궤적 형식

Hermes Agent는 ShareGPT 호환 JSONL 형식으로 대화 궤적을 저장합니다. 훈련 데이터, 디버깅 아티팩트, 강화 학습 데이터 세트로 사용됩니다.

소스 파일: agent/trajectory.py, run_agent.py(_save_trajectory 검색), batch_runner.py

파일 명명 규칙

궤적은 현재 작업 디렉터리의 파일에 기록됩니다.

파일	언제
`trajectory_samples.jsonl`	성공적으로 완료된 대화(`completed=True`)
`failed_trajectories.jsonl`	실패했거나 중단된 대화(`completed=False`)

배치 실행기(batch_runner.py)는 배치별로 사용자 정의 출력 파일에 씁니다. (예: batch_001_output.jsonl) 추가 메타데이터 필드가 있습니다.

save_trajectory()의 filename 매개변수를 통해 파일 이름을 재정의할 수 있습니다.

JSONL 항목 형식

파일의 각 줄은 자체 포함된 JSON 개체입니다. 두 가지 변형이 있습니다.

CLI/대화형 형식(`_save_trajectory`에서)

{
  "conversations": [... ],
  "timestamp": "2026-03-30T14:22:31.456789",
  "model": "anthropic/claude-sonnet-4.6",
  "completed": true
}

배치 실행기 형식(`batch_runner.py`에서)

{
  "prompt_index": 42,
  "conversations": [... ],
  "metadata": { "prompt_source": "gsm8k", "difficulty": "hard" },
  "completed": true,
  "partial": false,
  "api_calls": 7,
  "toolsets_used": ["code_tools", "file_tools"],
  "tool_stats": {
    "terminal": {"count": 3, "success": 3, "failure": 0},
    "read_file": {"count": 2, "success": 2, "failure": 0},
    "write_file": {"count": 0, "success": 0, "failure": 0}
  },
  "tool_error_counts": {
    "terminal": 0,
    "read_file": 0,
    "write_file": 0
  }
}
``tool_stats` 및 `tool_error_counts` 사전은 다음을 포함하도록 정규화되었습니다.
기본값이 0인 가능한 모든 도구(`model_tools.TOOL_TO_TOOLSET_MAP`부터),
HuggingFace 데이터 세트 로딩을 위한 항목 전체에서 일관된 스키마를 보장합니다.


## 대화 배열(ShareGPT 형식) \{#conversations-array-sharegpt-format}

`conversations` 배열은 ShareGPT 역할 규칙을 사용합니다.

| API 역할 | 공유GPT `from` |
|----------|-----------------|
| 시스템 | `"system"` |
| 사용자 | `"human"` |
| 조수 | `"gpt"` |
| 도구 | `"tool"` |

### 완전한 예 \{#complete-example}

```json
&#123;
  "conversations": [
    &#123;
      "from": "system",
      "value": "You are a function calling AI model. You are provided with function signatures within &lt;tools&gt; &lt;/tools&gt; XML tags. You may call one or more functions to assist with the user query. If available tools are not relevant in assisting with user query, just respond in natural conversational language. Don't make assumptions about what values to plug into functions. After calling & executing the functions, you will be provided with function results within &lt;tool_response&gt; &lt;/tool_response&gt; XML tags. Here are the available tools:\n&lt;tools&gt;\n[&#123;\"name\": \"terminal\", \"description\": \"Execute shell commands\", \"parameters\": &#123;\"type\": \"object\", \"properties\": &#123;\"command\": &#123;\"type\": \"string\"&#125;&#125;&#125;, \"required\": null&#125;]\n&lt;/tools&gt;\nFor each function call return a JSON object, with the following pydantic model json schema for each:\n&#123;'title': 'FunctionCall', 'type': 'object', 'properties': &#123;'name': &#123;'title': 'Name', 'type': 'string'&#125;, 'arguments': &#123;'title': 'Arguments', 'type': 'object'&#125;&#125;, 'required': ['name', 'arguments']&#125;\nEach function call should be enclosed within &lt;tool_call&gt; &lt;/tool_call&gt; XML tags.\nExample:\n&lt;tool_call&gt;\n&#123;'name': &lt;function-name&gt;,'arguments': &lt;args-dict&gt;&#125;\n&lt;/tool_call&gt;"
    &#125;,
    &#123;
      "from": "human",
      "value": "What Python version is installed?"
    &#125;,
    &#123;
      "from": "gpt",
      "value": "&lt;think&gt;\nThe user wants to know the Python version. I should run python3 --version.\n&lt;/think&gt;\n&lt;tool_call&gt;\n&#123;\"name\": \"terminal\", \"arguments\": &#123;\"command\": \"python3 --version\"&#125;&#125;\n&lt;/tool_call&gt;"
    &#125;,
    &#123;
      "from": "tool",
      "value": "&lt;tool_response&gt;\n&#123;\"tool_call_id\": \"call_abc123\", \"name\": \"terminal\", \"content\": \"Python 3.11.6\"&#125;\n&lt;/tool_response&gt;"
    &#125;,
    &#123;
      "from": "gpt",
      "value": "&lt;think&gt;\nGot the version. I can now answer the user.\n&lt;/think&gt;\nPython 3.11.6 is installed on this system."
    &#125;
  ],
  "timestamp": "2026-03-30T14:22:31.456789",
  "model": "anthropic/claude-sonnet-4.6",
  "completed": true
&#125;

정규화 규칙

콘텐츠 마크업 추론

궤적 변환기는 관계없이 모든 추론을 <think> 태그로 정규화합니다. 모델이 원래 어떻게 생산했는지:

네이티브 사고 토큰(다음과 같은 제공업체의 msg["reasoning"] 필드) Anthropic, OpenAI o 시리즈): <think>\n{reasoning}\n</think>\n로 래핑됨 콘텐츠 앞에 추가됩니다.
REASONING_SCRATCHPAD XML(기본 사고가 비활성화되고 모델이 시스템 프롬프트 지시 XML을 통한 이유): <REASONING_SCRATCHPAD> 태그는 convert_scratchpad_to_think()을 통해 <think>으로 변환되었습니다.
빈 생각 블록: 모든 gpt 턴마다 <think>이 보장됩니다. 블록. 추론이 생성되지 않은 경우 빈 블록이 삽입됩니다. <think>\n</think>\n — 이는 훈련 데이터의 일관된 형식을 보장합니다.

도구 호출 정규화

API 형식의 도구 호출(tool_call_id, 함수 이름, 인수 포함) JSON 문자열)은 XML로 래핑된 JSON으로 변환됩니다.

&lt;tool_call&gt;
&#123;"name": "terminal", "arguments": &#123;"command": "ls -la"&#125;&#125;
&lt;/tool_call&gt;

인수는 JSON 문자열에서 다시 객체로 구문 분석됩니다(이중 인코딩되지 않음).
JSON 구문 분석이 실패하는 경우(발생하지 않아야 함 - 대화 중에 확인됨) 빈 {}이 경고와 함께 사용됩니다.
한 번의 보조 회전으로 여러 도구를 호출하면 여러 개의 <tool_call> 블록이 생성됩니다. 단일 gpt 메시지에서

도구 응답 정규화

보조 메시지 이후의 모든 도구 결과는 단일 tool로 그룹화됩니다. XML로 래핑된 JSON 응답으로 전환합니다.

&lt;tool_response&gt;
&#123;"tool_call_id": "call_abc123", "name": "terminal", "content": "output here"&#125;
&lt;/tool_response&gt;

도구 콘텐츠가 JSON({ 또는 [으로 시작)과 유사하면 구문 분석되어 콘텐츠 필드에 문자열이 아닌 JSON 개체/배열이 포함되어 있습니다.
여러 도구 결과가 하나의 메시지에 줄바꿈으로 결합됩니다.
도구 이름은 상위 보조자의 tool_calls에 대한 위치와 일치합니다. 배열

시스템 메시지

시스템 메시지는 저장 시 생성됩니다(대화에서 가져오지 않음). 다음과 같이 Hermes 함수 호출 프롬프트 템플릿을 따릅니다.

함수 호출 프로토콜을 설명하는 서문
<tools> JSON 도구 정의가 포함된 XML 블록
FunctionCall 객체에 대한 스키마 참조
<tool_call> 예

도구 정의에는 name, description, parameters 및 required이 포함됩니다. (표준 형식과 일치하도록 null로 설정)

궤적 로드

궤적은 표준 JSONL입니다. JSON 라인 리더로 로드하세요.

import json

def load_trajectories(path: str):
    """Load trajectory entries from a JSONL file."""
    entries = 
    with open(path, "r", encoding="utf-8") as f:
        for line in f:
            line = line.strip()
            if line:
                entries.append(json.loads(line))
    return entries

# Filter to successful completions only
successful = [e for e in load_trajectories("trajectory_samples.jsonl")
              if e.get("completed")]

# Extract just the conversations for training
training_data = [e["conversations"] for e in successful]

HuggingFace 데이터세트 로드 중

from datasets import load_dataset

ds = load_dataset("json", data_files="trajectory_samples.jsonl")

정규화된 tool_stats 스키마는 모든 항목이 동일한 열을 갖도록 보장합니다. 데이터세트 로드 중 Arrow 스키마 불일치 오류를 방지합니다.

궤적 저장 제어

CLI에서 궤적 저장은 다음을 통해 제어됩니다.

# config.yaml
agent:
  save_trajectories: true  # default: false

또는 --save-trajectories 플래그를 통해. 에이전트가 초기화되는 경우 save_trajectories=True, 마지막에 _save_trajectory() 메소드가 호출됩니다. 각 대화의 차례입니다.

배치 실행기는 항상 궤적을 저장합니다(이것이 주요 목적입니다).

모든 턴에 걸쳐 추론이 0인 샘플은 자동으로 폐기됩니다. 비합리적 예제로 훈련 데이터를 오염시키는 것을 방지하기 위한 배치 실행기.

파일 명명 규칙​

JSONL 항목 형식​

CLI/대화형 형식(_save_trajectory에서)​

배치 실행기 형식(batch_runner.py에서)​

정규화 규칙​

콘텐츠 마크업 추론​

도구 호출 정규화​

도구 응답 정규화​

시스템 메시지​

궤적 로드​

HuggingFace 데이터세트 로드 중​

궤적 저장 제어​