SFT (LLM)

Dataset Format
ShareGPT Format
Example JSONL File
Message fields
OpenAI SFT Format
Example JSONL File
Message fields

Dataset Format

Choose the file type for your dataset. Currently supported types are:

jsonl (JSON Lines)
zipThe directory should be archived in a .zip file and stored in an object storage.
Example zip command:cd path/to/dataset_dir && zip -r dataset_dir.zip ./*

Each line in a.jsonlfile should represent a complete training example. The supported format styles are:

ShareGPT Format

{
  "system": "<system>",
  "conversation": [
    {"human": "<query1>", "assistant": "<response1>"},
    {"human": "<query2>", "assistant": "<response2>"}
  ]
}

Example JSONL File

{"system": "...", "conversation": ["...."]}
{"system": "...", "conversation": ["...."]}
{"system": "...", "conversation": ["...."]}

Message fields

system: The initial system instruction that sets the behavior or tone for the assistant.
conversation: A list of human-assistant message pairs forming the dialogue history.
- human: A user query or input in the conversation.
- assistant: The assistant’s response to the corresponding human input.

OpenAI SFT Format

{
  "messages": [
    {"role": "system", "content": "<system>"},
    {"role": "user", "content": "<query1>"},
    {"role": "assistant", "content": "<response1>"},
    {"role": "user", "content": "<query2>"},
    {"role": "assistant", "content": "<response2>"}
  ]
}

Example JSONL File

{"messages": [{"role": "...", "content": "..."},]}
{"messages": [{"role": "...", "content": "..."},]}
{"messages": [{"role": "...", "content": "..."},]}

Message fields

messages: A sequential list of role-based messages representing a full conversation.
role: The identity of the message sender (e.g., system, user, assistant).
content: The actual text of the message corresponding to the role.

GRPO (VLM)SFT (VLM)

⌘I

Get Started

Types of Inference

Playground

Model Compilation

Deployment

Benchmarking

Training

Settings

References

Dataset Format

ShareGPT Format

Example JSONL File

Message fields

OpenAI SFT Format

Example JSONL File

Message fields

Get Started

Types of Inference

Playground

Model Compilation

Deployment

Benchmarking

Training

Settings

References

​Dataset Format

​ShareGPT Format

​Example JSONL File

​Message fields

​OpenAI SFT Format

​Example JSONL File

​Message fields

Dataset Format

ShareGPT Format

Example JSONL File

Message fields

OpenAI SFT Format

Example JSONL File

Message fields