> ## Documentation Index
> Fetch the complete documentation index at: https://docs.simplismart.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# SFT (LLM)

> Provides schema examples for structuring conversations in SFT training.

## **Dataset Format**

Choose the file type for your dataset. Currently supported types are:

* `jsonl` (JSON Lines)
* `zip`The directory should be archived in a `.zip` file and stored in an object storage.\
  Example zip command:`cd path/to/dataset_dir && zip -r dataset_dir.zip ./*`

Each line in a`.jsonl`file should represent a complete training example. The supported format styles are:

## **ShareGPT Format**

```json theme={null}
{
  "system": "<system>",
  "conversation": [
    {"human": "<query1>", "assistant": "<response1>"},
    {"human": "<query2>", "assistant": "<response2>"}
  ]
}
```

## Example JSONL File

```json theme={null}
{"system": "...", "conversation": ["...."]}
{"system": "...", "conversation": ["...."]}
{"system": "...", "conversation": ["...."]}
```

### **Message fields**

* `system`: The initial system instruction that sets the behavior or tone for the assistant.
* `conversation:` A list of human-assistant message pairs forming the dialogue history.
  * `human`: A user query or input in the conversation.
  * ` assistant:` The assistant's response to the corresponding human input.

## **OpenAI SFT Format**

```json theme={null}
{
  "messages": [
    {"role": "system", "content": "<system>"},
    {"role": "user", "content": "<query1>"},
    {"role": "assistant", "content": "<response1>"},
    {"role": "user", "content": "<query2>"},
    {"role": "assistant", "content": "<response2>"}
  ]
}
```

## Example JSONL File

```json theme={null}
{"messages": [{"role": "...", "content": "..."},]}
{"messages": [{"role": "...", "content": "..."},]}
{"messages": [{"role": "...", "content": "..."},]}
```

### **Message fields**

* `messages:` A sequential list of role-based messages representing a full conversation.
* `role:`  The identity of the message sender (e.g., system, user, assistant).
* `content:` The actual text of the message corresponding to the role.
