Advanced Configuration


{
  "input_features": [
    {
      "name": "question",
      "type": "text",
      "preprocessing": {
        "max_sequence_length": 4096
      }
    }
  ],
  "output_features": [
    {
      "name": "answer",
      "type": "text",
      "preprocessing": {
        "max_sequence_length": 4096
      }
    }
  ],
  "quantization": {
    "bits": 4,
    "llm_int8_threshold": 6,
    "llm_int8_has_fp16_weight": false,
    "bnb_4bit_compute_dtype": "float16",
    "bnb_4bit_use_double_quant": true,
    "bnb_4bit_quant_type": "nf4"
  },
  "trainer": {
    "type": "finetune",
    "learning_rate_scheduler": {
      "warmup_fraction": 0.01,
      "decay": "linear"
    }
  },
  "preprocessing": {
    "sample_ratio": 1
  },
  "backend": {
    "type": "local"
  }
}

The training configuration provides a flexible way to define the inputs, outputs, and other advanced settings for your model.

  • Max Sequence Length (input and output features) : The ‘max_sequence_length’ parameter in the config refers to the maximum tokens in the prompt and response respectively.
  • If Quantisation of the model is not required (for smaller models like 3B, 2B etc) we can remove the quantization key from the advanced configuration, so that all the parameters are used in full precision which yields better accuracy (however training time would increase).

Quantization Config

  {  
    "bits": 4,
    "llm_int8_threshold": 6,
    "llm_int8_has_fp16_weight": false,
    "bnb_4bit_compute_dtype": "float16",
    "bnb_4bit_use_double_quant": true,
    "bnb_4bit_quant_type": "nf4"
  }
  • Ensure that the names in the input and output features match exactly with the corresponding columns in the dataset. (e.g., “question” for input and “answer” for output)

Prompt Template

In cases where you want to form a prompt using multiple columns you can use a prompt template to combine them.

config with Prompt Template:

{
  "prompt": {
    "template": "{system_prompt}\n\n {question}\n\n Response:\n"
  },
  "input_features": [
    {
      "name": "prompt",
      "type": "text",
      "preprocessing": {
        "max_sequence_length": 4096
      }
    }
  ],
  "output_features": [
    {
      "name": "response",
      "type": "text",
      "preprocessing": {
        "max_sequence_length": 4096
      }
    }
  ],
  "quantization": {
    "bits": 4,
    "llm_int8_threshold": 6,
    "llm_int8_has_fp16_weight": false,
    "bnb_4bit_compute_dtype": "float16",
    "bnb_4bit_use_double_quant": true,
    "bnb_4bit_quant_type": "nf4"
  },
  "trainer": {
    "type": "finetune",
    "learning_rate_scheduler": {
      "warmup_fraction": 0.01,
      "decay": "linear"
    }
  },
  "preprocessing": {
    "sample_ratio": 1
  },
  "backend": {
    "type": "local"
  }
}