Creating A Training Job

Initiate a New Training Job

Navigate to the My Trainings section in the platform.
Click on Add a Training Job button to create a new job.

Basic Details and Model Selection

Provide a name for your experiment
Enter HuggingFace model path of the base model you wish to FineTune.

You can view the supported models here.

Upload Your Dataset

Upload your training dataset. Refer to Dataset Preparation for more info.

The supported file format is JSONL

Please note that pre-processing is not handled by our training suite.The data should be provided in the required format (JSONL), including tasks such as image resizing, rotation, or any necessary transformations (e.g., preprocessing PDFs or images).

Select Training Parameters

Update the training parameters based on your requirements for the training job. title

Here is a short explanation of these parameters: Gradient Accumulation Steps: Number of steps to accumulate gradients before updating model weights. Learning Rate: Controls how much the model adjusts its weights during training. Batch Size: Number of samples processed together in one training iteration. Available GPU RAM dictates the maximum batch size. Trainer Epochs: Total number of times the model trains on the complete dataset. Adapter Alpha: The weight changes applied to the original model weights are scaled by a factor, determined as alpha divided by the rank to balance adaptation and original model knowledge. Adapter R: The integer rank of the update matrices. A lower rank creates smaller update matrices, reducing the number of trainable parameters. Alpha recommended to be twice as rank Adapter Dropout: The dropout probability for the LoRA layers, used to prevent overfitting. A value of 0.1 (10%) indicates a 10% chance for each neuron to be dropped during training.

Advanced Configuration

Update Training Configuration: The training configuration provides a flexible way to define the inputs, outputs, and other advanced settings for your model. Below is a sample configuration:

{
  "quantization": {
    "bits": 4,
    "llm_int8_threshold": 6,
    "llm_int8_has_fp16_weight": false,
    "bnb_4bit_compute_dtype": "float16",
    "bnb_4bit_use_double_quant": true,
    "bnb_4bit_quant_type": "nf4"
  },
  "trainer": {
    "type": "finetune",
    "learning_rate": 0.0001,
    "batch_size": 1,
    "epochs": 1,
    "gradient_accumulation_steps": 16
  },
  "image_column": "url"
}

Key Components

Image Column: Column which holds the image URLs.
Quantization: Reduces precision to optimize memory usage and inference speed.
Trainer: Configures training behavior, such as the learning rate scheduler, decay strategies.

Start and Monitor the Training Job

Once the configuration is updated, start the training job and monitor its progress in the Recent Jobs section of the UI. Keep track of metrics, logs, and any intermediate results to ensure the training meets your requirements. After the model is trained, you can also deploy the LoRA via Simplismart. Click here to know how.

Get Started

Types of Inference

Playground

Model Compilation

Deployment

Training

Settings

References

Initiate a New Training Job

Basic Details and Model Selection

Upload Your Dataset

Select Training Parameters

Advanced Configuration

Key Components

Start and Monitor the Training Job

Get Started

Types of Inference

Playground

Model Compilation

Deployment

Training

Settings

References

​Initiate a New Training Job

​Basic Details and Model Selection

​Upload Your Dataset

​Select Training Parameters

​Advanced Configuration

​Key Components

​Start and Monitor the Training Job

Initiate a New Training Job

Basic Details and Model Selection

Upload Your Dataset

Select Training Parameters

Advanced Configuration

Key Components

Start and Monitor the Training Job