Submit a new training job with the specified configuration, training data, and metadata.
JWT token for authentication
Bearer token for authentication and authorization.
Organization ID associated with the training job.
"0bf00b43-430a-4ca3-a8b3-b13cc8dc6d4f"
Name assigned to the training experiment.
"launch-simplismart-causal_lm-lora"
JSON-formatted string containing dataset preprocessing and split configuration.
"{\n \"preprocessing\": {\n \"lazy_tokenize\": true,\n \"streaming\": false,\n \"prompt\": {\n \"system\": null,\n \"max_length\": 4096,\n \"template\": null\n }\n },\n \"split\": {\n \"type\": \"random\",\n \"ratios\": [0.9, 0.1]\n }\n}\n"
JSON-formatted string containing model configuration including base model, quantization, and ownership details.
"{\n \"base_model\": \"meta-llama/Llama-3.2-1B-Instruct\",\n \"ownership\": \"public\",\n \"source_type\": \"hf\",\n \"model_type\": \"llm\",\n \"quantization\": {\n \"quant_bits\": 4\n }\n}\n"
JSON-formatted string containing training configuration including hyperparameters, adapter settings, and distributed training options.
"{\n \"type\": \"sft\",\n \"torch_dtype\": \"bfloat16\",\n \"task_type\": \"causal_lm\",\n \"train_type\": \"lora\",\n \"tuner_backend\": \"simplismart\",\n \"hyperparameters\": {\n \"num_epochs\": 1,\n \"per_device_train_batch_size\": 8,\n \"per_device_eval_batch_size\": 8,\n \"gradient_checkpointing\": true,\n \"save_steps\": 500,\n \"save_total_limit\": 2,\n \"eval_steps\": 500,\n \"logging_steps\": 5,\n \"learning_rate\": 0.0001,\n \"dataloader_num_workers\": 1\n },\n \"adapter_config\": {\n \"r\": 16,\n \"alpha\": 16,\n \"dropout\": 0.1,\n \"targets\": [\"all-linear\"]\n },\n \"distributed\": {\n \"type\": \"ddp\"\n }\n}\n"
JSON-formatted string containing dataset information including path, format, and access credentials.
"{\n \"dataset_name\": \"dataset-name\",\n \"dataset_path\": \"s3://training-dev-datasets/ds/sharegpt_ds_half.jsonl\",\n \"dataset_description\": \"\",\n \"dataset_type\": \"jsonl\",\n \"dataset_format\": \"sharegpt\",\n \"source_type\": \"s3\",\n \"ownership\": \"private\",\n \"secret_id\": \"<your-secret-key>\",\n \"region\": \"us-west-2\"\n}\n"
JSON-formatted string containing infrastructure requirements including GPU type, count, and node configuration.
"{\n \"gpu_type\": \"h100\",\n \"gpu_count\": 2,\n \"infra_type\": \"simplismart\",\n \"node_count\": 2\n}\n"
Training job submitted successfully.