Create a chat completion for given messages with streaming support
JWT token for authentication
Model UUID for the request (e.g., f49b2e20-fef3-4441-9358-897f946b8ae2 for Llama 70B)
Array of messages in the conversation
Model identifier
llama3_3 Whether to stream the response
Sampling temperature
0 <= x <= 2Maximum number of tokens to generate
1 <= x <= 4096Nucleus sampling parameter
0 <= x <= 1Sequences where the API will stop generating