Create chat completion with Llama 8B
LLMs
Llama 3.1 8B API
Create a chat completion for given messages with streaming support
POST
Create chat completion with Llama 8B
Authorizations
JWT token for authentication
Body
application/json
Array of messages in the conversation
Model identifier
Available options:
meta-llama/Meta-Llama-3.1-8B-Instruct Whether to stream the response
Sampling temperature
Required range:
0 <= x <= 2Maximum number of tokens to generate
Required range:
1 <= x <= 4096Nucleus sampling parameter
Required range:
0 <= x <= 1Sequences where the API will stop generating