LLMs
Llama 3.1 8B API
Create a chat completion for given messages with streaming support
POST
Authorizations
JWT token for authentication
Headers
Model UUID for the request (e.g., f49b2e20-fef3-4441-9358-897f946b8ae2 for Llama 8B)
Body
application/json
Array of messages in the conversation
Model identifier
Available options:
llama3_1
Whether to stream the response
Sampling temperature
Required range:
0 <= x <= 2
Maximum number of tokens to generate
Required range:
1 <= x <= 4096
Nucleus sampling parameter
Required range:
0 <= x <= 1
Sequences where the API will stop generating