The LLM (Large Language Model) playground offers a versatile space for text generation and conversational AI. Users can:

  • Chat: Engage in interactive conversations with the model, simulating real-world dialogue.
  • Configure Output: Adjust settings like output length, temperature and top-P to customize the responses generated by the model.
  • Experiment with Prompts: Input different prompts and scenarios to see how the model responds, allowing for creative and practical applications.
  • Evaluate Interactions: Analyze the generated text for coherence, creativity, and relevance to ensure it meets the required standards.


Settings explained

output tokens: The maximum length of the generated response, important for controlling the verbosity of the output.

temperature: Controls randomness in the output; higher values produce more creative results, while lower values yield more deterministic responses.

top-P: Uses nucleus sampling to choose tokens from the top P cumulative probability mass, balancing creativity and coherence.

stop sequence: Specific sequences that, when generated, will halt further output.

system prompt: The initial instruction or context setting the behaviour of the model.


Access the Llama model API documentation here for endpoints, parameters, and code examples.

The LLM (Large Language Model) playground offers a versatile space for text generation and conversational AI. Users can:

  • Chat: Engage in interactive conversations with the model, simulating real-world dialogue.
  • Configure Output: Adjust settings like output length, temperature and top-P to customize the responses generated by the model.
  • Experiment with Prompts: Input different prompts and scenarios to see how the model responds, allowing for creative and practical applications.
  • Evaluate Interactions: Analyze the generated text for coherence, creativity, and relevance to ensure it meets the required standards.


Settings explained

output tokens: The maximum length of the generated response, important for controlling the verbosity of the output.

temperature: Controls randomness in the output; higher values produce more creative results, while lower values yield more deterministic responses.

top-P: Uses nucleus sampling to choose tokens from the top P cumulative probability mass, balancing creativity and coherence.

stop sequence: Specific sequences that, when generated, will halt further output.

system prompt: The initial instruction or context setting the behaviour of the model.


Access the Llama model API documentation here for endpoints, parameters, and code examples.