output tokens:
The maximum length of the generated response, important for controlling the verbosity of the output.
temperature:
Controls randomness in the output; higher values produce more creative results, while lower values yield more deterministic responses.
top-P:
Uses nucleus sampling to choose tokens from the top P cumulative probability mass, balancing creativity and coherence.
stop sequence:
Specific sequences that, when generated, will halt further output.
system prompt:
The initial instruction or context setting the behaviour of the model.