Perform in-depth, customizable evaluations of LLM outputs using custom datasets and a range of evaluator types including programmatic, human, and AI-based.
Only one LLM deployment can be chosen at once.
1024
means the response will be capped at 1024 tokens.
A higher value allows longer outputs but also increases resource usage.
0
to 1
0.2
) → More deterministic and focused responses0.8
) → More diverse and creative responses0.7
balances creativity and consistency