- Getting started
- UiPath Agents in Studio Web
- About UiPath Agents
- Licensing
- Agents and workflows
- Best practices for building agents
- Best practices for context engineering
- Best practices for publishing and deploying agents
- Choosing the best model for your agent
- Prompts
- Working with files
- Contexts
- Escalations and Agent Memory
- Evaluations
- Agent traces
- Agent score
- Managing UiPath agents
- UiPath Coded agents

Agents user guide
Choosing the best model for your agent
Selecting the right model defines how your agent behaves in real scenarios. The model influences accuracy, stability, speed, and cost.
Choosing a model for an agent is an iterative process, not a one-time decision. You choose an initial model during design so you can build and test the agent, and later refine that choice once evaluations show how different models behave with your prompts, tools, data, and failure scenarios. Evaluations often show that a lower-cost model meets the same quality requirements as a more expensive option, or that a different model performs more reliably on specific edge cases.
This section helps you understand model settings, apply best practices, and run the steps needed to choose the most effective and cost-efficient model for your use case.
Understand model settings
Model settings control how the underlying AI model produces outputs. Two settings have the most impact on agent performance: model and temperature.
The model choice affects capability, latency, cost, and specialized strengths. Different models excel at tasks such as reasoning, coding, or summarization. To understand which one fits your agent’s workload, run evaluations that compare how models perform with your actual prompts and scenarios, as described in the following sections.
Temperature controls the randomness and creativity of the model's responses. Typical ranges include:
- Low Temperature (0.0 - 0.3): More deterministic and focused responses, better for factual tasks.
- Medium Temperature (0.4 - 0.7): Balanced creativity and consistency, good for most conversational agents.
- High Temperature (0.8 - 2.0): More creative and diverse responses, better for creative writing.
Start with an initial model during design
During design, select a model that broadly fits your agent’s workload. This initial model serves as the baseline you use to build prompts, integrate tools, and test flows. As a best practice, we recommend you start with a general-purpose or lower-cost model and expect to revisit this choice after evaluations.
For the full list of models supported in UiPath Automation Cloud and their regional availability, see Model availability and routing.
Model availability depends on your organization type. Enterprise and Enterprise Trial organizations can select from multiple supported models and configure bring-your-own-model integrations. Community organizations have access to a single model offered for free.
Set temperature conservatively
Temperature controls how deterministic or variable a model’s responses are. In most enterprise agents, consistency is more important than creativity.
- Low temperature produces repeatable, stable outputs
- Higher temperature increases variation and creativity
Best practice: Use temperature 0.0 for most production-oriented agents. If quality issues appear, change the model or prompt before increasing temperature.
Temperature should be tuned sparingly and always validated through evaluations.

Use evaluations to validate and revise model choice
Evaluations are where model selection becomes evidence-based.
First, start with a working agent. Run it in debug mode with different inputs across your key scenarios to confirm the full flow behaves as expected. Once the agent works end to end, build your evaluation set from real runs, either by using Add to evaluation set directly from a debug run or by downloading runtime runs and importing them into your evaluation set.
Build evaluation sets using:
- Typical user inputs
- Edge cases
- Known failure cases
Avoid relying only on synthetic or auto-generated cases, which can overestimate real-world performance.
Configure different models
Run the same evaluation set across multiple models and configurations. At this stage, you can decide whether differences in quality justify differences in cost, latency, or stability. Running the same scenarios across configurations makes these trade-offs visible. It is common and expected to change the selected model after reviewing evaluation results.
To configure and compare different model settings within an evaluation set:
-
From the Agent Builder Explorer panel, select Evaluation sets.
-
Select an evaluation set.
-
Select the gear icon to open Evaluation settings.

-
In the Evaluation set properties panel, add multiple temperature and model combinations. For example:
- Temperature 0.2, Model A
- Temperature 0.5, Model A
- Temperature 0.7, Model A
- Temperature 0.5, Model B
Each combination creates a separate evaluation run, allowing you to compare how small configuration changes affect behavior.


-
Select Evaluate set to run all configurations. After the runs complete, open the Results tab to compare them.

Compare models and analyze results
To make comparisons fair:
- Keep prompts, tools, and context identical.
- Add multiple models and temperature configurations in the Evaluation set properties panel.
- Run the same evaluation set for each configuration.
Each model added to your evaluation set triggers a new run, and you can review the results for each run in the Results table. When reviewing results, you are not just looking for the highest score, you are deciding which trade-offs matter most for your agent.
Review the evaluation results to understand how each configuration performs. Look for:
- Evaluator scores: Identify which settings produce accurate, high-quality outputs.
- Time performance: Compare response times across configurations.
Do not select a model based only on an average score. When reviewing evaluation results, consider:
- Where and how failures occur
- Consistency across scenarios
- Latency and execution time
- Cost relative to quality gains
A slightly lower-scoring model may be preferable if it is significantly cheaper and more stable.
Recommended workflow
The steps below summarize the core process described in this section. Use them as a quick reference when selecting and refining a model for your agent:
- Start with an initial model during agent design.
- Use a low temperature to prioritize consistency.
- Build a working agent and validate end-to-end behavior.
- Create evaluation sets from real agent runs.
- Compare multiple models using the same evaluation set.
- Select the lowest-cost model that consistently meets your quality bar.
- Re-run evaluations as the agent evolves or new models become available.