Google Gemini Models
Google's Gemini models for multimodal understanding, reasoning, and agentic use cases.
Google's Gemini models provide powerful multimodal capabilities with advanced reasoning features. Tambo supports Gemini models across three generations: Gemini 3, Gemini 2.5, and Gemini 2.0.
Known Issues
Gemini models may occasionally resist rendering as requested. Sometimes they complete the request, but behavior can be inconsistent. Try clarifying instructions (e.g., "Return a bulleted list only"). Outputs may have formatting quirks—be cautious when structure matters.
Model Families
Gemini 3 Family
The latest generation of Gemini models with enhanced multimodal and reasoning capabilities.
gemini-3-pro-preview
Status: Tested
Google's most powerful model as of November 2025, best for multimodal understanding and agentic use cases.
- API Name:
gemini-3-pro-preview - Context Window: 1,048,576 tokens
- Provider Documentation: Gemini 3.0 Pro
Best For:
- Complex multimodal understanding tasks
- Agentic workflows requiring reasoning
- Advanced problem-solving scenarios
Notes:
- Expected to have improved performance over 2.5 models
- Supports reasoning via thinking configuration
Gemini 2.5 Family
Advanced reasoning models with extended thinking capabilities.
gemini-2.5-pro
Status: Known Issues
Gemini 2.5 Pro is Google's most advanced reasoning model, capable of solving complex problems.
- API Name:
gemini-2.5-pro - Context Window: 1,048,576 tokens
- Provider Documentation: Gemini 2.5 Pro
Best For:
- Complex reasoning tasks
- Multi-step problem-solving
- Tasks requiring deep analysis
Notes:
- May occasionally resist rendering as requested
- Behavior can be inconsistent with formatting
- Supports reasoning via thinking configuration
gemini-2.5-flash
Status: Known Issues
Gemini 2.5 Flash is Google's best model in terms of price and performance, offering well-rounded capabilities.
- API Name:
gemini-2.5-flash - Context Window: 1,048,576 tokens
- Provider Documentation: Gemini 2.5 Flash
Best For:
- Production workloads requiring speed
- Cost-effective reasoning tasks
- Balanced performance across various use cases
Notes:
- Fast and efficient for production use
- May have formatting quirks occasionally
- Supports reasoning via thinking configuration
Gemini 2.0 Family
Next-generation features designed for the agentic era with superior speed and built-in tool use.
gemini-2.0-flash
Status: Known Issues
Gemini 2.0 Flash delivers next-generation features and improved capabilities designed for the agentic era, including superior speed, built-in tool use, multimodal generation, and a 1M token context window.
- API Name:
gemini-2.0-flash - Context Window: 1,048,576 tokens
- Provider Documentation: Gemini 2.0 Flash
Best For:
- Agentic workflows with built-in tool use
- Applications requiring superior speed
- Multimodal generation tasks
Notes:
- Optimized for the agentic era
- May occasionally have inconsistent rendering behavior
- Strong tool-calling capabilities
gemini-2.0-flash-lite
Status: Known Issues
Gemini 2.0 Flash Lite is a model optimized for cost efficiency and low latency.
- API Name:
gemini-2.0-flash-lite - Context Window: 1,048,576 tokens
- Provider Documentation: Gemini 2.0 Flash Lite
Best For:
- High-volume applications
- Cost-sensitive deployments
- Low-latency requirements
Notes:
- Most cost-efficient Gemini model
- Optimized for speed over capability
- May have formatting quirks
Provider-Specific Parameters
Reasoning Configuration
Gemini models support reasoning capabilities through a thinking configuration. All Gemini models can use these parameters to control reasoning behavior.
thinkingConfig
Configure thinking behavior with a JSON object:
| Field | Type | Description |
|---|---|---|
thinkingBudget | number | Token budget allocated for thinking |
includeThoughts | boolean | Whether to include thinking in response |
Example Configuration:
{
"thinkingBudget": 5000,
"includeThoughts": true
}Recommended Settings:
- Quick reasoning (1000-2000 tokens): Simple tasks requiring minimal thinking
- Standard reasoning (3000-5000 tokens): Most use cases (recommended)
- Extended reasoning (7000+ tokens): Complex problems requiring deep analysis
Configuring in the Dashboard
- Navigate to your project in the dashboard
- Go to Settings → LLM Providers
- Select Google Gemini as your provider
- Choose your model
- Under Custom LLM Parameters, click + thinkingConfig
- Enter your configuration as a JSON object:
{ "thinkingBudget": 5000, "includeThoughts": true } - Click Save to apply the configuration
Dashboard Suggestions
When you select a Gemini reasoning model, the dashboard automatically shows thinkingConfig as a suggested parameter. Just click it to add!
Best Practices
Model Selection
- Gemini 3.0 Pro Preview: Use for cutting-edge multimodal and agentic tasks (test thoroughly as untested)
- Gemini 2.5 Pro: Best for complex reasoning requiring extended thinking
- Gemini 2.5 Flash: Recommended for production workloads needing speed and cost efficiency
- Gemini 2.0 Flash: Choose for agentic workflows with strong tool-calling needs
- Gemini 2.0 Flash Lite: Select for high-volume, cost-sensitive applications
Performance Optimization
- Start with Gemini 2.5 Flash for balanced performance
- Use lower thinking budgets for simple tasks to reduce latency
- Monitor token usage when using reasoning features
- Test formatting requirements carefully due to known inconsistencies
Cost Considerations
- Gemini 2.0 Flash Lite offers the best cost efficiency
- Reasoning tokens (thinking budget) are billed separately
- Balance thinking budget with task complexity
- Consider caching for repeated queries with large context windows
Troubleshooting
Inconsistent rendering behavior?
- Try clarifying instructions more explicitly
- Use specific formatting directives (e.g., "Return a bulleted list only")
- Test with different prompt phrasings
- Consider using a tested OpenAI model for production-critical formatting
Reasoning not appearing in responses?
- Verify
thinkingConfigis added in your dashboard settings - Ensure
includeThoughtsis set totrue - Check that you've saved your configuration
- Try increasing the
thinkingBudgetvalue
Model performance issues?
- Lower the
thinkingBudgetfor faster responses - Use Gemini 2.0 Flash Lite for speed-critical applications
- Consider Gemini 2.5 Flash for balanced performance
- Monitor your context window usage
See Also
- Labels - Understanding model status labels and observed behaviors
- Custom LLM Parameters - Configure additional model parameters
- Reasoning Models - Comprehensive guide to reasoning capabilities