# Google Gemini Models
URL: /models/google
Google's Gemini models provide powerful multimodal capabilities with advanced reasoning features. Tambo supports Gemini models across three generations: Gemini 3, Gemini 2.5, and Gemini 2.0.
Gemini models may occasionally resist rendering as requested. Sometimes they
complete the request, but behavior can be inconsistent. Try clarifying
instructions (e.g., "Return a bulleted list only"). Outputs may have
formatting quirks—be cautious when structure matters.
## Model Families
### Gemini 3 Family
The latest generation of Gemini models with enhanced multimodal and reasoning capabilities.
#### gemini-3-pro-preview
**Status:** Tested
Google's most powerful model as of November 2025, best for multimodal understanding and agentic use cases.
* **API Name:** `gemini-3-pro-preview`
* **Context Window:** 1,048,576 tokens
* **Provider Documentation:** [Gemini 3.0 Pro](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/3-0-pro)
**Best For:**
* Complex multimodal understanding tasks
* Agentic workflows requiring reasoning
* Advanced problem-solving scenarios
**Notes:**
* Expected to have improved performance over 2.5 models
* Supports reasoning via [thinking configuration](#reasoning-configuration)
### Gemini 2.5 Family
Advanced reasoning models with extended thinking capabilities.
#### gemini-2.5-pro
**Status:** Known Issues
Gemini 2.5 Pro is Google's most advanced reasoning model, capable of solving complex problems.
* **API Name:** `gemini-2.5-pro`
* **Context Window:** 1,048,576 tokens
* **Provider Documentation:** [Gemini 2.5 Pro](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-pro)
**Best For:**
* Complex reasoning tasks
* Multi-step problem-solving
* Tasks requiring deep analysis
**Notes:**
* May occasionally resist rendering as requested
* Behavior can be inconsistent with formatting
* Supports reasoning via [thinking configuration](#reasoning-configuration)
#### gemini-2.5-flash
**Status:** Known Issues
Gemini 2.5 Flash is Google's best model in terms of price and performance, offering well-rounded capabilities.
* **API Name:** `gemini-2.5-flash`
* **Context Window:** 1,048,576 tokens
* **Provider Documentation:** [Gemini 2.5 Flash](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-5-flash)
**Best For:**
* Production workloads requiring speed
* Cost-effective reasoning tasks
* Balanced performance across various use cases
**Notes:**
* Fast and efficient for production use
* May have formatting quirks occasionally
* Supports reasoning via [thinking configuration](#reasoning-configuration)
### Gemini 2.0 Family
Next-generation features designed for the agentic era with superior speed and built-in tool use.
#### gemini-2.0-flash
**Status:** Known Issues
Gemini 2.0 Flash delivers next-generation features and improved capabilities designed for the agentic era, including superior speed, built-in tool use, multimodal generation, and a 1M token context window.
* **API Name:** `gemini-2.0-flash`
* **Context Window:** 1,048,576 tokens
* **Provider Documentation:** [Gemini 2.0 Flash](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-0-flash)
**Best For:**
* Agentic workflows with built-in tool use
* Applications requiring superior speed
* Multimodal generation tasks
**Notes:**
* Optimized for the agentic era
* May occasionally have inconsistent rendering behavior
* Strong tool-calling capabilities
#### gemini-2.0-flash-lite
**Status:** Known Issues
Gemini 2.0 Flash Lite is a model optimized for cost efficiency and low latency.
* **API Name:** `gemini-2.0-flash-lite`
* **Context Window:** 1,048,576 tokens
* **Provider Documentation:** [Gemini 2.0 Flash Lite](https://cloud.google.com/vertex-ai/generative-ai/docs/models/gemini/2-0-flash-lite)
**Best For:**
* High-volume applications
* Cost-sensitive deployments
* Low-latency requirements
**Notes:**
* Most cost-efficient Gemini model
* Optimized for speed over capability
* May have formatting quirks
## Provider-Specific Parameters
### Reasoning Configuration
Gemini models support reasoning capabilities through a thinking configuration. All Gemini models can use these parameters to control reasoning behavior.
**thinkingConfig**
Configure thinking behavior with a JSON object:
| Field | Type | Description |
| ----------------- | ------- | --------------------------------------- |
| `thinkingBudget` | number | Token budget allocated for thinking |
| `includeThoughts` | boolean | Whether to include thinking in response |
**Example Configuration:**
```json
{
"thinkingBudget": 5000,
"includeThoughts": true
}
```
**Recommended Settings:**
* **Quick reasoning (1000-2000 tokens)**: Simple tasks requiring minimal thinking
* **Standard reasoning (3000-5000 tokens)**: Most use cases (recommended)
* **Extended reasoning (7000+ tokens)**: Complex problems requiring deep analysis
### Configuring in the Dashboard
1. Navigate to your project in the dashboard
2. Go to **Settings** → **LLM Providers**
3. Select Google Gemini as your provider
4. Choose your model
5. Under [**Custom LLM Parameters**](/models/custom-llm-parameters), click **+ thinkingConfig**
6. Enter your configuration as a JSON object:
```json
{
"thinkingBudget": 5000,
"includeThoughts": true
}
```
7. Click **Save** to apply the configuration
When you select a Gemini reasoning model, the dashboard automatically shows
[**thinkingConfig**](#reasoning-configuration) as a suggested parameter. Just
click it to add!
## Best Practices
### Model Selection
* [**Gemini 3.0 Pro Preview**](#gemini-3-pro-preview): Use for cutting-edge multimodal and agentic tasks (test thoroughly as untested)
* [**Gemini 2.5 Pro**](#gemini-2-5-pro): Best for complex reasoning requiring extended thinking
* [**Gemini 2.5 Flash**](#gemini-2-5-flash): Recommended for production workloads needing speed and cost efficiency
* [**Gemini 2.0 Flash**](#gemini-2-0-flash): Choose for agentic workflows with strong tool-calling needs
* [**Gemini 2.0 Flash Lite**](#gemini-2-0-flash-lite): Select for high-volume, cost-sensitive applications
### Performance Optimization
* Start with [Gemini 2.5 Flash](#gemini-2-5-flash) for balanced performance
* Use lower [thinking budgets](#reasoning-configuration) for simple tasks to reduce latency
* Monitor token usage when using [reasoning features](#reasoning-configuration)
* Test formatting requirements carefully due to [known inconsistencies](#model-families)
### Cost Considerations
* [Gemini 2.0 Flash Lite](#gemini-2-0-flash-lite) offers the best cost efficiency
* Reasoning tokens ([thinking budget](#reasoning-configuration)) are billed separately
* Balance [thinking budget](#reasoning-configuration) with task complexity
* Consider caching for repeated queries with large context windows
## Troubleshooting
**Inconsistent rendering behavior?**
* Try clarifying instructions more explicitly
* Use specific formatting directives (e.g., "Return a bulleted list only")
* Test with different prompt phrasings
* Consider using a tested OpenAI model for production-critical formatting
**Reasoning not appearing in responses?**
* Verify [`thinkingConfig`](#reasoning-configuration) is added in your [dashboard settings](#configuring-in-the-dashboard)
* Ensure `includeThoughts` is set to `true`
* Check that you've saved your [configuration](#configuring-in-the-dashboard)
* Try increasing the `thinkingBudget` value
**Model performance issues?**
* Lower the `thinkingBudget` for faster responses
* Use [Gemini 2.0 Flash Lite](#gemini-2-0-flash-lite) for speed-critical applications
* Consider [Gemini 2.5 Flash](#gemini-2-5-flash) for balanced performance
* Monitor your context window usage
## See Also
* [Labels](/models/labels) - Understanding model status labels and observed behaviors
* [Custom LLM Parameters](/models/custom-llm-parameters) - Configure additional model parameters
* [Reasoning Models](/models/reasoning-models) - Comprehensive guide to reasoning capabilities