Loading...

Google Gemini Models

Google's Gemini models for multimodal understanding, reasoning, and agentic use cases.

Google's Gemini models provide powerful multimodal capabilities with advanced reasoning features. Tambo supports Gemini models across three generations: Gemini 3, Gemini 2.5, and Gemini 2.0.

Known Issues

Gemini models may occasionally resist rendering as requested. Sometimes they complete the request, but behavior can be inconsistent. Try clarifying instructions (e.g., "Return a bulleted list only"). Outputs may have formatting quirks—be cautious when structure matters.

Model Families

Gemini 3 Family

The latest generation of Gemini models with enhanced multimodal and reasoning capabilities.

gemini-3-pro-preview

Status: Tested

Google's most powerful model as of November 2025, best for multimodal understanding and agentic use cases.

  • API Name: gemini-3-pro-preview
  • Context Window: 1,048,576 tokens
  • Provider Documentation: Gemini 3.0 Pro

Best For:

  • Complex multimodal understanding tasks
  • Agentic workflows requiring reasoning
  • Advanced problem-solving scenarios

Notes:

Gemini 2.5 Family

Advanced reasoning models with extended thinking capabilities.

gemini-2.5-pro

Status: Known Issues

Gemini 2.5 Pro is Google's most advanced reasoning model, capable of solving complex problems.

  • API Name: gemini-2.5-pro
  • Context Window: 1,048,576 tokens
  • Provider Documentation: Gemini 2.5 Pro

Best For:

  • Complex reasoning tasks
  • Multi-step problem-solving
  • Tasks requiring deep analysis

Notes:

  • May occasionally resist rendering as requested
  • Behavior can be inconsistent with formatting
  • Supports reasoning via thinking configuration

gemini-2.5-flash

Status: Known Issues

Gemini 2.5 Flash is Google's best model in terms of price and performance, offering well-rounded capabilities.

  • API Name: gemini-2.5-flash
  • Context Window: 1,048,576 tokens
  • Provider Documentation: Gemini 2.5 Flash

Best For:

  • Production workloads requiring speed
  • Cost-effective reasoning tasks
  • Balanced performance across various use cases

Notes:

  • Fast and efficient for production use
  • May have formatting quirks occasionally
  • Supports reasoning via thinking configuration

Gemini 2.0 Family

Next-generation features designed for the agentic era with superior speed and built-in tool use.

gemini-2.0-flash

Status: Known Issues

Gemini 2.0 Flash delivers next-generation features and improved capabilities designed for the agentic era, including superior speed, built-in tool use, multimodal generation, and a 1M token context window.

  • API Name: gemini-2.0-flash
  • Context Window: 1,048,576 tokens
  • Provider Documentation: Gemini 2.0 Flash

Best For:

  • Agentic workflows with built-in tool use
  • Applications requiring superior speed
  • Multimodal generation tasks

Notes:

  • Optimized for the agentic era
  • May occasionally have inconsistent rendering behavior
  • Strong tool-calling capabilities

gemini-2.0-flash-lite

Status: Known Issues

Gemini 2.0 Flash Lite is a model optimized for cost efficiency and low latency.

  • API Name: gemini-2.0-flash-lite
  • Context Window: 1,048,576 tokens
  • Provider Documentation: Gemini 2.0 Flash Lite

Best For:

  • High-volume applications
  • Cost-sensitive deployments
  • Low-latency requirements

Notes:

  • Most cost-efficient Gemini model
  • Optimized for speed over capability
  • May have formatting quirks

Provider-Specific Parameters

Reasoning Configuration

Gemini models support reasoning capabilities through a thinking configuration. All Gemini models can use these parameters to control reasoning behavior.

thinkingConfig

Configure thinking behavior with a JSON object:

FieldTypeDescription
thinkingBudgetnumberToken budget allocated for thinking
includeThoughtsbooleanWhether to include thinking in response

Example Configuration:

{
  "thinkingBudget": 5000,
  "includeThoughts": true
}

Recommended Settings:

  • Quick reasoning (1000-2000 tokens): Simple tasks requiring minimal thinking
  • Standard reasoning (3000-5000 tokens): Most use cases (recommended)
  • Extended reasoning (7000+ tokens): Complex problems requiring deep analysis

Configuring in the Dashboard

  1. Navigate to your project in the dashboard
  2. Go to SettingsLLM Providers
  3. Select Google Gemini as your provider
  4. Choose your model
  5. Under Custom LLM Parameters, click + thinkingConfig
  6. Enter your configuration as a JSON object:
    {
      "thinkingBudget": 5000,
      "includeThoughts": true
    }
  7. Click Save to apply the configuration

Dashboard Suggestions

When you select a Gemini reasoning model, the dashboard automatically shows thinkingConfig as a suggested parameter. Just click it to add!

Best Practices

Model Selection

Performance Optimization

Cost Considerations

Troubleshooting

Inconsistent rendering behavior?

  • Try clarifying instructions more explicitly
  • Use specific formatting directives (e.g., "Return a bulleted list only")
  • Test with different prompt phrasings
  • Consider using a tested OpenAI model for production-critical formatting

Reasoning not appearing in responses?

Model performance issues?

  • Lower the thinkingBudget for faster responses
  • Use Gemini 2.0 Flash Lite for speed-critical applications
  • Consider Gemini 2.5 Flash for balanced performance
  • Monitor your context window usage

See Also