Skip to main content

Mastering Google's Interactions API: A Unified Gateway to Gemini Models and Deep Research Agent

· 12 min read
Raphaël MANSUY
Elitizon Ltd

The AI development landscape is shifting from stateless request-response patterns to stateful, multi-turn agentic workflows. Google's new Interactions API provides a unified interface designed specifically for this new era—offering a single gateway to both raw Gemini models and the fully managed Deep Research Agent.

In one sentence: The Interactions API is a unified endpoint for interacting with Gemini models and agents, featuring server-side state management, background execution for long-running tasks, and native support for the Deep Research Agent.

Interactions API Overview

Why the Interactions API Matters

The Evolution from generateContent to Interactions

The original generateContent API was designed for stateless request-response text generation—perfect for chatbots and simple completion tasks. But as AI applications evolve toward agentic patterns, developers need more sophisticated capabilities:

ChallengegenerateContentInteractions API
State ManagementClient-side onlyServer-side with previous_interaction_id
Long-running TasksTimeoutsBackground execution with polling
Agent AccessModels onlyModels AND built-in agents
Tool OrchestrationBasic function callingNative MCP, Google Search, Code Execution
Conversation HistoryManual managementAutomatic with session IDs

Key Benefits

  1. Server-Side State Management: Offload conversation history to the server, reducing client complexity
  2. Background Execution: Run multi-hour research tasks without maintaining client connections
  3. Unified Endpoint: Same API for models (gemini-3-pro-preview) and agents (deep-research-pro-preview-12-2025)
  4. Remote MCP Support: Models can directly call Model Context Protocol servers
  5. Improved Cache Hits: Server-managed state enables better context caching, reducing costs

Getting Started

Prerequisites

# Install the latest google-genai SDK (1.55.0+ required)
pip install "google-genai>=1.55.0"

# Set your API key
export GOOGLE_API_KEY="your-api-key-here"

Basic Interaction

The simplest way to use the Interactions API:

from google import genai

client = genai.Client()

# Create an interaction
interaction = client.interactions.create(
model="gemini-2.5-flash",
input="Tell me a short joke about programming."
)

print(interaction.outputs[-1].text)
SDK Requirements
  • Python: google-genai>=1.55.0
  • JavaScript: @google/genai>=1.33.0

Stateful Conversations

One of the most powerful features is server-side state management. Instead of sending the entire conversation history with each request, you reference the previous interaction:

from google import genai

client = genai.Client()

# First turn
interaction1 = client.interactions.create(
model="gemini-2.5-flash",
input="Hi, my name is Alex."
)
print(f"Model: {interaction1.outputs[-1].text}")

# Second turn - context preserved automatically!
interaction2 = client.interactions.create(
model="gemini-2.5-flash",
input="What is my name?",
previous_interaction_id=interaction1.id
)
print(f"Model: {interaction2.outputs[-1].text}")
# Output: "Your name is Alex."

Benefits of Server-Side State

  • Reduced Token Costs: No need to resend full history
  • Improved Cache Hits: Server can cache context more efficiently
  • Simpler Client Code: No local state management needed
  • Reliable Context: Server ensures consistency

Retrieving Past Interactions

# Get a previous interaction by ID
previous = client.interactions.get("<YOUR_INTERACTION_ID>")
print(previous.outputs[-1].text)

Deep Research Agent

The Deep Research Agent (deep-research-pro-preview-12-2025) is a game-changer for autonomous research tasks. Powered by Gemini 3 Pro, it autonomously plans, executes, and synthesizes multi-step research tasks.

When to Use Deep Research

Use CaseDeep ResearchStandard Model
LatencyMinutes (async)Seconds
ProcessPlan → Search → Read → Iterate → OutputGenerate → Output
OutputDetailed reports with citationsConversational text
Best ForMarket analysis, due diligence, literature reviewsChat, extraction, creative writing

Basic Deep Research

import time
from google import genai

client = genai.Client()

# Start research in background
interaction = client.interactions.create(
input="Research the competitive landscape of AI code assistants in 2025.",
agent="deep-research-pro-preview-12-2025",
background=True # Required for agents
)

print(f"Research started: {interaction.id}")

# Poll for completion
while True:
interaction = client.interactions.get(interaction.id)
print(f"Status: {interaction.status}")

if interaction.status == "completed":
print("\n📊 Research Report:\n")
print(interaction.outputs[-1].text)
break
elif interaction.status == "failed":
print(f"Research failed: {interaction.error}")
break

time.sleep(10) # Poll every 10 seconds

Streaming Deep Research with Progress Updates

For real-time progress updates during research:

from google import genai

client = genai.Client()

stream = client.interactions.create(
input="Research the history of Google TPUs.",
agent="deep-research-pro-preview-12-2025",
background=True,
stream=True,
agent_config={
"type": "deep-research",
"thinking_summaries": "auto" # Enable thought streaming
}
)

interaction_id = None
last_event_id = None

for chunk in stream:
if chunk.event_type == "interaction.start":
interaction_id = chunk.interaction.id
print(f"🚀 Research started: {interaction_id}")

if chunk.event_id:
last_event_id = chunk.event_id

if chunk.event_type == "content.delta":
if chunk.delta.type == "text":
print(chunk.delta.text, end="", flush=True)
elif chunk.delta.type == "thought_summary":
print(f"💭 Thought: {chunk.delta.content.text}", flush=True)

elif chunk.event_type == "interaction.complete":
print("\n✅ Research Complete")

Research with Custom Formatting

You can steer the agent's output with specific formatting instructions:

prompt = """
Research the competitive landscape of EV batteries.

Format the output as a technical report with:
1. Executive Summary (max 200 words)
2. Key Players (include a comparison table with columns: Company, Capacity, Chemistry, Market Share)
3. Supply Chain Risks (bullet points)
4. Future Outlook (2025-2030)

Use clear headers and include citations for all claims.
"""

interaction = client.interactions.create(
input=prompt,
agent="deep-research-pro-preview-12-2025",
background=True
)

Follow-up Questions

Continue conversations after the research completes:

# After research is complete
follow_up = client.interactions.create(
input="Can you elaborate on the third key player you mentioned?",
model="gemini-3-pro-preview", # Can use a model for follow-ups
previous_interaction_id=completed_interaction.id
)
print(follow_up.outputs[-1].text)

Function Calling with Interactions API

The Interactions API provides robust function calling capabilities:

from google import genai

client = genai.Client()

# Define the tool
def get_weather(location: str) -> str:
"""Gets current weather for a location."""
# Your implementation here
return f"The weather in {location} is sunny and 72°F."

weather_tool = {
"type": "function",
"name": "get_weather",
"description": "Gets the weather for a given location.",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}

# Send request with tool
interaction = client.interactions.create(
model="gemini-2.5-flash",
input="What is the weather in Paris?",
tools=[weather_tool]
)

# Handle tool call
for output in interaction.outputs:
if output.type == "function_call":
print(f"Tool Call: {output.name}({output.arguments})")

# Execute the tool
result = get_weather(**output.arguments)

# Send result back
interaction = client.interactions.create(
model="gemini-2.5-flash",
previous_interaction_id=interaction.id,
input=[{
"type": "function_result",
"name": output.name,
"call_id": output.id,
"result": result
}]
)
print(f"Response: {interaction.outputs[-1].text}")

Built-in Tools

The Interactions API provides access to powerful built-in tools:

Google Search Grounding

interaction = client.interactions.create(
model="gemini-2.5-flash",
input="Who won the 2024 Super Bowl?",
tools=[{"type": "google_search"}]
)

# Get the text output (filtering search results)
text_output = next((o for o in interaction.outputs if o.type == "text"), None)
if text_output:
print(text_output.text)

Code Execution

interaction = client.interactions.create(
model="gemini-2.5-flash",
input="Calculate the 50th Fibonacci number.",
tools=[{"type": "code_execution"}]
)
print(interaction.outputs[-1].text)

URL Context

interaction = client.interactions.create(
model="gemini-2.5-flash",
input="Summarize the content of https://google.github.io/adk-docs/",
tools=[{"type": "url_context"}]
)
print(interaction.outputs[-1].text)

Remote MCP Servers

mcp_server = {
"type": "mcp_server",
"name": "weather_service",
"url": "https://your-mcp-server.example.com/mcp"
}

interaction = client.interactions.create(
model="gemini-2.5-flash",
input="What is the weather like in New York?",
tools=[mcp_server]
)
print(interaction.outputs[-1].text)

Integration with Google ADK

The Interactions API integrates seamlessly with the Agent Development Kit (ADK):

ADK Agent with Interactions Backend

from google.adk.agents import Agent
from google.adk.models.google_llm import Gemini
from google.adk.tools.google_search_tool import GoogleSearchTool

def get_current_weather(location: str) -> dict:
"""Get weather for a location."""
return {
"status": "success",
"location": location,
"temperature": "72°F",
"conditions": "Sunny"
}

# Create agent with Interactions API enabled
root_agent = Agent(
model=Gemini(
model="gemini-2.5-flash",
use_interactions_api=True # Enable Interactions API!
),
name="interactions_enabled_agent",
description="An agent powered by the Interactions API",
instruction="""You are a helpful assistant with access to:
- Google Search for current information
- Weather data for location queries

Always provide accurate, well-sourced information.""",
tools=[
GoogleSearchTool(bypass_multi_tools_limit=True),
get_current_weather,
],
)

Benefits for ADK Developers

  1. Automatic State Management: ADK handles previous_interaction_id for you
  2. Background Task Support: Long-running agents don't timeout
  3. Native Thought Handling: Access to model reasoning chains
  4. Unified Tool Experience: Same tools work across models and agents

Multimodal Capabilities

The Interactions API supports multimodal inputs:

Image Understanding

import base64
from pathlib import Path

with open("image.png", "rb") as f:
base64_image = base64.b64encode(f.read()).decode('utf-8')

interaction = client.interactions.create(
model="gemini-2.5-flash",
input=[
{"type": "text", "text": "Describe what you see in this image."},
{"type": "image", "data": base64_image, "mime_type": "image/png"}
]
)
print(interaction.outputs[-1].text)

Image Generation

interaction = client.interactions.create(
model="gemini-3-pro-image-preview",
input="Generate an image of a futuristic AI research lab.",
response_modalities=["IMAGE"]
)

for output in interaction.outputs:
if output.type == "image":
with open("generated_lab.png", "wb") as f:
f.write(base64.b64decode(output.data))
print("Image saved!")

Structured Output

Enforce specific JSON output schemas:

from pydantic import BaseModel, Field
from typing import Literal

class ContentModeration(BaseModel):
is_safe: bool = Field(description="Whether the content is safe")
category: Literal["safe", "spam", "inappropriate", "harmful"]
confidence: float = Field(ge=0, le=1, description="Confidence score")
reason: str = Field(description="Explanation for the classification")

interaction = client.interactions.create(
model="gemini-2.5-flash",
input="Moderate: 'Free money! Click here to claim your prize!'",
response_format=ContentModeration.model_json_schema()
)

result = ContentModeration.model_validate_json(interaction.outputs[-1].text)
print(f"Safe: {result.is_safe}, Category: {result.category}")

Data Storage and Retention

Important considerations for stored interactions:

TierRetention Period
Paid55 days
Free1 day

Opting Out of Storage

# Disable storage (cannot use with background=True)
interaction = client.interactions.create(
model="gemini-2.5-flash",
input="Process this privately",
store=False # Opt out of storage
)

Deleting Interactions

# Delete a specific interaction
client.interactions.delete(interaction_id="<INTERACTION_ID>")

Best Practices

1. Use Server-Side State for Conversations

# ✅ Good: Server manages history
interaction2 = client.interactions.create(
model="gemini-2.5-flash",
input="Continue our discussion",
previous_interaction_id=interaction1.id
)

# ❌ Avoid: Sending full history each time
interaction2 = client.interactions.create(
model="gemini-2.5-flash",
input=[...entire_conversation_history...] # Expensive!
)

2. Always Use background=True for Agents

# ✅ Required for agents like Deep Research
interaction = client.interactions.create(
agent="deep-research-pro-preview-12-2025",
input="Research task",
background=True
)

3. Handle Long-Running Tasks with Resilience

import time

def run_research_with_retry(prompt: str, max_retries: int = 3):
"""Run research with automatic retry on failure."""
interaction = client.interactions.create(
agent="deep-research-pro-preview-12-2025",
input=prompt,
background=True
)

retries = 0
while retries < max_retries:
try:
while True:
status = client.interactions.get(interaction.id)
if status.status == "completed":
return status.outputs[-1].text
elif status.status == "failed":
raise Exception(status.error)
time.sleep(10)
except Exception as e:
retries += 1
if retries >= max_retries:
raise
time.sleep(30)

4. Mix Models and Agents in Conversations

# Start with Deep Research
research = client.interactions.create(
agent="deep-research-pro-preview-12-2025",
input="Research quantum computing applications",
background=True
)
# ... poll for completion ...

# Follow up with a standard model
summary = client.interactions.create(
model="gemini-2.5-flash",
input="Summarize the key points for a non-technical audience",
previous_interaction_id=research.id
)

Supported Models and Agents

NameTypeIdentifier
Gemini 2.5 ProModelgemini-2.5-pro
Gemini 2.5 FlashModelgemini-2.5-flash
Gemini 2.5 Flash-liteModelgemini-2.5-flash-lite
Gemini 3 Pro PreviewModelgemini-3-pro-preview
Deep Research PreviewAgentdeep-research-pro-preview-12-2025

Current Limitations

Beta Status

The Interactions API is in public beta. Features and schemas may change.

  1. Not Yet Supported:

    • Grounding with Google Maps
    • Computer Use
    • Combining MCP + Function Calling + Built-in tools in single request
  2. Deep Research Specific:

    • Maximum research time: 60 minutes (most complete in ~20 minutes)
    • No custom function calling tools
    • No structured output or plan approval
    • Audio inputs not supported
  3. Storage Requirements:

    • background=True requires store=True

Migration Guide

When to Use Interactions API vs generateContent

ScenarioRecommended API
Simple text completiongenerateContent
Standard chatbotgenerateContent
Production criticalgenerateContent
Agentic workflowsInteractions API
Long-running researchInteractions API
Complex tool orchestrationInteractions API
Multi-hour background tasksInteractions API
MCP server integrationInteractions API

Try It Yourself

We've created complete working examples in the tutorial_blog_implementation/ directory:

  1. Basic Interactions: Simple conversations and state management

  2. Deep Research Agent: Autonomous research with progress streaming

  3. ADK Integration: Using Interactions API with Google ADK

Each example includes:

  • ✅ Working Python code
  • ✅ Comprehensive tests (make test)
  • ✅ Interactive demos (make demo)
  • ✅ Makefile with setup and development commands

Resources

Conclusion

The Interactions API represents a significant evolution in how we build AI applications. By providing:

  • Server-side state management for simpler, more reliable conversations
  • Background execution for long-running agentic tasks
  • Unified access to both models and specialized agents like Deep Research
  • Native tool integration with MCP, Google Search, and more

...developers can now build sophisticated AI systems with less boilerplate and better reliability.

Whether you're building a research assistant, a multi-turn customer support agent, or a complex agentic workflow, the Interactions API provides the foundation for next-generation AI applications.


💬 Join the Discussion

Have questions or feedback? Discuss this tutorial with the community on GitHub Discussions.