Skip to main content

23. Production Deployment Strategies

Goal: Understand ADK deployment options and implement production-grade agents with custom authentication, monitoring, and reliability patterns.

Prerequisites:

  • Tutorial 01 (Hello World Agent)
  • Google Cloud Platform account
  • Basic Docker knowledge (helpful)
  • Understanding of FastAPI (helpful)

What You'll Learn:

  • βœ… Deploy agents using ADK's built-in server (5 minutes)
  • πŸ—οΈ Build production FastAPI servers with custom patterns (when needed)
  • πŸ“Š Implement custom monitoring and observability
  • πŸ” Add authentication and security patterns
  • πŸ“ˆ Auto-scale across platforms
  • πŸ›‘οΈ Understand when to use ADK vs custom server

Quick Decision Framework:

  • 5 minutes to production? β†’ Cloud Run βœ…
  • Need FedRAMP compliance? β†’ Agent Engine βœ…βœ…
  • Have Kubernetes? β†’ GKE βœ…
  • Need custom auth? β†’ Tutorial 23 + Cloud Run βš™οΈ
  • Just testing locally? β†’ Local Dev ⚑

Time to Complete: 5 minutes (Cloud Run) to 2+ hours (custom patterns)


🎯 DECISION FRAMEWORK: Choose Your Platform​

What's Your Situation?​

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 1. QUICK MVP / MOVING FAST? β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Setup: 5 minutes | Cost: ~$40/mo | Security: Auto βœ…
β”‚ β†’ Use: CLOUD RUN βœ…
β”‚ Best for: Startups, MVPs, most production apps β”‚
β”‚ Deploy: adk deploy cloud_run --project ID --region us-central1 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 2. NEED COMPLIANCE (FedRAMP, HIPAA, PCI-DSS)? β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Setup: 10 minutes | Cost: ~$50/mo | Security: Auto βœ…βœ…
β”‚ β†’ Use: AGENT ENGINE βœ…βœ…
β”‚ Best for: Enterprise, government, compliance-heavy β”‚
β”‚ Why: Only platform with FedRAMP compliance β”‚
β”‚ Deploy: adk deploy agent_engine --project ID --region us-center β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 3. HAVE KUBERNETES / NEED FULL CONTROL? β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Setup: 20 minutes | Cost: $200-500/mo | Security: Configure βš™οΈ
β”‚ β†’ Use: GKE βœ…
β”‚ Best for: Complex deployments, existing Kubernetes shops β”‚
β”‚ Deploy: kubectl apply -f deployment.yaml β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 4. NEED CUSTOM AUTH (LDAP, KERBEROS)? β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Setup: 2 hours | Cost: ~$60/mo | Security: Custom + Platform βš™οΈ
β”‚ β†’ Use: TUTORIAL 23 + CLOUD RUN βš™οΈ
β”‚ Best for: Custom authentication requirements β”‚
β”‚ Why: Platform doesn't support these auth methods natively β”‚
β”‚ Note: Most users don't need this - use Cloud Run IAM instead β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 5. JUST DEVELOPING LOCALLY? β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ Setup: < 1 min | Cost: Free | Security: Add before deploy ⚑
β”‚ β†’ Use: LOCAL DEV ⚑ β”‚
β”‚ Best for: Development, prototyping, testing β”‚
β”‚ Deploy: adk api_server β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β†’ Pick the box that matches your situation. That's your platform.


⚠️ Important: Understanding ADK's Deployment Model​

Key Insight: Security is Platform-First​

ADK's built-in server is intentionally minimal by design. Here's why:

  • βœ… ADK provides: Input validation, session management, error handling
  • βœ… Platform provides: TLS/HTTPS, DDoS protection, authentication, compliance
  • βœ… Result: Secure production deployment with zero custom security code

See: Security Research Summary for complete analysis of what each platform secures automatically.

Custom Server (Tutorial 23) is ADVANCED & OPTIONAL​

You only need the custom FastAPI server if:

  • You need custom authentication (LDAP, Kerberos, etc.)
  • You need advanced logging beyond platform defaults
  • You have specific business logic endpoints
  • You're not using Google Cloud infrastructure

Most production deployments use Cloud Run + ADK's built-in. No custom server needed.

Platform Comparison​

PlatformSecuritySetupCostBest ForNeeds Custom Server?
Cloud RunAuto βœ…5 minPay-per-useMost apps❌ No
Agent EngineAuto βœ…βœ…10 minPay-per-useEnterprise❌ No
GKEConfigure βš™οΈ20 minHourlyComplex❌ No
Custom + Cloud RunHybrid βš™οΈ2 hrsPay-per-useSpecial needsβœ… Yes
Local DevMinimal< 1 minFreeDevelopmentβœ… Yes (add locally)

See: Complete Security Analysis for detailed security breakdown per platform.


πŸ” Security First: What's Automatic vs Manual​

Important Discovery: Each platform provides different levels of automatic security.

Security by Platform (Quick Reference)​

Security FeatureCloud RunAgent EngineGKELocal
HTTPS/TLSβœ… Autoβœ… Autoβœ… Manual❌
DDoS Protectionβœ… Autoβœ… Auto❌❌
Authenticationβœ… Auto (IAM)βœ… Auto (OAuth)βš™οΈ Manual❌
Encryption at Restβœ… Autoβœ… Autoβœ… Manual❌
Audit Loggingβœ… Autoβœ… Autoβœ… Manual❌
Compliance Readyβœ… HIPAA, PCIβœ…βœ… FedRAMPβœ… All❌

Key Message: Cloud Run and Agent Engine give you production-ready security with zero configuration. All security is automatic.

Read the Full Security Analysis​

For comprehensive details on what's secure across all platforms:

  • πŸ“„ SECURITY_RESEARCH_SUMMARY.md - Executive summary (5 min read)

    • What ADK provides vs what platforms provide
    • When you actually need custom authentication
    • Platform security capabilities comparison
    • Real-world use case recommendations
  • πŸ“‹ SECURITY_ANALYSIS_ALL_DEPLOYMENT_OPTIONS.md - Comprehensive (20 min read)

    • Detailed security breakdown per platform
    • Compliance certifications
    • Platform-specific security checklists
    • Security verification steps
    • When to use custom server

Bottom Line: "ADK's built-in server is secure by design because platform security is the foundation."


Quick Reference: Understanding ADK's Deployment​

What Happens When You Run adk deploy cloud_run?​

Your Agent Code
↓
[ADK Generates]
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ main.py (using get_fast_api_app() from ADK)
└── requirements.txt
↓
[Builds Container]
↓
[Deploys to Cloud Run]
↓
βœ… Live FastAPI Server
(with basic endpoints only)

What's Inside ADK's Built-In Server?​

Provided by get_fast_api_app():

  • βœ… GET / - API info
  • βœ… GET /health - Health check
  • βœ… GET /agents - List agents
  • βœ… POST /invoke - Run agent
  • βœ… Session management

NOT Provided:

  • ❌ Custom authentication
  • ❌ Custom logging
  • ❌ Custom metrics
  • ❌ Rate limiting
  • ❌ Circuit breakers

When You Need a Custom Server​

The custom server in this repository (Tutorial 23) adds:

  • βœ… Custom authentication
  • βœ… Structured logging with request tracing
  • βœ… Health checks with real metrics
  • βœ… Request timeouts and circuit breaking
  • βœ… Custom error handling
  • βœ… Full observability

See: DEPLOYMENT_OPTIONS_EXPLAINED.md for complete details

Time to Complete: 45 minutes


🌍 Real-World Scenarios: Which Platform for Which Situation?​

Scenario 1: Startup Building MVP​

Your Situation: Moving fast, limited resources, want to deploy this week.

What You Need:

  • Deployment in < 5 minutes
  • Automatic security (don't want to manage this)
  • Pay only for what you use
  • Can iterate quickly

Recommendation: βœ… Cloud Run

Why:

  • Fastest time to market (5 minutes!)
  • Secure by default (HTTPS, DDoS, IAM)
  • Cost-effective (~$40/mo for 1M requests)
  • No infrastructure to manage

Deploy:

adk deploy cloud_run \
--project your-project-id \
--region us-central1

Cost: ~$40/month (1M requests) + $0.30/CPU-month

Next Step: As you grow, consider Agent Engine for better compliance.


Scenario 2: Enterprise System (Need Compliance)​

Your Situation: Building for enterprise customers, need FedRAMP or HIPAA compliance.

What You Need:

  • FedRAMP compliance (government-ready)
  • HIPAA/PCI-DSS certifications
  • Zero infrastructure management
  • Immutable audit logs
  • Sandboxed execution

Recommendation: βœ…βœ… Agent Engine (ONLY PLATFORM WITH FedRAMP)

Why:

  • Only platform with FedRAMP compliance built-in
  • Google manages all security/compliance
  • Zero configuration needed
  • Best for highly regulated industries

Deploy:

adk deploy agent_engine \
--project your-project-id \
--region us-central1 \
--agent-name my-agent

Cost: ~$50/month (1M requests) + usage

Benefits:

  • FedRAMP compliance
  • SOC 2 Type II certified
  • Automatic audit logging
  • Content safety filters
  • No ops burden

Next Step: Already production-ready. Focus on agent safety.


Scenario 3: Kubernetes Shop​

Your Situation: Your company runs Kubernetes infrastructure, you want ADK in that environment.

What You Need:

  • Deploy in existing Kubernetes cluster
  • Full control over configuration
  • NetworkPolicy for traffic control
  • Workload Identity integration
  • Pod resource limits

Recommendation: βœ… GKE (or any Kubernetes)

Why:

  • Leverage existing infrastructure
  • Full control over security config
  • Support for complex networking
  • Advanced observability

Deploy:

kubectl apply -f deployment.yaml

Cost: $200-500+/month (based on cluster size)

Requires:

  • Kubernetes expertise
  • Manual security configuration
  • Pod security setup
  • RBAC configuration

Next Step: Use GKE Autopilot to simplify security.


Scenario 4: Custom Authentication Required​

Your Situation: You need LDAP, Kerberos, or other custom authentication not available on platforms.

What You Need:

  • Custom authentication provider
  • Custom API endpoints
  • Advanced logging
  • Specific business logic

Recommendation: βš™οΈ Tutorial 23 Custom Server + Cloud Run

Why:

  • Cloud Run provides platform security
  • Tutorial 23 provides custom authentication
  • Combined = secure + custom

Deploy:

# 1. Use custom server from Tutorial 23
cd tutorial_implementation/tutorial23

# 2. Deploy to Cloud Run
adk deploy cloud_run \
--project your-project-id \
--region us-central1

Cost: ~$60/month (on Cloud Run) + custom server complexity

Note: MOST USERS DON'T NEED THIS

  • Use Cloud Run IAM for standard authentication
  • Use Agent Engine OAuth for standards
  • Only use this if platforms don't support your auth method

Effort: 2+ hours to implement custom server


Scenario 5: Local Development​

Your Situation: Building and testing locally before deploying.

What You Need:

  • Fast iteration loop
  • Hot reload on code changes
  • Easy testing
  • No infrastructure needed

Recommendation: ⚑ Local Dev (add security before deploy)

Why:

  • Zero setup time
  • Instant feedback
  • Free
  • Perfect for development

Run Locally:

# Start dev server
adk api_server

# Or use custom server
python -m uvicorn production_agent.server:app --reload

Before Production:

  • Add authentication layer
  • Test with HTTPS (use ngrok)
  • Verify security settings
  • Move to Cloud Run

Cost: Free (local)

Next Step: Deploy to Cloud Run when ready for production.


5-Minute Quick Start with ADK's Built-In Server​

Want to deploy NOW? Use this command:

# Cloud Run
adk deploy cloud_run \
--project your-project-id \
--region us-central1 \
./your_agent_directory

# GKE
adk deploy gke \
--project your-project-id \
--cluster_name my-cluster \
--region us-central1 \
./your_agent_directory

# Agent Engine
adk deploy agent_engine \
--project your-project-id \
--region us-central1 \
./your_agent_directory

βœ… That's it! Your agent is live in 5 minutes.

What you get:

  • Automatic container build
  • FastAPI server with basic endpoints
  • Auto-scaling
  • Public HTTPS URL
  • Session management
  • /health endpoint
  • No custom code needed

πŸ—οΈ Advanced: When You Need a Custom FastAPI Server​

⚠️ Important: Most Users Don't Need This​

First Check: Do you actually need a custom server?

  • βœ… Use Cloud Run + ADK's built-in if you need standard authentication (IAM, OAuth)
  • βœ… Use Agent Engine if you need compliance/security
  • βœ… Use GKE if you need Kubernetes control
  • βš™οΈ Use Custom Server ONLY if you have special needs below

When Custom Server is Actually Needed​

You need Tutorial 23's custom server IF:

  1. Custom authentication (LDAP, Kerberos, API keys)

    • Cloud Run IAM doesn't support it
    • Agent Engine OAuth doesn't work for you
    • You have proprietary auth system
  2. Advanced logging/observability beyond platform defaults

    • Custom request correlation IDs
    • Business event tracking
    • Custom metrics
  3. Additional API endpoints for business logic

    • Webhooks
    • Custom health checks
    • Integration endpoints
  4. Non-Google infrastructure

    • Running on AWS, Azure, on-premises
    • Portable solution needed

If none of these apply: Use Cloud Run or Agent Engine. Much simpler.

What Tutorial 23 Provides​

This tutorial includes a complete, production-ready implementation:

tutorial23/
β”œβ”€β”€ production_agent/
β”‚ β”œβ”€β”€ agent.py # Agent with 3 tools
β”‚ └── server.py # FastAPI server (488 lines)
β”œβ”€β”€ tests/ # 40 comprehensive tests
β”œβ”€β”€ Makefile # Commands: setup, dev, test, demo
β”œβ”€β”€ FASTAPI_BEST_PRACTICES.md # 7 core patterns guide
└── README.md # Complete documentation

Key Features (If You Need Custom Server):

  • βœ… Custom authentication with API keys
  • βœ… Structured logging with request tracing
  • βœ… Health checks with real metrics
  • βœ… Error handling and validation
  • βœ… Request timeouts and circuit breaking
  • βœ… 40 passing tests (93% coverage)
  • βœ… Production-ready patterns

πŸ“– Full Implementation: View on GitHub β†’

Security Note: Tutorial 23 is ADVANCED pattern. It adds application-layer features but depends on platform-layer security from Cloud Run or your infrastructure.


Quick Start (5 minutes)​

cd tutorial_implementation/tutorial23

# Setup
make setup

# Run development server
export GOOGLE_API_KEY=your_key
make dev

# Run tests
make test

# See demos
make demo-info

Open http://localhost:8000 and select production_deployment_agent from dropdown.


Deployment Strategies​

ADK supports multiple deployment paths. Choose based on your needs:

Comparison Matrix​

StrategySetup TimeScalingCostBest For
Local< 1 minManualFreeDevelopment
Cloud Run5 minsAutoPay-per-useMost apps
Agent Engine10 minsAutoPay-per-useEnterprise
GKE20 minsManualHourlyComplex

1. Local Development​

Perfect for: Quick testing and iteration

# Start FastAPI server
adk api_server

# Custom port
adk api_server --port 8090

Test it:

curl http://localhost:8080/health
curl -X POST http://localhost:8080/invoke \
-H "Content-Type: application/json" \
-d '{"query": "Hello!"}'

Features:

  • πŸ”„ Hot reload during development
  • πŸ“– Auto-generated API docs at /docs
  • ⚑ Instant feedback loop

See tutorial implementation for custom server code.


Perfect for: Serverless auto-scaling with minimal ops

# Deploy in one command
adk deploy cloud_run \
--project your-project-id \
--region us-central1 \
--service-name my-agent

That's it! ADK handles:

  • βœ… Building container image
  • βœ… Pushing to Container Registry
  • βœ… Deploying to Cloud Run
  • βœ… Setting up auto-scaling

Manual Alternative:

# 1. Build
gcloud builds submit --tag gcr.io/YOUR_PROJECT/agent

# 2. Deploy
gcloud run deploy agent \
--image gcr.io/YOUR_PROJECT/agent \
--platform managed \
--region us-central1 \
--memory 2Gi \
--max-instances 100

Cost: ~$0.40 per million requests + compute


3. Vertex AI Agent Engine​

Perfect for: Managed agent infrastructure with built-in versioning

# Deploy to managed service
adk deploy agent_engine \
--project your-project-id \
--region us-central1 \
--agent-name my-agent

Benefits:

  • πŸ“¦ Managed infrastructure
  • 🎯 Version control
  • πŸ”„ A/B testing
  • πŸ“Š Built-in monitoring
  • πŸ” Enterprise security

4. Google Kubernetes Engine (GKE)​

Perfect for: Complex deployments needing full control

# Create cluster
gcloud container clusters create agent-cluster \
--region us-central1 \
--machine-type n1-standard-2 \
--num-nodes 3

# Get credentials
gcloud container clusters get-credentials agent-cluster \
--region us-central1

# Deploy
kubectl apply -f deployment.yaml

When to use GKE:

  • Need advanced networking
  • Running multiple services
  • Existing Kubernetes expertise
  • Custom orchestration requirements

See tutorial implementation for full Kubernetes manifests.


Deployment Flow Diagram​

YOUR AGENT CODE
|
v
+-------------------+
| adk deploy XXXX |
+-------------------+
|
+-------+-------+-------+-------+
| | | | |
v v v v v
LOCAL CLOUD-RUN AGENT-ENG GKE CUSTOM
| | | | |
v v v v v
localhost serverless managed k8s your-infra

Production Setup​

Environment Configuration​

Create .env file (never commit!):

# Google Cloud
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GOOGLE_GENAI_USE_VERTEXAI=1

# Application
MODEL=gemini-2.0-flash
TEMPERATURE=0.5
MAX_TOKENS=2048

# Security
API_KEY=your-secret-key
ALLOWED_ORIGINS=https://yourdomain.com

# Monitoring
LOG_LEVEL=INFO
ENABLE_TRACING=true

Health Checks​

All deployments should expose /health endpoint:

GET /health

{
"status": "healthy",
"uptime_seconds": 3600,
"request_count": 1250,
"error_count": 3,
"error_rate": 0.0024,
"metrics": {
"successful_requests": 1247,
"timeout_count": 0
}
}

Configure in orchestrator:

  • Cloud Run: Automatically detected
  • GKE: Set as liveness probe
  • Agent Engine: Built-in

Secrets Management​

Never commit API keys to code. Use Google Secret Manager:

from google.cloud import secretmanager

def get_secret(secret_id: str) -> str:
client = secretmanager.SecretManagerServiceClient()
project = os.environ['GOOGLE_CLOUD_PROJECT']
name = f"projects/{project}/secrets/{secret_id}/versions/latest"
response = client.access_secret_version(request={"name": name})
return response.payload.data.decode('UTF-8')

# Usage
api_key = get_secret('api-key')

Monitoring & Observability​

Key Metrics to Track​

MetricTargetAlert Threshold
Error Rate< 0.5%> 5%
P99 Latency< 2 sec> 5 sec
Availability> 99.9%< 99%
Request CountTrackN/A

Structured Logging​

All production servers should log JSON to stdout:

{
"timestamp": "2025-01-17T10:30:45Z",
"severity": "INFO",
"message": "invoke_agent.success",
"request_id": "550e8400-e29b",
"tokens": 245,
"latency_ms": 1230
}

Cloud Logging automatically parses and indexes these fields.


πŸ’° Cost Breakdown: Choose Based on Budget​

Monthly Cost Estimates (at 1M requests/month)​

PlatformBasePer-RequestSetupMonthly TotalBest For
Cloud Run$0~$0.405 min~$40Most apps
Agent Engine$0~$0.5010 min~$50Enterprise
GKE$50+Varies20 min$200-500+Complex
Custom + Cloud Run$0~$0.402 hrs~$60Special needs
Local Dev$0$0< 1 min$0Development

Notes:

  • Costs based on US pricing (may vary by region)
  • Includes compute + storage estimates
  • Actual costs depend on model, memory, CPU usage
  • Agent Engine includes managed infrastructure overhead
  • GKE includes cluster base cost + node costs

ROI Analysis:

  • Startup: Start with Cloud Run ($40/mo), move to Agent Engine ($50/mo) if compliance needed
  • Enterprise: Start with Agent Engine ($50/mo), includes compliance
  • Existing K8s: Use GKE ($200+/mo), leverages existing infrastructure

βœ… Deployment Verification: How to Verify It Works​

After Deploying to Cloud Run​

# 1. Get your service URL
SERVICE_URL=$(gcloud run services describe my-agent \
--region us-central1 \
--format 'value(status.url)')

# 2. Test health endpoint
curl $SERVICE_URL/health

# 3. Test agent invocation
curl -X POST $SERVICE_URL/invoke \
-H "Content-Type: application/json" \
-d '{"query": "Hello agent!", "temperature": 0.5}'

# 4. Check metrics
curl $SERVICE_URL/health | jq '.metrics'

After Deploying to Agent Engine​

# Agent Engine dashboard: https://console.cloud.google.com/vertex-ai/
# Check:
# - βœ… Agent deployed
# - βœ… Endpoints responding
# - βœ… Invocation successful
# - βœ… Audit logs appearing

Security Verification Checklist​

  • HTTPS/TLS working (curl shows https://)
  • Authentication enabled (get 401 on unauthenticated call)
  • CORS configured (check headers)
  • Health check responding (GET /health)
  • Logging to Cloud Logging (check console)
  • No API keys in logs (verify secrets not exposed)
  • Request timeouts working (test long-running query)
  • Error handling working (test invalid input)

See: DEPLOYMENT_CHECKLIST.md for complete verification steps.


✨ Best Practices for Production Deployment​

πŸ” Security (Platform Provides Most of This Automatically)​

What Cloud Run/Agent Engine Provides Automatically:

  • βœ… HTTPS/TLS encryption (handled by platform)
  • βœ… DDoS protection (included)
  • βœ… Encryption at rest (Google-managed)
  • βœ… Non-root container execution (enforced)
  • βœ… Binary vulnerability scanning (included)

What You Must Configure:

  • Use Secret Manager for API keys (never hardcode)
  • Enable authentication in Cloud Run console
  • Configure CORS with specific origins (never use wildcard *)
  • Set resource limits (memory, CPU)
  • Store secrets in Secret Manager (not .env)
  • Monitor error rates and latency

For Custom Server:

  • Implement request authentication (see Tutorial 23 examples)
  • Use Bearer token validation
  • Implement timeout protection
  • Validate input sizes
  • Handle errors securely (don't expose internals)

πŸ“Š Observability​

  • Export logs to Cloud Logging
  • Set up error tracking with Error Reporting
  • Monitor metrics with Cloud Monitoring
  • Use request IDs for tracing
  • Log important business events

⚑ Reliability​

  • Set request timeouts (30s recommended)
  • Implement health checks
  • Configure auto-scaling appropriately
  • Use load balancing
  • Plan for disaster recovery

πŸ“ˆ Performance​

  • Use connection pooling
  • Stream responses when possible
  • Cache agent configuration
  • Monitor memory usage
  • Use multiple workers

FastAPI Best Practices​

This implementation demonstrates 7 core production patterns:

  1. Configuration Management - Environment-based settings
  2. Authentication & Security - Bearer token validation
  3. Health Checks - Real metrics-based status
  4. Request Lifecycle - Timeout protection
  5. Error Handling - Typed exceptions
  6. Logging & Observability - Request tracing
  7. Metrics & Monitoring - Observable systems

πŸ“– Full Guide: FastAPI Best Practices for ADK Agents β†’

This guide includes:

  • βœ… Code examples for each pattern
  • βœ… ASCII diagrams showing flows
  • βœ… Production checklist
  • βœ… Common pitfalls (❌ Don't / βœ… Do)
  • βœ… Deployment examples

Common Patterns​

Pattern: Gradual Rollout​

Deploy to Cloud Run
|
v
Traffic: 5% (canary)
|
v
Monitor for 1 hour
|
+------ Error Rate High? -----> ROLLBACK
|
+------ Healthy? -------> 25% traffic
|
v
Monitor
|
+---> 100% traffic

Pattern: Zero-Downtime Deployment​

Blue-Green Deployment:

CURRENT (Blue)          NEW (Green)
| |
+----> BOTH ACTIVE <-----+
| | |
+--- LB routes traffic ---+
| |
+-- Health checks OK? ---|
| |
YES NO
| |
v v
Blue OFF Rollback (Blue ON)
Green ON Green OFF

Troubleshooting​

Agent Not Found in Dropdown​

Problem: adk web agent_name fails

Solution: Install as package first

pip install -e .
adk web # Then select from dropdown

GOOGLE_API_KEY Not Set​

export GOOGLE_API_KEY=your_key
# Or in Cloud Run: Set env var in Cloud Console

High Latency​

Check:

  1. Request timeout setting
  2. Agent complexity (use streaming)
  3. Resource limits (increase CPU)
  4. Model selection (try gemini-2.0-flash)

Memory Issues​

  • Reduce max_tokens
  • Enable request streaming
  • Use connection pooling
  • Monitor with Cloud Profiler

Quick Reference​

CLI Commands​

# Local
adk api_server --port 8080

# Deploy
adk deploy cloud_run --project PROJECT --region REGION
adk deploy agent_engine --project PROJECT --region REGION
adk deploy gke

# List deployments
adk list deployments

Environment Variables​

GOOGLE_CLOUD_PROJECT       # GCP project ID
GOOGLE_CLOUD_LOCATION # Region (us-central1)
GOOGLE_GENAI_USE_VERTEXAI # Use Vertex AI (1 or 0)
MODEL # Model name
API_KEY # Secret key for auth
REQUEST_TIMEOUT # Timeout in seconds

Endpoints​

GET  /                  # API info
GET /health # Health check + metrics
POST /invoke # Agent invocation
GET /docs # OpenAPI docs

Summary​

You now know:

  • βœ… Deploy locally for development
  • βœ… Deploy to Cloud Run for most production apps
  • βœ… Use Agent Engine for managed infrastructure
  • βœ… Use GKE for complex deployments
  • βœ… Configure and secure production systems
  • βœ… Monitor and observe agent systems
  • βœ… Implement reliability patterns

Deployment Checklist:

  • Environment variables configured
  • Secrets in Secret Manager
  • Health checks working
  • Monitoring/logging setup
  • Auto-scaling configured
  • CORS properly configured
  • Rate limiting enabled
  • Error handling tested
  • Disaster recovery planned

Next Steps:


Supporting Resources​

Comprehensive Guides​

Security Research​

Additional Resources​


πŸŽ‰ Tutorial 23 Complete! You're now ready to deploy agents to production. Proceed to Tutorial 24 for advanced observability.

πŸ’¬ Join the Discussion

Have questions or feedback? Discuss this tutorial with the community on GitHub Discussions.