23. Production Deployment Strategies
Goal: Understand ADK deployment options and implement production-grade agents with custom authentication, monitoring, and reliability patterns.
Prerequisites:
- Tutorial 01 (Hello World Agent)
- Google Cloud Platform account
- Basic Docker knowledge (helpful)
- Understanding of FastAPI (helpful)
What You'll Learn:
- β Deploy agents using ADK's built-in server (5 minutes)
- ποΈ Build production FastAPI servers with custom patterns (when needed)
- π Implement custom monitoring and observability
- π Add authentication and security patterns
- π Auto-scale across platforms
- π‘οΈ Understand when to use ADK vs custom server
Quick Decision Framework:
- 5 minutes to production? β Cloud Run β
- Need FedRAMP compliance? β Agent Engine β β
- Have Kubernetes? β GKE β
- Need custom auth? β Tutorial 23 + Cloud Run βοΈ
- Just testing locally? β Local Dev β‘
Time to Complete: 5 minutes (Cloud Run) to 2+ hours (custom patterns)
π― DECISION FRAMEWORK: Choose Your Platformβ
What's Your Situation?β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 1. QUICK MVP / MOVING FAST? β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Setup: 5 minutes | Cost: ~$40/mo | Security: Auto β
β β Use: CLOUD RUN β
β Best for: Startups, MVPs, most production apps β
β Deploy: adk deploy cloud_run --project ID --region us-central1 β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 2. NEED COMPLIANCE (FedRAMP, HIPAA, PCI-DSS)? β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Setup: 10 minutes | Cost: ~$50/mo | Security: Auto β
β
β β Use: AGENT ENGINE β
β
β Best for: Enterprise, government, compliance-heavy β
β Why: Only platform with FedRAMP compliance β
β Deploy: adk deploy agent_engine --project ID --region us-center β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 3. HAVE KUBERNETES / NEED FULL CONTROL? β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Setup: 20 minutes | Cost: $200-500/mo | Security: Configure βοΈ
β β Use: GKE β
β Best for: Complex deployments, existing Kubernetes shops β
β Deploy: kubectl apply -f deployment.yaml β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 4. NEED CUSTOM AUTH (LDAP, KERBEROS)? β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Setup: 2 hours | Cost: ~$60/mo | Security: Custom + Platform βοΈ
β β Use: TUTORIAL 23 + CLOUD RUN βοΈ
β Best for: Custom authentication requirements β
β Why: Platform doesn't support these auth methods natively β
β Note: Most users don't need this - use Cloud Run IAM instead β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β 5. JUST DEVELOPING LOCALLY? β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Setup: < 1 min | Cost: Free | Security: Add before deploy β‘
β β Use: LOCAL DEV β‘ β
β Best for: Development, prototyping, testing β
β Deploy: adk api_server β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Pick the box that matches your situation. That's your platform.
β οΈ Important: Understanding ADK's Deployment Modelβ
Key Insight: Security is Platform-Firstβ
ADK's built-in server is intentionally minimal by design. Here's why:
- β ADK provides: Input validation, session management, error handling
- β Platform provides: TLS/HTTPS, DDoS protection, authentication, compliance
- β Result: Secure production deployment with zero custom security code
See: Security Research Summary for complete analysis of what each platform secures automatically.
Custom Server (Tutorial 23) is ADVANCED & OPTIONALβ
You only need the custom FastAPI server if:
- You need custom authentication (LDAP, Kerberos, etc.)
- You need advanced logging beyond platform defaults
- You have specific business logic endpoints
- You're not using Google Cloud infrastructure
Most production deployments use Cloud Run + ADK's built-in. No custom server needed.
Platform Comparisonβ
| Platform | Security | Setup | Cost | Best For | Needs Custom Server? |
|---|---|---|---|---|---|
| Cloud Run | Auto β | 5 min | Pay-per-use | Most apps | β No |
| Agent Engine | Auto β β | 10 min | Pay-per-use | Enterprise | β No |
| GKE | Configure βοΈ | 20 min | Hourly | Complex | β No |
| Custom + Cloud Run | Hybrid βοΈ | 2 hrs | Pay-per-use | Special needs | β Yes |
| Local Dev | Minimal | < 1 min | Free | Development | β Yes (add locally) |
See: Complete Security Analysis for detailed security breakdown per platform.
π Security First: What's Automatic vs Manualβ
Important Discovery: Each platform provides different levels of automatic security.
Security by Platform (Quick Reference)β
| Security Feature | Cloud Run | Agent Engine | GKE | Local |
|---|---|---|---|---|
| HTTPS/TLS | β Auto | β Auto | β Manual | β |
| DDoS Protection | β Auto | β Auto | β | β |
| Authentication | β Auto (IAM) | β Auto (OAuth) | βοΈ Manual | β |
| Encryption at Rest | β Auto | β Auto | β Manual | β |
| Audit Logging | β Auto | β Auto | β Manual | β |
| Compliance Ready | β HIPAA, PCI | β β FedRAMP | β All | β |
Key Message: Cloud Run and Agent Engine give you production-ready security with zero configuration. All security is automatic.
Read the Full Security Analysisβ
For comprehensive details on what's secure across all platforms:
-
π SECURITY_RESEARCH_SUMMARY.md - Executive summary (5 min read)
- What ADK provides vs what platforms provide
- When you actually need custom authentication
- Platform security capabilities comparison
- Real-world use case recommendations
-
π SECURITY_ANALYSIS_ALL_DEPLOYMENT_OPTIONS.md - Comprehensive (20 min read)
- Detailed security breakdown per platform
- Compliance certifications
- Platform-specific security checklists
- Security verification steps
- When to use custom server
Bottom Line: "ADK's built-in server is secure by design because platform security is the foundation."
Quick Reference: Understanding ADK's Deploymentβ
What Happens When You Run adk deploy cloud_run?β
Your Agent Code
β
[ADK Generates]
βββ Dockerfile
βββ main.py (using get_fast_api_app() from ADK)
βββ requirements.txt
β
[Builds Container]
β
[Deploys to Cloud Run]
β
β
Live FastAPI Server
(with basic endpoints only)
What's Inside ADK's Built-In Server?β
Provided by get_fast_api_app():
- β
GET /- API info - β
GET /health- Health check - β
GET /agents- List agents - β
POST /invoke- Run agent - β Session management
NOT Provided:
- β Custom authentication
- β Custom logging
- β Custom metrics
- β Rate limiting
- β Circuit breakers
When You Need a Custom Serverβ
The custom server in this repository (Tutorial 23) adds:
- β Custom authentication
- β Structured logging with request tracing
- β Health checks with real metrics
- β Request timeouts and circuit breaking
- β Custom error handling
- β Full observability
See: DEPLOYMENT_OPTIONS_EXPLAINED.md for complete details
Time to Complete: 45 minutes
π Real-World Scenarios: Which Platform for Which Situation?β
Scenario 1: Startup Building MVPβ
Your Situation: Moving fast, limited resources, want to deploy this week.
What You Need:
- Deployment in < 5 minutes
- Automatic security (don't want to manage this)
- Pay only for what you use
- Can iterate quickly
Recommendation: β Cloud Run
Why:
- Fastest time to market (5 minutes!)
- Secure by default (HTTPS, DDoS, IAM)
- Cost-effective (~$40/mo for 1M requests)
- No infrastructure to manage
Deploy:
adk deploy cloud_run \
--project your-project-id \
--region us-central1
Cost: ~$40/month (1M requests) + $0.30/CPU-month
Next Step: As you grow, consider Agent Engine for better compliance.
Scenario 2: Enterprise System (Need Compliance)β
Your Situation: Building for enterprise customers, need FedRAMP or HIPAA compliance.
What You Need:
- FedRAMP compliance (government-ready)
- HIPAA/PCI-DSS certifications
- Zero infrastructure management
- Immutable audit logs
- Sandboxed execution
Recommendation: β β Agent Engine (ONLY PLATFORM WITH FedRAMP)
Why:
- Only platform with FedRAMP compliance built-in
- Google manages all security/compliance
- Zero configuration needed
- Best for highly regulated industries
Deploy:
adk deploy agent_engine \
--project your-project-id \
--region us-central1 \
--agent-name my-agent
Cost: ~$50/month (1M requests) + usage
Benefits:
- FedRAMP compliance
- SOC 2 Type II certified
- Automatic audit logging
- Content safety filters
- No ops burden
Next Step: Already production-ready. Focus on agent safety.
Scenario 3: Kubernetes Shopβ
Your Situation: Your company runs Kubernetes infrastructure, you want ADK in that environment.
What You Need:
- Deploy in existing Kubernetes cluster
- Full control over configuration
- NetworkPolicy for traffic control
- Workload Identity integration
- Pod resource limits
Recommendation: β GKE (or any Kubernetes)
Why:
- Leverage existing infrastructure
- Full control over security config
- Support for complex networking
- Advanced observability
Deploy:
kubectl apply -f deployment.yaml
Cost: $200-500+/month (based on cluster size)
Requires:
- Kubernetes expertise
- Manual security configuration
- Pod security setup
- RBAC configuration
Next Step: Use GKE Autopilot to simplify security.
Scenario 4: Custom Authentication Requiredβ
Your Situation: You need LDAP, Kerberos, or other custom authentication not available on platforms.
What You Need:
- Custom authentication provider
- Custom API endpoints
- Advanced logging
- Specific business logic
Recommendation: βοΈ Tutorial 23 Custom Server + Cloud Run
Why:
- Cloud Run provides platform security
- Tutorial 23 provides custom authentication
- Combined = secure + custom
Deploy:
# 1. Use custom server from Tutorial 23
cd tutorial_implementation/tutorial23
# 2. Deploy to Cloud Run
adk deploy cloud_run \
--project your-project-id \
--region us-central1
Cost: ~$60/month (on Cloud Run) + custom server complexity
Note: MOST USERS DON'T NEED THIS
- Use Cloud Run IAM for standard authentication
- Use Agent Engine OAuth for standards
- Only use this if platforms don't support your auth method
Effort: 2+ hours to implement custom server
Scenario 5: Local Developmentβ
Your Situation: Building and testing locally before deploying.
What You Need:
- Fast iteration loop
- Hot reload on code changes
- Easy testing
- No infrastructure needed
Recommendation: β‘ Local Dev (add security before deploy)
Why:
- Zero setup time
- Instant feedback
- Free
- Perfect for development
Run Locally:
# Start dev server
adk api_server
# Or use custom server
python -m uvicorn production_agent.server:app --reload
Before Production:
- Add authentication layer
- Test with HTTPS (use ngrok)
- Verify security settings
- Move to Cloud Run
Cost: Free (local)
Next Step: Deploy to Cloud Run when ready for production.
Path 1: Simple Deployment (Recommended)β
5-Minute Quick Start with ADK's Built-In Serverβ
Want to deploy NOW? Use this command:
# Cloud Run
adk deploy cloud_run \
--project your-project-id \
--region us-central1 \
./your_agent_directory
# GKE
adk deploy gke \
--project your-project-id \
--cluster_name my-cluster \
--region us-central1 \
./your_agent_directory
# Agent Engine
adk deploy agent_engine \
--project your-project-id \
--region us-central1 \
./your_agent_directory
β That's it! Your agent is live in 5 minutes.
What you get:
- Automatic container build
- FastAPI server with basic endpoints
- Auto-scaling
- Public HTTPS URL
- Session management
/healthendpoint- No custom code needed
ποΈ Advanced: When You Need a Custom FastAPI Serverβ
β οΈ Important: Most Users Don't Need Thisβ
First Check: Do you actually need a custom server?
- β Use Cloud Run + ADK's built-in if you need standard authentication (IAM, OAuth)
- β Use Agent Engine if you need compliance/security
- β Use GKE if you need Kubernetes control
- βοΈ Use Custom Server ONLY if you have special needs below
When Custom Server is Actually Neededβ
You need Tutorial 23's custom server IF:
-
Custom authentication (LDAP, Kerberos, API keys)
- Cloud Run IAM doesn't support it
- Agent Engine OAuth doesn't work for you
- You have proprietary auth system
-
Advanced logging/observability beyond platform defaults
- Custom request correlation IDs
- Business event tracking
- Custom metrics
-
Additional API endpoints for business logic
- Webhooks
- Custom health checks
- Integration endpoints
-
Non-Google infrastructure
- Running on AWS, Azure, on-premises
- Portable solution needed
If none of these apply: Use Cloud Run or Agent Engine. Much simpler.
What Tutorial 23 Providesβ
This tutorial includes a complete, production-ready implementation:
tutorial23/
βββ production_agent/
β βββ agent.py # Agent with 3 tools
β βββ server.py # FastAPI server (488 lines)
βββ tests/ # 40 comprehensive tests
βββ Makefile # Commands: setup, dev, test, demo
βββ FASTAPI_BEST_PRACTICES.md # 7 core patterns guide
βββ README.md # Complete documentation
Key Features (If You Need Custom Server):
- β Custom authentication with API keys
- β Structured logging with request tracing
- β Health checks with real metrics
- β Error handling and validation
- β Request timeouts and circuit breaking
- β 40 passing tests (93% coverage)
- β Production-ready patterns
π Full Implementation: View on GitHub β
Security Note: Tutorial 23 is ADVANCED pattern. It adds application-layer features but depends on platform-layer security from Cloud Run or your infrastructure.
Quick Start (5 minutes)β
cd tutorial_implementation/tutorial23
# Setup
make setup
# Run development server
export GOOGLE_API_KEY=your_key
make dev
# Run tests
make test
# See demos
make demo-info
Open http://localhost:8000 and select production_deployment_agent from dropdown.
Deployment Strategiesβ
ADK supports multiple deployment paths. Choose based on your needs:
Comparison Matrixβ
| Strategy | Setup Time | Scaling | Cost | Best For |
|---|---|---|---|---|
| Local | < 1 min | Manual | Free | Development |
| Cloud Run | 5 mins | Auto | Pay-per-use | Most apps |
| Agent Engine | 10 mins | Auto | Pay-per-use | Enterprise |
| GKE | 20 mins | Manual | Hourly | Complex |
1. Local Developmentβ
Perfect for: Quick testing and iteration
# Start FastAPI server
adk api_server
# Custom port
adk api_server --port 8090
Test it:
curl http://localhost:8080/health
curl -X POST http://localhost:8080/invoke \
-H "Content-Type: application/json" \
-d '{"query": "Hello!"}'
Features:
- π Hot reload during development
- π Auto-generated API docs at
/docs - β‘ Instant feedback loop
See tutorial implementation for custom server code.
2. Cloud Run (Recommended for Most Apps)β
Perfect for: Serverless auto-scaling with minimal ops
# Deploy in one command
adk deploy cloud_run \
--project your-project-id \
--region us-central1 \
--service-name my-agent
That's it! ADK handles:
- β Building container image
- β Pushing to Container Registry
- β Deploying to Cloud Run
- β Setting up auto-scaling
Manual Alternative:
# 1. Build
gcloud builds submit --tag gcr.io/YOUR_PROJECT/agent
# 2. Deploy
gcloud run deploy agent \
--image gcr.io/YOUR_PROJECT/agent \
--platform managed \
--region us-central1 \
--memory 2Gi \
--max-instances 100
Cost: ~$0.40 per million requests + compute
3. Vertex AI Agent Engineβ
Perfect for: Managed agent infrastructure with built-in versioning
# Deploy to managed service
adk deploy agent_engine \
--project your-project-id \
--region us-central1 \
--agent-name my-agent
Benefits:
- π¦ Managed infrastructure
- π― Version control
- π A/B testing
- π Built-in monitoring
- π Enterprise security
4. Google Kubernetes Engine (GKE)β
Perfect for: Complex deployments needing full control
# Create cluster
gcloud container clusters create agent-cluster \
--region us-central1 \
--machine-type n1-standard-2 \
--num-nodes 3
# Get credentials
gcloud container clusters get-credentials agent-cluster \
--region us-central1
# Deploy
kubectl apply -f deployment.yaml
When to use GKE:
- Need advanced networking
- Running multiple services
- Existing Kubernetes expertise
- Custom orchestration requirements
See tutorial implementation for full Kubernetes manifests.
Deployment Flow Diagramβ
YOUR AGENT CODE
|
v
+-------------------+
| adk deploy XXXX |
+-------------------+
|
+-------+-------+-------+-------+
| | | | |
v v v v v
LOCAL CLOUD-RUN AGENT-ENG GKE CUSTOM
| | | | |
v v v v v
localhost serverless managed k8s your-infra
Production Setupβ
Environment Configurationβ
Create .env file (never commit!):
# Google Cloud
GOOGLE_CLOUD_PROJECT=your-project-id
GOOGLE_CLOUD_LOCATION=us-central1
GOOGLE_GENAI_USE_VERTEXAI=1
# Application
MODEL=gemini-2.0-flash
TEMPERATURE=0.5
MAX_TOKENS=2048
# Security
API_KEY=your-secret-key
ALLOWED_ORIGINS=https://yourdomain.com
# Monitoring
LOG_LEVEL=INFO
ENABLE_TRACING=true
Health Checksβ
All deployments should expose /health endpoint:
GET /health
{
"status": "healthy",
"uptime_seconds": 3600,
"request_count": 1250,
"error_count": 3,
"error_rate": 0.0024,
"metrics": {
"successful_requests": 1247,
"timeout_count": 0
}
}
Configure in orchestrator:
- Cloud Run: Automatically detected
- GKE: Set as liveness probe
- Agent Engine: Built-in
Secrets Managementβ
Never commit API keys to code. Use Google Secret Manager:
from google.cloud import secretmanager
def get_secret(secret_id: str) -> str:
client = secretmanager.SecretManagerServiceClient()
project = os.environ['GOOGLE_CLOUD_PROJECT']
name = f"projects/{project}/secrets/{secret_id}/versions/latest"
response = client.access_secret_version(request={"name": name})
return response.payload.data.decode('UTF-8')
# Usage
api_key = get_secret('api-key')
Monitoring & Observabilityβ
Key Metrics to Trackβ
| Metric | Target | Alert Threshold |
|---|---|---|
| Error Rate | < 0.5% | > 5% |
| P99 Latency | < 2 sec | > 5 sec |
| Availability | > 99.9% | < 99% |
| Request Count | Track | N/A |
Structured Loggingβ
All production servers should log JSON to stdout:
{
"timestamp": "2025-01-17T10:30:45Z",
"severity": "INFO",
"message": "invoke_agent.success",
"request_id": "550e8400-e29b",
"tokens": 245,
"latency_ms": 1230
}
Cloud Logging automatically parses and indexes these fields.
π° Cost Breakdown: Choose Based on Budgetβ
Monthly Cost Estimates (at 1M requests/month)β
| Platform | Base | Per-Request | Setup | Monthly Total | Best For |
|---|---|---|---|---|---|
| Cloud Run | $0 | ~$0.40 | 5 min | ~$40 | Most apps |
| Agent Engine | $0 | ~$0.50 | 10 min | ~$50 | Enterprise |
| GKE | $50+ | Varies | 20 min | $200-500+ | Complex |
| Custom + Cloud Run | $0 | ~$0.40 | 2 hrs | ~$60 | Special needs |
| Local Dev | $0 | $0 | < 1 min | $0 | Development |
Notes:
- Costs based on US pricing (may vary by region)
- Includes compute + storage estimates
- Actual costs depend on model, memory, CPU usage
- Agent Engine includes managed infrastructure overhead
- GKE includes cluster base cost + node costs
ROI Analysis:
- Startup: Start with Cloud Run ($40/mo), move to Agent Engine ($50/mo) if compliance needed
- Enterprise: Start with Agent Engine ($50/mo), includes compliance
- Existing K8s: Use GKE ($200+/mo), leverages existing infrastructure
β Deployment Verification: How to Verify It Worksβ
After Deploying to Cloud Runβ
# 1. Get your service URL
SERVICE_URL=$(gcloud run services describe my-agent \
--region us-central1 \
--format 'value(status.url)')
# 2. Test health endpoint
curl $SERVICE_URL/health
# 3. Test agent invocation
curl -X POST $SERVICE_URL/invoke \
-H "Content-Type: application/json" \
-d '{"query": "Hello agent!", "temperature": 0.5}'
# 4. Check metrics
curl $SERVICE_URL/health | jq '.metrics'
After Deploying to Agent Engineβ
# Agent Engine dashboard: https://console.cloud.google.com/vertex-ai/
# Check:
# - β
Agent deployed
# - β
Endpoints responding
# - β
Invocation successful
# - β
Audit logs appearing
Security Verification Checklistβ
- HTTPS/TLS working (curl shows https://)
- Authentication enabled (get 401 on unauthenticated call)
- CORS configured (check headers)
- Health check responding (GET /health)
- Logging to Cloud Logging (check console)
- No API keys in logs (verify secrets not exposed)
- Request timeouts working (test long-running query)
- Error handling working (test invalid input)
See: DEPLOYMENT_CHECKLIST.md for complete verification steps.
β¨ Best Practices for Production Deploymentβ
π Security (Platform Provides Most of This Automatically)β
What Cloud Run/Agent Engine Provides Automatically:
- β HTTPS/TLS encryption (handled by platform)
- β DDoS protection (included)
- β Encryption at rest (Google-managed)
- β Non-root container execution (enforced)
- β Binary vulnerability scanning (included)
What You Must Configure:
- Use Secret Manager for API keys (never hardcode)
- Enable authentication in Cloud Run console
- Configure CORS with specific origins (never use wildcard
*) - Set resource limits (memory, CPU)
- Store secrets in Secret Manager (not .env)
- Monitor error rates and latency
For Custom Server:
- Implement request authentication (see Tutorial 23 examples)
- Use Bearer token validation
- Implement timeout protection
- Validate input sizes
- Handle errors securely (don't expose internals)
π Observabilityβ
- Export logs to Cloud Logging
- Set up error tracking with Error Reporting
- Monitor metrics with Cloud Monitoring
- Use request IDs for tracing
- Log important business events
β‘ Reliabilityβ
- Set request timeouts (30s recommended)
- Implement health checks
- Configure auto-scaling appropriately
- Use load balancing
- Plan for disaster recovery
π Performanceβ
- Use connection pooling
- Stream responses when possible
- Cache agent configuration
- Monitor memory usage
- Use multiple workers
FastAPI Best Practicesβ
This implementation demonstrates 7 core production patterns:
- Configuration Management - Environment-based settings
- Authentication & Security - Bearer token validation
- Health Checks - Real metrics-based status
- Request Lifecycle - Timeout protection
- Error Handling - Typed exceptions
- Logging & Observability - Request tracing
- Metrics & Monitoring - Observable systems
π Full Guide: FastAPI Best Practices for ADK Agents β
This guide includes:
- β Code examples for each pattern
- β ASCII diagrams showing flows
- β Production checklist
- β Common pitfalls (β Don't / β Do)
- β Deployment examples
Common Patternsβ
Pattern: Gradual Rolloutβ
Deploy to Cloud Run
|
v
Traffic: 5% (canary)
|
v
Monitor for 1 hour
|
+------ Error Rate High? -----> ROLLBACK
|
+------ Healthy? -------> 25% traffic
|
v
Monitor
|
+---> 100% traffic
Pattern: Zero-Downtime Deploymentβ
Blue-Green Deployment:
CURRENT (Blue) NEW (Green)
| |
+----> BOTH ACTIVE <-----+
| | |
+--- LB routes traffic ---+
| |
+-- Health checks OK? ---|
| |
YES NO
| |
v v
Blue OFF Rollback (Blue ON)
Green ON Green OFF
Troubleshootingβ
Agent Not Found in Dropdownβ
Problem: adk web agent_name fails
Solution: Install as package first
pip install -e .
adk web # Then select from dropdown
GOOGLE_API_KEY Not Setβ
export GOOGLE_API_KEY=your_key
# Or in Cloud Run: Set env var in Cloud Console
High Latencyβ
Check:
- Request timeout setting
- Agent complexity (use streaming)
- Resource limits (increase CPU)
- Model selection (try
gemini-2.0-flash)
Memory Issuesβ
- Reduce max_tokens
- Enable request streaming
- Use connection pooling
- Monitor with Cloud Profiler
Quick Referenceβ
CLI Commandsβ
# Local
adk api_server --port 8080
# Deploy
adk deploy cloud_run --project PROJECT --region REGION
adk deploy agent_engine --project PROJECT --region REGION
adk deploy gke
# List deployments
adk list deployments
Environment Variablesβ
GOOGLE_CLOUD_PROJECT # GCP project ID
GOOGLE_CLOUD_LOCATION # Region (us-central1)
GOOGLE_GENAI_USE_VERTEXAI # Use Vertex AI (1 or 0)
MODEL # Model name
API_KEY # Secret key for auth
REQUEST_TIMEOUT # Timeout in seconds
Endpointsβ
GET / # API info
GET /health # Health check + metrics
POST /invoke # Agent invocation
GET /docs # OpenAPI docs
Summaryβ
You now know:
- β Deploy locally for development
- β Deploy to Cloud Run for most production apps
- β Use Agent Engine for managed infrastructure
- β Use GKE for complex deployments
- β Configure and secure production systems
- β Monitor and observe agent systems
- β Implement reliability patterns
Deployment Checklist:
- Environment variables configured
- Secrets in Secret Manager
- Health checks working
- Monitoring/logging setup
- Auto-scaling configured
- CORS properly configured
- Rate limiting enabled
- Error handling tested
- Disaster recovery planned
Next Steps:
- Tutorial 24: Advanced Observability - Deep observability patterns
- Tutorial 25: Best Practices & Patterns - Production patterns
- π Deploy your own agent to production!
Supporting Resourcesβ
Comprehensive Guidesβ
- π Security Verification Guide β - Step-by-step verification for each platform
- π Migration Guide β - Safe migration between all platforms
- π° Cost Breakdown Analysis β - Detailed pricing for budget planning
- β Deployment Checklist β - Pre/during/post deployment verification
Security Researchβ
- π Security Research Summary β - Executive summary of platform security
- π Detailed Security Analysis β - Per-platform security breakdown
Additional Resourcesβ
- π Tutorial Implementation β
- π FastAPI Best Practices Guide β
- π Cloud Run Docs
- π€ Agent Engine Docs
- βοΈ GKE Docs
- π Secret Manager
π Tutorial 23 Complete! You're now ready to deploy agents to production. Proceed to Tutorial 24 for advanced observability.
π¬ Join the Discussion
Have questions or feedback? Discuss this tutorial with the community on GitHub Discussions.