Skip to main content
InfraGap.com Logo
Home
Getting Started
Core Concept What is a CDE? How It Works Benefits CDE Assessment Getting Started Guide CDEs for Startups
AI & Automation
AI Coding Assistants Agentic AI AI-Native IDEs Agentic Engineering AI Agent Orchestration AI Governance AI-Assisted Architecture Shift-Left AI LLMOps Autonomous Development AI/ML Workloads GPU Computing
Implementation
Architecture Patterns DevContainers Advanced DevContainers Language Quickstarts IDE Integration CI/CD Integration Platform Engineering Developer Portals Container Registry Multi-CDE Strategies Remote Dev Protocols Nix Environments
Operations
Performance Optimization High Availability & DR Disaster Recovery Monitoring Capacity Planning Multi-Cluster Development Troubleshooting Runbooks Ephemeral Environments
Security
Security Deep Dive Zero Trust Architecture Secrets Management Vulnerability Management Network Security IAM Guide Supply Chain Security Air-Gapped Environments AI Agent Security MicroVM Isolation Compliance Guide Governance
Planning
Pilot Program Design Stakeholder Communication Risk Management Migration Guide Cost Analysis FinOps GreenOps Vendor Evaluation Training Resources Developer Onboarding Team Structure DevEx Metrics Industry Guides
Resources
Tools Comparison CDE vs Alternatives Case Studies Lessons Learned Glossary FAQ

Pilot Program Design

Structure your CDE pilot for success with team selection criteria, success metrics, 90-day evaluation scorecards, and go/no-go decision frameworks.

Pilot Team Selection Criteria

Choose teams that will set your pilot up for success

Ideal Pilot Team Characteristics

  • Enthusiastic team lead

    Manager willing to champion the change and handle resistance

  • Modern tech stack

    Teams using containers, VS Code, or standard tooling (not legacy)

  • Team size 5-15 developers

    Large enough for valid data, small enough to manage closely

  • Stable roadmap

    Not in middle of critical deadline or major refactoring

  • Recent or planned hires

    Can demonstrate onboarding improvements immediately

  • AI agent readiness

    Teams already using AI coding assistants or planning to adopt AI agent workflows

Avoid for Initial Pilot

  • Teams under deadline pressure

    Any friction will be blamed on CDE, not given fair evaluation

  • Specialized hardware needs

    GPU workloads, embedded development, iOS builds (for first pilot)

  • Known skeptics/blockers

    Don't start with teams vocally opposed - win them later with success

  • Legacy monolith teams

    Complex local dependencies increase pilot complexity

  • High-latency regions

    Teams far from your cloud region will have poor experience

  • Uncontrolled AI agent usage

    Teams running AI agents without guardrails add unpredictable cost and security risk to the pilot

Team Selection Scorecard

Criteria Weight Team A Team B Team C
Manager enthusiasm (1-5) 3x
Tech stack compatibility (1-5) 2x
Roadmap stability (1-5) 2x
Team size (5-15 = 5, else lower) 1x
Onboarding needs (1-5) 2x
AI agent readiness (1-5) 2x
Weighted Total (max 60) - - -

Pilot Success Metrics

Define what success looks like before you start

Productivity

  • Onboarding time < 4 hours
  • Workspace startup < 5 min
  • Env issues/week < 2

Experience

  • Developer NPS > +30
  • Would recommend > 70%
  • Satisfaction score > 4.0/5

Reliability

  • Platform uptime > 99.5%
  • P95 latency < 100ms
  • Support tickets < 5/week

Adoption

  • Daily active users > 80%
  • Local dev usage < 10%
  • Return to local < 5%

AI Agents

  • Agent task success > 75%
  • Agent cost/task < budget
  • Sandbox escapes 0

Cost Control

  • Cost per dev/month < target
  • LLM cost attribution 100%
  • Idle workspace waste < 10%

90-Day Evaluation Scorecard

Weekly checkpoints for pilot evaluation

Phase Week Milestone Success Criteria Status
Setup 1 Infrastructure deployed Platform accessible, SSO working -
2 Training complete 100% pilot team attended -
2 AI agent sandbox configured Agent workspaces isolated, cost limits set -
Active Pilot 3 First sprint on CDE No blockers, sprint completed -
4 First pulse survey Satisfaction > 3.5/5 -
5-6 Steady state DAU > 70%, < 3 support tickets -
6 AI agent workflow validation Agent tasks completing in sandbox, no escapes -
7-8 New hire onboarding Onboarding < 4 hours -
9-10 Edge case testing Complex workflows and AI agent edge cases validated -
11 Final survey NPS > +30, recommend > 70% -
Decision 12 Data analysis All metrics compiled, AI agent cost review -
13 Go/No-Go Decision Executive presentation -

AI Agent Pilot Considerations

In 2026, any CDE pilot should account for AI agent workloads alongside human developers

Why AI agents change CDE pilots

AI coding agents like Claude Code, GitHub Copilot, and Cursor autonomously spin up workspaces, generate code, run tests, and submit pull requests. CDE platforms such as Coder, Ona (formerly Gitpod), and GitHub Codespaces now serve both human developers and autonomous agents. Your pilot must evaluate how well the platform handles both workload types, including sandboxing, cost attribution, and lifecycle management for unattended agent sessions.

Sandbox Isolation

AI agents must run in isolated workspaces with no access to production data or other developer environments. Validate that your CDE enforces strict network and filesystem boundaries for agent sessions.

  • Network egress restrictions per workspace
  • Filesystem isolation between agent and human workspaces
  • Automatic workspace termination on timeout

Cost Attribution

AI agents can run unattended workloads around the clock, making cost tracking critical. Your pilot needs to separate agent compute and LLM API costs from human developer workspace costs.

  • Per-agent and per-developer cost tagging
  • LLM token usage tracked per workspace
  • Budget alerts before runaway spend

Agent Lifecycle

Unlike human developers who close their laptops, AI agents can run indefinitely. Your pilot should define how agent workspaces are created, monitored, and automatically cleaned up.

  • Maximum session duration limits
  • Auto-shutdown on idle or task completion
  • Workspace cleanup and artifact retention policies

Agent Observability

You need visibility into what AI agents are doing inside CDE workspaces. Evaluate whether the platform provides audit logs, action traces, and output review for autonomous agent sessions.

  • Full audit trail of agent actions
  • Human-in-the-loop review gates
  • Anomaly detection for unexpected behavior

Platform Compatibility

Not all CDE platforms handle AI agent workloads equally. During your pilot, evaluate how well your chosen platform (Coder, Ona, Codespaces, DevPod, Daytona) supports headless agent sessions.

  • Headless workspace creation via API
  • Template support for agent-specific images
  • Resource quotas separate from human workspaces

Governance Policies

Define clear guardrails for AI agent behavior during the pilot. Establish which repos agents can access, what actions they can take autonomously, and what requires human approval.

  • Repository access allowlists for agents
  • PR merge requires human approval
  • No direct production deployment by agents

AI Agent Pilot Readiness Checklist

Before launch

  • Agent workspace templates created and tested
  • Network isolation rules configured
  • Cost budget and alerts set for agent workloads
  • Maximum session duration defined
  • Audit logging enabled for agent workspaces

During pilot

  • Monitor agent compute costs weekly
  • Review agent output quality and accuracy
  • Track sandbox escape attempts (should be zero)
  • Measure developer trust in agent-generated code
  • Validate idle workspace auto-shutdown is working

Go/No-Go Decision Framework

Objective criteria for the expansion decision

GO

All must be true

  • Developer satisfaction > 4.0/5
  • Platform uptime > 99.5%
  • No critical blockers unresolved
  • DAU > 80% of pilot team
  • Onboarding < 4 hours achieved
  • Manager recommends expansion
  • AI agent costs within budget
  • Zero sandbox escape incidents

CONDITIONAL

Extend pilot 30 days

  • Satisfaction 3.5-4.0 (trending up)
  • Uptime 99-99.5% (fixable issues)
  • 1-2 blockers with clear fix path
  • DAU 60-80% (adoption growing)
  • Mixed manager feedback
  • Agent costs over budget but trending down

NO-GO

Any one triggers stop

  • Developer satisfaction < 3.5/5
  • Platform uptime < 99%
  • Security incident occurred
  • DAU < 60% (low adoption)
  • > 3 critical unfixed blockers
  • Manager recommends rollback
  • AI agent sandbox breach detected
  • Uncontrolled agent cost overruns