What are the main risks of CDE adoption?

Key risks include: vendor lock-in, platform outages affecting all developers, network dependency, migration complexity, change resistance, and unexpected costs. Mitigate with multi-cloud strategy, DR planning, and phased rollout.

How do I plan for CDE vendor exit?

Exit planning includes: using standard DevContainers (portable across platforms), avoiding proprietary extensions, maintaining ability to run locally, documenting all configurations, and negotiating data export rights in contracts.

What is the rollback plan if CDE adoption fails?

Maintain local development capability during transition. Keep laptop configurations current, document pre-CDE setup procedures, set clear go/no-go criteria for each rollout phase, and be prepared to revert teams that have issues.

CDE Risk Management - Rollback, Exit Plans & DR

CDE Risk Assessment Matrix

Identify, assess, and prioritize risks before implementation

Technical Risks

HIGH Control plane single point of failure
HIGH Network latency affecting developer experience
MED Storage performance degradation
MED IDE plugin compatibility issues
LOW Template drift across environments

Organizational Risks

HIGH Developer resistance to workflow change
HIGH Insufficient platform engineering resources
MED Knowledge concentration in few individuals
MED Lack of executive sponsorship
LOW Shadow IT local development

Vendor Risks

HIGH Vendor acquisition or product discontinuation
MED Significant pricing changes
MED Feature deprecation without alternatives
MED Support quality degradation
LOW API breaking changes

AI Agent & LLM Risks

HIGH AI agent sandbox escape or privilege escalation
HIGH Uncontrolled LLM token costs from autonomous agents
HIGH Sensitive code or secrets leaked via LLM context windows
MED AI-generated code introducing vulnerabilities at scale
MED Prompt injection via malicious repository content

Compliance & Data Risks

HIGH Source code sent to external LLM APIs without approval
HIGH Regulatory violations from AI training on customer data
MED Audit trail gaps for AI agent actions in workspaces
MED IP ownership disputes over AI-generated code
LOW License contamination from AI-suggested dependencies

Risk Scoring Framework

Risk Factor	Probability (1-5)	Impact (1-5)	Score	Mitigation Priority
Control plane outage	3	5	15	Critical - Immediate
Developer productivity loss	4	4	16	Critical - Immediate
Vendor discontinuation	2	5	10	High - Plan within 30 days
Cost overrun	3	3	9	High - Plan within 30 days
Security breach	2	5	10	High - Plan within 30 days
AI agent sandbox escape	3	5	15	Critical - Immediate
LLM data exfiltration	3	4	12	Critical - Immediate
Uncontrolled AI token spend	4	3	12	Critical - Immediate
AI-generated vulnerability at scale	3	4	12	Critical - Immediate

Migration Failure Scenarios & Mitigation

Prepare for common migration failures with proven mitigation strategies

Scenario: Control Plane Becomes Unresponsive During Peak Hours

Impact

All developers unable to access workspaces
Active work sessions terminated
Potential data loss in unsaved work

Mitigation

Deploy HA control plane (3+ replicas)
Enable workspace persistence during outages
Configure auto-save intervals (every 30s)

Rollback Trigger

Outage > 4 hours in production
3+ incidents in 7 days
Developer productivity < 50%

Scenario: Network Latency Makes Development Unusable

Impact

Keystroke delays >200ms
IDE features timeout or fail
Developer frustration and workarounds

Mitigation

Deploy in multiple regions
Use WireGuard/Tailscale for optimization
Enable local file sync with Mutagen

Rollback Trigger

P95 latency > 150ms sustained
Developer survey score < 3/5
Local development requests > 20%

Scenario: Vendor Announces Product Discontinuation

Impact

12-18 month migration timeline
Template/automation rewrite required
Training and process changes

Mitigation

Use Terraform for infrastructure portability
DevContainers for portable configs
Maintain alternative vendor relationship

Rollback Trigger

Sunset notice with < 18 months
Acquisition by competitor
Key feature removal announcement

Scenario: AI Agent Escapes Sandbox or Exfiltrates Data

Impact

Source code or secrets sent to external LLM APIs
Autonomous agent modifies production infrastructure
Compliance violation if PII enters model context

Mitigation

Run agents in Firecracker microVM sandboxes
Enforce network egress allowlists per workspace
Require human-in-the-loop for destructive operations

Rollback Trigger

Any confirmed data exfiltration event
Agent action outside approved scope
Sandbox breakout detected in monitoring

Scenario: LLM Token Costs Spike Beyond Budget

Impact

Autonomous agents running overnight burn tokens
Monthly AI spend exceeds entire CDE budget
No per-team or per-project cost attribution

Mitigation

Set per-user and per-team token budgets with hard caps
Auto-terminate agent sessions exceeding time limits
Deploy LLM gateway proxy for cost tracking and limits

Rollback Trigger

Token spend > 150% of monthly budget
Single agent session > $500 with no output
No ROI improvement after 90-day evaluation

Rollback Procedures

Step-by-step procedures for different rollback scenarios

Rollback Decision Timeline

0-2h

Immediate Response

Investigate issue, engage platform team, communicate status to affected developers

2-4h

Escalation

Engage vendor support (if applicable), prepare partial rollback, enable local development fallback

4-8h

Partial Rollback Decision

Enable hybrid mode - critical teams return to local, non-critical stay on CDE

8h+

Full Rollback

Execute full rollback procedure, transition all developers to local development

Full Rollback Procedure

1

Export All Workspace Data

# Export all user workspace files
for workspace in $(coder workspaces list --all -o json | jq -r '.[].name'); do
    coder ssh $workspace "tar -czf /tmp/workspace-backup.tar.gz ~/projects"
    coder scp $workspace:/tmp/workspace-backup.tar.gz ./backups/$workspace.tar.gz
done

# Export configuration and templates
coder templates export --all -o ./backups/templates/
kubectl get configmap -n coder -o yaml > ./backups/k8s-configs.yaml

2

Notify All Stakeholders

# Send notification via Slack/Teams/Email
Subject: [ACTION REQUIRED] CDE Rollback in Progress

Dear Developers,

Due to [REASON], we are initiating a rollback to local development.

Timeline:
- [TIME]: Begin workspace data export
- [TIME+2h]: Disable new workspace creation
- [TIME+4h]: All workspaces terminated
- [TIME+6h]: Local dev environment required

Action Required:
1. Save all current work immediately
2. Pull local copies of your repositories
3. Set up local development environment per: [WIKI_LINK]

Support: #platform-engineering or page Platform On-Call

3

Restore Local Development

# Re-enable local development permissions
# (Adjust based on your security controls)

# Restore local admin rights (Windows)
Add-LocalGroupMember -Group "Administrators" -Member "DOMAIN\Developers"

# Re-enable Docker Desktop
Enable-WindowsOptionalFeature -FeatureName Containers -Online

# Distribute local development scripts
git clone https://github.com/company/local-dev-setup
cd local-dev-setup && ./setup.sh

4

Post-Rollback Verification

# Verify developer environment status
# Send survey to all affected developers

curl -X POST "https://forms.company.com/api/submit" \
  -d "survey_id=rollback-verification" \
  -d "questions=local_env_working,data_restored,blockers"

# Schedule retrospective
# Document lessons learned
# Update risk assessment based on actual experience

AI Agent Risk Management in CDEs

In 2026, AI coding agents run autonomously inside CDE workspaces. New risks require new controls.

Why AI Agent Risks Are Different

Unlike traditional CDE risks where humans are in the loop, AI coding agents (Claude Code, Copilot Agent, Cursor, Devin, Windsurf) operate autonomously inside workspaces. They can read files, execute commands, make network requests, and modify code without human approval on every action. A misconfigured agent in a CDE has the same blast radius as a compromised developer account - but it acts faster and at scale.

LLM Data Flow Risks

Every AI agent interaction sends workspace context to an LLM provider. Understand what leaves your CDE.

File contents in context windows
Agents read source files and send them as LLM prompt context. Secrets in .env files, hardcoded credentials, and proprietary algorithms can all be transmitted.
Terminal output in agent loops
Agents execute commands and feed output back to the LLM. Database connection strings, API responses with customer data, and error messages with internal URLs can all leak.
Model provider data retention
Not all LLM providers offer zero-retention agreements. Verify whether your provider trains on inputs or retains conversation logs.

Agent Autonomy Controls

Define boundaries for what AI agents can and cannot do inside CDE workspaces.

Filesystem scope limits
Restrict agents to project directories only. Block access to ~/.ssh, ~/.aws, /etc, and other sensitive paths via workspace policy.
Network egress allowlists
Only allow agent workspaces to reach approved LLM API endpoints, package registries, and internal services. Block all other outbound traffic.
Session time and token budgets
Set hard limits on agent session duration (e.g., 4 hours max) and per-session token budgets to prevent runaway costs.

AI Agent Governance Framework for CDEs

Control Area	Minimum Requirement	Recommended (2026)	Priority
Workspace isolation	Container per agent session	Firecracker microVM per agent session	P0
Network egress	Allowlist for LLM API endpoints	Zero-trust network with per-request auth	P0
Secret management	No secrets in workspace filesystem	Vault-injected, auto-rotating, agent-scoped tokens	P0
Cost controls	Per-team monthly token budgets	Per-session limits with LLM gateway proxy	P1
Audit logging	Log all agent commands and file changes	Full prompt/response logging with retention policy	P1
Code review gates	Human review before merge	Automated SAST/DAST scan on all AI-generated code	P1

Vendor Exit Strategy

Ensure portability and reduce lock-in from day one

Portability Checklist

Use Terraform for all infrastructure
Avoid vendor-proprietary template formats
DevContainer specification for configs
Works across VS Code, Codespaces, Ona (formerly Gitpod), Coder
Standard container images
No vendor-specific base images or extensions
Document all vendor-specific features used
Maintain migration notes for each feature
Regular data export testing
Quarterly validation of export/restore procedures
Maintain alternative vendor evaluation
Annual review of market alternatives
AI agent configurations as code
Store agent rules, allowlists, and token budgets in version control - not vendor dashboards
LLM provider portability plan
Use LLM gateway proxies to abstract provider APIs and enable rapid LLM switching

Migration Paths

Coder to Ona

Terraform templates need rewrite to Ona configuration format, but DevContainers work as-is

Moderate Effort

Any CDE to GitHub Codespaces

DevContainers fully compatible, but requires GitHub Enterprise

Low Effort

CDE to Local Development

DevContainers run locally, security controls may need adjustment

High Effort

Self-Hosted to Managed

Offload operations but may lose some customization

Moderate Effort

Switch AI Agent / LLM Provider

LLM gateway proxy makes switching providers straightforward; agent rule files need adaptation

Low Effort (with proxy)

Continue Your Planning

Related resources for comprehensive CDE implementation

Risk Management & Rollback Strategies

CDE Risk Assessment Matrix

Technical Risks

Organizational Risks

Vendor Risks

AI Agent & LLM Risks

Compliance & Data Risks

Risk Scoring Framework

Migration Failure Scenarios & Mitigation

Scenario: Control Plane Becomes Unresponsive During Peak Hours

Impact

Mitigation

Rollback Trigger

Scenario: Network Latency Makes Development Unusable

Impact

Mitigation

Rollback Trigger

Scenario: Vendor Announces Product Discontinuation

Impact

Mitigation

Rollback Trigger

Scenario: AI Agent Escapes Sandbox or Exfiltrates Data

Impact

Mitigation

Rollback Trigger

Scenario: LLM Token Costs Spike Beyond Budget

Impact

Mitigation

Rollback Trigger

Rollback Procedures

Rollback Decision Timeline

Immediate Response

Escalation

Partial Rollback Decision

Full Rollback

Full Rollback Procedure

Export All Workspace Data

Notify All Stakeholders

Restore Local Development

Post-Rollback Verification

AI Agent Risk Management in CDEs

Why AI Agent Risks Are Different

LLM Data Flow Risks

Agent Autonomy Controls

AI Agent Governance Framework for CDEs

Vendor Exit Strategy

Portability Checklist

Migration Paths

Coder to Ona

Any CDE to GitHub Codespaces

CDE to Local Development

Self-Hosted to Managed

Switch AI Agent / LLM Provider

Continue Your Planning

Pilot Program Design

High Availability & DR

Vendor Evaluation

Agentic Engineering