Agentic Engineering
The emerging discipline of designing, deploying, and supervising AI agents that autonomously perform software development tasks within Cloud Development Environments
What is Agentic Engineering?
A new discipline for a new era of software development
Beyond the Technology
While agentic AI refers to the technology itself - autonomous systems that plan, execute, and iterate - agentic engineering is the human discipline of putting that technology to work effectively. It encompasses the practices, workflows, organizational patterns, and skills needed to design agent systems, set meaningful guardrails, and supervise autonomous development at scale.
Agentic engineering is the practice of designing, deploying, and supervising AI agents that perform software development tasks with varying degrees of autonomy. It represents a fundamental shift in how software teams operate. Instead of writing every line of code by hand, agentic engineers define objectives, design agent workflows, establish quality gates, and oversee systems that can independently plan, code, test, and deliver software features.
The role of an agentic engineer is distinct from traditional software engineering. Where a conventional developer focuses on writing and debugging code, an agentic engineer focuses on orchestration - defining what work agents should do, how they should do it, what constraints they must follow, and how their output gets validated. Think of it as the difference between a musician playing an instrument and a conductor directing an orchestra. Both require deep technical skill, but the nature of the work is fundamentally different.
As AI agents become more capable, every engineering team will need practitioners who understand how to harness autonomous systems safely and productively. Agentic engineering is not a replacement for software engineering - it is an evolution of it. The best agentic engineers are experienced developers who combine deep technical knowledge with new skills in prompt design, agent orchestration, and quality assurance for AI-generated output.
Traditional Development
- Engineer writes code directly, line by line
- Manual testing and debugging cycles
- One task at a time per developer
- Quality depends on individual skill
- Output limited by human typing speed and cognitive bandwidth
Agentic Engineering
- Engineer designs workflows and delegates tasks to agents
- Automated validation pipelines verify agent output
- Multiple agents work in parallel across many tasks
- Quality enforced by systematic guardrails and gates
- Output scales with infrastructure, not headcount
The Delegation and Supervision Model
How engineers delegate tasks to AI agents and maintain appropriate oversight
Effective agentic engineering requires a clear model for how work gets delegated to agents and how human oversight is maintained. Not every task warrants the same level of autonomy, and not every agent interaction needs the same degree of supervision. The delegation model defines a spectrum from fully supervised to fully autonomous operation, with well-defined handoff points and escalation paths at each level.
The key insight is that delegation is not binary - it is a gradient. Mature agentic engineering teams calibrate the autonomy level to match the risk, complexity, and reversibility of each task. A routine dependency update might run fully autonomously, while a security-critical API change requires human approval at every step.
Fully Supervised
The engineer reviews and approves every action the agent takes before it executes. The agent suggests changes, but the human decides whether to accept, modify, or reject each one. This mode is ideal for onboarding new agent workflows, working in unfamiliar codebases, or handling high-stakes changes.
Semi-Autonomous
The agent executes routine sub-tasks independently but pauses at predefined checkpoints for human review. Engineers define the boundaries - agents can write code and run tests freely, but must get approval before modifying APIs, changing database schemas, or opening pull requests.
Fully Autonomous
The agent operates independently from task assignment through to pull request creation with no human intervention during execution. Validation happens through automated pipelines - tests, linting, security scans, and code quality checks. Human review occurs only at the final PR stage, if at all.
Human-in-the-Loop Patterns
Even when agents operate with high autonomy, well-designed human touchpoints are essential. These patterns define when and how humans intervene in agent workflows to maintain quality and safety.
Gate-Based Review
Define explicit gates where agent work pauses for human inspection. Common gates include pre-merge review, pre-deployment approval, and architecture decision points. Agents queue their work for review and move on to other tasks while waiting.
Exception-Based Escalation
Agents operate freely until they encounter a situation outside their defined boundaries - a failing test they cannot fix, a merge conflict, or a task that exceeds complexity thresholds. At that point, they escalate to a human engineer with full context about what they tried and what failed.
Sampling-Based Audit
For high-volume autonomous work, randomly sample a percentage of agent outputs for detailed human review. This provides statistical confidence in agent quality without requiring review of every individual change. Increase sampling rates for newer agents or riskier code areas.
Retrospective Review
Allow agents to merge low-risk changes automatically, then review batches of completed work periodically. This maximizes throughput while still maintaining human oversight. Roll back any changes that do not meet standards and refine agent instructions accordingly.
Multi-Agent Coordination
Orchestrating multiple AI agents to collaborate on complex development tasks
As development tasks grow in complexity, single agents hit practical limits. Multi-agent coordination addresses this by allowing specialized agents to collaborate on different aspects of a problem - one agent handling frontend changes, another managing backend logic, a third writing tests, and a fourth performing security reviews. This mirrors how human development teams divide work, but agents can coordinate at machine speed.
Cloud Development Environments are uniquely suited for multi-agent work because they can provision isolated workspaces for each agent while providing shared access to repositories, artifacts, and communication channels. Each agent gets its own sandbox to work in, preventing interference while enabling collaboration through well-defined interfaces.
Sequential Pipeline
Agents execute in a defined order, each one building on the output of the previous agent. Agent A writes the implementation, Agent B writes tests for that implementation, Agent C reviews the combined output for security issues, and Agent D generates documentation. Each stage validates the work before passing it forward.
Parallel Execution
Multiple agents work simultaneously on independent aspects of a larger task. A frontend agent, a backend agent, and a database migration agent each work in their own CDE workspace at the same time. An integration step at the end merges their outputs and runs end-to-end tests to verify everything works together.
Hierarchical Orchestration
A planning agent decomposes a high-level objective into sub-tasks and assigns them to specialized worker agents. The planner monitors progress, resolves conflicts between agents, re-plans when sub-tasks fail, and assembles the final output. Worker agents can themselves spawn sub-agents for specialized work.
Agent-to-Agent Communication
For multi-agent systems to function effectively, agents need well-defined protocols for sharing context, passing artifacts, and signaling status. Unlike human communication, agent communication should be structured, versioned, and auditable.
Artifact Passing
Agents share work products through version control branches, shared filesystems, or artifact registries. Agent A commits code to a branch, Agent B checks out that branch in its own workspace to continue. CDE platforms manage the underlying infrastructure for seamless cross-workspace access.
Context Protocols
Structured messages that convey task requirements, constraints, and relevant context between agents. Protocols like the Model Context Protocol (MCP) provide standardized formats for sharing tool access, code context, and conversational history across different agent systems.
Status Signaling
Agents report their progress, completion status, or failure conditions through event systems. Orchestrators subscribe to these events to track overall progress, trigger dependent tasks, or escalate issues. This enables non-blocking coordination where agents do not wait idle for each other.
CDE Workspaces as Sandboxes
Each agent operates in its own isolated CDE workspace with defined resource limits and network policies. This prevents a misbehaving agent from affecting others while still allowing controlled communication through shared repositories and messaging queues. Workspaces are ephemeral - created for a task and destroyed when complete.
Skills for Agentic Engineers
The capabilities developers need to effectively design, deploy, and supervise AI agent workflows
Agentic engineering requires a blend of traditional software development expertise and new competencies specific to working with AI systems. Developers do not need to become machine learning engineers, but they do need to understand how large language models behave, how to communicate intent effectively to agents, and how to build systems that validate and constrain autonomous behavior.
The most effective agentic engineers combine strong coding fundamentals with systems thinking and a rigorous approach to quality assurance. They think about agent workflows the same way platform engineers think about CI/CD pipelines - as automated systems that need clear inputs, defined stages, quality gates, and observability.
Prompt Engineering
The ability to communicate intent clearly to AI agents through well-structured prompts, system instructions, and context documents. Effective prompt engineering goes beyond simple instructions - it includes providing relevant code context, defining output formats, specifying constraints, and anticipating edge cases.
Agent System Design
Designing end-to-end workflows where AI agents operate as components in a larger system. This includes defining task decomposition strategies, choosing appropriate autonomy levels, setting up orchestration logic, and building feedback loops that help agents improve over time.
Output Monitoring and QA
Building and maintaining systems that continuously evaluate agent output quality. This goes beyond simple test passing - it includes code quality metrics, security scanning, performance benchmarks, and pattern detection for common agent mistakes like hallucinated APIs or circular logic.
LLM Literacy
Understanding how large language models work at a practical level - their strengths, weaknesses, and failure modes. Knowing when an agent is likely to hallucinate, why context window limits matter, how temperature affects output consistency, and why certain types of tasks are fundamentally harder for current models.
Guardrail Design
Creating effective constraints that keep agents within safe operational boundaries without being so restrictive that they cannot complete their work. This includes defining file access permissions, network policies, resource limits, blocked operations, and escalation triggers. Good guardrails are invisible when agents behave correctly.
CDE and Cloud Infrastructure
Understanding how Cloud Development Environments provision workspaces, manage resources, and enforce isolation. Agentic engineers need to configure workspace templates optimized for agent workloads, set up networking policies, manage secrets injection, and tune resource allocations for cost efficiency.
CDEs as Agent Infrastructure
Why Cloud Development Environments are the essential foundation for running AI agents in production
Running AI agents on local developer machines introduces serious risks - agents executing arbitrary code, installing unknown packages, and consuming unbounded resources on machines that also contain credentials, personal data, and access to production systems. Cloud Development Environments solve this by providing purpose-built infrastructure where agents operate in isolated, controlled, and observable sandboxes.
CDEs provide the four pillars that make production-grade agentic engineering possible: isolation, observability, scalability, and governance. Without these capabilities, organizations cannot safely move beyond experimental agent use into the kind of scaled autonomous development that delivers real productivity gains.
Isolated Workspaces
Each agent gets its own containerized or VM-based workspace with a complete development environment. Agents cannot access the host system, other workspaces, or production infrastructure. If an agent goes off the rails - infinite loops, excessive resource consumption, or destructive file operations - the blast radius is limited to a single disposable workspace.
Resource Limits and Cost Control
CDEs enforce CPU, memory, disk, and runtime limits on every workspace. This prevents agents from consuming unlimited resources, whether through runaway processes, large dependency installations, or excessively long execution cycles. Budget caps and auto-termination policies keep costs predictable even when running hundreds of agents in parallel.
Comprehensive Audit Trails
Every command an agent executes, every file it modifies, and every API call it makes is logged. These audit trails are essential for debugging agent behavior, demonstrating compliance, and improving agent workflows over time. When an agent produces unexpected output, you can replay the entire session to understand exactly what happened.
Rollback Capabilities
When agents produce incorrect or harmful output, CDE-based workflows make rollback straightforward. Since agents work on isolated branches in ephemeral workspaces, reverting their changes is as simple as closing a pull request or deleting a branch. There is no risk of agents corrupting shared state or leaving behind artifacts that affect other developers.
CDE Platforms Supporting Agent Workloads
Coder
Coder's Terraform-based workspace provisioning provides API-driven creation of agent workspaces on any infrastructure - AWS, Azure, GCP, or on-premises Kubernetes. Its Premium tier includes dedicated agent workspace templates with optimized resource profiles, and its open-source foundation means full control over the agent execution environment. Teams can define custom templates for different agent types and task categories.
Ona (formerly Gitpod)
Ona has pivoted to an agent-first architecture, redesigning its platform around headless, API-driven workspaces specifically built for AI agents. Its workspace provisioning is optimized for rapid startup, making it ideal for high-volume agent workflows where thousands of short-lived workspaces need to spin up and tear down throughout the day. The platform provides built-in observability for agent sessions.
Organizational Readiness
Assessing and building your organization's capacity for agentic engineering at scale
Adopting agentic engineering is not just a technology decision - it is an organizational transformation. Teams need new skills, new processes, and new ways of thinking about developer productivity. Organizations that rush to deploy agents without building the right foundations often face resistance, poor results, and abandoned initiatives. A structured readiness assessment and maturity model helps teams adopt agentic practices incrementally and sustainably.
The most successful organizations approach agentic engineering adoption the same way they approached DevOps or cloud migration - as a cultural and process change supported by technology, not the other way around. Start with willing early adopters, demonstrate measurable value, and expand gradually as confidence and capabilities grow.
Agentic Engineering Maturity Model
Exploring
Individual developers experiment with AI coding assistants like GitHub Copilot or Cursor for personal productivity. There is no organizational strategy, governance, or shared practices. Agent use is informal and untracked. The focus is on learning what AI tools can do and identifying potential use cases.
Experimenting
A dedicated team runs structured pilots with agentic tools on specific projects or task types. Basic governance is established - approved tool lists, usage guidelines, and initial security reviews. Teams begin measuring agent effectiveness and identifying high-value use cases. CDEs are evaluated as agent execution infrastructure.
Scaling
Multiple teams run AI agents in production on CDE infrastructure. Standardized agent workflows, templates, and governance policies are in place. An internal platform team manages agent infrastructure, and agentic engineering practices are documented and shared across the organization. Cost tracking, quality metrics, and security controls are mature.
Optimizing
Agentic engineering is a core organizational capability. Multi-agent systems handle significant portions of routine development work. Data from agent operations drives continuous improvement - tuning prompts, refining workflows, and expanding the scope of autonomous tasks. The organization measures ROI at the portfolio level and continuously optimizes the balance between human and agent contributions.
Change Management for AI-Augmented Teams
Introducing AI agents into development workflows changes how people work, how performance is measured, and how teams collaborate. Proactive change management prevents resistance and accelerates adoption.
Address Developer Concerns
Be transparent about how AI agents will change roles and workflows. Emphasize that agents handle routine work so developers can focus on higher-value architecture, design, and problem-solving. Frame agentic engineering as a career growth opportunity, not a threat. Involve developers in agent workflow design from the start.
Redefine Productivity Metrics
Traditional metrics like lines of code or commits per day become meaningless when agents generate code. Shift to outcome-based metrics: features delivered, bugs resolved, time to production, customer impact, and quality scores. Measure the combined output of human-agent teams rather than individual contributor metrics.
Invest in Training
Provide structured training on prompt engineering, agent supervision, and the specific tools your organization has adopted. Pair experienced agentic engineers with teams that are just getting started. Create internal documentation, playbooks, and shared prompt libraries. Allocate dedicated time for experimentation and learning.
Next Steps
Continue exploring related topics to build a complete understanding of AI-powered development
Agentic AI and Autonomous Development
Deep dive into the technology behind autonomous coding agents - autonomy levels, leading platforms, and workspace-per-agent architecture
AI Coding Assistants
How GitHub Copilot, Cursor, Claude Code, and other AI assistants integrate with Cloud Development Environments for governed AI-assisted development
CDE Governance and Policy
Establish governance frameworks for CDEs - define policies, enforce standards, manage resource quotas, and maintain control at scale
