What Happens When You Give Your Best Engineer an AI Team

You have a platform problem. Every EC2 session across your AWS org runs as a shared account. Your auditors are flagging it. Your security team knows it’s wrong. Your engineers know how to fix it — in theory.

In practice, the fix requires deep expertise across six different systems simultaneously, and whoever you assign it to will disappear into it for weeks.

This is the story of how that problem got solved in a day. And what it means for how you staff and run engineering work going forward.

The Problem Is Harder Than It Looks

AWS Systems Manager Session Manager is the right way to access EC2. No SSH keys, no bastions, IAM-governed, fully audited. Your teams are probably already using it.

The issue: every session lands as a shared ssm-user account. When your auditor asks who ran a command on a production server, the answer is a role name, not a person. That fails SOC 2. It fails incident response. It costs remediation cycles and delays certification.

AWS has a documented fix for this. It involves wiring an identity attribute from your IdP through SCIM provisioning into IAM Identity Center’s ABAC engine, which stamps a session tag on STS tokens that SSM uses to pick the Linux user. The documentation covers this end-to-end — for Okta.

Most enterprises run Microsoft Entra ID. The Entra-specific integration wasn’t documented anywhere.

What It Takes to Solve This Without AI

One senior engineer, working sequentially:

Learn IAM Identity Center ABAC mechanics
Learn Entra ID SCIM provisioning via Graph API
Learn SAML federation metadata format requirements
Debug silent failures across two cloud platforms simultaneously
Navigate CloudFormation StackSet constraints for org-wide deployment
Design an SCP guardrail that doesn’t break break-glass access
Write and test deployment scripts across two deployment models
Document everything so the next person isn’t starting from scratch

Realistic estimate: 3–6 weeks, assuming no other work. Heavy context-switching throughout. Knowledge that lives only in the engineer’s head unless documentation is prioritised — which it won’t be under time pressure.

This is the kind of work that drains your best people and produces institutional knowledge that leaves when they do.

What It Takes With an Agentic AI System

The same engineer, acting as Conductor rather than implementer:

Frames the problem and defines the requirements
Directs two hours of parallel research across the full internet — not sequentially, simultaneously
Reviews a formal specification (SAFETY / LIVENESS / INVARIANT properties) before approving implementation
Reviews a peer-validated plan and dependency graph before a line of code is written
Makes architectural judgment calls at each decision point
Validates the end-to-end result against real infrastructure

Total elapsed: one day (approximately 8 business hours). Output: working infrastructure, formal specifications, test plans, deployment scripts, verification tooling, and full documentation — captured as a natural output of the process, not written afterward.

The Quality Question

The natural concern: if AI wrote it, is it trustworthy?

The methodology answers this directly. Before any code was written, the system produced formal property specifications — typed as SAFETY (must never happen), LIVENESS (must eventually happen), and INVARIANT (must always hold) — each with a concrete, runnable proof. A test plan was derived from those properties. The plan itself was peer-reviewed by a validation subagent before implementation started. Implementation was written to pass the tests. Code was reviewed adversarially across security, correctness, performance, maintainability, and reliability. The engineer reviewed the final output before anything touched a real environment.

That is more structured rigour than most sprint tickets receive. The code that shipped had been through formal specification, test-driven development, peer-reviewed planning, adversarial code review, and human sign-off.

The Staffing Implication

This is the part that should interest you most.

When an engineer is implementing, their cognitive bandwidth is split: syntax, dependencies, logic, and the business requirement all at once. They are simultaneously implementer, reviewer, and architect — losing altitude on every context switch. The work is slower and the architecture is shallower than it should be.

When agentic AI handles implementation, that overhead disappears. The engineer operates purely at the architectural level — thinking about system boundaries, failure modes, security posture, and operational implications. The result is not just faster delivery. It is better-architected systems, because the person making architectural decisions is not simultaneously exhausted by implementation details.

One engineer, directed well, can now do the work that used to require a team — and produce better-documented, more rigorously tested output than that team typically would.

The question for engineering leaders is not “should we use AI?” It is “are we using it in a way that promotes our engineers, or just in a way that makes them faster at the same level they were already at?”

Two Models, One Pipeline

The solution produced two deployment approaches that cover the full enterprise spectrum:

Per-Role — Users land on shared role accounts (admin, developer, oracle). Zero AD dependency. Works in any environment. CloudTrail still shows the real person.

Per-User — Users land as their personal AD identity. Full individual accountability at the OS level. Entitlements managed entirely in Active Directory.

Both are org-wide: a CloudFormation StackSet pushes the configuration to every member account, current and future. An SCP prevents member account admins from disabling it. New accounts are covered automatically.

The Takeaway

Your engineers are not blocked by lack of skill. They are blocked by bandwidth. The problem this post describes was solvable — it just required simultaneously holding context across six systems that no single person or team owns end to end.

Agentic AI removes that bandwidth constraint. The engineer who already understood the problem space directed a day of parallel investigation, specification, implementation, and review — and shipped something that would have taken weeks.

That is the model. Conductor and Orchestra. The engineer sets the tempo and makes every meaningful call. The AI plays all the instruments simultaneously.

In this engineer’s opinion, agentic AI should not be used to reduce headcount. It should be used to raise the ceiling on what your best people can deliver. Whether your organisation chooses to realise that potential is a decision that will define the quality of your systems for years.

Full technical details, architecture diagrams, and all 17 gotchas are in the white paper. The repo has working CloudFormation templates, deploy scripts, and verification tooling for both deployment models.

Even this post was written in the Conductor and Orchestra model.