Every SSM Session Manager session in your AWS org lands as ssm-user.
That’s not a Linux user. That’s a liability.
When your compliance auditor asks “who ran that command on that server at 2am?” the answer is “someone with the SRE permission set.” That’s not an answer. That’s a finding. A finding that delays your SOC 2. A finding that costs remediation cycles. A finding that, in a real incident, means your mean time to answer “who did this?” is measured in hours of log correlation instead of one CloudTrail query.
AWS has a fix for this. It’s called runAsEnabled. When you enable it, SSM reads an SSMSessionRunAs tag off the caller’s STS token and starts the session as that Linux user. Alice gets a shell as alice. Bob gets a shell as bob. Full accountability, zero SSH keys, no bastion.
The problem? AWS only documented how to wire this up with Okta.
I run Microsoft Entra ID. Like most enterprises do.
Weeks of Searching. Nothing.
The requirement was clear. The destination was clear. The path was not.
I opened an AWS Support case. They tried. Genuinely. But this integration touches IAM Identity Center, Entra ID, SCIM provisioning, SAML federation, SSM Session Manager, CloudFormation StackSets, and AWS Organizations — simultaneously. No single team owns all of that. The Entra-specific behavior lives on Microsoft’s side of the fence.
I started wondering: are we crazy? Are we the only ones who want individual Linux accountability tied to the corporate directory?
We were not crazy. We were just early.
One Day with Agentic AI. Done.
I didn’t open a chat window and copy-paste responses into a text editor. That’s not what this was.
Claude Code ran directly in my terminal. It read my CloudFormation templates, wrote new ones to disk, ran scripts, read the error output, diagnosed the failure, fixed the template, and redeployed — without me transcribing anything between steps. The AI worked where the code lives.
And it didn’t work alone. It worked as a team.
But here’s the thing people skip: before a single line of code was written, before any planning happened, I spent two hours doing nothing but deep research.
Two hours of parallel investigative subagents scouring the internet. Not just the AWS docs. Not just the Microsoft docs. Community forums, GitHub issues, Stack Overflow threads, re:Post questions, blog posts, obscure comment sections where someone half-answered a related question three years ago and never followed up. Everything. Cross-referenced. Synthesized into a complete picture of a problem that no single source covered end to end.
Shout out to Context7 — an MCP server that pulls current, version-accurate library and API documentation directly into the agent’s context. No hallucinated method signatures. No docs from two major versions ago. The research was grounded in what the APIs actually do today.
That research phase is what made everything else fast. You cannot build well on ground you haven’t surveyed.
After that, the methodology was closer to formal software engineering than to “AI wrote some code”:
A specification subagent produced formal property definitions before a line of code was written — SAFETY properties (things that must never happen), LIVENESS properties (things that must eventually happen), and INVARIANTS (things that must always hold). Each property had a concrete, runnable observable. A test plan was derived from those properties. A dependency graph mapped every task, its prerequisites, and its acceptance criteria.
Then the plan itself was peer-reviewed — a validation pipeline challenged the task list before implementation started. Testable acceptance criteria? Missing dependencies? Gaps in property coverage? The plan had to pass before anyone wrote a line.
Implementation subagents wrote code to pass the tests. Review subagents tore it apart across five dimensions. Nothing shipped without passing the spec.
This is test-driven development. It just happened to run across parallel agents instead of a single developer’s terminal.
One day. Research, planning, implementation, review, validation, documentation. All of it.
The difference isn’t speed. It’s width. And it’s discipline — research before building, always. A single engineer working sequentially is bottlenecked by attention. An agentic system running parallel subagents removes that bottleneck entirely.
The Solution: Four Layers, Two Options
The architecture is clean once you see it.
Entra ID → SCIM → IAM Identity Center → ABAC Session Tag → STS → SSM → EC2 (as the right Linux user)
Every user has an attribute in Entra. That attribute flows through SCIM into IAM Identity Center. ABAC turns it into an SSMSessionRunAs session tag on every login. SSM reads the tag and starts the session as the right Linux user.
A CloudFormation StackSet with autoDeployment pushes the SSM preferences document to every member account — current and future. An SCP prevents any member account admin from disabling it. The whole org is covered automatically.
Here’s what it looks like in practice. Alice needs to debug something in production. She logs into the IAM Identity Center portal, picks her account and permission set, starts an SSM session. No SSH key. No bastion. No ticket. She gets a shell — as alice. CloudTrail shows alice@contoso.com. The OS shows alice. Same answer, both places, automatically, from the corporate directory.
Two deployment models:
Per-Role — Users land on shared role accounts (admin, developer, oracle, support). No Active Directory required. Zero additional cost. CloudTrail still shows the real person. Works everywhere.
Per-User — Users land as their personal AD username. Full individual accountability at the OS level. Entitlements managed entirely through AD group memberships. For organizations without existing AD in AWS, the repo includes a self-contained Windows Server 2022 Domain Controller deployed entirely via CloudFormation — no pre-existing infrastructure needed to prove the pattern.
The Stuff Nobody Documented
This is the section that cost weeks. All of it Entra-specific. None of it written down anywhere.
Entra’s federation metadata has 12 signing certificates. IAM Identity Center silently fails with “Retry Failed Steps” when you upload it. Strip it to one cert and two SSO endpoints. Nothing else.
The SCIM template ID is aWSSingleSignon, not aws. The Microsoft Graph API documentation shows aws — but that example is for the legacy AWS Single-Account Access gallery app, which doesn’t provision users at all. The AWS IAM Identity Center app uses a different template entirely. Using aws against an IAM Identity Center service principal returns BadRequest. Query GET /servicePrincipals/{id}/synchronization/templates on your own service principal to confirm the value.
The ACS URL uses signin.aws, not signin.aws.amazon.com. The gallery app wildcard covers the wrong domain. You have to explicitly add the specific ACS URL or federation silently fails.
Changing the IdC identity source deletes your IAM SAML provider. Your role trust policies still reference the old ARN. You get 403 on federation. Fix: delete and recreate the account assignment to force IdC to regenerate the provider.
Fn::Sub eats PowerShell variables. CloudFormation substitutes ${anything} it finds in a string. PowerShell uses the same syntax. Your UserData script gets silently mangled. Capture CFN values into plain variables at the top and use unbraced $var everywhere else.
Windows EC2 UserData is two boots, not one. Install-ADDSForest reboots the instance during forest promotion. Everything after that call in your UserData is dead code. Use <persist>true</persist> and a registry phase flag. Resume on the second boot.
There are nine more where those came from. They’re all in the full white paper.
The Cognitive Load Thing Nobody Talks About
When you’re writing the CloudFormation yourself, your brain is split: syntax, resource dependencies, IAM logic, and the actual business requirement — all at once. You’re an implementer, a reviewer, and an architect simultaneously, switching contexts constantly, losing altitude on every switch.
When the AI handles implementation, that overhead disappears.
You’re operating purely at the architectural level. Thinking about system boundaries, failure modes, security posture, operational implications. The person making architectural decisions is no longer exhausted by implementation details.
Agentic AI doesn’t just make you faster. It promotes you.
When the Answer Changes Halfway Through
This is the part that surprised me most.
Three times during that day, the right approach changed. A tiered CloudFormation stack replaced a monolithic one. A domain alignment issue forced a different SSSD configuration. Entra Connect Cloud Sync replaced a manual attribute approach.
In traditional development, each of those is a rework cycle. Hours of context-rebuilding, updated documentation that nobody writes, review that gets skipped because you’re already behind.
With the agentic pipeline, each change was a feedback cycle. Re-plan from the existing base, re-implement against the updated spec, re-review automatically. Minutes, not days. And the documentation stayed current because it was generated, not written.
For a problem this exploratory — where the right answer genuinely was not known at the start — the ability to iterate quickly without accumulating technical debt wasn’t just useful. It was what made a day-long solution possible at all.
Conductor and Orchestra
The right mental model is not “AI did it.” It’s Conductor and Orchestra.
I framed the problem. I scoped the research. I made every architectural call — which attribute to repurpose, how wide to set the SCP, whether to build one deployment model or two. The AI did not bring the weeks of domain knowledge I had going in. That context is what made the session productive rather than chaotic.
The AI brought speed, breadth, parallelism, and enforced rigor at every step. I brought judgment, context, and accountability.
In this engineer’s opinion, agentic AI should not make engineers redundant. It should promote them — freeing them from implementation overhead to operate at the architectural level where engineering judgment actually creates value. Whether organizations choose to realize that potential, or choose instead to chase headcount reduction, is a decision that will define the quality of the systems they build for years to come.
The Repo
CloudFormation templates, deploy scripts, PowerShell, and verification tooling — all of it is in the repo. Two deployment models, fully parameterized, no hardcoded values.
The full white paper covers every gotcha in detail, the complete architecture, operational lifecycle, and security properties.
Even this blog post was written in the Conductor and Orchestra model. The engineer set the direction. The agentic system drafted and refined. You’re reading the result.