What a Verified Research Process Actually Looks Like

Windows servers were burning CPU in production. The cause was identified. The fix — adding folder exclusions to Microsoft Defender — was well-understood in principle. But the specific configuration details were scattered across four vendors’ documentation. Getting them right mattered: a wrong path would leave the problem unsolved, and an overly broad exclusion could create a security gap.

The question wasn’t whether to fix it. The question was: how do you know what you find is actually correct?

The Deliverable

What came back from the research phase:

36 findings across the relevant vendor documentation
23 cited sources, each linked and traceable
Specific folder paths, organised by agent
Security tradeoffs explicitly named — what each exclusion permits and what risks it introduces
Known attacker techniques cross-referenced against the proposed configuration
Open questions flagged for follow-up, not buried

That’s not a summary. That’s a structured output you can act on, hand off, or file for audit.

The Quality Gate

Here’s the part that matters for anyone signing off on work like this: the first draft had errors.

Not vague errors. Specific, consequential errors. One vendor’s log path was wrong by a single folder name — the exclusion would have been configured, tested as “applied,” and would have done nothing. A security detail about a specific malware variant was attributed to the wrong technique.

A second agent — a Validator — ran through every source independently. Its only job was to check. Not summarise, not add findings, not recommend. Check. It found both errors before anything reached production.

Two agents, separated by role. The Investigator finds. The Validator checks. Neither does both. This is the same separation-of-concerns principle as a developer who doesn't review their own code.

What Made It Repeatable

A one-off investigation done carefully is useful. A system that runs every future investigation at the same quality bar is an asset.

The difference is documentation. Before the second investigation ran, the rules were written down:

Every investigation starts with a scope gate — one clear question, stated in a single sentence, with explicit out-of-scope boundaries
Validation is not optional, ever
The answer goes first in every output — not after pages of context
One source of truth per investigation; everything else is generated from it

This isn’t process for its own sake. Each rule exists because its absence produced a real problem.

What It Produced in One Afternoon

At the end of the build:

A scope gate that forces clear questions before any research starts
An Investigator that researches from live sources, not memory
A Validator that fact-checks independently before anything ships
A verification script that catches structural errors automatically
Answer-first output — the decision is at the top, the evidence follows
Open questions converted to concrete follow-up tasks
Everything stored, findable, and reproducible

The system has no memory of being built in an afternoon. Every investigation it runs gets the same structure, the same validation, the same output format — regardless of who’s asking the question or what the topic is.

Research that arrives pre-validated, sourced, and structured is research you can act on. Research that arrives as a summary from someone's memory is research you have to verify yourself — which means you're doing the work twice, or you're not doing it at all.

The Lesson That Applies Everywhere

The Validator caught errors that a human reviewer probably wouldn’t have caught — not because humans are careless, but because the Validator went back to primary sources for every claim, which is not how humans review under time pressure.

The scope gate prevents the most common research failure mode: wide-and-shallow investigations that cover everything and answer nothing.

The answer-first structure means the person reading it doesn’t have to excavate the finding from the context.

These are not AI-specific lessons. They’re good research practice. The system just enforces them mechanically, every time, without being asked.