Agentic Architecture Recovery¶

Status: in development. Tracked as Milestone L on the public roadmap.

Squeaky Clean today generates Clean-Architecture projects from a ProblemSpec. Agentic Architecture Recovery is the inverse path: ingest an existing brownfield project and rebuild it from scratch through the same generation pipeline, with a human-reviewable Squib in the middle.

What it does¶

Point the framework at an existing project. The pipeline runs in six stages.

Ingest the codebase. A deterministic AST extractor walks every source file and produces a class catalog (FQNs, base classes, method signatures, fields, imports, decorators) plus a per-class import graph. No LLM in this stage; the front-half is reproducible across runs.
Assign each class to a Clean-Architecture layer. The framework reuses its existing infer_category() verb-set classifier. Classes whose method verbs hit an infrastructure category route to INFRASTRUCTURE. Classes whose imports stay sibling-domain route to DOMAIN. Classes that orchestrate domain + infrastructure route to APPLICATION. Classes carrying framework decorators (@app.route, @RestController) route to INTERFACE.
Classify each class against the 34-pattern catalog. Deterministic fingerprints score each candidate pattern. An LLM is consulted only on two-or-more-way ties, and only with the candidate set plus the class skeleton as context. Bounded, cacheable, recordable.
Decompose the catalog into modules. Strongly-connected components of the Domain-layer subgraph become module candidates. Application, Infrastructure, and Interface classes attach to the modules whose dependencies they target. The result is serialized as a Squib validated against the same grammar the architect produces in the greenfield path.
Stop at a human review gate. You read the emitted Squib, edit it, and sign off before any regeneration runs. If your edits no longer parse, the pipeline reports the parse error with line context and waits for a corrected version.
Re-enter the standard generation pipeline. The framework synthesizes a thin ProblemSpec (with acceptance criteria auto-derived from the legacy tests/ directory) and injects your signed-off Squib through a new --squib-file CLI flag. The flag short-circuits the architect step so the architecture you reviewed is the architecture that gets regenerated.

The round-trip is the unit of value: messy legacy code in, freshly generated Clean-Architecture project out, with a human checkpoint in the middle.

What "lossless" means here¶

Stages 1, 2, and 4 contain no LLM call and are reproducible by construction. Stage 3's pattern-tie-break LLM call is routed through a content-addressed disk cache, so repeats of the same project return cached responses. On a cold cache, the Anthropic API may produce different responses across runs even at temperature=0; the cache provides replay stability, not first-call determinism.

What "agentic" means here¶

Agents enter the pipeline in two places.

Pattern-classification tie-break (stage 3). When deterministic fingerprints score two or more candidate patterns equally, an LLM picks from the candidate set with the class skeleton as context. The output is rejected and falls back to SimpleClass if the LLM names a pattern outside the candidate set. Repeated tie-break calls return cached responses; first-call output depends on the LLM API's runtime behavior at temperature=0.
Regeneration (stage 6). Once your Squib is signed off, the standard greenfield generation pipeline takes over: Architect, Manager, atomic agents. The architect step is short-circuited because you've already given it the architecture; the rest of the pipeline runs end-to-end as it would for any greenfield problem.

Why "architecture recovery"¶

"Architecture recovery" is the established academic term for inferring software architecture from existing code. Squeaky Clean's variant adds two things to the academic baseline: it emits a machine-checkable Squib rather than a human-only diagram, and it closes the loop by regenerating a buildable project from the recovered architecture.

What ships day-1¶

Python projects in their entirety.
7 of the 34 patterns classified directly: SimpleClass, Entity, ValueObject, Repository, Gateway, UseCase, Strategy. The remaining 27 fall back to SimpleClass with a warning.
Auto-derived Gherkin from the legacy tests/ directory. Pytest names like test_user_can_login are normalized into Given/When/Then user can login scenarios.
Cache-stable replay of stages 1–4 on a populated response cache.
Human review gate with re-parse on edit and clear violation reporting.

What's deferred¶

Tree-sitter extractors for Java, Go, Rust, JavaScript, and TypeScript. Once the Python round-trip is solid, the same extractor abstraction lifts to the other five languages.
Side-effect inventory. Undocumented behaviors with no test coverage (a log line, a metric emission, a startup-time third-party call) may not survive the round-trip. Preservation strategy is an open question on the roadmap.
Test-discovery vs ground-truth reconciliation. When auto-derived Gherkin disagrees with what the legacy code actually does, the review gate will need to surface the diff explicitly.