Boback
Back to Projects
Python
Clean Architecture
AST Processing

ContextStrategies

CLI tool for flattening code repositories to markdown, AI-assisted code focusing, patching, and static analysis.

~5,000 LOC
Active
High

Executive Summary

Challenge: Prepare massive codebases for LLM consumption—raw repos overwhelm context windows with noise (node_modules, build artifacts, redundant code). Approach: Built comprehensive analysis toolkit with Clean Architecture—Domain layer handles scanning/flattening with gitignore intelligence, Application layer orchestrates processing pipelines, Infrastructure abstracts AI clients and storage. Content cleaning pipeline strips comments, normalizes whitespace, deduplicates code blocks. AI-powered scope selection uses pydantic_ai to intelligently focus on relevant files for specific queries. AST-based patching workflow extracts target scopes, generates modifications via LLM, applies changes safely. Multi-faceted analysis includes architecture validation (layer violations), code smell detection, cyclomatic complexity metrics, duplicate code identification, security pattern scanning, and dependency mapping. Golden testing framework captures AI responses for regression validation. Interfaces: Unified Typer CLI with server fallback, Textual TUI with split panes, Streamlit dashboard for history visualization, FastAPI HTTP server for remote access. DuckDB analytics backend. Innovation: BaseAnalyzer with caching prevents redundant processing; unified interface abstracts deployment modes (local/server/TUI).

The Challenge

Automating codebase flattening, analysis, and AI-assisted refinement for large repositories.

The Solution

Content cleaners, BaseAnalyzer caching, golden AI tests. Domain layer independent of infra. Unified CLI with Server Fallback.

System Architecture

Key Features

1

Repository Flattening

2

AI Scope Selection & Focusing

3

Code Patching

4

Static Code Analysis

5

Duplicate Code Detection

6

Dependency Mapping

7

History Dashboard

8

TUI Interface

Technical Skills Matrix

Python
Clean Architecture
AST Processing
AST Manipulation
Typer CLI
FastAPI
RESTful APIs
HTTP Servers
Textual TUI
Terminal UI Development
Split Panes
Interactive Terminals
Pydantic AI
AI/LLM Integration
OpenAI
OpenRouter
Streamlit
Data Visualization
Dashboard Development
DuckDB
SQLite
Data Storage
SQL Querying
Static Code Analysis
Architecture Validation
Layer Violation Detection
Code Smell Detection
Cyclomatic Complexity
Complexity Metrics
Duplicate Code Detection
Security Scanning
Pattern Recognition
Golden Testing
Integration Testing
pytest
Test Automation
Content Sanitization
Path Traversal Protection
Security Best Practices
Gitignore Parsing
Ruff Integration
Code Formatting
BaseAnalyzer Pattern
Caching Strategies
Performance Optimization
Markdown Rendering
Token Estimation
Content Processing
Custom Exceptions
Error Handling
Logging
Configuration Management
CLI Argument Merging
Unified Interfaces