Case Study
Clear11y
Containerized accessibility scanner for static site builds that catches what browser extensions cannot.
I built Clear11y because browser extensions cannot scan static site builds. Chrome blocks extensions from file:// URLs, which means you cannot check accessibility on your dist/ folder before deploying. Clear11y takes a ZIP of your build output, spins up an ephemeral HTTP server, runs axe-core and a custom keyboard navigation engine inside Docker, and produces a scored report.
Why this matters
Proof first, implementation second.
I want the reader to see the product, understand the operator workflow, and then dig into the architecture and tradeoffs behind it.
Overview
What the product does and why I built it that way.
Browser extensions like axe DevTools run inside the browser's extension context. Chrome and Firefox block extensions from accessing file:// URLs as a security measure. The restriction is correct. It also means no extension can scan a static site sitting in your dist/ folder.
I ran into this building a static site. The build output sat on disk. The tools wanted a URL. Clear11y fills that gap by treating the ZIP artifact as the input, not a live URL. It extracts the archive safely, starts a throwaway HTTP server on a kernel-assigned port, runs axe-core and a custom keyboard testing engine via Playwright, and produces a scored report with screenshots.
Highlights
Quick read
Scans build artifacts from ZIP archives, bypassing the file:// restriction that blocks every browser-based accessibility extension.
Two testing engines: axe-core for static violations and a custom KeyboardAccessibilityService that detects focus issues axe-core cannot see.
IS_FOCUS_VISIBLE_SCRIPT checks eight computed style properties across five categories to detect missing focus indicators.
Ephemeral HTTP server binds to port 0 for kernel-assigned ports. No port conflicts in CI.
Zip Slip protection runs two separate checks, before and after os.path.normpath(), because normalization itself can introduce traversal sequences.
Non-root containers (pwuser, UID 1000) with Zip Bomb protection capped at 500 MB uncompressed.
Architecture
How the system is structured.
Clear11y has one Pipeline class that knows how to run a scan. It accepts a ZipService, HtmlDiscoveryService, HttpService, PlaywrightAxeService, and an optional KeyboardAccessibilityService. It knows nothing about how input arrived or where output goes. Three interfaces call it: a CLI, a FastAPI server, and a GitHub Action.
Pipeline
The core scan orchestrator. Accepts a file path, returns violation dicts. No knowledge of which interface called it.
ZipService
Extracts archives with Zip Slip protection, Zip Bomb detection (500MB cap, 100x expansion ratio), and path traversal blocking at two normalization stages.
HttpService
Ephemeral ThreadingHTTPServer bound to port 0. Kernel assigns a free port. Runs in a daemon thread, cleaned up in a finally block with a 5-second join timeout.
PlaywrightAxeService
Injects axe-core via axe_playwright_python, blocks heavy resources (analytics, ad networks), captures screenshots with violation overlays.
KeyboardAccessibilityService
Custom engine that simulates Tab traversal (capped at 150 keypresses), checks focus visibility across 8 CSS properties, and renders an SVG tab-journey overlay.
Features
What it does and how each piece works.
axe-core scanning
Runs 90+ WCAG 2.1 rules against live DOM via Playwright. Configurable rule sets, disabled rules, and conformance levels.
Keyboard navigation testing
Simulates full Tab traversal, detects missing focus indicators by comparing computed styles before and after element.focus({ focusVisible: true }).
Focus visibility detection
IS_FOCUS_VISIBLE_SCRIPT checks outline, box-shadow, border, background-color, and text-decoration changes. Snapshots properties as primitives to avoid the live CSSStyleDeclaration comparison bug.
Self-validation
KeyboardAccessibilityService runs _self_validate() before any real scan. Creates an inline test page with known-good and known-bad focus styles. Caught a real regression when a Playwright version changed focusVisible behavior.
ZIP artifact scanning
Accepts ZIP archives of static site builds. Extracts safely, discovers HTML pages, serves them locally, scans them. No live deployment required.
SSRF prevention
For URL scans, _validate_public_http_url() resolves hostnames via socket.getaddrinfo() and requires every resolved IP to pass ipaddress.ip_address.is_global.
Three interfaces
CLI with Rich progress bars, FastAPI REST API with job queue pattern (POST returns 202), and GitHub Action that fails workflows on configurable thresholds.
Data flow
How data moves through the system.
Input arrives as a ZIP path or URL. The Pipeline extracts the archive, discovers HTML pages, starts the ephemeral server, runs both scanning engines, consolidates results, and produces a scored report.
ZIP extracted with Zip Slip and Zip Bomb protection
HtmlDiscoveryService finds all .html files in the extracted directory
HttpService starts on port 0, serves extracted files over localhost
PlaywrightAxeService navigates to each page, injects axe-core, captures violations and screenshots
KeyboardAccessibilityService runs Tab traversal, checks focus visibility, renders SVG overlay
Results consolidated into a scored report with violation counts by impact level
HttpService stopped in finally block, temporary files cleaned up
Tradeoffs
Decisions that had real alternatives.
Ephemeral HTTP server instead of Playwright file:// handling
Browser inconsistency. Chromium's file:// restrictions change between versions. Firefox requires different flags. WebKit has its own behavior. The HTTP server works identically across all three browsers because it uses standard HTTP.
Custom keyboard testing instead of relying on axe-core
axe-core does not simulate keyboard input. It checks static DOM properties. Focus visibility, tab order, focus traps, unreachable interactive elements are all invisible without pressing Tab and observing what happens.
Python instead of Node.js
axe_playwright_python gives direct access to axe-core results without a subprocess call. Python's http.server is in the standard library. Node would work, but Python reduces the dependency surface for this specific job.
SQLite for the API instead of PostgreSQL only
SQLite works out of the box with zero configuration for single-instance deployments. PostgreSQL is available for concurrent writers. The database layer is abstracted behind DatabaseJobStore, so switching costs nothing.
Challenges
Problems that required specific solutions.
Problem
getComputedStyle() returns a live CSSStyleDeclaration object. Reading it once and comparing after focus would always return equal values.
Solution
The IS_FOCUS_VISIBLE_SCRIPT snapshots all eight property values as primitive strings into a before object before calling focus(), then reads into an after object. test_keyboard_focus.py has an explicit test verifying const before appears before element.focus() in the script source.
Problem
Playwright version updates occasionally change how focusVisible behaves on headless Chromium builds.
Solution
_self_validate() creates an inline test page with three buttons (proper outline, outline:none, box-shadow) and checks that the script correctly distinguishes them before scanning anything real. This caught a real regression.
Problem
ZIP files can be malicious: bombs, path traversal, too many files, NUL-byte filenames.
Solution
_sanitize_archive_member() rejects absolute paths and .. in any component. After normalization, _is_safe_path() resolves both paths to absolute form and checks os.path.commonpath(). Total uncompressed size capped at 500MB. A 42-byte polyglot that expands to 4.5 petabytes raises RuntimeError before a single byte hits disk.
Outcomes
What the work produced.
Static site teams can run scanner scan build.zip and get a scored accessibility report before any deployment exists.
The ephemeral server approach eliminates browser-specific file:// workarounds and produces consistent results across Chromium, Firefox, and WebKit.
IS_FOCUS_VISIBLE_SCRIPT detects outline:none suppressions and five other focus indicator patterns that axe-core's static analysis cannot see.
The self-validation in _self_validate() means the heuristic tests itself on a known-good/known-bad page before scanning anything real.
Zip Slip protection runs two separate checks because normalization itself can introduce traversal sequences. One check is not enough.
12 runtime dependencies. The standard library covers HTTP serving, ZIP handling, socket operations, IP validation, and threading.
Continue reading
See the rest of the work.
Each case study covers architecture, tradeoffs, and delivery detail. The skills page shows how these technologies connect across projects.