Complete 2025-01 to 2025-03 Solo

Case Study

Clear11y

Containerized accessibility scanner for static site builds that catches what browser extensions cannot.

I built Clear11y because browser extensions cannot scan static site builds. Chrome blocks extensions from file:// URLs, which means you cannot check accessibility on your dist/ folder before deploying. Clear11y takes a ZIP of your build output, spins up an ephemeral HTTP server, runs axe-core and a custom keyboard navigation engine inside Docker, and produces a scored report.

90+

axe-core Rules

Focus Properties

Interfaces

500MB

Zip Bomb Cap

PythonFastAPIPlaywrightaxe-coreDockerWCAG 2.1GitHub ActionCLI

Source Code

Why this matters

Proof first, implementation second.

I want the reader to see the product, understand the operator workflow, and then dig into the architecture and tradeoffs behind it.

Overview

What the product does and why I built it that way.

Browser extensions like axe DevTools run inside the browser's extension context. Chrome and Firefox block extensions from accessing file:// URLs as a security measure. The restriction is correct. It also means no extension can scan a static site sitting in your dist/ folder.

I ran into this building a static site. The build output sat on disk. The tools wanted a URL. Clear11y fills that gap by treating the ZIP artifact as the input, not a live URL. It extracts the archive safely, starts a throwaway HTTP server on a kernel-assigned port, runs axe-core and a custom keyboard testing engine via Playwright, and produces a scored report with screenshots.

Highlights

Quick read

Scans build artifacts from ZIP archives, bypassing the file:// restriction that blocks every browser-based accessibility extension.

Two testing engines: axe-core for static violations and a custom KeyboardAccessibilityService that detects focus issues axe-core cannot see.

IS_FOCUS_VISIBLE_SCRIPT checks eight computed style properties across five categories to detect missing focus indicators.

Ephemeral HTTP server binds to port 0 for kernel-assigned ports. No port conflicts in CI.

Zip Slip protection runs two separate checks, before and after os.path.normpath(), because normalization itself can introduce traversal sequences.

Non-root containers (pwuser, UID 1000) with Zip Bomb protection capped at 500 MB uncompressed.

Architecture

How the system is structured.

Clear11y has one Pipeline class that knows how to run a scan. It accepts a ZipService, HtmlDiscoveryService, HttpService, PlaywrightAxeService, and an optional KeyboardAccessibilityService. It knows nothing about how input arrived or where output goes. Three interfaces call it: a CLI, a FastAPI server, and a GitHub Action.

Pipeline

The core scan orchestrator. Accepts a file path, returns violation dicts. No knowledge of which interface called it.

ZipService

Extracts archives with Zip Slip protection, Zip Bomb detection (500MB cap, 100x expansion ratio), and path traversal blocking at two normalization stages.

HttpService

Ephemeral ThreadingHTTPServer bound to port 0. Kernel assigns a free port. Runs in a daemon thread, cleaned up in a finally block with a 5-second join timeout.

PlaywrightAxeService

Injects axe-core via axe_playwright_python, blocks heavy resources (analytics, ad networks), captures screenshots with violation overlays.

KeyboardAccessibilityService

Custom engine that simulates Tab traversal (capped at 150 keypresses), checks focus visibility across 8 CSS properties, and renders an SVG tab-journey overlay.

Features

What it does and how each piece works.

Scanning

axe-core scanning

Runs 90+ WCAG 2.1 rules against live DOM via Playwright. Configurable rule sets, disabled rules, and conformance levels.

Scanning

Keyboard navigation testing

Simulates full Tab traversal, detects missing focus indicators by comparing computed styles before and after element.focus({ focusVisible: true }).

Scanning

Focus visibility detection

IS_FOCUS_VISIBLE_SCRIPT checks outline, box-shadow, border, background-color, and text-decoration changes. Snapshots properties as primitives to avoid the live CSSStyleDeclaration comparison bug.

Reliability

Self-validation

KeyboardAccessibilityService runs _self_validate() before any real scan. Creates an inline test page with known-good and known-bad focus styles. Caught a real regression when a Playwright version changed focusVisible behavior.

Input

ZIP artifact scanning

Accepts ZIP archives of static site builds. Extracts safely, discovers HTML pages, serves them locally, scans them. No live deployment required.

Security

SSRF prevention

For URL scans, _validate_public_http_url() resolves hostnames via socket.getaddrinfo() and requires every resolved IP to pass ipaddress.ip_address.is_global.

Interfaces

Three interfaces

CLI with Rich progress bars, FastAPI REST API with job queue pattern (POST returns 202), and GitHub Action that fails workflows on configurable thresholds.

Data flow

How data moves through the system.

Input arrives as a ZIP path or URL. The Pipeline extracts the archive, discovers HTML pages, starts the ephemeral server, runs both scanning engines, consolidates results, and produces a scored report.

ZIP extracted with Zip Slip and Zip Bomb protection

HtmlDiscoveryService finds all .html files in the extracted directory

HttpService starts on port 0, serves extracted files over localhost

PlaywrightAxeService navigates to each page, injects axe-core, captures violations and screenshots

KeyboardAccessibilityService runs Tab traversal, checks focus visibility, renders SVG overlay

Results consolidated into a scored report with violation counts by impact level

HttpService stopped in finally block, temporary files cleaned up

Tradeoffs

Decisions that had real alternatives.

Ephemeral HTTP server instead of Playwright file:// handling

Browser inconsistency. Chromium's file:// restrictions change between versions. Firefox requires different flags. WebKit has its own behavior. The HTTP server works identically across all three browsers because it uses standard HTTP.

Custom keyboard testing instead of relying on axe-core

axe-core does not simulate keyboard input. It checks static DOM properties. Focus visibility, tab order, focus traps, unreachable interactive elements are all invisible without pressing Tab and observing what happens.

Python instead of Node.js

axe_playwright_python gives direct access to axe-core results without a subprocess call. Python's http.server is in the standard library. Node would work, but Python reduces the dependency surface for this specific job.

SQLite for the API instead of PostgreSQL only

SQLite works out of the box with zero configuration for single-instance deployments. PostgreSQL is available for concurrent writers. The database layer is abstracted behind DatabaseJobStore, so switching costs nothing.

Challenges

Problems that required specific solutions.

Problem

getComputedStyle() returns a live CSSStyleDeclaration object. Reading it once and comparing after focus would always return equal values.

Solution

The IS_FOCUS_VISIBLE_SCRIPT snapshots all eight property values as primitive strings into a before object before calling focus(), then reads into an after object. test_keyboard_focus.py has an explicit test verifying const before appears before element.focus() in the script source.

Problem

Playwright version updates occasionally change how focusVisible behaves on headless Chromium builds.

Solution

_self_validate() creates an inline test page with three buttons (proper outline, outline:none, box-shadow) and checks that the script correctly distinguishes them before scanning anything real. This caught a real regression.

Problem

ZIP files can be malicious: bombs, path traversal, too many files, NUL-byte filenames.

Solution

_sanitize_archive_member() rejects absolute paths and .. in any component. After normalization, _is_safe_path() resolves both paths to absolute form and checks os.path.commonpath(). Total uncompressed size capped at 500MB. A 42-byte polyglot that expands to 4.5 petabytes raises RuntimeError before a single byte hits disk.

Outcomes

What the work produced.

Static site teams can run scanner scan build.zip and get a scored accessibility report before any deployment exists.

The ephemeral server approach eliminates browser-specific file:// workarounds and produces consistent results across Chromium, Firefox, and WebKit.

IS_FOCUS_VISIBLE_SCRIPT detects outline:none suppressions and five other focus indicator patterns that axe-core's static analysis cannot see.

The self-validation in _self_validate() means the heuristic tests itself on a known-good/known-bad page before scanning anything real.

Zip Slip protection runs two separate checks because normalization itself can introduce traversal sequences. One check is not enough.

12 runtime dependencies. The standard library covers HTTP serving, ZIP handling, socket operations, IP validation, and threading.

See the rest of the work.

Each case study covers architecture, tradeoffs, and delivery detail. The skills page shows how these technologies connect across projects.

All Case Studies Get in Touch