Case Study
DisplayAnalysis
Python tool for measuring invisible display artifacts that cause eye strain and headaches.
Modern displays lie to you. They use PWM to dim the backlight by rapidly switching it on and off. They use temporal dithering to simulate colors the panel cannot actually produce. None of this is visible in a single frame. The symptoms are real though: eye strain, headaches, fatigue. DisplayAnalysis makes these artifacts measurable. Input: a video recording of the display. Output: a PDF report that says what is wrong and how bad it is.
Why this matters
Proof first, implementation second.
I want the reader to see the product, understand the operator workflow, and then dig into the architecture and tradeoffs behind it.
Overview
What the product does and why I built it that way.
DisplayAnalysis detects imperceptible screen artifacts: PWM flicker, temporal dithering, brightness non-uniformity. It processes video recordings of displays using FFT analysis, Welford's online algorithm, and CIELAB color science, then produces a risk-assessed PDF report.
The core design choice is that every measurement is numeric, every threshold is explicit, and every risk rating has a specific number behind it. No subjective "this screen feels better" comparisons. The tool quantifies flicker frequency and modulation depth, tracks per-pixel variance across thousands of frames, measures spatial brightness uniformity, and converts everything into a risk assessment.
Highlights
Quick read
PixelStatsAccumulator cuts per-pixel variance memory from O(N×H×W) to O(H×W) using Welford's algorithm. A 10-minute 4K video uses the same RAM as a 5-second clip.
FFT flicker detection applies a Hanning window, corrects amplitude attenuation, and detects synthetic 10 Hz signals within 1 Hz and 15% amplitude tolerance.
TestSummaryKeyContract uses 13.83 as a sentinel value to prevent silent .get(key, 0) fallbacks from producing wrong risk ratings.
Nine-page PDF with heatmaps, box plots, FFT spectrum, worst-case frame extraction, and per-pixel temporal stability maps.
CIELAB color space for sub-perceptual color shift measurement. RGB is not perceptually uniform; a delta of 10 in blue means something different than a delta of 10 in green.
Conservative overall risk: any HIGH in a single category makes the overall rating HIGH. Eye strain does not average out.
Architecture
How the system is structured.
The analysis logic lives entirely in run_analysis() in analyze_display.py. All four interfaces call it directly. The measurement code has no knowledge of how it was invoked.
AnalysisConfig
Common interface dataclass. Has a from_namespace() class method for backward compatibility with argparse callers.
PixelStatsAccumulator
Welford's online algorithm for per-pixel mean and variance. Three arrays (count, mean, m2) updated per frame. Memory is O(H×W) regardless of video length.
Flicker Detection
FFT via scipy.fft.rfft with Hanning windowing and amplitude correction. Returns frequency, amplitude, and modulation depth.
Risk Assessment
Four categories (Temporal Dithering, PWM Flicker, Text/Edge Stability, Brightness Uniformity) with explicit numeric thresholds. Overall risk uses a conservative any-HIGH rule.
PDF Report
Nine pages rendered via matplotlib PdfPages. Heatmaps, box plots, FFT spectrum, worst-case frames, CIELAB channel plots, and per-pixel stability maps.
Features
What it does and how each piece works.
Welford's online algorithm
Per-pixel variance computed in a single pass. Each frame updates delta, mean, and m2 arrays. No frames stored in memory. Verified within 1% of batch np.std().
FFT flicker detection
Transforms brightness time series into frequency domain. Hanning window reduces spectral leakage. Amplitude correction compensates for window attenuation. Detects PWM frequency and modulation depth.
Risk assessment engine
Four categories with explicit thresholds. Temporal dithering: pixel change count as percentage of ROI. PWM: frequency and modulation depth. Text stability: mean pixel temporal std. Uniformity: block brightness std.
Contract testing
TestSummaryKeyContract asserts that AnalysisSummary.to_dict() contains every key the risk function and PDF report look up. Uses 13.83 as a sentinel to catch silent .get(key, 0) fallbacks.
CIELAB color analysis
Converts BGR to CIELAB via skimage.color.rgb2lab. Computes block-mean standard deviation for L*, a*, and b* separately. Perceptually uniform color space for sub-perceptual shift measurement.
Four interfaces
CLI via argparse, interactive wizard with prompts, Tkinter GUI with ROI selection via cv2.selectROI(), and Docker image. All call run_analysis() directly.
Data flow
How data moves through the system.
Video frames are processed one at a time. Each frame updates the online statistics accumulators, computes per-frame metrics, and optionally extracts worst-case frames. After all frames, the accumulators finalize and the report is generated.
Video opened with OpenCV, ROI selected (full frame or user-defined region)
Each frame converted to grayscale, fed to PixelStatsAccumulator.update()
Per-frame metrics computed: MAD, RMS, StdDev, dither pixel count, spatial uniformity
Worst-case frames tracked for temporal instability and spatial non-uniformity
After all frames: accumulator.finalize() returns mean, std, and pixel_std_map
FFT analysis on the brightness time series for flicker detection
Risk assessment computed from summary metrics with explicit thresholds
Nine-page PDF generated via matplotlib PdfPages
Tradeoffs
Decisions that had real alternatives.
Welford's algorithm instead of shorter videos
The artifacts I'm looking for sometimes only appear after minutes of observation. PWM frequency might drift. Dithering patterns might change with content. Cutting videos short means missing the interesting cases. The online algorithm removes the constraint.
FFT instead of a simpler periodicity detector
I need the frequency, not just whether flicker exists. A display with PWM at 2000 Hz is different from one at 200 Hz, even if both show periodic brightness changes. FFT gives frequency and amplitude. Frame differencing gives a boolean.
Conservative overall risk (any HIGH = overall HIGH)
Eye strain does not average out. A display with perfect uniformity and zero dithering but brutal PWM flicker is still unusable. One bad dimension ruins the experience.
Template-based explanations instead of LLM-generated text
The explanations need to be deterministic, fast, and offline. Templates are boring and reliable. An LLM introduces an external dependency and makes the output non-reproducible, which breaks testing.
AnalysisConfig dataclass instead of threading argparse.Namespace
The original code passed the namespace object all the way from argument parsing to deep in the analysis functions. The dataclass gives IDE autocompletion, type checking, and a single place to see what a run requires.
Challenges
Problems that required specific solutions.
Problem
The first version stored every grayscale ROI frame in a list. A 30-second 60fps video at 1920x1080 produces 1800 frames at ~2MB each. That is 3.6 GB of RAM.
Solution
PixelStatsAccumulator uses Welford's online algorithm. Each update() call computes delta, updates mean, updates m2. Memory is O(H×W) regardless of video length. Verified within 1% of batch np.std().
Problem
Frame differencing catches obvious changes between consecutive frames but misses periodic flicker that repeats every few frames. PWM at 240 Hz sampled at 60 fps aliases into a pattern frame differencing cannot detect.
Solution
FFT transforms the brightness time series into the frequency domain. Hanning window reduces spectral leakage. Amplitude correction compensates for window attenuation. Integration test validates against a synthetic 10 Hz signal.
Problem
Dictionary key mismatches between AnalysisSummary.to_dict() and the consumers (risk function, PDF report) pass every unit test and fail silently in production.
Solution
TestSummaryKeyContract asserts every key exists. The sentinel value 13.83 is chosen because it is unlikely to appear by accident. If the code uses a different key that defaults to something similar, the bug would slip through with a round number.
Outcomes
What the work produced.
Memory usage for pixel statistics is constant regardless of video length. A 10-minute 4K video uses the same RAM as a 5-second clip.
Flicker detection accuracy validated against synthetic signals: frequency within 1 Hz, amplitude within 15%.
TestSummaryKeyContract prevents the most common class of silent failures in the data pipeline: key rename bugs that produce wrong risk ratings without raising an exception.
Nine-page PDF generated headlessly via matplotlib PdfPages, suitable for CI/CD integration and automated display testing.
154 tests across 10 files. test_eye_strain_risk.py alone has 41 tests covering every threshold boundary.
No external service dependencies. The tool runs fully offline and produces deterministic output.
Continue reading
See the rest of the work.
Each case study covers architecture, tradeoffs, and delivery detail. The skills page shows how these technologies connect across projects.