Test harness (prototype)
This is the one part of DoesItARM that is built, not just specced: a zero-dependency Swift CLI
(apps/doesitarm-harness) that takes a Mac app from “we have a copy” to “here’s the verdict, with
evidence.” Everything below reflects what runs today; planned pieces are marked.
System architecture — current vs planned
Section titled “System architecture — current vs planned”flowchart TB
subgraph HOST["macOS host · Apple M2 Max · macOS 26.3 — ACTIVE TODAY"]
H["Harness CLI<br/>swift run harness"]
subgraph DRV["Driver seams (swappable)"]
UI["UIDriver<br/>ax ✓"]
REC["EvidenceRecorder<br/>screencapturekit ✓ · screencapture · ffmpeg · none"]
ISO["IsolationBackend<br/>host ✓"]
end
APP["App under test<br/>VLC · HandBrake · Audacity"]
H --> DRV
DRV --> APP
end
subgraph VM["macOS VM · tart / Virtualization.framework — PLANNED (stub, not active)"]
VAPP["App in clean-room VM"]
end
HOST -. "--isolation tart (not wired up yet)" .-> VM
classDef planned stroke-dasharray:6 4,opacity:0.5;
class VM,VAPP planned;
Build status
Section titled “Build status”| Capability | Status |
|---|---|
Static arch / Rosetta classification (lipo/otool) | ✅ built |
| Bulk scan of all installed apps (no permissions) | ✅ built — 113 apps classified |
| Acquire via Homebrew cask + locate | ✅ built |
| Launch + window-ready + crash detection | ✅ built |
| Drive UI via Accessibility (menus, buttons, files) | ✅ built (needs Accessibility grant) |
Audio Unit lane via auval (zero-UI) | ✅ built |
| Record video + screenshots | ✅ built — needs Screen Recording grant to capture |
Local store (SQLite = D1 mirror, r2/ = R2 mirror) | ✅ built |
| Swappable recorder + isolation backends | ✅ built (host + capture backends); tart/user = stubs |
macOS VM isolation (tart) | ◌ stub |
| Live D1/R2 sync | ◌ dry-run only |
Test run pipeline
Section titled “Test run pipeline”One invocation of harness test <app> runs this, start to finish:
flowchart LR
A["acquire<br/>brew cask / locate"] --> B["static arch<br/>lipo · otool"]
B --> C["isolation.prepare<br/>host: clean state"]
C --> D["launch<br/>NSWorkspace"]
D --> E["record<br/>ScreenCaptureKit"]
E --> F["drive profile<br/>Accessibility API"]
F --> G["evidence<br/>video · shots · crash · logs"]
G --> H["classify<br/>passed / native …"]
H --> I[("store<br/>SQLite + r2/")]
I -. "sync (later)" .-> J["D1 + R2"]
The driver pattern
Section titled “The driver pattern”Three seams are pluggable at runtime, so we can A/B the original tooling against alternatives — or run
permission-free — with a flag. ✓ = working, (stub) = wired seam, not implemented.
flowchart TB
E["AutomationEngine"]
E --> U{{"UIDriver"}}
E --> R{{"EvidenceRecorder"}}
E --> S{{"IsolationBackend"}}
U --> u1["ax ✓"]
U -.-> u2["cgevent (stub)"]
U -.-> u3["xcuitest (stub)"]
R --> r1["screencapturekit ✓"]
R --> r2["screencapture ✓"]
R --> r3["ffmpeg ✓"]
R --> r4["none ✓"]
S --> s1["host ✓"]
S -.-> s2["tart VM (stub)"]
S -.-> s3["user account (stub)"]
Isolation options
Section titled “Isolation options”| Backend | GPU | Isolation | Status |
|---|---|---|---|
host (default) | ✅ real Metal | prefs/containers reset | active |
user (dedicated account) | ✅ real Metal | separate account state | stub |
tart (macOS VM) | ⚠️ paravirtual (capped) | full clean-room | stub |
| bare-metal worker (prod) | ✅✅ full | one app per box | planned |
Data model → D1 + R2
Section titled “Data model → D1 + R2”Results map 1:1 onto the data model. The local store is built so
“connect Cloudflare later” is a flat upload: SQLite is D1, and r2/ file paths are R2 object keys.
flowchart LR
subgraph M["Domain model"]
T["Title"] --> SG["Signal"]
TR["TestRun"] --> SG
SG --> V["Verdict"]
end
subgraph L["Local store · .darm-data"]
DB[("SQLite<br/>doesitarm.db")]
FS["r2/ mirror<br/>recordings · logs · diagnostics"]
end
M --> DB
TR -. "evidence keys" .-> FS
DB -.->|"wrangler d1 execute"| D1[("Cloudflare D1")]
FS -.->|"wrangler r2 object put"| CR[("Cloudflare R2")]
Component tree
Section titled “Component tree”mindmap
root(("doesitarm-harness"))
CLI
setup
arch
scan
test
plugins
sync
Core["CompatHarnessCore"]
Acquirer
ArchInspect
Launcher
EvidenceRecorder
IsolationBackend
Accessibility
Profiles["Base + VLC/HandBrake/Audacity"]
Storage["Storage + Schema"]
VerdictResolver
AutomationEngine
Store[".darm-data"]
SQLite["SQLite (D1 mirror)"]
r2["r2/ evidence (R2 mirror)"]
What works without permissions
Section titled “What works without permissions”macOS gates UI driving (Accessibility) and capture (Screen Recording) behind TCC grants that can’t be set from the CLI. The arch/scan/plugin lanes need neither, so we collect real data today regardless.
flowchart TB
subgraph FREE["Works now — no permissions"]
sc["scan · 113 apps classified"]
pl["plugins · auval"]
ar["arch · static Mach-O"]
end
subgraph AXG["Accessibility ✓ granted"]
dr["drive UI · menus · buttons"]
end
subgraph SRG["Screen Recording ✗ NOT granted"]
vid["video clips"]
shot["screenshots"]
end
classDef ok fill:#0f3d2e,stroke:#5ee0a8,color:#dffaf0;
classDef warn fill:#3a2f12,stroke:#e5c463,color:#fff7e0;
class FREE,sc,pl,ar,AXG,dr ok;
class SRG,vid,shot warn;
Running it
Section titled “Running it”cd apps/doesitarm-harnessswift buildswift run harness scan # classify every installed app (no permissions)swift run harness test vlc # acquire, launch, drive, record, storeswift run harness test vlc --recorder ffmpeg --isolation host # swap backendsThe full options matrix and decisions live in the repo at
apps/doesitarm-harness/docs/driver-architecture-and-tooling-options.md.