This morning I shared a URL from Safari using a Shortcut. Then I asked AI to verify it worked. It grepped the CloudFront logs, found the request, confirmed 202. That was a test. I just didn't call it one.
The test I ran
- Fire a request (the shortcut)
- Check the logs for the expected entry
- Confirm 202, confirm the URL was captured
That's an integration test. The system under test is production. The test output is the access log. The assertion is: did the expected line show up?
Why it's a regression test
The git note from this session says:
Verified: shortcuts hit CloudFront, logs show 202s from Mac (IAD) and phone (DFW)
That's the test case. If I change the CF function tomorrow, the same check works: fire a request, grep the logs, did it show up? If yes, no regression. If no, something broke.
The note is the spec. The log is the test output. The AI is the test runner.
Async testing
CloudFront standard logs take 5-10 minutes to land in S3. You can't run this in a CI pipeline that expects instant feedback. But who said tests need to be synchronous?
Two scheduled actions:
- Fire (every hour):
curl https://thetube.today/create/heartbeat?origin=ci - Check (15 minutes later): grep the logs for
heartbeat&origin=ci. If it's not there, something's broken.
The test is decoupled from the assertion. Same architecture as everything else — write now, read later. And if the check fails, it writes its own event: /create/alert?test=heartbeat&status=failed. The system monitors itself using itself.
Test in prod
There's no staging environment because there's nothing to stage. The CF function either returns 202 or it doesn't. A test request with type=heartbeat&origin=ci goes through the same path as a real request. It is a real request. The test is production traffic.
No mock, no fake, no "staging that's almost like prod but not quite." You can't corrupt data by testing because there's no data to corrupt — just logs that accumulate. One more log entry doesn't hurt anything. That's the luxury of stateless + append-only.
vs. traditional testing
Traditional: write code → write tests → run tests → tests produce output → check output → maintain tests when code changes
This: use the system → system produces logs → check logs
The logs are the test report. The notes are the test cases. The AI is the test runner. No pytest, no Jest, no test framework. Just files and a reader.
The gap: notes aren't executable. You can't run them in CI. An AI has to interpret them. That's weaker as an automated gate but stronger as a regression check — the intent survives implementation changes that would break brittle unit tests. "The shortcut fires and the request appears in the logs" doesn't care whether the handler is a CF function or a Lambda or a different path. The assertion is about behavior, not implementation.
The journey
Walk thought → conversation. The verification I did this morning (grep logs for the shortcut request) is literally a regression test. The git note records what should work. The logs prove it does. Two cron jobs make it automated. No framework needed — the system already records everything.