Why I built web-tarpit · crumrine.net

Every site I run gets the same bot probes. /wp-login.php, /.env, /.git/config, /admin.php, /phpmyadmin/, on and on. Same IP hits sixty paths in thirty seconds, doesn’t find anything, moves on. They’re looking for soft targets, fast.

Two options for me. Keep returning 404s (costs me nothing, also costs the scanner nothing, one TCP round trip and they’re done). Or waste their time.

The second option is more interesting. Every second a scanner spends on me is a second it’s not spending on someone who might actually be vulnerable. Small contribution per site, but I run about 100 of them, and each one slow-dripping a few seconds of fake response data adds up. Multiply across everyone who installs the package and it starts to bend the economics of mass scanning.

So I built web-tarpit. It’s middleware. Drop it in front of your worker (or your express app, whatever) and it intercepts the bot paths before they hit your real routes. The probe gets a fake WordPress login page, or a slow-dripping fake .env with plausible-looking AWS keys and a fake DB password. The scanner has to parse it. Has to try the creds. Has to wait. Real users never see any of this because real users don’t request /wp-config.bak.

The hardest part wasn’t the middleware. It was making the fakes convincing enough to be worth attacking. A fake login page that immediately 404s on submit is an obvious dead end. One that accepts your credentials with a slight delay, logs the attempt, and returns “incorrect password, try again” makes the scanner want to keep trying. The slow-drip .env is the same trick. Connection stays open, bytes trickle out, the scanner can’t tell if the file is real or just being served by something broken. Most time out before getting the whole thing.

There’s also a logging piece I find quietly useful. Every probe writes a row to D1: path, method, payload, IP, UA. Across 100 sites this becomes a real-time view of which scanner campaigns are active, which paths they’re prioritizing, which ASNs are sourcing them. Free threat intel. Mostly it’s the same fifty CVEs being rechecked. Occasionally something weird.

If you want it: npm install web-tarpit. Drop into your worker’s fetch handler, point at a D1 binding, done. The repo has the architecture and how to add new fake-response handlers. It’s also a fun read if you’re learning workers - single file, no deps, short code path.