Stage Partners - Botnet Attack Remediation

Executive Summary

In early February 2026, Stage Partner's Magento 2 site came under sustained attack from what appeared to be a sophisticated residential proxy botnet that was systematically enumerating the product catalog using the faceted navigation on the site, driving database load to dangerous levels and degrading page performance.

Over the course of a month, Assisted Innovations worked through five phases of escalating defense, ultimately developing a novel solution that combines a CloudFront Function with Cloudflare Turnstile to perform and validate browser challenges directly from the AWS edge. The result is a system that definitively separates human visitors from automated requests without impacting the user experience.

Key result: Attack traffic is now absorbed at the edge. Page loads hold steady below 0.5 seconds even during active attack periods, and the database is no longer affected.

The Challenge

History of Attacks

Stage Partners has been targeted by automated attacks multiple times. In late 2024, a DDoS-style attack originating from Hong Kong hammered the site with upwards of 25,000 requests per minute. That was resolved by blocking the country at the firewall level. Throughout 2025, similar spikes appeared from scattered locations around the world, leading to a country-based allowlist that restricted access to English-speaking countries where Stage Partners' audience actually lives. By February 2026, that system was in place and working well.

A New Kind of Attack

In early February 2026, during routine weekly development work, I noticed anomalies in the CloudWatch dashboard. Page load times were spiking. CPU was cycling between normal and near-max. Database connections were surging toward their ceiling and then dropping off. None of it was severe enough to cause a full outage, but the pattern was clearly not typical load. I flagged it to Stage Partners and started investigating.

The initial theory was a performance problem, not a security problem. The traffic was mostly coming from the United States, looked fairly normal at a glance, and request volumes were not dramatically elevated. We reviewed custom Magento modules, identified some inefficiencies, and pushed updates. The site got a bit faster, but the intermittent spikes continued, ruling out a pure code issue.

A deeper review of the access logs revealed the real pattern: a large number of single-use IPs making requests to deeply filtered category pages. Magento provides faceted navigation that lets users combine filters (genre, theme, length, casting requirements) to narrow down the catalog. The attacker was exploiting this by programmatically combining filter parameters in the URL, requesting pages like /plays/genre-comedy-drama-theme-holiday-length_type-one_act. Each unique combination forced additional load to the Magento database, and the bot was cycling through thousands of them.

The attacker was sophisticated. They rotated through a massive pool of residential proxy IPs. In our worst datasets, 85 to 97 percent of attacking IPs appeared only once. They spoofed legitimate-looking user agents and, critically, set the referer header to yourstagepartners.com to make every request look like normal site navigation. This was not a crude scraper. It was purpose-built for filter enumeration.

The Response

Phase 1: WAF Regex Rules and Their Limits

Our first instinct was AWS WAF. We already had managed rule roups running (Bot Control, SQL injection, common exploits), but they were not designed for this kind of attack. Each individual request appeared perfectly legitimate. It was the aggregate pattern that was malicious.

We started writing custom regex rules to match URLs with excessive filter parameters. Block anything with four or more filter groups. Then three groups with too many values. The problem was that every rule needed reworking within a day or two. The attack adapted, shifting which URL paths it targeted, varying how many filters it combined, and rotating through different catalog entry points.

WAF regex rules proved to be blunt instruments for this problem. We kept hitting false positives on real users running advanced searches. WAF also has limited regex complexity and no ability to evaluate multiple signals together in a single rule.

Phase 2: Caching and a CloudFront Function

Next we added caching. A five-minute TTL on the filtered category paths in CloudFront meant repeated hits to the same URL would not reach the origin. This helped initially, but the attack adapted and started requesting even more varied paths to avoid cache hits.

That adaptation gave us our first breakthrough. We moved the detection logic to a CloudFront Function running on viewer-request events. CloudFront Functions execute at the edge before the request reaches the origin or WAF. They run in under a millisecond. And, crucially, they let you write real JavaScript logic instead of being limited to regex patterns.

The first version was straightforward: parse the URI, count how many filter group keywords appeared, count how many values were in each group, and block requests exceeding our thresholds. Even this simple approach was a massive improvement over WAF rules, because we could evaluate the number of groups and the number of values per group together, rather than trying to express that in a single regex.

But the botnet kept adapting. When we deployed the function on /plays/* and /collections/*, traffic shifted almost entirely to /resources/*. When we covered that path, probes appeared on /catalogsearch/*. We ended up deploying the function on the default CloudFront behavior to cover every path, after verifying through log analysis that no legitimate traffic would be affected.

The attack appeared to stop immediately. Thirty-six hours later, overall request volume on the CloudFront charts actually dropped significantly, leading us to believe the attacker had given up.

Phase 3: Hardening the Application

We were not fully convinced the attack was over, and we were still concerned about site performance in general. So we did another pass through the application code, this time looking specifically for slow and unnecessary database queries. We also reviewed the code under the assumption that the whole thing might be a self-inflicted positive feedback loop, just in case we had missed something in prior reviews.

We found real problems: excess database queries and inefficiencies in how Magento was processing filter combinations. We pushed fixes for both.

The result was Stage Partners averaging under half a second per page load. That was a huge win regardless of the attack, and it meant the site would be far more resilient if the bot came back.

Phase 4: Behavioral Signals

Then we discovered that the CloudFront Function was too strict and was blocking some legitimate traffic. We relaxed the thresholds, nervous that the attack would return.

It did, but with a new pattern. The bot started running during overnight hours and stopping around 6:30 AM each day.

We decided to pull the raw WAF logs for deeper analysis. The real leap came from parsing 2.4 GB of log data covering over 800,000 requests captured during one of those overnight degradations. Instead of just looking at URLs, we examined the full request context: cookies, sec-fetch-mode headers, and referer patterns.

The findings were striking. Of the attack traffic, 99.7 percent carried no cookies at all, while 90 percent of legitimate traffic included Magento session cookies. The attack was 96 percent sec-fetch-mode: navigate (full page loads), while real users browsing with filters primarily generated AJAX requests (cors or no-cors). And 97 percent of the cookie-less filter traffic spoofed a self-referer, claiming to navigate from yourstagepartners.com. That is impossible without session cookies.

These signals gave us a layered detection model. A request with a self-referer but no cookies is almost certainly spoofed, because real users navigating within a site always carry session cookies. A full-page load to a filtered URL without cookies and without a search engine referer is suspicious. And the filter-complexity thresholds still serve as a backstop for anything that slips through.

The Hardest Part: Not Blocking Real Users

Every rule we tightened against the bot risked blocking a real customer. Stage Partners' advanced search generates URLs with three filter groups. Users might share filtered links with colleagues. First-time visitors arrive from Google with no cookies yet. Someone bookmarks a filtered page and returns weeks later.

Each of these is a legitimate scenario that shares characteristics with the botnet. The function went through multiple iterations to handle them: exempting search engine referers from cookie checks, allowing cookie-bearing users up to four filter groups, distinguishing shared links (no referer) from spoofed navigation (self-referer without cookies), and permitting low-complexity filter patterns even from unknown sources.

We validated every version against the 800,000-request WAF dataset, running simulations to measure both catch rate and false positive rate before deploying. The final version blocked 95 percent of attack traffic with zero false positives on cookie-bearing legitimate users.

But we still had a problem. Every time we loosened the thresholds to let a legitimate user pattern through, attack traffic seeped in and database load climbed. We needed a way to definitively separate humans from bots, not just guess based on request headers.

Phase 5: Proving Humanity at the Edge

The breakthrough came from a simple insight: bots making raw HTTP requests is unlikely to execute JavaScript. A real browser loading any page on the site will run the page's scripts. A bot cycling through filter URLs with forged headers will not.

We built a tiered verification system using two JavaScript-set cookies. The first is a lightweight verification cookie set by a small inline script on every page. When a browser loads any page, the script sets a cookie with an encoded date stamp. The value rotates daily and uses a simple hash, so it cannot be easily guessed or replayed from a previous session. If a request arrives with this cookie, we know a real browser loaded at least one page in the last 24 hours.

The second layer uses Cloudflare Turnstile, a free, invisible challenge platform that runs a background proof-of-humanity check without showing users a CAPTCHA. We added a Turnstile widget that loads silently on every page. When it succeeds, which it does almost instantly for real browsers, we set a second cookie with the validation timestamp. This cookie proves not just that JavaScript ran, but that an actual human-driven browser session passed Cloudflare's behavioral analysis.

With these two signals feeding into the CloudFront Function, we built a four-tier detection model. Tier 0 is a request carrying the Turnstile cookie: confirmed human, essentially unrestricted. Tier 1 has the JavaScript verification cookie but no Turnstile: very likely real, with generous thresholds. Tier 2 has a PHP session cookie but neither verification cookie: probably a returning user whose JavaScript has not run yet on this visit, so moderate restrictions apply. Tier 3 has no cookies at all: the strictest rules, matching the behavioral detection from Phase 4.

The key innovation is what happens when a real user does get caught by the stricter tiers. Instead of returning a hard 403 block, the function serves a lightweight Turnstile challenge page directly from the CloudFront edge. The page loads instantly, runs the invisible Turnstile check, sets the validation cookie, and redirects the user to their original destination. For a real person in a real browser, this takes about a second and requires no interaction. They might briefly see a "Verifying your browser" message before the page they wanted loads normally.

For the bots, that challenge page is a dead end. They receive the HTML but cannot execute the JavaScript, cannot complete the Turnstile challenge, and cannot obtain the cookie that would let them through on subsequent requests. The attack traffic hits a wall it cannot adapt around, because the fundamental constraint is not a URL pattern or a threshold they can probe their way past. It is the inability to run a browser.

Why This Combination Was Necessary

This approach is, to our knowledge, a novel use of Cloudflare Turnstile. Turnstile is typically deployed as a form-protection widget or a page-level challenge managed through Cloudflare's own CDN. We do not have access to the Cloudflare CDN from this environment, so that kind of automated page-level challenge was not possible.

The AWS WAF is also able to issue challenges, but only from a WAF Rule that does not give us the flexibility we needed for this detection. The behavioral signals we built (cookie tiers, sec-fetch-mode analysis, referer classification, filter-complexity scoring) all require the programmable logic of a CloudFront Function.

By combining the two platforms, we were able to get the best of both: human-detection intelligence from Cloudflare combined with edge compute power in AWS CloudFront, resulting in a challenge-on-demand system at the edge that neither platform offers on its own.

Results

The defense now operates across multiple layers: the country allowlist filters out irrelevant regions, WAF managed rules handle standard threats, the CloudFront cache absorbs repeat requests, the CloudFront Function evaluates behavioral signals and serves Turnstile challenges at the edge, and the application itself is significantly leaner than it was a month ago.

The botnet is still active. We can see the attack traffic hitting the function daily. But it is being absorbed without impact. The Turnstile challenge pages are serving as intended: real users pass through in about a second with no interaction required, while bot traffic hits the challenge and goes nowhere. Page loads are holding at 0.5 seconds even during peak attack periods, and the database is barely noticing the load at all.

The bot will almost certainly adapt again. They have before, and a sophisticated attacker with a residential proxy network and purpose-built tooling is not going to walk away quietly. But we are in a fundamentally different position now. The detection is based on what a request is, not just what it looks like. No matter how cleverly the bot rotates its IPs or varies its request patterns, it cannot fake being a real browser that executes JavaScript and passes a humanity check. That is an asymmetry we can build on.

Key Takeaways

Don't skip the application-level work. Optimizing the application itself was one of the most impactful steps we took. Cutting page loads from multiple seconds to under half a second meant the site could absorb more abuse before users noticed, and it made every other mitigation more effective.

CloudFront Functions are underrated for bot mitigation. The ability to evaluate cookies, headers, referers, and URL structure together, at the edge, in under a millisecond, with no origin load, is extremely powerful. The constraints (ES 5.1, 1ms compute, no network calls) force tight, efficient code, but that is exactly what you want for request-level checks.

Behavioral signals beat pattern matching. The botnet can rotate IPs, spoof user agents, and vary URL patterns, but it cannot forge session cookies for a site it has not actually visited. That asymmetry is what ultimately makes the defense work.

Log analysis is worth the effort. Parsing 2.4 GB of raw WAF logs was tedious, but it revealed behavioral patterns that no dashboard or summary metric would have shown. The cookie distribution, the sec-fetch-mode breakdown, the self-referer spoofing: all of that came from digging into the raw data.

Attackers adapt, and so must you. Every time we deployed a new defense, the botnet changed tactics within hours. Blocking filter paths? It moved to new paths. Deploying a CloudFront Function? It shifted to overnight-only runs. The attack did not stop until we moved from blocking specific patterns to evaluating request behavior.

Challenge, don't just block. Replacing hard 403 responses with Turnstile challenge pages was a philosophical shift as much as a technical one. Blocking is binary and always risks false positives. Challenging lets real users prove themselves while still stopping bots. We initially thought this was not going to be possible because we were working at the edge instead of in a WAF rule. By combining platforms in unexpected ways (Cloudflare's challenge intelligence inside AWS's edge compute) we were able to produce a solution that neither vendor offers out of the box.

Study: Fighting a Month-Long Botnet Attack on a Stage Partners eCommerce Site