How does edge caching for bot traffic work?
Reduce the load on backend systems, lower Builderio costs, and speed up first visits by caching sessionless bot traffic at the CloudFront edge.
This guide describes a caching layer added to AI Commerce platform that stores sessionless SSR requests from bot traffic in CloudFront EDGE locations. The benefits include significantly lower load on backend systems (Backend and CMS), faster response times for search bots and first-time visitors, and lower display costs for third parties (e.g. Builderio).
How the feature works (in brief)
- Location: The cache operates in CloudFront EDGE locations (geographically close to the visitor).
- What is cached: Server-Side Rendering (SSR) requests that do not contain session information.
- Path and parameter requirements: The request belongs to the visitor's public path and only includes parameters that are generally supported by the store or that are separately whitelisted (e.g. pagination and sorting).
- Marketing Tracking Out: Requests with marketing tracking parameters (e.g. gclid, fbclid, utm_source) are not cached.
- The session is never cached: A request containing the customer's session information is always routed to the backend systems (security reason: individual data must not be leaked).
- Cache age: For bots and non-session visitors, content can be up to 24 hours old.
Examples of parameter rules
| Parameter / example | Caching | Attention |
|---|---|---|
| page=2 | Allowable | Pagination parameter (categories, etc.). |
| sort=price_desc | Allowable | General sorting parameter. |
| gclid, fbclid, utm_source | Not allowed | Marketing tracking parameters; skipped from cache. |
SSR vs. CSR
- SSR (Server-Side Rendering): First page loads and external requests. Only these are included in the edge cache if there is no session.
- CSR (Client-Side Rendering): Site navigation and component updates between the browser and the server. CSR requests are not shared from the edge cache because at this point the user is verifiably human.
What is never cached
- Requests that contain session information (e.g., logged in customer or active visit session).
- Requests with marketing tracking parameters (e.g. gclid, fbclid, utm_*).
- Views and other non-public paths related to the control panel.
- Requests for developers with system background information available.
Benefits
- Cost savings: Many systems charge per impression (e.g. Builderio). Cache significantly cuts impressions to bots.
- Speed for crawlers and first-time users: Without caching, the WordPress CMS API can load in an average of ~500–2000 ms. Edge caches often return responses in a few milliseconds.
- Scalability and security: Up to ~90% of requests are made by bots. Edge locations serve massive spikes (including denial of service attacks) without consuming live data from backend systems.
Disadvantages and limitations
- 24-hour freshness for bots: Bots can see content that is up to 24 hours old. Search engines don't usually update their indexes in real time, so the impact is minimal in practice.
- Not for “fast news” to bots: If the goal is for crawlers to see changes immediately, edge caching is not the right tool - and search engines usually don’t update results immediately either. In the context of e-commerce, this is a generally acceptable tradeoff.
Impact on different user groups
- Bots (crawlers): Often retrieve a page from cache within milliseconds; reduces background load and display costs.
- First visits / expired session: A visitor without a session (e.g. first visit or session expired after ~7 days) will equally benefit from faster loading times.
- Admins and logged in users: Due to the session, they always see up-to-date content. They are generally not affected by acceleration (no cache for sessions).
Configuration principles (where to edit)
- Location: Settings are implemented in CloudFront edges (EDGE locations) of AI Commerce environment.
- Cache lifetime: TTL up to 24 hours for sessionless SSR requests.
- Whitelist parameters: Only parameters supported/specifically allowed by the store, such as page and sort.
- Blacklist parameters: Marketing trackers such as gclid, fbclid, and utm_* are ignored from the cache.
- Session IDs: If the request contains a session cookie or similar identifier, the cache is always skipped.
- Destination paths: Public visitor paths (homepage, categories, CMS content), views that do not require login.
Example calculation of Builderio costs
| Description | Number |
|---|---|
| Builderio Basic | 10,000 impressions / month |
| Builderio Pro | 100,000 views/month |
| Bot indexing (1,000/day) | ≈ 30,000 views / month |
- Without a cache, aggressive indexing can exceed the Basic package limit on its own.
- Edge cache cuts bot traffic and reduces the risk of exceeding packet limits.
Concepts
- AI Commerce : An e-commerce platform with integrated edge caching.
- Builderio: CMS whose sidebars may incur costs.
- Dashboard: Store maintenance view (logged in users are not covered by the edge cache due to the session).
- EDGE location: A CloudFront router network that serves content close to the visitor.
- Session: An identifier (such as a browser cookie) that uniquely identifies a visitor and whose presence will bypass the cache.
- SSR/CSR: Server-Side Rendering (cacheable without session) vs. Client-Side Rendering (no cache).
Summary and recommended next action
Edge caching directs bot traffic and sessionless first visits to CloudFront, reducing backend load, page load times, and external display costs.
Search terms
- AI Commerce
- CloudFront edge cache
- bot traffic cache
- SSR vs CSR
- Builderio costs
- WordPress CMS API
- gclid fbclid utm
- Control panel
- search engine crawler
- scalability DDoS