What is Bot Traffic? The 51% Your Analytics Miss
9 min read
TL;DR
- More than half of all internet traffic comes from automated bots, not human visitors.
- Google Analytics and most analytics tools filter out bot traffic entirely, leaving a massive blind spot.
- AI crawlers like GPTBot and ClaudeBot now decide which products get recommended to millions of consumers.
- Ignoring bot traffic means ignoring the fastest-growing distribution channel in ecommerce.
What is Bot Traffic?
Bot traffic is any visit to your website that comes from an automated software program rather than a human being. These programs – called bots, crawlers, spiders, or scrapers – visit web pages for a wide range of purposes: indexing content for search engines, training AI models, monitoring uptime, scraping prices, testing security vulnerabilities, and much more.
According to multiple industry studies, bots consistently account for between 45% and 55% of all web traffic globally. The proportion is even higher for ecommerce sites, where product data is especially valuable. When you log into Google Analytics and see your visitor count, you are only seeing the human half. The other half – the bot half – is completely invisible.
This was acceptable when bots were mostly search engine crawlers following well-understood rules. But the landscape has changed fundamentally. A new generation of AI crawlers is visiting your site dozens or hundreds of times per day, and the data they collect directly determines whether AI assistants like ChatGPT, Claude, and Perplexity recommend your products – or your competitor's.
Why Bot Traffic Matters for Ecommerce
For most of the internet's history, the relationship between bots and websites was simple: Googlebot crawled your pages, indexed them, and ranked them in search results. If you wanted more traffic, you optimized for Google. Everything else was noise.
That model is breaking down. Consumers increasingly ask AI assistants for product recommendations instead of typing queries into Google. When someone asks ChatGPT for the best running shoes under $150, GPTBot's crawl of your product pages determines whether your brand appears in the answer. No crawl, no recommendation. Bad crawl, wrong recommendation. This is not hypothetical – it is happening right now, at scale.
The ecommerce brands that understand this shift have a window of opportunity. AI-driven product recommendations are still in their early phase. The crawl patterns are not yet optimized for every site. The brands that actively monitor and optimize their bot traffic today will establish an advantage that compounds over time – similar to how early SEO adopters dominated search rankings for years.
There is also a defensive angle. Malicious bots represent a real cost: they inflate server load, distort analytics data, scrape pricing intelligence for competitors, and can execute inventory hoarding attacks during flash sales. Understanding your bot traffic profile is the first step toward blocking the bad actors while welcoming the bots that drive revenue.
Types of Bots Visiting Your Site
Not all bots are created equal. Understanding the different categories helps you decide which ones to welcome, which to monitor, and which to block.
Search Engine Crawlers
Googlebot, Bingbot, YandexBot, and DuckDuckBot are the workhorses of the web. They crawl your pages to build the search index that powers traditional search results. These bots are well-behaved, respect robots.txt directives, and identify themselves clearly. You almost always want them on your site.
AI Crawlers
This is the fastest-growing category. GPTBot (OpenAI), ClaudeBot (Anthropic), PerplexityBot (Perplexity AI), Bytespider (TikTok / ByteDance), and Google-Extended (Google's AI training crawler) visit your site to gather data that feeds large language models and AI-powered answer engines. Their crawl patterns, frequency, and the pages they prioritize directly influence whether AI assistants recommend your products. For a deep dive into each crawler, see our guide to AI crawlers.
SEO and Monitoring Bots
SemrushBot, AhrefsBot, MJ12bot, and similar crawlers index backlinks, keywords, and site architecture for SEO tools. Uptime monitors like Pingdom and UptimeRobot make periodic requests to verify your site is live. These bots serve legitimate purposes but can be aggressive in crawl frequency if left unchecked.
Malicious and Unwanted Bots
Scrapers, credential stuffers, inventory hoarders, and DDoS bots represent the darker side of bot traffic. These programs do not identify themselves honestly (many spoof legitimate user-agent strings), do not respect robots.txt, and exist solely to extract value or cause harm. Imperva's annual bot report consistently finds that malicious bots account for roughly 30% of all internet traffic.
How to Measure Bot Traffic
The challenge with bot traffic measurement is that most analytics platforms are designed to exclude it. Google Analytics filters out known bots by default. Adobe Analytics does the same. This makes sense for measuring human engagement, but it creates an enormous blind spot for anyone trying to understand how bots interact with their site.
To properly measure bot traffic, you need to look at the raw server layer – specifically, your web server access logs. Every HTTP request that hits your server is recorded there, regardless of whether it came from a human browser or an automated crawler. The log entry includes the user-agent string, which bots use to identify themselves, along with the timestamp, requested URL, HTTP status code, and response size.
The problem is that raw logs are noisy, voluminous, and difficult to analyze at scale. A mid-sized ecommerce site can generate millions of log lines per day. Manually parsing user-agent strings against a known-bot database, correlating crawl patterns across pages, and identifying anomalies is not practical work for a human being or a spreadsheet.
This is exactly the problem botjar solves. The platform ingests your server logs (or captures events via a lightweight edge script), classifies every request against a comprehensive database of known bots, and presents the data in visual dashboards that make bot behavior as intuitive to understand as human behavior in Hotjar. You get datamaps (heatmaps for bots), crawl timelines, page-level AI Visibility Scores, and alerts when crawl patterns change.
Other approaches to bot measurement include CDN-level analytics (Cloudflare, Fastly, and Akamai all provide some bot classification), WAF (Web Application Firewall) logs, and custom middleware that inspects incoming requests. Each has trade-offs in accuracy, granularity, and ease of implementation. The key requirement is capturing traffic at the server layer, before analytics JavaScript has a chance to filter it out.
The Revenue Impact of Bot Traffic
Bot traffic impacts ecommerce revenue through three distinct channels: discovery, cost, and intelligence.
On the discovery side, AI crawlers determine whether your products appear in AI-generated recommendations. When a consumer asks an AI assistant for product advice, the assistant draws on training data and retrieval augmented generation (RAG) to formulate its answer. If AI crawlers have not recently visited your product pages, or if they encountered errors, slow load times, or poorly structured content, your products will not appear in those recommendations. Every missed crawl is a missed opportunity for discovery.
On the cost side, excessive or malicious bot traffic directly inflates infrastructure costs. Bots that aggressively scrape product catalogs consume bandwidth, CPU cycles, and CDN egress. During high-traffic events like Black Friday, bot traffic can compete with human visitors for server resources, causing slowdowns that cost real revenue. Some ecommerce operators report that bot mitigation reduces their cloud hosting bills by 15-30%.
On the intelligence side, competitor bots scrape your pricing, inventory levels, and product descriptions to feed competitive intelligence platforms. If you are not aware of which bots are extracting this data, you are handing your competitors a strategic advantage without even knowing it. Monitoring bot traffic gives you visibility into who is scraping what, and the ability to selectively restrict access.
The brands that treat bot traffic as a strategic asset – welcoming the right crawlers, blocking the bad ones, and optimizing the experience for AI – will outperform those that continue to ignore the other 51%. The shift from traditional SEO to Bot CRO is not a question of if, but when. The question is whether you will be ahead of the curve or behind it.
Continue Reading
See your bot traffic in action
Botjar analyzes your server logs and shows you exactly which bots visit, what they see, and what they miss. Free audit takes 60 seconds.
See Your Bot Traffic – Demo