Fixing Index Bloat: A Robots.txt & Sitemap Strategy for 1M+ Pages

Fixing Index Bloat: A Robots.txt & Sitemap Strategy for 1M+ Pages

Client: A large online marketplace with over 1 million product listings.

Industry

E-commerce (Marketplace)

Platform

Custom PHP Framework

Timeline

4 months

Fixing Index Bloat_ A Robots.txt & Sitemap Strategy for 1M+ Pages

Challenges We Faced

Wasted Crawl Budget

Outdated Sitemaps

Index Bloat

Robots.txt Errors

Save Your Crawl Budget

“Is Google crawling the wrong pages? Our Robots.txt Optimization services will guide bots to your money pages.”

Our Approach – How We Solved These Challenges

Robots.txt Optimization

Our robot.txt configuration SEO company team audited their logs and updated the robots.txt file to Disallow: /search? and other low-value parameters.
We unblocked the CSS/JS resources to fix rendering issues.

XML Sitemap Creation & Optimization

We built a new, dynamic XML sitemap system that auto-updated daily.
We split the sitemap into smaller “child” sitemaps (e.g., sitemap-products-1.xml) to help Google process them faster.

Crawl Budget Analysis

We used Log File Analysis to confirm that after the robots.txt update, Googlebot shifted its attention to the product pages.

Pruning & Index Bloat Removal

We submitted a removal request for the 2 million low-value search URLs.

Results

MetricBeforeAfterGrowth
Valid Pages Indexed (GSC)20% (of total index)95% (of total index)+375%
Crawl Frequency (Product Pages)Once every 30 daysOnce every 3 days+900%
Organic Traffic850,000/mo1,400,000/mo+64%

Free Crawl Budget Audit

“Not sure if you’re wasting crawl budget? Claim a Free Log File Analysis from our sitemap & robots.txt agency!”

Scroll to Top

DO YOU HAVE A PROJECT? 

If you’ve got a business challenge to solve or want to take your brand to the next level, we’d love to hear from you.

image
Simply complete this form and one of our experts will be in touch!