Tutorial

How to Scrape Yelp Business Data in 2026

|9 min read

Yelp is the largest local business review platform in the US, with data on millions of businesses across every city. Whether you are building a lead generation tool, analyzing customer sentiment, or researching local markets, scraping Yelp gives you structured business data at scale.

Why scrape Yelp?

Yelp data powers a range of business intelligence use cases:

1

Lead generation

Build targeted B2B lead lists with business names, phone numbers, addresses, and websites from Yelp listings.

2

Review analysis

Analyze customer sentiment across thousands of reviews. Identify common complaints and praise for competitor research.

3

Local market research

Map business density, average ratings, and pricing tiers across neighborhoods and cities.

4

Reputation monitoring

Track your own business ratings and review velocity over time. Get alerted to new negative reviews.

The challenge

Yelp is notoriously aggressive about blocking scrapers. The site uses JavaScript rendering, obfuscated CSS class names that rotate regularly, CAPTCHA challenges, and IP-based rate limiting. Yelp has also pursued legal action against scraping operations, making it one of the more contentious targets.

Method 1: DIY with Puppeteer

Launch a headless browser, search Yelp, and extract business data:

scraper.js
#E8A0BF">const puppeteer = #E8A0BF">require(#A8D4A0">'puppeteer');

(#E8A0BF">async () => {
  #E8A0BF">const browser = #E8A0BF">await puppeteer.#87CEEB">launch({ headless: #A8D4A0">'new' });
  #E8A0BF">const page = #E8A0BF">await browser.#87CEEB">newPage();

  #E8A0BF">await page.setUserAgent(
    #A8D4A0">'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' +
    #A8D4A0">'AppleWebKit/537.36 (KHTML, like Gecko) ' +
    #A8D4A0">'Chrome/124.0.0.0 Safari/537.36'
  );

  #E8A0BF">await page.#87CEEB">goto(
    #A8D4A0">'https://www.yelp.com/search?find_desc=pizza&find_loc=New+York',
    { waitUntil: #A8D4A0">'networkidle2', timeout: 30000 }
  );

  #E8A0BF">await page.#87CEEB">waitForSelector(#A8D4A0">'[data-testid=#A8D4A0">"serp-ia-card"]', { timeout: 10000 });

  #E8A0BF">const businesses = #E8A0BF">await page.#87CEEB">$$eval(
    #A8D4A0">'[data-testid=#A8D4A0">"serp-ia-card"]',
    nodes => nodes.slice(0, 10).map(n => ({
      name: n.#87CEEB">querySelector(#A8D4A0">'a.css-19v1rkv')?.innerText,
      rating: n.#87CEEB">querySelector(#A8D4A0">'[aria-label*=#A8D4A0">"star rating"]')?.getAttribute(#A8D4A0">'aria-label'),
      reviews: n.#87CEEB">querySelector(#A8D4A0">'.css-chan6m')?.innerText,
      address: n.#87CEEB">querySelector(#A8D4A0">'.css-li27hp')?.innerText,
    }))
  );

  console.#87CEEB">log(businesses);
  #E8A0BF">await browser.#87CEEB">close();
})();

Pain points

  • !Yelp rotates CSS class names on every deployment — selectors break constantly
  • !CAPTCHA challenges appear after just a few requests from the same IP
  • !Review content is partially hidden behind "read more" buttons requiring click simulation
  • !Phone numbers and websites are sometimes gated behind click-to-reveal elements
  • !Yelp has filed lawsuits against scraping operations — higher legal risk than most targets
  • !Pagination requires handling both URL parameters and JavaScript-loaded content

Method 2: SnapRender API

SnapRender handles the browser rendering and anti-bot challenges for you. Use /render for markdown or /extract for structured JSON.

Render as markdown

Get Yelp search results as clean markdown for AI processing or storage.

render.py
#E8A0BF">import requests

# Render Yelp search results #E8A0BF">as markdown
render = requests.#87CEEB">post(
    #A8D4A0">"https://api.snaprender.dev/v1/render",
    headers={#A8D4A0">"x-api-key": #A8D4A0">"sr_live_YOUR_KEY"},
    json={
        #A8D4A0">"url": #A8D4A0">"https://www.yelp.com/search?find_desc=pizza&find_loc=New+York",
        #A8D4A0">"format": #A8D4A0">"markdown",
        #A8D4A0">"use_flaresolverr": #E8A0BF">True
    }
)
#E8A0BF">print(render.#87CEEB">json()[#A8D4A0">"data"][#A8D4A0">"markdown"])

Extract structured data

Pull business names, ratings, review counts, and addresses as JSON.

extract.py
#E8A0BF">import requests

# Extract structured business data
extract = requests.#87CEEB">post(
    #A8D4A0">"https://api.snaprender.dev/v1/extract",
    headers={#A8D4A0">"x-api-key": #A8D4A0">"sr_live_YOUR_KEY"},
    json={
        #A8D4A0">"url": #A8D4A0">"https://www.yelp.com/search?find_desc=pizza&find_loc=New+York",
        #A8D4A0">"use_flaresolverr": #E8A0BF">True,
        #A8D4A0">"selectors": {
            #A8D4A0">"names": #A8D4A0">"a.css-19v1rkv",
            #A8D4A0">"ratings": #A8D4A0">"[aria-label*=#A8D4A0">'star rating'] @aria-label",
            #A8D4A0">"review_counts": #A8D4A0">".css-chan6m",
            #A8D4A0">"addresses": #A8D4A0">".css-li27hp"
        }
    }
)
#E8A0BF">print(extract.#87CEEB">json())

Example response

response.json
{
  #A8D4A0">"status": #A8D4A0">"success",
  #A8D4A0">"data": {
    #A8D4A0">"names": [#A8D4A0">"Joe's Pizza", #A8D4A0">"Prince Street Pizza", #A8D4A0">"Di Fara Pizza", ...],
    #A8D4A0">"ratings": [#A8D4A0">"4.5 star rating", #A8D4A0">"4.0 star rating", #A8D4A0">"4.5 star rating", ...],
    #A8D4A0">"review_counts": [#A8D4A0">"12,456 reviews", #A8D4A0">"8,231 reviews", #A8D4A0">"5,892 reviews", ...],
    #A8D4A0">"addresses": [#A8D4A0">"7 Carmine St", #A8D4A0">"27 Prince St", #A8D4A0">"1424 Avenue J", ...]
  },
  #A8D4A0">"url": #A8D4A0">"https://www.yelp.com/search?find_desc=pizza&find_loc=New+York",
  #A8D4A0">"elapsed_ms": 2650
}

Legal considerations

Yelp scraping carries higher legal risk than most sites:

  • 1.Yelp has filed multiple lawsuits against data scrapers and is known for aggressive enforcement of its Terms of Service.
  • 2.Review content may be copyrighted by the individual reviewers. Reproducing reviews in bulk could trigger DMCA claims.
  • 3.Business contact information (phone, email) scraped from Yelp may be subject to telemarketing and anti-spam regulations if used for outreach.
  • 4.Rate-limit aggressively and never scrape at a volume that could impact Yelp's service. Consult legal counsel before building a commercial scraping pipeline.

Start free — 100 requests/month

Get your API key in 30 seconds. Scrape Yelp business data with five lines of code. No credit card, no browser fleet, no proxy bills.

Get Your API Key

Frequently asked questions

Yelp has actively litigated against scrapers and its Terms of Service explicitly prohibit automated data collection. The legality depends on your jurisdiction and use case. Scraping publicly visible business listings is more defensible than scraping user reviews. Always consult a lawyer.

Yes, Yelp offers a Fusion API with business search, details, and reviews endpoints. However, it limits review access to 3 reviews per business and caps at 5,000 API calls/day on the free tier. Scraping fills the gap for bulk data needs.

Yelp uses rate limiting, CAPTCHA challenges, IP blocking, and behavioral analysis. It also serves different HTML to suspected bots. SnapRender's FlareSolverr integration bypasses these protections by rendering through a real browser session.

Technically yes — publicly visible reviews can be rendered and extracted. However, Yelp has been aggressive about enforcing its ToS around review scraping. Use SnapRender's /render endpoint to get review text as markdown, then feed it into your NLP pipeline.