Tutorial

How to Scrape Indeed Job Listings in 2026

|11 min read

Indeed is the world's largest job board, with millions of active listings across every industry. Scraping it gives you access to salary data, hiring trends, and competitive intelligence — but Indeed's Cloudflare protection makes it one of the hardest sites to scrape. This guide shows you how.

Why scrape Indeed?

Job listing data is valuable across multiple use cases:

1

Job market analysis

Track which roles are in demand, which skills appear most frequently, and how job volume changes over time.

2

Salary research

Benchmark compensation across roles, companies, and locations. Build salary comparison tools with real data.

3

Competitor hiring

Monitor what positions competitors are hiring for. A flurry of ML engineer postings signals a pivot to AI.

4

Recruitment automation

Aggregate listings from Indeed and other boards into a unified pipeline. Match candidates to relevant openings automatically.

The Cloudflare challenge

Indeed uses Cloudflare's enterprise bot management, which is one of the most sophisticated anti-scraping systems available. Here is what you are up against:

Protection layers

  • !JavaScript challenges that must be solved in a real browser environment
  • !TLS fingerprinting that detects non-browser HTTP clients
  • !Behavioral analysis that flags automated navigation patterns
  • !CAPTCHA challenges triggered by suspicious request velocity
  • !Cookie-based session tracking that detects session reuse across IPs
  • !Canvas and WebGL fingerprinting to detect headless browsers

Standard Puppeteer or Playwright setups fail against this stack. You either need a purpose-built stealth browser with proxy rotation, or you offload the challenge to a service like SnapRender with FlareSolverr integration.

Method 1: DIY with Puppeteer

Here is a basic Puppeteer scraper for Indeed search results. Note: this will only work with stealth plugins and residential proxies — Cloudflare blocks vanilla Puppeteer within seconds.

scraper.js
#E8A0BF">const puppeteer = #E8A0BF">require(#A8D4A0">'puppeteer');

(#E8A0BF">async () => {
  #E8A0BF">const browser = #E8A0BF">await puppeteer.#87CEEB">launch({
    headless: #A8D4A0">'new',
    args: [#A8D4A0">'--no-sandbox', #A8D4A0">'-#FFB347">-disable-setuid-sandbox'],
  });
  #E8A0BF">const page = #E8A0BF">await browser.#87CEEB">newPage();

  #E8A0BF">await page.setUserAgent(
    #A8D4A0">'Mozilla/5.0 (Windows NT 10.0; Win64; x64) ' +
    #A8D4A0">'AppleWebKit/537.36 (KHTML, like Gecko) ' +
    #A8D4A0">'Chrome/124.0.0.0 Safari/537.36'
  );

  #E8A0BF">await page.#87CEEB">goto(
    #A8D4A0">'https://www.indeed.com/jobs?q=software+engineer&l=remote',
    { waitUntil: #A8D4A0">'networkidle2', timeout: 30000 }
  );

  // Wait #E8A0BF">for job cards to render
  #E8A0BF">await page.#87CEEB">waitForSelector(#A8D4A0">'.job_seen_beacon', { timeout: 15000 });

  #E8A0BF">const jobs = #E8A0BF">await page.#87CEEB">$$eval(#A8D4A0">'.job_seen_beacon', (cards) =>
    cards.map((card) => ({
      title: card.#87CEEB">querySelector(#A8D4A0">'.jobTitle span')?.innerText,
      company: card.#87CEEB">querySelector(#A8D4A0">'[data-testid=#A8D4A0">"company-name"]')?.innerText,
      location: card.#87CEEB">querySelector(#A8D4A0">'[data-testid=#A8D4A0">"text-location"]')?.innerText,
      salary: card.#87CEEB">querySelector(#A8D4A0">'.salary-snippet-container')?.innerText || #E8A0BF">null,
    }))
  );

  console.#87CEEB">log(jobs);
  #E8A0BF">await browser.#87CEEB">close();
})();

To make this work reliably, you would need puppeteer-extra with stealth plugin, residential rotating proxies ($15-50/GB), and retry logic for Cloudflare challenges. The infra overhead is significant.

Method 2: SnapRender API with FlareSolverr

SnapRender's use_flaresolverr: true flag routes requests through a Chromium session that solves Cloudflare challenges automatically. No proxies, no stealth plugins, no CAPTCHA services.

Render as markdown

Get the full Indeed search results page as clean markdown — ideal for LLM analysis or building job aggregator pipelines.

render.py
#E8A0BF">import requests

# Render Indeed search results #E8A0BF">as clean markdown
render = requests.#87CEEB">post(
    #A8D4A0">"https://api.snaprender.dev/v1/render",
    headers={#A8D4A0">"x-api-key": #A8D4A0">"sr_live_YOUR_KEY"},
    json={
        #A8D4A0">"url": #A8D4A0">"https://www.indeed.com/jobs?q=software+engineer&l=remote",
        #A8D4A0">"format": #A8D4A0">"markdown",
        #A8D4A0">"use_flaresolverr": #E8A0BF">True
    }
)
#E8A0BF">print(render.#87CEEB">json()[#A8D4A0">"data"][#A8D4A0">"markdown"])

Extract structured data

Pull job titles, companies, locations, and salaries as structured JSON arrays.

extract.py
#E8A0BF">import requests

# Extract structured job listing data
extract = requests.#87CEEB">post(
    #A8D4A0">"https://api.snaprender.dev/v1/extract",
    headers={#A8D4A0">"x-api-key": #A8D4A0">"sr_live_YOUR_KEY"},
    json={
        #A8D4A0">"url": #A8D4A0">"https://www.indeed.com/jobs?q=software+engineer&l=remote",
        #A8D4A0">"use_flaresolverr": #E8A0BF">True,
        #A8D4A0">"selectors": {
            #A8D4A0">"titles": #A8D4A0">".jobTitle span",
            #A8D4A0">"companies": #A8D4A0">"[data-testid=#A8D4A0">'company-name']",
            #A8D4A0">"locations": #A8D4A0">"[data-testid=#A8D4A0">'text-location']",
            #A8D4A0">"salaries": #A8D4A0">".salary-snippet-container"
        }
    }
)
#E8A0BF">print(extract.#87CEEB">json())

Example response

response.json
{
  #A8D4A0">"status": #A8D4A0">"success",
  #A8D4A0">"data": {
    #A8D4A0">"titles": [
      #A8D4A0">"Senior Software Engineer",
      #A8D4A0">"Full Stack Developer",
      #A8D4A0">"Backend Engineer - Python"
    ],
    #A8D4A0">"companies": [#A8D4A0">"Stripe", #A8D4A0">"Shopify", #A8D4A0">"Datadog"],
    #A8D4A0">"locations": [#A8D4A0">"Remote", #A8D4A0">"Remote", #A8D4A0">"Remote - US"],
    #A8D4A0">"salaries": [#A8D4A0">"$180,000 - $220,000 a year", #A8D4A0">"$150,000 - $190,000 a year", #E8A0BF">null]
  },
  #A8D4A0">"url": #A8D4A0">"https://www.indeed.com/jobs?q=software+engineer&l=remote",
  #A8D4A0">"elapsed_ms": 4210
}

Legal considerations

Indeed is more litigious about scraping than most sites. Here is what you need to know:

  • 1.Indeed's Terms of Service explicitly prohibit automated access, and they have filed lawsuits against scraping companies (Indeed v. Glassdoor, Indeed v. Mixrank).
  • 2.The hiQ v. LinkedIn ruling supports scraping public data, but Indeed has argued their data is not truly "public" since listings are submitted by employers under specific terms.
  • 3.Never scrape applicant data, resumes, or any personally identifiable information. Stick to public job listing details.
  • 4.Rate-limit aggressively. Beyond the legal risk, overwhelming Indeed's servers could constitute a denial-of-service.
  • 5.Consider whether Indeed's paid API or job posting partnerships might be a better fit for your commercial use case.

Disclaimer

This tutorial is for educational purposes. SnapRender provides the technical capability to render and extract web content, but it is your responsibility to ensure your use case complies with applicable laws and website terms of service.

Start free — 100 requests/month

Get your API key in 30 seconds. Bypass Cloudflare and extract job data with a single API call. No browser fleet, no proxy rotation, no CAPTCHA solving.

Get Your API Key

Frequently asked questions

Indeed's Terms of Service explicitly prohibit scraping and automated access. They actively enforce this through litigation — Indeed has sued multiple scraping companies. While the hiQ v. LinkedIn ruling supports scraping public data, Indeed's aggressive legal stance makes it riskier than most targets. Consult a lawyer before scraping Indeed at scale.

Indeed uses Cloudflare's enterprise-tier bot protection including JavaScript challenges, browser fingerprinting, and behavioral analysis. Simple HTTP clients and default headless browser configurations are detected instantly. SnapRender's use_flaresolverr flag handles Cloudflare challenges automatically.

Indeed shows salary data on some listings — either employer-provided ranges or Indeed's estimated salary. You can extract these with CSS selectors targeting the salary metadata container. Note that not all listings include salary information, so your extraction should handle missing values gracefully.

SnapRender starts free with 100 requests/month. Paid plans begin at $9/month for 1,500 requests. Requests using FlareSolverr (needed for Indeed's Cloudflare protection) count as one request each — no credit multipliers.