Why scrape Instagram?
Instagram has over 2 billion monthly active users and is the primary platform for influencer marketing, brand discovery, and visual content. Scraping public data enables powerful business use cases:
Influencer research
Find and vet influencers by follower count, engagement rate, post frequency, and content niche. Build outreach lists at scale.
Competitor analysis
Track competitor posting frequency, engagement rates, and content strategy. Identify what formats and topics drive the most engagement.
Content inspiration
Analyze top-performing posts in your niche. Understand what captions, hashtags, and visual styles generate the most engagement.
Public profiles only
This tutorial covers scraping public Instagram profiles and posts only. Never attempt to access private accounts, stories from non-followers, or direct messages. Respect user privacy and platform terms of service.
Why Instagram is hard to scrape
Anti-scraping measures
- !Login walls — Instagram now requires authentication for most profile views
- !Aggressive rate limiting with IP and browser fingerprint tracking
- !React-based SPA — zero useful data in the initial HTML response
- !Frequent DOM structure changes that break CSS selectors
- !Meta's legal team actively pursues scraping operations
- !GraphQL API endpoints require authentication tokens that expire frequently
- !Image URLs use CDN tokens that expire after a few hours
Method 1: DIY with Puppeteer
The brute-force approach: launch a headless browser with a mobile user-agent (Instagram serves a lighter page to mobile browsers), navigate to a public profile, and extract data:
#E8A0BF">const puppeteer = #E8A0BF">require(#A8D4A0">'puppeteer');
(#E8A0BF">async () => {
#E8A0BF">const browser = #E8A0BF">await puppeteer.#87CEEB">launch({
headless: #A8D4A0">'new',
args: [#A8D4A0">'--no-sandbox'],
});
#E8A0BF">const page = #E8A0BF">await browser.#87CEEB">newPage();
// Set a realistic user-agent
#E8A0BF">await page.setUserAgent(
#A8D4A0">'Mozilla/5.0 (iPhone; CPU iPhone OS 17_0 like Mac OS X) ' +
#A8D4A0">'AppleWebKit/605.1.15 (KHTML, like Gecko) Version/17.0 ' +
#A8D4A0">'Mobile/15E148 Safari/604.1'
);
// Navigate to a public profile
#E8A0BF">await page.#87CEEB">goto(
#A8D4A0">'https://www.instagram.com/natgeo/',
{ waitUntil: #A8D4A0">'networkidle2', timeout: 30000 }
);
// Wait #E8A0BF">for profile header to render
#E8A0BF">await page.#87CEEB">waitForSelector(#A8D4A0">'header section', {
timeout: 15000,
});
// Extract profile data
#E8A0BF">const profile = #E8A0BF">await page.evaluate(() => {
#E8A0BF">const header = document.#87CEEB">querySelector(#A8D4A0">'header section');
#E8A0BF">const stats = header?.#87CEEB">querySelectorAll(#A8D4A0">'li span span') || [];
#E8A0BF">return {
name: document.#87CEEB">querySelector(
#A8D4A0">'header section h2'
)?.innerText,
posts: stats[0]?.innerText,
followers: stats[1]?.innerText,
following: stats[2]?.innerText,
bio: document.#87CEEB">querySelector(
#A8D4A0">'header section > div:last-child'
)?.innerText,
};
});
console.#87CEEB">log(profile);
#E8A0BF">await browser.#87CEEB">close();
})();This approach is fragile. Instagram changes its DOM structure frequently, mobile and desktop views differ, and you will hit login walls and CAPTCHAs within a handful of requests. At scale, the maintenance cost exceeds the value of the data.
Method 2: SnapRender API
SnapRender handles the browser session, anti-bot bypass, and JavaScript rendering. Use use_flaresolverr: true to route through a real Chromium session that passes Instagram's detection.
Render profile as markdown
Get the full public profile as LLM-ready markdown — perfect for feeding into AI analysis or storing in a database.
#E8A0BF">import requests
# Render a public Instagram profile #E8A0BF">as markdown
resp = requests.#87CEEB">post(
#A8D4A0">"https://api.snaprender.dev/v1/render",
headers={#A8D4A0">"x-api-key": #A8D4A0">"sr_live_YOUR_KEY"},
json={
#A8D4A0">"url": #A8D4A0">"https://www.instagram.com/natgeo/",
#A8D4A0">"format": #A8D4A0">"markdown",
#A8D4A0">"use_flaresolverr": #E8A0BF">True
}
)
#E8A0BF">print(resp.#87CEEB">json()[#A8D4A0">"data"][#A8D4A0">"markdown"])Extract structured profile data
Pull specific fields — username, followers, bio — as clean JSON.
#E8A0BF">import requests
# Extract structured profile data #E8A0BF">with CSS selectors
resp = requests.#87CEEB">post(
#A8D4A0">"https://api.snaprender.dev/v1/extract",
headers={#A8D4A0">"x-api-key": #A8D4A0">"sr_live_YOUR_KEY"},
json={
#A8D4A0">"url": #A8D4A0">"https://www.instagram.com/natgeo/",
#A8D4A0">"use_flaresolverr": #E8A0BF">True,
#A8D4A0">"selectors": {
#A8D4A0">"username": #A8D4A0">"header section h2",
#A8D4A0">"posts": #A8D4A0">"header section li:nth-child(1) span span",
#A8D4A0">"followers": #A8D4A0">"header section li:nth-child(2) span span",
#A8D4A0">"following": #A8D4A0">"header section li:nth-child(3) span span",
#A8D4A0">"bio": #A8D4A0">"header section > div:last-child",
#A8D4A0">"profile_pic": #A8D4A0">"header img"
}
}
)
#E8A0BF">print(resp.#87CEEB">json())Extract individual post data
Scrape specific public posts for captions, likes, and timestamps.
#E8A0BF">import requests
# Extract data #E8A0BF">from a single public post
resp = requests.#87CEEB">post(
#A8D4A0">"https://api.snaprender.dev/v1/extract",
headers={#A8D4A0">"x-api-key": #A8D4A0">"sr_live_YOUR_KEY"},
json={
#A8D4A0">"url": #A8D4A0">"https://www.instagram.com/p/ABC123/",
#A8D4A0">"use_flaresolverr": #E8A0BF">True,
#A8D4A0">"selectors": {
#A8D4A0">"caption": #A8D4A0">"div[role=#A8D4A0">'button'] span",
#A8D4A0">"likes": #A8D4A0">"section span a span",
#A8D4A0">"timestamp": #A8D4A0">"time",
#A8D4A0">"author": #A8D4A0">"header a"
}
}
)
#E8A0BF">print(resp.#87CEEB">json())Example response
{
#A8D4A0">"status": #A8D4A0">"success",
#A8D4A0">"data": {
#A8D4A0">"username": #A8D4A0">"natgeo",
#A8D4A0">"posts": #A8D4A0">"32,841",
#A8D4A0">"followers": #A8D4A0">"283M",
#A8D4A0">"following": #A8D4A0">"156",
#A8D4A0">"bio": #A8D4A0">"Experience the world through the eyes of...",
#A8D4A0">"profile_pic": #A8D4A0">"https://instagram.com/..."
},
#A8D4A0">"url": #A8D4A0">"https://www.instagram.com/natgeo/",
#A8D4A0">"elapsed_ms": 3850
}Practical use cases
- 1.Influencer vetting — scrape follower counts and engagement rates across hundreds of profiles. Calculate engagement rate (likes + comments / followers) to identify authentic vs. bought audiences.
- 2.Hashtag research — analyze which hashtags top performers in your niche use. Track hashtag performance over time to identify trending topics.
- 3.Content calendars — study posting frequency and timing of successful accounts. Identify optimal posting windows for your target audience.
- 4.Brand monitoring — track mentions and tags of your brand across public posts. Feed captions into sentiment analysis to gauge brand perception.
- 5.Competitive benchmarking — compare your engagement metrics against competitors. Track follower growth rates and content performance over time.
Legal considerations
Instagram scraping carries higher legal risk than many other platforms. Here is what you need to know:
- 1.Meta has actively sued scraping companies. The 2020 Massroot8 case and the 2024 Bright Data settlement set important precedents.
- 2.Only scrape public profiles. Attempting to access private accounts or login-gated content crosses clear legal lines.
- 3.GDPR applies to EU user data. Instagram bios, follower counts tied to usernames, and post content with identifiable information are personal data under EU law.
- 4.Never use scraped data for advertising targeting, building shadow profiles, or resale of personal information.
- 5.Rate-limit aggressively. Instagram monitors request patterns and will pursue legal action against scrapers that degrade their service.
Start free — 100 requests/month
Get your API key in 30 seconds. Scrape public Instagram profiles and posts with a single API call. No browser fleet, no login required, no proxy bills.
Get Your API KeyFrequently asked questions
Scraping publicly available Instagram data is a legal gray area. Meta's Terms of Service prohibit automated data collection, and they actively enforce this through lawsuits (e.g., the 2020 case against Massroot8). Only scrape public profiles, never private accounts. Never collect personal data for advertising or resale. Consult a lawyer for your specific use case.
Instagram aggressively restricts access for non-authenticated users. However, public profile pages and individual post pages are still partially accessible. SnapRender renders these pages in a real browser session, accessing the publicly visible content without requiring your Instagram credentials.
From public profiles: display name, bio, follower count, following count, post count, profile picture URL, verification status, and recent post thumbnails. From individual posts: caption text, like count, comment count, timestamp, image/video URLs, and tagged accounts.
Rate-limit your requests (2-5 seconds between calls minimum), use SnapRender's use_flaresolverr flag to route through a real browser session, avoid scraping the same profile repeatedly in short intervals, and never attempt to access private or login-gated content. SnapRender handles the browser fingerprinting and anti-bot bypass automatically.