Announcing BrowserQL, our next-gen automation tooling

Extract HTML from protected sites with BrowserQL

Use BrowserQL to unlock the site and either grab the HTML, or generate an endpoint for Playwright or Puppeteer.

Scripting Service

Success message

You have exceeded the request limit. Sign Up to continue.

Unexpected error

We'll get past the detectors and send an image as proof.

Bypass Cloudflare, Datadome and other bot detectors

BrowserQL automatically hides even the most subtle signs that a browser is being automated. It controls browsers at the CDP level, removing typical traces a library leaves behind.

You can extract the HTML to parse, or use the unlocked WebSocket endpoint with Playwright or Puppeteer.

BrowserQL doc

Use the HTML with Scrapy or other tools

Our APIs render and evaluates a page with our browsers, then returns the HTML or JSON.

You can then use a library such as Scrapy or Beautiful Soup to extract the data if needed. This gives you the advantages of headless such as JavaScript rendering and captcha avoidance, without having to run them yourself.

Check out the docs

Three windows, with the rear showing code involving Scrapy and the /content API, the middle representing an ecommerce site, and the front showing scraped data.

Cloud provider and scripting library logos on the left, with the browserless and browser logos on the right

Use the full power of Puppeteer and Playwright

Unlike many scraping tools, you can also use the standard Puppeteer and Playwright libraries to run any script.

You can click buttons, navigate dynamic content or anything else. Just host the scripts in your servers and connect them to our browsers.

Getting started docs

Cut proxy usage by around 90% with reconnects

BrowserQL makes it easy to keep a browser alive for reconnecting to.

That lets you maintain the session cache and cookies instead of loading each page in a fresh browser, leading to a reduction of proxy usage of around 90%.

Check out the docs

Use our API or an unforked library

Unblock API for HTML payloads

Get a pages content after JavaScript has ran.

Scraping API to return JSON

Puppeteer or Playwright

See the Docs


// Automatically responds with the pages HTML payload
curl --request POST \
  --url 'https://production-sfo.browserless.io/unblock' \
  --header 'content-type: application/json' \
  --data '{
  "url": "https://example.com",
  "browserWSEndpoint": false,
  "cookies": false,
  "content": true,
  "screenshot": true,
  "ttl": 3000
}'


// Returns the JSON of the elements specified
$ curl -X POST \
https://chrome.browserless.io/scrape \
-H 'Content-Type: application/json' \
-d '{
  "url": "https://news.ycombinator.com/",
  "elements": [{
    "selector": ".athing .titlelink"
  }]
}'


// From inside your Node application
import puppeteer from 'puppeteer';

// Replace puppeteer.launch with puppeteer.connect
const browser = await puppeteer.connect({
  browserWSEndpoint: 'wss://chrome.browserless.io'
});

// The rest of your script remains the same
const page = await browser.newPage();
await page.goto('https://example.com/');
console.log(await page.title());
browser.close();

Customer Stories

"We started using another scraping company's headless browsers to run Puppeteer scripts. But, it required a Vercel upgrade due to slow fetch times, and the proxies weren't running correctly. I found Browserless and had our Puppeteer code running within an hour. The scrapes are now 5x faster and 1/3rd of the price, plus the support has been excellent."

Nicklas Smit

Full-Stack Developer, Takeoff Copenhagen

"We built a scraping tool to train our chatbots on public website data, but it quickly got complicated due to edge cases and bot detection. I found Browserless and set aside a day for the integration, but it only took a couple of hours. I didn't need to become an expert in managing proxy servers or virtual computers, so now I can stay focused on core parts of the business."

Mike Heap

Founder, My AskAI