Announcing BrowserQL, our next-gen automation tooling
Extract HTML from protected sites with BrowserQL
Use BrowserQL to unlock the site and either grab the HTML, or generate an endpoint for Playwright or Puppeteer.
We'll get past the detectors and send an image as proof.
Bypass Cloudflare, Datadome and other bot detectors
BrowserQL automatically hides even the most subtle signs that a browser is being automated. It controls browsers at the CDP level, removing typical traces a library leaves behind.
You can extract the HTML to parse, or use the unlocked WebSocket endpoint with Playwright or Puppeteer.
BrowserQL doc
Use the HTML with Scrapy or other tools
Our APIs render and evaluates a page with our browsers, then returns the HTML or JSON.
You can then use a library such as Scrapy or Beautiful Soup to extract the data if needed. This gives you the advantages of headless such as JavaScript rendering and captcha avoidance, without having to run them yourself.
Check out the docs
Use the full power of Puppeteer and Playwright
Unlike many scraping tools, you can also use the standard Puppeteer and Playwright libraries to run any script.
You can click buttons, navigate dynamic content or anything else. Just host the scripts in your servers and connect them to our browsers.
Getting started docs
Cut proxy usage by around 90% with reconnects
BrowserQL makes it easy to keep a browser alive for reconnecting to.
That lets you maintain the session cache and cookies instead of loading each page in a fresh browser, leading to a reduction of proxy usage of around 90%.
Check out the docs