Announcing a new puppeteer benchmarking tool

contents

TL;DR

  • Certain versions of puppeteer (and their respective Chromium versions) are more performant than others for certain types of workstreams.
  • We published some results which can help you scrape, automate, test, etc. faster, and are continuing to update these as releases are made.
  • We open sourced the tool that we built in case you want to test for yourself!
  • Subscribe to our blog if you’d like to get the results of future puppeteer versions in the future.

One of the most important open source projects for headless automation is Puppeteer. As it stands today there are over 82,000 stars and 8,800 forks of the project plus hundreds of contributors. We work with Puppeteer a lot at Browserless to ensure we can help companies get the most out of browser automation tasks at scale. If you aren’t familiar with what you can with Puppeteer, check out our article here.

For those of you that are already using Puppeteer in your day-to-day, you might have noticed that as new versions of Puppeteer are released you notice performance gains and degradations depending on the version of Chromium it comes bundled with. We’ve actually heard about this from our users a lot, and wanted to spend some time figuring which version works best for certain tasks. To that end, we’ve open sourced a tool that you can use to test the different common tasks to see which version is ideal for your application. 

Here is a quick overview of the tasks that we decided to tests and some details about them:

Screenshots:

TestNameCriteria
Launchbrowser-launchHow long Puppeteer takes to launch a Chrome browser. Each version is using its recommended Chromium revision.
Page gotonavigationHow long Puppeteer navigates to example.com. Navigation is considered finished when the page’s Body element triggers the load event
Page ScreenshotscreenshotHow long it takes to screenshot and save the page.
Shutdownbrowser-closeHow long it takes Puppeteer to close and kill the Chrome process

PDF Generation:

TestNameCriteria
Launchbrowser-launchSee screenshot
Page PDFpdf-generationHow long it takes to render a PDF of the page and save it. Using Chromium’s native PDF render util

Time to first load:

TestNameCriteria
Launchbrowser-launchSee screenshot
First Bytetime-to-first-byteUses Chrome’s internal window.performance object to get the time of the browser’s first received byte — be it from a server response, cache, or local resource
Content Loadeddom-content-loadedUses Chrome’s internal window.performance object to get the time at which the Document triggered its DOMContentLoaded event.
First Paintfirst-paintUses Chrome’s internal window.performance object to get the time at which the browser started rendering something — like a background color
Contentfulfirst-contentful-paintUses Chrome’s internal window.performance object to get the time at which the browser started rendering DOM objects — like actual text or images
Selector Appearwait-for-selector-h1How long it takes Puppeteer to be able to select an h1 element after the page was created

It should be acknowledged that the results of these tests will vary depending on the webpage, network speed, latency, frontend technology and hardware resources. Because of that, our tool allows you to run a test numerous times to try and remove some of that noise variance from the final results. We do acknowledge that this isn’t perfect but such things never are when you can’t control every piece of networking in the world. Remember: we just want to get a sense of what version of Chrome/Puppeteer to use for a particular task.

In terms of the versions that we decided to test, we went all the way back to version 1.20.0 of Puppeteer and tested up until 19.3.0 (the latest release at the time). Here is a tabular breakdown of those results (Red = Slow, Green = Fast). Times below are in milliseconds:

we went all the way back to version 1.20.0 of Puppeteer and tested up until 19.3.0

Key Takeaways:

  1. There is a trend of browser launch taking longer with each new release of Puppeteer/Chrome.
    1. While browser launch/close time is getting slower over time (this is most likely due to the size of Puppeteer/Chrome), the DOM traverse/navigation, screenshots and pdf generation are all getting faster. This is good news for the average Chrome users, but tells us that reusing the same browser instance is always a good practice if possible. We’re working on a feature to e
  2. The tendency seems to be PDF generation getting faster by the version. Although it seems like the best versions to render screenshots are 9 through 13. V17 seems like a fair compromise for both PDF and Screenshot generation
  1. Versions 14 through 17 are the best to scrape data, since they have the fastest DOM-loading and First Meaningful Content scores, with pretty decent Wait-for-selector scores

Going forward:

In terms of how this tool will evolve, we are planning to run these tests every time a new version of Puppeteer is released. We are also considering adding Playwright at some point in the future. If you’d like to contribute or use the tool yourself, here is the repo: 

https://github.com/browserless/puppeteer-benchmark

If there is a specific parameter that you’d want to see on her going forward, please reach out or submit a PR. We love any/all feedback!

Thanks!

Joel

Share this article

Ready to try the benefits of Browserless?