TL;DR
- Certain versions of puppeteer (and their respective Chromium versions) are more performant than others for certain types of workstreams.
- We published some results which can help you scrape, automate, test, etc. faster, and are continuing to update these as releases are made.
- We open sourced the tool that we built in case you want to test for yourself!
- Subscribe to our blog if you’d like to get the results of future puppeteer versions in the future.
One of the most important open source projects for headless automation is Puppeteer. As it stands today there are over 82,000 stars and 8,800 forks of the project plus hundreds of contributors. We work with Puppeteer a lot at Browserless to ensure we can help companies get the most out of browser automation tasks at scale. If you aren’t familiar with what you can with Puppeteer, check out our article here.
For those of you that are already using Puppeteer in your day-to-day, you might have noticed that as new versions of Puppeteer are released you notice performance gains and degradations depending on the version of Chromium it comes bundled with. We’ve actually heard about this from our users a lot, and wanted to spend some time figuring which version works best for certain tasks. To that end, we’ve open sourced a tool that you can use to test the different common tasks to see which version is ideal for your application.
Here is a quick overview of the tasks that we decided to tests and some details about them:
Screenshots:
PDF Generation:
Time to first load:
It should be acknowledged that the results of these tests will vary depending on the webpage, network speed, latency, frontend technology and hardware resources. Because of that, our tool allows you to run a test numerous times to try and remove some of that noise variance from the final results. We do acknowledge that this isn’t perfect but such things never are when you can’t control every piece of networking in the world. Remember: we just want to get a sense of what version of Chrome/Puppeteer to use for a particular task.
In terms of the versions that we decided to test, we went all the way back to version 1.20.0 of Puppeteer and tested up until 19.3.0 (the latest release at the time). Here is a tabular breakdown of those results (Red = Slow, Green = Fast). Times below are in milliseconds:
Key Takeaways:
- There is a trend of browser launch taking longer with each new release of Puppeteer/Chrome.
- While browser launch/close time is getting slower over time (this is most likely due to the size of Puppeteer/Chrome), the DOM traverse/navigation, screenshots and pdf generation are all getting faster. This is good news for the average Chrome users, but tells us that reusing the same browser instance is always a good practice if possible. We’re working on a feature to e
- While browser launch/close time is getting slower over time (this is most likely due to the size of Puppeteer/Chrome), the DOM traverse/navigation, screenshots and pdf generation are all getting faster. This is good news for the average Chrome users, but tells us that reusing the same browser instance is always a good practice if possible. We’re working on a feature to e
- The tendency seems to be PDF generation getting faster by the version. Although it seems like the best versions to render screenshots are 9 through 13. V17 seems like a fair compromise for both PDF and Screenshot generation
- Versions 14 through 17 are the best to scrape data, since they have the fastest DOM-loading and First Meaningful Content scores, with pretty decent Wait-for-selector scores
Going forward:
In terms of how this tool will evolve, we are planning to run these tests every time a new version of Puppeteer is released. We are also considering adding Playwright at some point in the future. If you’d like to contribute or use the tool yourself, here is the repo:
https://github.com/browserless/puppeteer-benchmark
If there is a specific parameter that you’d want to see on her going forward, please reach out or submit a PR. We love any/all feedback!
Thanks!
Joel