You can’t conduct browser automation tasks without good logging capabilities. Too often, developers run into issues like:
- Bot detection tools that block automation tools
- Failure to execute complex browser tasks (like multifactor authentication)
- Infinite timeouts when pages don’t load properly
Unless you log these errors, you won't know what's wrong, increasing the time to find the root cause of the issue. That's why you should add logging capabilities to your Puppeteer or Playwright scripts.
This may sound obvious, but we’ve helped lots of Browserless users running automations with minimal or non-existing logging.
If you're wondering how to set them up and use them properly, in this article, we'll explore:
- How to effectively log and debug in Puppeteer and Playwright
- Common mistakes developers make when logging, and how to avoid them
- Advanced tools and best practices for managing logs and troubleshooting errors
Why do logging and debugging matter for browser automation tasks?
When you're scraping data from websites or automating browser tasks, many things can go wrong: pages fail to load, redirects happen, or a site's structure changes. Without logs, identifying what caused these issues becomes significantly more complicated.
It’s tempting to rely on error messages, but they often don’t give enough content to successfully debug a problem.
For example, an error might tell you that a web element was not found, but it won't explain if the page was slow to load or if a redirect caused the problem. That’s where using logging and a debugger comes in.
Think of them as two sides of the same coin:
- Logs let you capture the specific details such as error codes and load times
- Debugging tools let you view a script running, including network requests and page interactions.
So they’re both critical if you’re automating browser tasks.
Four browser automation issues that logging uncovers
But what issues can you expect during task automation that logging can help you debug? Let’s look at the most common errors:
1. Page redirects and bot blocks
Your automation might get redirected, such as due to website issues or by a bot-detector such as Cloudflare. These mechanisms can prevent your script from reaching where it’s meant to go.
Without logging the final URL, it is hard to discover that the automation failed due to a redirect such as being sent to a CAPTCHA page.
We would recommend logging the final URL and response code status, with a simple log like this:
This log helps you identify whether your script was redirected to a bot challenge or if the site returned an unexpected status code.
2. Slow or failing page loads
Pages may load slower than expected or fail entirely, especially if external resources like images, scripts, or style sheets take longer to load. By logging request failures and tracking page load times, you can better understand what might be causing the slowdown.
Below, you can see a log which tracks the time taken for the page to load and if critical resources were successfully fetched:
If suitable, you could then filter out unnecessary elements like iframes or service workers with targetFilter.
3. Network request failures
While a missing favicon or an unused script might not affect your automation, specific network requests do. Logging each request and response tells you which part of the network stack is failing.
It’s particularly useful when you’re dealing with sites that could momentarily fail to load resources due to server-side problems or bot restrictions.
4. Handling captchas
One of the more difficult challenges in web automation is dealing with CAPTCHA systems. Capturing screenshots or logging errors when a CAPTCHA is presented helps you identify when your automation hit this roadblock.
In these cases, logging the page’s response and even taking a screenshot at the moment of failure helps:
If you don’t want to handle screenshotting yourself, you can use our /screenshot API.
How to set up logging in Playwright and Puppeteer
Here's how you can start logging tasks in both of these automation tools:
Step 1: Choose a logging library
Both JavaScript and Node.js have built-in console logging capabilities (console.log()), which can be used to log events and responses directly.
Apart from that, Playwright has verbose logging and tracing capabilities, while Puppeteer has network logging and verbose logging capabilities. You can choose the best method depending on the level of detail you need.
You can also use libraries like winston or log4js that allow for different logging severity levels. These tools give you better control over your logs, such as specifying when a message should be classified as an error versus a warning.
Step 2: Log responses and requests
One of the most common mistakes in web automation scripts is ignoring the responses and requests made by the browser.
For example, a simple goto function call in Playwright or Puppeteer may be successful. However, critical errors like page redirects could go unnoticed if the response status code isn't captured and logged.
Step 3: Handle thresholds in logging
As your script grows, separate your log messages by importance using thresholds. When you do that, you won't be overwhelmed with excessive logs, making locating critical errors challenging. For instance, use:
debug
for general information during developmentinfo
for standard runtime detailswarn
for issues that may not immediately break functionalityerror
for critical failures
How do you debug issues once you have basic logging set up?
Here are a few tips to help you debug issues after you’ve set up logs on Playwright or Puppeteer:
1. Capture metadata for better debugging
Many times, you might avoid capturing the metadata for your logs. Maybe it's too much detail, or you're in a hurry to wrap things up. But if you don't, you miss out on critical details that could tell you what's wrong with your failed tasks.
Always capture additional metadata like:
- HTTPS headers
- Response times
- Browser behaviours
While logging the response status and URLs is a good start, advanced metadata can help pinpoint issues that are less obvious. For example, logging request headers or tracking cookies can reveal authentication or session management issues.
Adding this much detail to your log could make things more complex, but you’ll thank your past self if you’re debugging complex issues like failed API calls or authentication issues.
2. Use headless mode and capture screenshots of failures
Headless mode is a powerful feature in both Puppeteer and Playwright. Since the browser runs without rendering its UI, troubleshooting can be very difficult because you can't see what's happening on the screen.
To debug this, you need to create logs and capture screenshots. Here’s an example log which captures a screenshot if a failure occurs:
Capturing screenshots when a failure occurs allows you to visually see what went wrong, even if you’re running the script in headless mode.
3. Store logs and screenshots in the cloud
Over time, it's natural to scale your automation tasks. If today, they're 10 a day, tomorrow, they could be 500 a day. Now, when you think about it from that perspective, it's practically impossible to debug every error manually without storing a detailed log of the interaction.
When you do this locally, it'll be harder to track down the right log when you need it, or you could even accidentally delete it. That's why you should store it in the cloud—like in AWS S3 or something similar.
4. Use hosted debugging sessions
For real-time debugging, hosted debugging sessions offer a powerful way to interact with your web automation scripts. Tools like Playwright’s Inspector or Puppeteer’s DevTools allow you to pause execution, inspect elements, and view network requests as they happen.
It's great for cases where intermittent failures are hard to replicate. Also, these debuggers tend to surface data that doesn't show up in your logs—like JavaScript errors on the page or failed network requests that don't trigger a status code change.
Here’s what Browserless’s debugger looks like:
Best practices for logging and debugging
Here are a few things to keep in mind when you’re storing logs and debugging errors:
1. Be cautious about logging sensitive information
Don’t accidentally log sensitive information such as user credentials, private URLs, or personal data. Logging this type of data can create serious security risks, making your application vulnerable to attacks.
A general rule is to avoid logging sensitive fields such as passwords, API keys, or full URLs that include query parameters containing user information.
You can sanitise logs by stripping these fields out or masking them like the example below, where only the origin of the URL is stored and not the entire query string:
2. Use log thresholds to manage volume
Overloading your logs with too much information can make debugging harder. This is where logging thresholds come in. By setting different severity levels for your logs—like debug, info, warn, and error—you can manage what information is recorded at each stage of development and execution.
For instance, during development, debug
logs can capture detailed information, while in production, you might only log warn
and error
messages to avoid performance issues and unnecessary data storage:
3. Write logs with your future self in mind
Ideally, your logs should be clear, concise, and offer enough context to quickly understand what was happening at the time of the error. For example, rather than logging generic messages like "Error occurred", provide data like location and status code.
Here’s an example of a well-written log:
4. Use Try-catch blocks for better error handling
If you’re using JavaScript, try-catch blocks are a must-have for handling exceptions. If you don’t use them, your errors could go unreported or lack enough data to troubleshoot.
So, always make sure you wrap code in try-catch blocks to capture exceptions and log them appropriately.
Here’s what that looks like:
5. Add logging during initial stages of development
The best way to improve your debugging workflow is to start logging your tasks early in development.
You need to treat it as an important part of your workflow—just as you do with writing tests—because it will save you a lot of time and frustration down the road. You'll not only catch potential issues early but also set up a strong foundation for troubleshooting in the future.
Final takeaways
Logging and debugging go hand in hand, so it's time to start including these in your development workflows. Start by reviewing your current scripts. Ask yourself questions like:
- Have you at least used Playwright/Puppeteer’s built-in logger?
- Are you logging enough different events?
- Do those logs have enough detail to be helpful?
- Are you using the right logging thresholds to avoid being overwhelmed by data?
If any of those are a no, it’s a good place to start.
You may also be using a platform to host your automations. For example, Browserless lets users scale their Playwright or Puppeteer deployment. In those cases, use the extra debugging capabilities of those tools.
Hopefully this helps you speed up your debugging, but our support team is always available to help users solve issues.
{{banner}}
Avoid having to troubleshoot infrastructure issues by hosting your automations with Browserless
Automations can fail for many reasons, but don't let infrastructure be one of them.
Browserless offers a pool of hosted browsers, ideal for running Playwright or Puppeteer scripts. You won't need to troubleshoot memory leaks, breakages due to version updates, or any other infra issue.
You can safely scale your automations from tens to thousands of concurrencies, while using our APIs and session management features. Try it yourself with a free trial.