How to Debug Playwright & Puppeteer Automations With Effective Logging

August 30, 2024

contents

You can’t conduct browser automation tasks without good logging capabilities. Too often, developers run into issues like:

Bot detection tools that block automation tools
Failure to execute complex browser tasks (like multifactor authentication)
Infinite timeouts when pages don’t load properly

Unless you log these errors, you won't know what's wrong, increasing the time to find the root cause of the issue. That's why you should add logging capabilities to your Puppeteer or Playwright scripts.

This may sound obvious, but we’ve helped lots of Browserless users running automations with minimal or non-existing logging.

If you're wondering how to set them up and use them properly, in this article, we'll explore:

How to effectively log and debug in Puppeteer and Playwright
Common mistakes developers make when logging, and how to avoid them
Advanced tools and best practices for managing logs and troubleshooting errors

Why do logging and debugging matter for browser automation tasks?

When you're scraping data from websites or automating browser tasks, many things can go wrong: pages fail to load, redirects happen, or a site's structure changes. Without logs, identifying what caused these issues becomes significantly more complicated.

It’s tempting to rely on error messages, but they often don’t give enough content to successfully debug a problem.

For example, an error might tell you that a web element was not found, but it won't explain if the page was slow to load or if a redirect caused the problem. That’s where using logging and a debugger comes in.

Think of them as two sides of the same coin:

Logs let you capture the specific details such as error codes and load times
Debugging tools let you view a script running, including network requests and page interactions.

So they’re both critical if you’re automating browser tasks.

Four browser automation issues that logging uncovers

But what issues can you expect during task automation that logging can help you debug? Let’s look at the most common errors:

1. Page redirects and bot blocks

Your automation might get redirected, such as due to website issues or by a bot-detector such as Cloudflare. These mechanisms can prevent your script from reaching where it’s meant to go.

Without logging the final URL, it is hard to discover that the automation failed due to a redirect such as being sent to a CAPTCHA page.

We would recommend logging the final URL and response code status, with a simple log like this:


const response = await page.goto('https://ecommerce-site.com');
console.log(`Final URL: ${response.url()}, Status Code: ${response.status()}`);

This log helps you identify whether your script was redirected to a bot challenge or if the site returned an unexpected status code.

2. Slow or failing page loads

Pages may load slower than expected or fail entirely, especially if external resources like images, scripts, or style sheets take longer to load. By logging request failures and tracking page load times, you can better understand what might be causing the slowdown.

Below, you can see a log which tracks the time taken for the page to load and if critical resources were successfully fetched:


const startTime = Date.now();
await page.goto('https://example.com');
const loadTime = Date.now() - startTime;
console.log(`Page load time: ${loadTime}ms`);

If suitable, you could then filter out unnecessary elements like iframes or service workers with targetFilter.

3. Network request failures

While a missing favicon or an unused script might not affect your automation, specific network requests do. Logging each request and response tells you which part of the network stack is failing.

It’s particularly useful when you’re dealing with sites that could momentarily fail to load resources due to server-side problems or bot restrictions.

4. Handling captchas

One of the more difficult challenges in web automation is dealing with CAPTCHA systems. Capturing screenshots or logging errors when a CAPTCHA is presented helps you identify when your automation hit this roadblock.

In these cases, logging the page’s response and even taking a screenshot at the moment of failure helps:


await page.screenshot({ path: 'captcha_error.png' });
console.log(`Screenshot saved: captcha_error.png`);

If you don’t want to handle screenshotting yourself, you can use our /screenshot API.

How to set up logging in Playwright and Puppeteer

Here's how you can start logging tasks in both of these automation tools:

Step 1: Choose a logging library

Both JavaScript and Node.js have built-in console logging capabilities (console.log()), which can be used to log events and responses directly.

Apart from that, Playwright has verbose logging and tracing capabilities, while Puppeteer has network logging and verbose logging capabilities. You can choose the best method depending on the level of detail you need.

You can also use libraries like winston or log4js that allow for different logging severity levels. These tools give you better control over your logs, such as specifying when a message should be classified as an error versus a warning.

Step 2: Log responses and requests

One of the most common mistakes in web automation scripts is ignoring the responses and requests made by the browser.

For example, a simple goto function call in Playwright or Puppeteer may be successful. However, critical errors like page redirects could go unnoticed if the response status code isn't captured and logged.


const response = await page.goto('https://example.com');
console.log(`Response status: ${response.status()}`);
console.log(`Final URL: ${response.url()}`);

Step 3: Handle thresholds in logging

As your script grows, separate your log messages by importance using thresholds. When you do that, you won't be overwhelmed with excessive logs, making locating critical errors challenging. For instance, use:

debug for general information during development
info for standard runtime details
warn for issues that may not immediately break functionality
error for critical failures

How do you debug issues once you have basic logging set up?

Here are a few tips to help you debug issues after you’ve set up logs on Playwright or Puppeteer:

1. Capture metadata for better debugging

Many times, you might avoid capturing the metadata for your logs. Maybe it's too much detail, or you're in a hurry to wrap things up. But if you don't, you miss out on critical details that could tell you what's wrong with your failed tasks.

Always capture additional metadata like:

HTTPS headers
Response times
Browser behaviours

While logging the response status and URLs is a good start, advanced metadata can help pinpoint issues that are less obvious. For example, logging request headers or tracking cookies can reveal authentication or session management issues.

Adding this much detail to your log could make things more complex, but you’ll thank your past self if you’re debugging complex issues like failed API calls or authentication issues.

2. Use headless mode and capture screenshots of failures

Headless mode is a powerful feature in both Puppeteer and Playwright. Since the browser runs without rendering its UI, troubleshooting can be very difficult because you can't see what's happening on the screen.

To debug this, you need to create logs and capture screenshots. Here’s an example log which captures a screenshot if a failure occurs:


if (response.status() !== 200) {
    await page.screenshot({ path: 'error_screenshot.png' });
    console.log('Error screenshot captured.');
}

Capturing screenshots when a failure occurs allows you to visually see what went wrong, even if you’re running the script in headless mode.

3. Store logs and screenshots in the cloud

Over time, it's natural to scale your automation tasks. If today, they're 10 a day, tomorrow, they could be 500 a day. Now, when you think about it from that perspective, it's practically impossible to debug every error manually without storing a detailed log of the interaction.

When you do this locally, it'll be harder to track down the right log when you need it, or you could even accidentally delete it. That's why you should store it in the cloud—like in AWS S3 or something similar.

4. Use hosted debugging sessions

For real-time debugging, hosted debugging sessions offer a powerful way to interact with your web automation scripts. Tools like Playwright’s Inspector or Puppeteer’s DevTools allow you to pause execution, inspect elements, and view network requests as they happen.

It's great for cases where intermittent failures are hard to replicate. Also, these debuggers tend to surface data that doesn't show up in your logs—like JavaScript errors on the page or failed network requests that don't trigger a status code change.

Here’s what Browserless’s debugger looks like:

The Browserless debugger, to troubleshoot automations running in our browser pool.

Best practices for logging and debugging

Here are a few things to keep in mind when you’re storing logs and debugging errors:

1. Be cautious about logging sensitive information

Don’t accidentally log sensitive information such as user credentials, private URLs, or personal data. Logging this type of data can create serious security risks, making your application vulnerable to attacks.

A general rule is to avoid logging sensitive fields such as passwords, API keys, or full URLs that include query parameters containing user information.

You can sanitise logs by stripping these fields out or masking them like the example below, where only the origin of the URL is stored and not the entire query string:


console.log(`User attempted login at ${new URL(page.url()).origin}`);

2. Use log thresholds to manage volume

Overloading your logs with too much information can make debugging harder. This is where logging thresholds come in. By setting different severity levels for your logs—like debug, info, warn, and error—you can manage what information is recorded at each stage of development and execution.

For instance, during development, debug logs can capture detailed information, while in production, you might only log warn and error messages to avoid performance issues and unnecessary data storage:


if (process.env.NODE_ENV === 'production') {
    console.error('Error occurred!');
} else {
    console.debug('Detailed debug info here.');
}

3. Write logs with your future self in mind

Ideally, your logs should be clear, concise, and offer enough context to quickly understand what was happening at the time of the error. For example, rather than logging generic messages like "Error occurred", provide data like location and status code.

Here’s an example of a well-written log:


console.error(`Failed to load page at ${response.url()} with status code: ${response.status()}`);

4. Use Try-catch blocks for better error handling

If you’re using JavaScript, try-catch blocks are a must-have for handling exceptions. If you don’t use them, your errors could go unreported or lack enough data to troubleshoot.

So, always make sure you wrap code in try-catch blocks to capture exceptions and log them appropriately.

Here’s what that looks like:


try {
    await page.goto('https://example.com');
} catch (error) {
    console.error(`Failed to navigate: ${error.message}`);
}

5. Add logging during initial stages of development

The best way to improve your debugging workflow is to start logging your tasks early in development.

You need to treat it as an important part of your workflow—just as you do with writing tests—because it will save you a lot of time and frustration down the road. You'll not only catch potential issues early but also set up a strong foundation for troubleshooting in the future.

Final takeaways

Logging and debugging go hand in hand, so it's time to start including these in your development workflows. Start by reviewing your current scripts. Ask yourself questions like:

Have you at least used Playwright/Puppeteer’s built-in logger?
Are you logging enough different events?
Do those logs have enough detail to be helpful?
Are you using the right logging thresholds to avoid being overwhelmed by data?

If any of those are a no, it’s a good place to start.

You may also be using a platform to host your automations. For example, Browserless lets users scale their Playwright or Puppeteer deployment. In those cases, use the extra debugging capabilities of those tools.

Hopefully this helps you speed up your debugging, but our support team is always available to help users solve issues.

Avoid having to troubleshoot infrastructure issues by hosting your automations with Browserless

Automations can fail for many reasons, but don't let infrastructure be one of them.

Browserless offers a pool of hosted browsers, ideal for running Playwright or Puppeteer scripts. You won't need to troubleshoot memory leaks, breakages due to version updates, or any other infra issue.

You can safely scale your automations from tens to thousands of concurrencies, while using our APIs and session management features. Try it yourself with a free trial.

Scale your automations with a free trial

Share this article