Deploying Playwright on a Google Cloud Compute Engines is a powerful solution for browser automation, but comes with challenges.
In this guide, we’ll go through the steps for setting up Playwright on a GCP VM, including selecting the ideal VM configuration, installing necessary dependencies, configuring the environment, and running a sample script to capture screenshots.
{{banner}}
Choosing the Right VM
When deploying Playwright on Google Cloud, selecting the appropriate VM size is key to ensuring smooth performance. An e2-medium or e2-standard-2 instance, with 4-8 GB of RAM, generally provides sufficient resources to run Playwright smoothly.
For storage, allocating around 10 GB is recommended to handle browser binaries and temporary files generated during automation tasks. We will use Ubuntu as the operating system, since it's officially supported by Playwright.
Launch and Connect to VM
To get started, create a VM instance on Google Cloud with sufficient storage and connect to it using SSH. You’ll install Node.js and Playwright using the commands in the following section. Playwright comes with bundled browser binaries and dependencies; during installation, it will prompt you to install these automatically. If everything installs correctly, additional dependency installation is generally unnecessary, but commands are included below in case they’re needed.
Installing packages and libraries
First install Node.js
Followed by Playwright and dependencies
This command will prompt to install browsers and dependencies. In case of any issues, they can be downloaded separately using the following command:
Configuring a GCP bucket to store a screenshot
The code in the next section will store screenshots in Google Cloud Storage. First, set up a Cloud Storage bucket and configure the necessary authentication for the VM to save screenshots. Your VM instance also requires the Google Cloud Storage library, which we’ve already installed in the previous section.
- Setting up the Cloud Storage Bucket: In Google Cloud Storage, create a bucket to store screenshots.
- Enable the Cloud Storage API: In the Google Cloud Console, go to APIs & Services > Library and enable the Cloud Storage API.
- Assign Permissions to the VM: The simplest way to manage permissions is to use the default service account associated with your VM instance. Ensure this service account has the "Storage Object Creator" role for your bucket. You can configure this in IAM & Admin > IAM by assigning the role to your bucket.
Granting the VM's service account these permissions allows it to access and upload screenshots directly to Google Cloud Storage
Writing code
Once the environment is set up, write the following javascript code that captures a screenshot of a webpage given as input and uploads it to the GCP bucket. This ensures that the screenshots are easily accessible for testing and validation.
Create a new screenshot.js JavaScript file and paste the following code. Update the bucket name to your bucket name.
Use the following command to run the code (note the input format with https):
Go and check the screenshot stored in the GCP bucket.
Maintenance tips and challenges
Playwright and VMs are a great combination, but it requires careful maintenance.
One of the most important aspects is dependency management. Playwright, along with its browser binaries, frequently releases updates to stay in sync with modern web standards.
These updates often bring new features, optimizations, and security fixes, which makes it essential to regularly update your Playwright version to avoid potential compatibility issues or vulnerabilities.
You will also need to keep an eye out for memory leaks. Issues such as zombie process and browsers not closing properly can gradually increase the resources needed to keep the automations running smoothly.
Run Playwright with Browserless to Keep Things Simple
To take the hassle out of scaling your scraping, screenshotting or other automations, try Browserless.
It takes a quick connection change to use our thousands of concurrent Chrome browsers. Try it today with a free trial.
The Easy Option: Connect Playwright to Our Browser Pool
Hosting Playwright is easy, it's the browsers that cause the issues. To simplify your setup, use our pool of thousands of concurrent browsers with just a change in endpoint.
You can either host just playwright-core
without the browsers, or use our REST APIs. There’s residential proxies, stealth options, HTML exports and other common needed features.