Running Puppeteer On An AWS Elastic Beanstalk Instance Using NodeJS

AWS makes it a nightmare when you try to use almost any product or service made by Google. In my instance I wanted to develop a dashboard in which one could simply click a button and render the current page as a pdf report.

Given I have used Puppeteer in the past for unit tests and web scraping I thought it would be perfect to use in this instance. Getting it to run on a local machine is easy, however today we are going to solve the nightmare that is running it on AWS without a docker image.

Assuming we have a working NodeJS app that can use Puppeteer for whatever reason is necessary let us begin. For the deployment of this application we will be uploading a zip of our NodeJS application. This is important to know since it can be a roadblock in the near future for people who are developing in a Mac environment rather than using the command line to deploy.

Now for the sake of having close to no compatibility issues when deploying I usually dont push my application with the node_modules folder since very few times can come along when some node packages aren’t compatible with the server you deploy on. So in my instance I do not include the npm packages I deploy.

const browser = await puppeteer.launch({
headless: true,
executablePath:'/home/sagar/workplace/scraping-demo/node_modules/puppeteer/.local-chromium/linux-599821/chrome-linux/chrome',
args: ['--no-sandbox', '--disable-setuid-sandbox'],
});

First we need to configure a .npmrc file to allow puppeteer to be installed along with some of its dependancies. It isn't a great practice to let it run with these permissions but if your code base is solid and secure nothing terrible should happen. This file should be in your applications root directory.

We simply need to add the following to our .npmrc file:

This concoction of errors will be thrown at you without the file:

Moving on let us get to the most annoying error that EVERYONE has been running into. Attached are some logs I requested from my ELB instance. This error happens when we are missing dependancies.

in this bitch
(node:2519) UnhandledPromiseRejectionWarning: Error: Failed to launch chrome!
/var/app/current/node_modules/puppeteer/.local-chromium/linux-674921/chrome-linux/chrome: error while loading shared libraries: libXcursor.so.1: cannot open shared object file: No such file or directory

Here is a list of dependancies that are generally needed.

gconf-service
libasound2
libatk1.0-0
libatk-bridge2.0-0
libc6
libcairo2
libcups2
libdbus-1-3
libexpat1
libfontconfig1
libgcc1
libgconf-2-4
libgdk-pixbuf2.0-0
libglib2.0-0
libgtk-3-0
libnspr4
libpango-1.0-0
libpangocairo-1.0-0
libstdc++6
libx11-6
libx11-xcb1
libxcb1
libxcomposite1
libxcursor1
libxdamage1
libxext6
libxfixes3
libxi6
libxrandr2
libxrender1
libxss1
libxtst6
ca-certificates
fonts-liberation
libappindicator1
libnss3
lsb-release
xdg-utils
wget

To check what dependencies you are missing simply type the following into your command line. Grep is your best friend!

Lets solve the error now

Just like Ash Ketchum I searched far and wide for a solution to this problem. There wasn't much out there especially for our specific deployment, most direction was for that of a docker image or some instances just advised I move the app to Google Cloud to make my life easier. I will not let the cloud make me its bitch.

In your root folder of the application, we must add the following folder with a config file for chromium:

The contents of the file are as follows:

https://gist.github.com/ErikkJs/40d4e9b24e42c812a5a7b17ac90716c0

This will install the necessary packages (99% of the time). Now deploy the app! Since I will be zipping the application we need to zip the contents of the application without the MAC folders. AWS will throw the following errors if not compressed the correct way.

Compress your application with the following from your terminal (MAC only):