Want to know how you can capture a web page and save it as a PDF document or an image using the terminal? Luckily, Linux has a plethora of utilities that you can use to automate the task of converting HTML documents to PDF files and images.
This article will introduce you to wkhtmltopdf and wkhtmltoimage, utilities that you need to make your work easier.
How to Convert HTML to PDF
If you’re looking to capture web pages and convert them into a PDF file, the wkhtmltopdf utility will help you out. Wkhtmltopdf is an open-source command-line tool used to render web pages into PDF documents.
Since the tool works headlessly inside the Linux terminal, you won’t require any web driver or a browser automation framework like Selenium.
Install wkhtmltopdf on Linux
Wkhtmltopdf is not one of the standard packages that come pre-installed on Linux. You’ll have to manually install it using your system’s package manager.
To install wkhtmltopdf on Ubuntu and Debian-based distributions:
sudo apt install wkhtmltopdf
On Arch-based distros like Manjaro Linux:
sudo pacman -S wkhtmltopdf
Installing wkhtmltopdf on RHEL-based distros like Fedora and CentOS is easy as well.
sudo dnf install wkhtmltopdf
The basic syntax of the command is:
wkhtmltopdf webpage filename
…where webpage is the URL of the web page that you want to convert and filename is the name of the output PDF file.
To convert the Google homepage into a PDF document:
wkhtmltopdf https://google.com google.pdf
On opening the PDF file, you will notice that wkhtmltopdf has precisely rendered the web page into a document.
Print Multiple Copies of the Web Page
The –copies flag is a lifesaver if you want your output file to have multiple copies of the webpage. Note that when printing multiple copies, wkhtmltopdf won’t generate multiple PDF files, but will add additional pages to a single document instead.
To create three copies of the Google homepage:
wkhtmltopdf --copies 3 https://google.com google.pdf
The output PDF file will contain three pages as specified in the aforementioned command.
Add a Grayscale Filter to the Output
To add a grayscale filter to the PDF file, use the -g or –grayscale flag with the command:
wkhtmltopdf -g https://google.com google.pdf
wkhtmltopdf --grayscale https://google.com google.pdf
Change the Orientation of the PDF
By default, wkhtmltopdf generates the PDF file in vertical layout i.e. portrait. To change this default behavior and capture web pages in landscape instead, use the –orientation flag with the command:
wkhtmltopdf --orientation landscape https://google.com google.pdf
Note that the landscape version of the document has a larger whitespace area as compared to the portrait one.
Don’t Include Images While Converting
While generating the output, if you don’t want wkhtmltopdf to render images present in a web page, use the –no-images flag:
wkhtmltopdf --no-images https://google.com google.pdf
How to Convert a Web Page to Images
The wkhtmltoimage utility is a part of the wkhtmltopdf package. If you’re working on a report and want to include images of a website, then this tool will work in your favor. The Linux terminal not only makes it easier for you to capture the images but also gives you a range of options that allow you to customize your output.
Wkhtmltoimage has a syntax similar to wkhtmltopdf:
wkhtmltoimage webpage filename
…where webpage is the URL of a website and filename is the name of the output image.
Convert a Web Page to an Image
Continuing with the aforementioned example, let’s convert the Google homepage into images.
wkhtmltoimage https://google.com google.png
You can also specify a custom file format that you want the output image to have. Wkhtmltoimage supports the following file extensions:
For example, if you want to generate a JPG image, simply replace the file extension with JPG in the command:
wkhtmltoimage https://google.com google.jpg
Capturing Web Pages Using the Linux Terminal
You must have a PDF viewer installed on your Linux system if you want to view the PDF files generated by wkhtmltopdf. While most of the Linux distributions come with a PDF editor preinstalled, you can manually choose and install a PDF editor that suits your needs.
Need to edit a PDF file in Linux? These Linux PDF editors are free to install and easy to use.
About The Author