PDF, PS and DjVu

This article covers software to view, edit and convert PDF, PostScript (PS), DjVu (déjà vu) and XPS files.

Engines

https://poppler.freedesktop.org/ || poppler
  • Mupdf MuPDF is a lightweight PDF, XPS, and EPUB viewer, consisting of a software library, command line tools, and viewers.
https://mupdf.com/ || libmupdf
  • libspectre Small library for rendering Postscript documents.
https://www.freedesktop.org/wiki/Software/libspectre || libspectre
  • Ghostscript Interpreter for PostScript and PDF. Provides the gs(1) command-line interface, see also /usr/share/doc/ghostscript/*/Use.htm (online[dead link 2022-09-22 ]), along with many wrapper scripts like ps2pdf and pdf2ps.
https://ghostscript.com/ || ghostscript
  • DjVuLibre Suite to create, manipulate and view DjVu documents.
https://djvu.sourceforge.net/ || djvulibre
  • libgxps GObject based library for handling and rendering XPS documents.
https://wiki.gnome.org/Projects/libgxps || libgxps

Viewers

Framebuffer

  • fbgs Poor man's PostScript/pdf viewer for the linux framebuffer console.
https://www.kraxel.org/blog/linux/fbida/ || fbida
  • fbpdf Small framebuffer PDF and DjVu viewer based on MuPDF, with Vim keybindings and written in C
https://repo.or.cz/w/fbpdf.git || fbpdf-gitAUR
  • jfbview Framebuffer PDF and image viewer. Features include Vim-like controls, zoom-to-fit, a TOC (outline) view and fast multi-threaded rendering.
https://github.com/jichu4n/jfbview || jfbviewAUR

Graphical

Note: Some web browsers can display PDF files, for example with PDF.js.
  • DjView Viewer for DjVu documents.
https://djvu.sourceforge.net/djview4.html || djview
  • ePDFView Lightweight PDF document viewer using the Poppler and GTK libraries. Development stopped.
http://freecode.com/projects/epdfview || epdfview
  • qpdfview Tabbed document viewer. It uses Poppler for PDF support, libspectre for PS support, DjVuLibre for DjVu support, CUPS for printing support and the Qt toolkit for its interface.
https://launchpad.net/qpdfview || qpdfviewAUR
  • Sioyek Lightweight PDF viewer based on MuPDF with features designed for viewing research papers and technical books.
https://sioyek.info/ || sioyekAUR

Comparison

NamePDFPostScriptDjVuXPSPDF formsPDF AnnotationNon-rectangle selectionLicense
Adobe Reader Custom
apvlv PopplerDjVuLibre (not by default, at least)
Atril PopplerlibspectreDjVuLibrelibgxps
DjView DjVuLibre
Emacs Ghostscript*DjVuLibre*GPLv3
Emacs pdf-tools PopplerGPLv3
ePDFView Poppler
Evince PopplerlibspectreDjVuLibrelibgxps
Foxit Reader Custom
gv GhostscriptGPLv3
llpp libmupdflibmupdfGPLv3
MuPDF CustomCustom (mupdf-gl) (mupdf-gl) (mupdf-gl)
Okular PopplerlibspectreDjVuLibreCustom
pdfpc Poppler
qpdfview Popplerlibspectre*DjVuLibre*
Xpdf CustomGPLv3
Xreader Popplerlibspectre*DjVuLibre*libgxps*
Zathura Poppler* / libmupdf*libspectre*DjVuLibre*libmupdf*

* Optional dependency needs to be installed

PDF forms

The PDF forms column in the above table refers to AcroForms support. If you do not need your input to be directly extractable from the PDF, you can also use the applications in #Annotation or #Graphical PDF editing to put text on top of a PDF. PDF forms can be created with LibreOffice Writer (View > Toolbars > Form Controls) and the advanced PDF editors.

The proprietary and deprecated XFA format for forms is not fully supported by Poppler and only supported by Adobe Reader and Master PDF Editor.

Alternatively, web browsers such as Firefox or Chromium feature a built-in PDF viewer capable of filling out forms.

Annotation

    See also List of applications/Documents#Stylus note-taking.

    Graphical PDF editing

    • Scribus can import and export PDF; text is imported as polygons.
    • LibreOffice Draw can import and export PDF; text is imported as text; embedded fonts are substituted.
    • Inkscape can import a single page from a PDF and export to PDF; text is imported as cloned glyphs or text; with the latter embedded fonts are substituted.
    • Graphics editors like GIMP and can also import and export PDFs at the cost of rasterization.

    Basic editors

    • jPDF Tweak Java Swing application that can combine, split, rotate, reorder, watermark, encrypt, sign, and otherwise tweak PDF files.
    https://jpdftweak.sourceforge.net/ || jpdftweakAUR
    • PDF Arranger Helps merge or split pdf documents and rotate, crop and rearrange pages. It is a maintained fork of PDF-Shuffler.
    https://github.com/jeromerobert/pdfarranger || pdfarranger

    Cropping tools

    • pdfCropMargins Automatically crops the margins of PDF files.
    https://github.com/abarker/pdfCropMargins || pdfcropmarginsAUR

      Advanced editors

      PDF tools

      See also Ghostscript.

      • Coherent PDF Command line tools to manipulate PDF files including merge, encrypt, decrypt, scale, crop, rotate, bookmarks, stamp, logos, page numbers.
      https://community.coherentpdf.com/ || cpdfAUR
      • PDFtk Simple tool for doing everyday things with PDF documents.
      https://gitlab.com/pdftk-java/pdftk || pdftk

      Create a PDF from images

      With GraphicsMagick:

      $ gm convert 1.jpg 2.jpg 3.jpg out.pdf

      Concatenate PDFs

      With Ghostscript:

      $ gs -dNOPAUSE -sDEVICE=pdfwrite -sOUTPUTFILE=out.pdf -dBATCH 1.pdf 2.pdf 3.pdf

      With PDFtk:

      $ pdftk 1.pdf 2.pdf 3.pdf cat output out.pdf

      With Poppler:

      $ pdfunite 1.pdf 2.pdf 3.pdf out.pdf

      With QPDF:

      $ qpdf --empty --pages 1.pdf 2.pdf 3.pdf -- out.pdf

      Convert a PDF to text

      With Poppler and maintaining the layout:

      $ pdftotext -layout in.pdf out.txt

      See also .

      Decrypt a PDF

      This section lists commands to decrypt a PDF to an unencrypted file. Note that most PDF viewers also support encrypted PDFs.

      With PDFtk:

      $ pdftk in.pdf input_pw password output out.pdf

      With Poppler to PostScript:

      $ pdftops -upw password in.pdf out.ps

      With QPDF:

      $ qpdf --decrypt --password=password in.pdf out.pdf
      Tip: Forgotten passwords might be recovered with pdfcrack, see pdfcrack(1).

      Encrypt a PDF

      The user password is used for encryption, the owner password to restrict operations once the document is decrypted, for more information, see Wikipedia:PDF#Encryption and signatures.

      With PDFtk:

      $ pdftk in.pdf output out.pdf user_pw password

      With PoDoFo:

      $ podofoencrypt -u user_password -o owner_password in.pdf out.pdf

      With QPDF:

      $ qpdf --encrypt user_password owner_password key_length -- in.pdf out.pdf

      where can be 40, 128 or 256.

      Extract images from a PDF

      With Poppler to JPEG:

      $ pdfimages infile.pdf -j outfileroot

      Extract page range from PDF, split multipage PDF document

      With Ghostscript as a single file

      $ gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER -dFirstPage=first -dLastPage=last -sOutputFile=outfile.pdf infile.pdf

      With PDFtk as a single file:

      $ pdftk infile.pdf cat first-last output outfile.pdf

      With Poppler as separate files:

      $ pdfseparate -f first -l last infile.pdf outfileroot-%d.pdf

      With QPDF as a single file:

      $ qpdf --empty --pages infile.pdf first-last -- outfile.pdf

      With mutool as a single file:

      $ mutool clean -g infile.pdf outfile.pdf first-last

      Imposing a PDF

      PDF Imposition (e.g. to combine multiple pages to one page) can be done with pdfjam, for example paper waste can be reduced with pdfnup and pdfbook can be used to arrange PDFs into a format suitable for book binding.

      Inspecting metadata

      With ExifTool:

      $ exiftool file.pdf

      With Poppler:

      $ pdfinfo file.pdf

      Optimize, reduce size of a PDF

      With Ghostscript one of:

      $ ps2pdf -dPDFSETTINGS=/screen in.pdf out.pdf
      $ gs -dNOPAUSE -dBATCH -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/printer -sOutputFile=out.pdf in.pdf

      For different settings see the documentation.

      There is also , a script wrapping gs.

      Rasterize a PDF

      With GraphicsMagick to convert a specific page:

      $ gm convert -density dpi infile.pdf[page] outfile.jpg

      With Poppler to convert all pages:

      $ pdftoppm -jpeg -r dpi infile.pdf outfileroot

      With Poppler to convert a specific page:

      $ pdftoppm -jpeg -r dpi -f page -singlefile infile.pdf outfileroot

      Splitting PDF pages

      With mupdf-tools to split every page vertically into two pages:

      $ mutool poster -y 2 in.pdf out.pdf

      Can be used to undo simple imposition.

      Add signature.png or image to one of the pages in the PDF

      Adding an image to any location in a PDF can be done

      Details on these and other solutions can be found on StackExchange.

      Add digital signature to PDF

      can digitally sign PDF files with X.509 certificates in GUI and CLI.

      Readers such as Okular and MuPDF can sign PDFs with digital signatures. This requires a PFX certificate, which can be created with an OpenSSL command:

      $ openssl req -x509 -days 365 -newkey rsa:2048 -keyout cert.pem -out cert.pem
      $ openssl pkcs12 -export -in cert.pem -out cert.pfx

      MuPDF users can then sign PDFs with the cert.pfx using the graphical interface, or its mutool-sign tool.

      Okular users must import cert.pfx into a certificate store such as the one in the default Firefox profile. With Firefox this is done through Settings > Privacy & Security > View Certificates > Your Certificates > Import and selecting cert.pfx. Afterwards Okular will offer this certificate to be used when signing PDFs.

      Libreoffice can also sign PDFs.

      Removing annotations from a PDF

      With :

      $ rewritepdf.pl -C in.pdf out.pdf

      See https://superuser.com/a/1051543 for more information.

      DjVu tools

      • DjVuLibre provides many command-line tools, like for example.

      Convert DjVu to images

      Break Djvu into separate pages:

      $ djvmcvt -i input.djvu /path/to/out/dir output-index.djvu

      Convert Djvu pages into images:

      $ ddjvu --format=tiff page.djvu page.tiff

      Convert Djvu pages into PDF:

      $ ddjvu --format=pdf inputfile.djvu ouputfile.pdf

      You can also use --page to export specific pages:

      $ ddjvu --format=tiff --page=1-10 input.djvu output.tiff

      this will convert pages from 1 to 10 into one tiff file.

      Processing images

      You can use to:

      • fix orientation
      • split pages
      • deskew
      • crop
      • adjust margins

      Make DjVu from images

      There is a useful script img2djvu-gitAUR.

      $ img2djvu -c1 -d600 -v1 ./out

      it will create 600 DPI from all files in directory.

      Alternatively, you can try didjvuAUR, which seems to create smaller files especially on images with well defined background.

      PostScript tools

      ps2pdf

      ps2pdf is a wrapper around ghostscript to convert PostScript to PDF:

      $ ps2pdf -sPAPERSIZE=a4 -dOptimize=true -dEmbedAllFonts=true YourPSFile.ps

      Explanation:

      • with you define the paper size. For valid PAPERSIZE values, see .
      • lets the created PDF be optimised for loading.
      • makes the fonts look always nice.

      Libraries

      • libharu C library for generating PDF documents.
      https://github.com/libharu/libharu || libharu, Lua binding: lua-hpdfAUR

        Python

        • PyPDF3 A pure-Python library built as a PDF toolkit.
        https://github.com/sfneal/PyPDF3 || python-pypdf3AUR

        See also

        This article is issued from Archlinux. The text is licensed under Creative Commons - Attribution - Sharealike. Additional terms may apply for the media files.