PDF Newspaper 2.5

We’ve just released a new version of PDF Newspaper, our tool to create printable versions of web articles and feeds. Here’s what’s new.

HTML output

The biggest change in this release is the ability to request HTML output instead of PDF. It may seem odd to offer HTML output in an application called PDF Newspaper, but browser support for recent CSS specifications means that browsers are now quite adept at producing print layouts very similar to what we were generating before as PDF files.

The HTML view we generate contains a print stylesheet to produce a somewhat similar result to our PDF output when printing. There are a number of differences though:

  • You can edit the text before printing.
  • Unlike our PDF output, columns are balanced (roughly same height) in this view.
  • You can convert to PDF using your own PDF creator (e.g. Acrobat) provided it has a PDF print driver.1
  • Our print stylesheet will use 3 columns if you print A4 landscape or larger, 2 columns for A4 portrait, and 1 column for A5 or smaller.2
  • Right-to-left languages can now be displayed.3
  • If you know HTML/CSS, you can make changes to the template.4

Video

Here’s a short video demonstrating some of the features of HTML output – mainly the multi-column CSS rules being applied in print view, based on paper size and orientation.

Live Examples

To load the feeds we used in the video with PDF Newspaper, use the links below:

PDF screenshots (A4)

multi-story mode single-story mode

More screenshots

PDF changes

We’ve made changes to our PDF output to address problems with PDF.js rendering (used by recent versions of Firefox to render PDFs in the browser) and iOS rendering. If you used a previous version of PDF Newspaper and had trouble seeing your PDFs on Firefox or iOS devices, this new version should work better.

We also now offer a Letter template in addition to A4 and A5.

Combining stories from different sources

If you’d like to generate a newspaper by selecting articles from different sources, you’ll have to first create a feed from those articles. The easiest way to do that is to use our Feed Creator. Here you can paste URLs, one per line, and create a static feed containing only the URLs you’ve entered.

Let’s say we want to create a newspaper from the following articles:

We would paste the URLs to these articles in the Feed Creator field provided and click Create Simple RSS. This will generate a feed containing the 3 items above:

Now if we give this feed URL to PDF Newspaper, here’s what it can produce:

Try it!

PDF Newspaper 2.5 is live now at FiveFilters.org.

Request parameters

Using the form we provide is the easiest way to get started, but if you want to call PDF Newspaper programmatically, the following table of request parameters will tell you what PDF Newspaper can accept. These parameters should be used in a HTTP GET request to makepdf.php.

Parameter Value Description
url string URL of a feed or a single web article.
mode multi-story (default), single-story
multi-story
Use for feeds. You can customise the newspaper title.
single-story
Use to process a single article (even with feeds). Produces a more compact layout, omitting newspaper title.
template A4 (default), Letter, A5 Sets PDF paper size. The A4 and Letter templates produce a larger, two-column PDF. The A5 template produces a smaller, single-column PDF.
output pdf (default), pdf-download, html
pdf
Your browser decides what to do with the generated PDF – either it will load the PDF within the browser, download it, or prompt you to choose an action.
pdf-download
Tells the browser that you want to download the PDF rather than view it inside the browser. It will either download automatically or prompt you to choose an action.
html
Outputs the generated HTML without producing a PDF. Produces a result faster than the pdf options, but uses a print stylesheet to achieve a somewhat similar result when you print. Note: the template parameter currently has no effect with HTML output – if printing, you will set the paper size in the print dialog that appears. For best results printing or creating a PDF from this view, please use Firefox.
dir auto (default), ltr, rtl Sets text direction: auto = browser decides, ltr = left-to-right, rtl = right-to-left. This parameter currently only works for HTML output.
images 1 or 0 (default) Include images. Pass 1 to enable.
date 1 or 0 (default) Include date and time for each article (if available). Pass to 1 to enable.
sub string This can be a tagline, slogan, or the main title if using single story mode. If omitted, the default one set in the config file will be used.

Multi-story parameters — These parameters only apply when multi-story mode is enabled (see mode parameter above).

Parameter Value Description
title string Newspaper title. If you’d like to use the default title image instead, delete the title.
order desc (default) or asc Determines how stories will be ordered by date. Pass asc for chronological ordering (oldest story in the feed appears first). Pass desc to have the latest stories shown first.
date_start string If the feed contains dates for feed items, you can restrict items returned by specifying a start date. Any items with a publish date earlier than date_start will be omitted from the output.

date formats
You can pass an absolute date using the YYYY-MM-DD format, e.g. 2014-01-24 or a relative one, e.g. last week or yesterday.
date_end string If the feed contains dates for feed items, you can restrict items returned by specifying an end date. Any items with a publish date later than date_end will be omitted from the output. See note above about using relative and absolute dates.

Full-Text RSS integration — When you give PDF Newspaper a URL to a web article, Full-Text RSS is automatically used to extract its content. For a partial feed, you will have to tell PDF Newspaper if you’d like it passed to Full-Text RSS for full content. (Full-Text RSS integration can be configured or disabled within the config file.)

Parameter Value Description
fulltext 1 or 0 (default) Use for partial feeds. Runs feed through Full-Text RSS before processing result. Pass 1 to enable.
use_extracted_title 1 or 0 (default) Normally feed titles take precedance over extracted titles. Pass 1 to tell Full-Text RSS to replace feed titles with those it extracts. (Requires Full-Text RSS version 3.2 or greater. Has no effect without fulltext parameter.)

API keys — If you want to restrict access to PDF Newspaper you can specify API keys in the config file. URLs produced by PDF Newspaper can be used publically, e.g. linked from a website, so the API key should not appear in the final URL.

Parameter Value Description
api_key string A key that you’ve entered in the config. If you’re calling PDF Newspaper programattically, it’s better to use the key and hash parameters (see below) to hide the actual key in the HTTP request. If this parameter is used, PDF Newspaper will produce the key and hash values automatically and redirect to a new URL to hide the API key. If you’d like to link to a PDF publically while protecting your API key, make sure you copy and paste the URL that results after the redirect. If you’ve configured PDF Newspaper to require a key, an invalid key will result in an error message.
key integer This should be the index number which identifies an API key without revealing it. It must be passed along with the hash parameters. See the config file.
hash string A SHA-1 hash value of the API key (actual key, not index number) and requested URL, concatenated. It must be passed along with the key parameter. In PHP, for exmaple: $hash = sha1($api_key.$url);

Required parameters: url must be supplied.

Changelog

  • New: HTML output with editable content and print stylesheet (Firefox recommended for printing)
  • New: Output parameter to choose between PDF, HTML, and PDF for download
  • New: Text direction parameter (only for HTML output for the time being)
  • New: PDF Letter template for US users
  • New: Form field to specify start date (if feed items include dates)
  • New: Config option to set PDF filename – see $options->filename
  • New: Config option to enable/disable output caching – see options->caching
  • New: Config option to for whitelisting/blacklisting hosts – see $options->allowed_hosts and $options->blocked_urls
  • Font subsetting disabled in PDF output to improve iOS and PDF.js rendering
  • PDF is no longer generated if there are no items to include (e.g. no articles published after start date)
  • Table showing available request parameters now shown in index.php
  • Full-Text RSS updated to version 3.1
  • HTML Purifier updated to version 4.6.0
  • SimplePie updated to version 1.3.1
  • PHP Typography updated
  • Humble HTTP Agent updated
  • TCPDF fonts updated
  • TCPDF minor update (latest version not compatible with our modifications)
  • Plus other minor fixes/improvements

  1. If you have a PDF print driver, you’ll see it in your list of printers when you go to print. If you have Acrobat, you’ll probably see ‘Adobe PDF’ in the list. If you get an error creating a PDF using Adobe PDF, go into Properties and in Adobe PDF Settings, uncheck ‘Rely on system fonts only; do not use document fonts’. ↩︎
  2. Not all browsers currently support multi-column printing. Notably, Chrome, which supports multi-column layouts in its regular view, does not support it in its print view. In our tests, Firefox (tested with version 26) and IE (version 11) had the best support for multi-column printing. ↩︎
  3. By default we rely on the browser to decide (based on the content) whether it should show the result as right-to-left (we use dir=“auto” attribute). You can override this, however, by passing &dir=rtl in the querystring to makepdf.php. ↩︎
  4. To make changes to the template used in the HTML view, save a copy of html_template.html as custom_html_template.html and edit it. The custom file will be used if it exists. This only applies to users of our self-hosted package. ↩︎