PDF FILES TRANSLATION - José Henrique Lamensdorf - translation - tradução

Go to content




The current way of distributing the most varied publications is electronically, via PDF files. Books, magazines, manuals, catalogs, brochures, everything that was previously printed in hard copy is, nowadays, published as PDF files, either exclusively, or as a complement to paper copies.

The greatest advantage of PDF files is that they may be seen and read with almost any device, not only computers running under any operating system, but also smartphones, tablets, smart TVs, and others. Were that not enough, it is possible to search for words and phrases in them, and they may contain audio, video, and animations.

Of course, to create such complex and versatile files, not only technology and software are required, but also some considerable visual arts talent, in order to give them a visually appealing look. However when the mission is about translating a PDF file into a different language, all the artwork will have been done, and all that's required is to preserve it; on the other hand, the file complexity on its own may hamper translation to some extent.

Whar follows is an explanation of the concept to translate PDF files. If you came to this page to see how it's actually done, please click here.

Historically, PDF – Portable Document Format – developed by Adobe in 1993, is a very handy format, which indeed became a standard in the marketplace. The major features that led to it were:
    • Files are very compact, making their distribution easier, either on physical media (disks) or over the Internet.
    • The same PDF file may be viewed or printed regardless of the computer or operating system used.
    • There is software to view/print PDF files using any operating system, and they are all free.
    • Any computer program capable of printing to a PostSctipt (also developed by Adobe) printer may generate a PDF file.
    • If the file has been properly generated, the user’s computer doesn’t need to have the type fonts originally used to create the document, to display it exactly as the original.

As a result, an overwhelming number of companies began publishing their formerly hard copy catalogs, manuals, folders, and other publications in PDF format. On top of saving an unimaginable quantity of paper, contributing to the environment, updating became much easier and faster.

However it isn’t that simple. Such publications are usually developed using DTP (= DeskTop Publishing) software, typically PageMaker, InDesign, FrameMaker, QuarkXpress, and others. Exception to PageMaker and InDesign, which are “father & son”, mastering each of them requires a fresh learning approach; i.e. each one is a novelty for the user of another DTP package.
Were that not enough, the files each one generates use its proprietary format, incompatible with the others. None of the few converters available between them works well. What they all have in common is the ability to generate PDF files.

It is worth noting that, from the translation standpoint (and others as well), there are two types of PDF files. One is the “software-generated”, “distilled” (as Adobe names it), or "live" type, files which are editable, therefore translatable. The other one is scanned or "dead", where a printed page is converted into a graphic.

In order to make it crystal clear, the letter “O” in a generated PDF is a letter “O” with certain features (font, size, bold, italic, underlined, etc.). In a scanned PDF, the letter O is simply a circle or an oval somewhere in a drawing that takes up the entire page.

The conventional process for translating such publications is complex. Let’s call the Desktop Publisher a DTPer, for short. I am both a translator and a DTPer, however in the second activity I’m limited to using PageMaker.

Generally, the traditional process comprises the following steps:
    • DTPer extracts the text from the original file and sends it to the Translator as a table, to know which original segment corresponds to what piece of the translation.
    • Translator translates on another column, preserving the original table format, as well as the formatting of certain words, i.e. which of them should be in italics, bold, underscored, and sends the table back to the DTPer.
    • DTPer carefully copies and pastes, one by one, each block of text to their right place on the original file. Then, if the text has changed in size, DTPer makes the necessary adjustments. Next, DTPer distills a PDF file, and sends it to the Translator.
    • Translator carefully reviews the publication, looking for missing, surplus, or misplaced text, wrong diacritics due to incompatible fonts, hyphenation mistakes, and others. Translator prepares a list of corrections, either on a separate file, or by means of annotations on the PDF itself.
    • A ping-pong game begins between Translator (or Reviewer) and the DTPer, which will only end when the corrections list is reduced to nothing.

The reason for both translators and translation agencies eschewing PDF files is obvious. Generally, as good DTPers like to create new publications, they take such translation jobs only to fill in their available time. Nevertheless, it’s a nuisance to all involved.

Recent technology brought us a new way to translate PDF files. One day I saw a software named Infix, which allowed someone having practice in DTP to edit PDF with relative ease. At least it was a lot better than going back to the origins and finding someone capable of dealing with that specific DTP program’s proprietary format file. I don’t know if I was the only one to do it, however I wrote to the Infix developers suggesting them to adapt the software for the PDF files translation market. That’s how Infix became a PDF translation tool.

Back to content