WORKING FROM AUDIO RECORDINGS... revisited!
by José Henrique Lamensdorf
This article was originally published in the Translation Journal, and later on Proz, as well as several other places and in different languages, most of which I don't understand. Nevertheless advances in IT made the original version a somewhat obsolete, so at least on my own web site I have a personal commitment to update it now and then.
Sometimes it happens; not too often, but it does. A translator is asked to work from an audio recording. The client assumes that a translator frequently does this; so doing it in any of the translator's working languages should be a piece of cake.
In view of the concerns expressed by first-timers about such an endeavor in translator newsgroups on the web, some years ago I voiced out a few ideas. As several colleagues around the world wrote me about it, and considering that in the computer world too many changes took place in this meantime, I decided to update it.
1. The job
The request might be for transcribing, translating, or both, from an audio recording.
It can never be overemphasized that one of the translator's duties is to educate their clients. Some clients assume that translators who do dubbing/subtitling/captioning work from printed text. Though frequent flaws seen on TV might support such reasoning, it is definitely wrong, at least for dubbing.
So, if the client asks you for both (transcribing and translating), hold your horses! Ask what they need the transcript for. If it's just to have the translation, sell them out of the transcript! It is not necessary. You can translate directly from audio.
A transcript is useful for a written record of what was said in the recording, for developing printed material from its contents, or to rebuild a script for an eventual re-enactment, but definitely not required for dubbing or subtitling.
Therefore, it is important to know what the client intends to do with the transcript or translation. If it's video for dubbing or subtitling, this kind of work requires special skills; make sure you have them before accepting. Furthermore, translation for dubbing and translation for subtitling are quite different jobs. Make sure to start on the right track!
There is, however, one case when transcribing to translate thereafter is a sensible option: when the recording is to be translated in written form (not for dubbing nor subtitling) into several different languages. It is easier to find translators who work from text, and this kind of work is cheaper, at least because it takes less time to do.
2. The media
A recording might be on various kinds of media, and some of them present specific difficulties. Digital audio has removed most of them, but analog audio is still around.
If it is on waxed cylinders, long-play records, 16 mm film, video tape, or any other strictly professional or "vintage" media, you don't have the least obligation to have the equipment to play it. Unless you have the resources to convert from analog into digital media, ask the client to get it done for you.
Once it is on digital media, it will be workable using your computer. Formerly there were numerous issues about ways to convert one file type into another. However my favorite software for this kind of work, Express Scribe, has evolved sufficiently to handle most - if not all - the most common audio and video files. The good news is that its free version has most of the features you'll ever need.
One tip on Express Scribe, though. If you can't make it play your video files properly, chances are you are missing the proper codec for it.
3. The recording
Considering the act of recording in itself, never take anything for granted. If the client says it's a professional recording job, this might mean no more than someone having actually been paid to do it, and/or that the equipment used looked impressive.
You might get a recording where the volume was set too low or too high. If it was set too low, of course you can amplify the sound, but when you increase the volume, the background noise (or any other) will become equally louder. If it was set too high, you get distortion, and some words may sound mumbled.
Even if it is a professionally made movie, there is the risk of music and sound FX covering the words or phrases you'll need to hear.
Another problem is the speaker. A weird accent is enough to create a need for listening to each phrase more than once, sometimes to the extent of requiring some creativity for it to make sense.
Speed is also an issue. Even a good, clear speaker, if talking too fast, might make the job more difficult than it should be. Ifthewordssoundgluedtoeachother, this means that the speaker's mouth is faster than your ears, and you might need a mini-rewind at each and every stop.
Almost each of these problems can be solved, though separately. Digital audio makes it easier without requiring hi-tech audio equipment. To solve them, I have been using for some years a program named Acoustica. I found WavePad to be very similar, so it's a matter of personal preference. A more expensive solution I began to use more recently is Sony SoundForge. For those who prefer freeware, Audacity is a good choice. So there are many options available.
The audio editing program will usually show you the recording in graphic format, so you can adjust the volume up or down as needed. You can even do different things to specific parts of the recording. One such case is when the person holding the microphone is close to the speaker, but turns it around to get questions or comments from members of an audience farther away.
You can also remove noise rather easily. Noise may be a hiss from the tape itself, a hum from poorly grounded equipment, or just wind from a fan or air conditioner hitting the microphone directly. Simply select a part of the recording (the longer the better) that should be silent, ask the audio program to perform a "Noise analysis", then select the whole recording, and ask it to do a "Noise reduction" from that analysis. This should cover 90% of the cases. To play it safe, use a backup copy of your audio file, however if you botch it up, there is usually the "Undo" feature, and the chance of trying again with a somewhat "lighter" noise reduction.
Nevertheless, take care and listen to the "silence" you selected before using the “Noise reduction” function. Once I selected a piece of (what I thought was) silence, but it included the speaker clearing his throat. Everything he said vanished with a click! (Of course, the “Undo” button saved the situation.)
Beware of removing music and sound effects with this feature; there is a high risk of removing parts of the speech altogether. Better try to live with this kind of noise.
4. The process
Your working method will depend on your personal memory buffer, not the one in your computer. Some people (probably all those who do simultaneous translation) will be able to keep long phrases in mind for short recall. Others (like me) will only store short pieces.
This will determine how long you can listen before you have to stop and type. I know people who can listen to 20-30 seconds of a recording, stop, and then go hammering the keyboard without missing a thing. I stop at every phrase or, if it is long, at every punctuation mark. Test yourself, and find the best way for you to work. It's intuitive, but remember that you have to find your way, and not to learn anything new, though you might improve with practice.
Using Express Scribe is quite simple. Though you may buy or build your own foot control for it, but I found the programmable [F#] keys extremely practical for the play, stop, rewind, and other controls. Express Scribe also offers variable playing speed for slowing down fast talkers. FYI it made me retire two bulky open-reel tape recorders I had been using for over a decade.
And while we are discussing software, if you ever need an excellent, versatile, player for many types of AV multimedia files, try the free VLC Media Player.
5. The price
Finally, the big issue... how much should you charge for such a job? It is not so easy to find market standards, though some inconclusive ones exist.
There are two, totally incompatible, measurements for this kind of job. Each one burdens a different side of the deal with risk.
One is to charge per recording time (per minute, block of 10 minutes, or per hour). The risk is on your side. Before having listened to the recording, it's impossible to say how fast those people speak, hence how long the text per minute will be.
The other option is to charge per word. The risk will be on the client's side, as they won't know the size of their bill until the job is finished.
Whatever you and the client agree to should prevail, as long as both parties know what they are getting into.
One way of finding out the basic amount you should charge is by testing. Get an "average" or "typical" 10-min recording and do it! Use a stopwatch to time how long it took you to do the job, and count the words. Knowing how much you think you should make per hour, you can easily calculate the rates you should charge per word or per minute of playing time.
But this is not all of it. You have to check if it will be a transcription or a translation. Consider the additional time in research or whatever it will take you to do the latter, and add it on, possibly as a fixed percentage. Round this percentage up, if you feel that there is a risk of spending extra time to decypher some outlandish accent.
Take also into account whatever audio witchcraft you might have to perform, and add an adequate amount to cover it as a risk, not as an ever-present cost. Don't forget to include a marginal return on your investment in software and hardware, if it's the case.
Then perform a reality check on your final price, and adjust it. There is, of course, some risk that you will lose money (or get grossly overpaid and lose a client) in your first such job. But practice makes it perfect, and while improving your skill you'll quickly discover what a fair price - both competitive and profitable - should be.