Communicating across borders
Maskinoversættelse - tør vi tænke tanken?
All you wanted to know but were afraid to ask about machine translation
OK, bit of a naff heading, but with the latest claims from Google of a breakthrough in neural machine translation, perhaps it’s time to write another GlobalDenmark blog with a little information about the current state of play.
What’s the difference between machine translation and computer-assisted translation?
First it’s important not to confuse machine translation (MT) with computer-assisted translation (CAT).
With MT, the computer does all the work: input a French text, push a button, and out comes an Italian text. The text then usually needs post-editing.
With CAT, the computer will search through terminology databases and databases of previously translated sentences and match the input (source) text with terminology and sentences (or part sentences) in the output (target) language. Then the computer suggests a best match the translator can use or discard.
All professional translators have been using some form of CAT for many years. As I recall, at GlobalDenmark we purchased our first CAT tool in around 2000. This post is about MT, not CAT.
Are there different types of machine translation?
Now, traditionally there have basically been three types of MT: Rule-based MT (RbMT), Statistical MT (SMT) and hybrid systems combining the two. You can almost guess the difference.
With RbMT, masses of linguistic rules and dictionaries about the source and target languages are entered into the computer. The computer then applies these to do the translation. As you can imagine, putting all these rules and other input together is extremely demanding, and as languages develop the rules have to be changed too, so there’s a good deal of maintenance.
With SMT, the computer itself analyses monolingual and bilingual texts to build its own models. The computer learns from the input texts, so the larger the input, the better the model and the more superior the translation. The computer will also ‘learn’ and develop with the language.
Of course, the input must be appropriate and of high quality for the computer to learn, and this can be a problem for SMT. However, if the input texts are all very similar, for example instructions and manuals, terms and conditions, etc. the computer can put together a very good model.
Finally, as I said, there are systems that combine the two approaches.
What’s Google doing?
There’s a newer, fourth type of MT, and this is what Google is getting excited about, as I mentioned in the introduction. It’s called neural machine translation (you guessed it – NMT).
To be honest it’s pretty heavy stuff and having read a lot of articles and Wikipedia I still don’t feel much the wiser. However, it seems we’re in the artificial intelligence world, with an artificial neural network in the computer based on the biological neural networks in the human brain. One advantage with NMT is that it circumvents the need for the vast memory capacity required by SMT.
…..and what’s Google’s breakthrough?
In a paper published in September 2016, Google claims that “in some cases human and Google NMT translations are nearly indistinguishable” and that the “quality of the resulting (neural) translation system gets closer to that of average human translators”.
Of course, others have been quick to question these claims, and have been especially critical of the fact that the translations in the study were based on a sample of “well-crafted simple sentences”.
What do you think?
I think it’s interesting that Google mentions average human translators in the quote above. We should be careful of comparing all ‘human’ translators with all MT. Similarly, you can’t say all translation jobs are the same. Of course, MT without the right rules/statistical data/’neurons’ will never be good, and humans need the right linguistic skills, insights into specialist areas, etc. too. There’s a need for MT and there’s a need for human translation. I believe they’re complementary rather than competitive.
At GlobalDenmark we’re looking at MT in all its forms, and I’m looking forward to getting involved: to benefit ourselves and our customers.
At GlobalDenmark, reading aloud is more than a bedtime story for the kids
Walk around the GlobalDenmark offices and you’ll hear people reading aloud contracts, legislation, reports and of course speeches. It’s often not quite as exciting as the latest Harry Potter, but it’s how we check our translations.The idea came from checking financial statementsSimon Palmer, Head of Translations at GlobalDenmark, started his career as an accountant.
Back in the 1980s, it was normal practice to write financial reports by hand before having them typed up. The procedure then was to read the typed draft, including figures, while a colleague followed in the hand-written version. Errors or omissions were inserted by hand, then the typed draft was returned to the typist for correction.We’ve come a long way since then, but at GlobalDenmark we still believe that reading a translation aloud while a second linguist follows in the source text is a vital part of peer review, and we are convinced that it is the reason for continued praise from our customers for the quality of our translations.
You can get your computer to read your text for you
Did you know that the Microsoft Office package has a function that will speak any text or email you write on your computer? Logically enough it’s called ‘Speak’ (Tale in Danish), and as far as we know it’s available in all versions of Word, PowerPoint and Outlook.You can find it under the arrow on the extreme right of the quick access toolbar in the top left-hand corner of your screen in Word. Select ‘More Commands’ (Flere kommandoer…). Next choose ‘All Commands’ (Alle kommandoer) from the roll-down menu with ‘Popular commands’ (‘Oftest anvendte kommandoer’). Next select ‘Speak’ (Tale), click ‘Add’ (Tilføj) and then close. A speech icon will appear on your Quick Access toolbar and you can hear your text read aloud.