Human translators require extraordinary concentration to translate and listen to speech at a time. Currently, there are about few thousand expert simultaneous translators and their job is so demanding that they need to swap places with each other after every 20 to 30 minutes. And if the conversations are long, possibility of error rises exponentially.
Machines are capable of outdoing humans in this job, since they don’t get tired and have superior memory. But according to Oregon State University and Baidu researchers, it is also not that easy for the machines.
The results of research work have been published in a paper titled as “Simultaneous Translation with Integrated Anticipation and Controllable Latency”. Researchers developed a neural network model that can translate Mandarin Chinese into English almost instantly, with English translation slowed down by up to minimum 5 words.
In the AI translation system, an encoder transforms the target language words into a vector form. A decoder forecasts the chances of the next word by studying the previous sentences words. The decoder works after the encoder and produces the interpreted words, until it processes the complete text or speech.
“In one of the examples, the Chinese sentence ‘Bush President in Moscow…’ would suggest the next English word after ‘President Bush’ is likely ‘meets’ ”, Liang Huang, principal scientist at Baidu Research, clarified.
He further added:
“This is possible because in the training data, we have a lot of “Bush meeting someone, like Putin in Moscow” so the system learned that if “Bush in Moscow”, he is likely “meeting” someone.”
According to Huang, the struggle depends upon which language is being translated. Closely related languages such as Spanish and French are similar in structure, the order of their words are more in line.
German and Japanese sentences are arranged in subject-object-verb (SOV) order. While English and Chinese both are arranged in subject-verb-object (SVO) order. Therefore, translating from Japanese and German to Chinese and English is more challenging.
“There is a well-known joke in the UN that a German-to-English interpreter often has to pause and “wait for the German verb”. Standard Arabic and Welsh are verb-subject-object, which is even more different from SVO,” he said.
The code of the new developed algorithm, after a bit tweaking, can be used for any neural machine translation models. It has already been deployed in in-house speech-to-text translation at Baidu and will be shown at “Baidu World Tech” conference taking place on November 1st in Beijing.
“We don’t have an exact timeline for when this product will be available for the general public, this is certainly something Baidu is working on,” Liang said. “We envision our technology making simultaneous translation much more accessible and affordable, as there is an increasing demand. We also envision the technology [will reduce] the burden on human translators.”