In Part 1 of this article , I briefly described the advent of simultaneous interpretation and the techniques that enable interpreters like myself to perform this task. Next, I would like to answer an obvious question confronting our industry today: Is AI already good enough to replace human interpreters? If not, will AI ever be?

To frame this discussion, let us first seek to understand how artificial intelligence may be able to perform the task of a simultaneous interpreter. There are three steps:

Now, to answer the question of how good AI already is, we can give technology a report card at each step, and multiply the three numbers to get an overall result.

So, if we do the math by multiplying these scores, we will now get somewhere around 81%. But what exactly does this number mean? Is the job of human interpreters 81% doomed, or are we still safe? I would like to discuss two additional considerations.

First, machines and human interpreters are good at different, and usually supplementary, parts of the job. For instance, human interpreters often feel stressed out by the numbers, figures, and proper nouns that come up in speeches being interpreted. If the following sentence comes up in a speech out of the blue, most human interpreters will have a hard time keeping up simultaneously, especially if the interpreter is unfamiliar with the subject matter.

MAHMOUD MOHIELDIN, Senior Vice-President of the 2030 Development Agenda, United Nations Relations and Partnerships, World Bank Group, describing the messages that emerged from the international financial institution’s spring meetings, said global growth has lost momentum, dropping from 3.3 per cent in the first quarter of 2018 to below 2.7 per cent in the fourth quarter.

Machines, however, are amazingly accurate and fast when it comes to transcribing and translating proper nouns and numbers. If the transcribed and translated message is displayed on a screen in front of the human interpreter in the “booth” (the soundproof space in which interpreters work), it would greatly enhance the confidence and overall quality of interpreter’s output.

What about the AI’s weak spot? The answer is that AI is, to this day, still very weak at thinking and analyzing. Human interpreters take into account the social context when they interpret. Machines are utterly unable to do so. For instance, when the American investment guru Ray Dalio was invited to give a talk in China last year, the event organizer in Beijing was daring enough to use a service provided by named Sogou, offering a combination of Step 1 (speech recognition) and Step 2 (text-to-text translation) with both the transcription and the translation displayed on a big screen.

When the host of the event, a Chinese professor and friend of Mr. Dalio introduced his guest, he said “Ray是一个做梦的人” (Ray shi yi ge zuo meng de ren). The correct English translation would have been something like “Ray is a dreamer” or “Ray is a man with dreams.” To everyone’s bewilderment, what came out on the big screen was: “瑞士一个,做梦的人。” (Rui shi yi ge zuo meng de ren)”, accompanied by the English translation “One in Switzerland. A dreamer.” Phonetically, that was exactly what the speaker said, but apparently the machine mistakenly understood the syllables Ray and shi (the Chinese character for “is”) to mean Rui-shi (瑞士 / Switzerland).

Picture Credit: Jonathan Rechtman, https://www.linkedin.com/pulse/ray-dalio-speaks-china-machine-translation-fails-jonathan-rechtman/

This is a telling case, because the root cause of this blunder is not lack of data or computing power, but the inability to think and analyze. Had a human interpreter heard the same combination of sounds (Ray-shi), (s)he would very likely be able to understand it as “Ray is”, rather than “Switzerland”, especially given that Mr. Ray Dalio is American.

Second, it is also worth pointing out that technical maturity is one thing, but user adoption is quite another. Just because some technology is “pretty much there” doesn’t mean it can take over a market any time soon. One major barrier to be overcome is the perceived risk and lack of trust by stakeholders. After all, the real decision maker about using simultaneous interpretation services is not someone browsing a wikipedia page and casually clicks the “translate” button on Chrome, but big institutional clients (think of the United Nations and Government of Canada) who tend to be conservative and slow in adopting new technology. With the slightest risk that things might go wrong, these large institutions usually hold back and stick to the safe option that they’ve been using for decades.

This dilemma between innovation and risk was best shown in the public’s reaction to the first fatal accident caused by Uber’s self-driving car. In light of this accident – which we knew was going to happen sooner or later –  Uber immediately suspended its tests. In a way, this is unfair for the Ubers of our world: statistically speaking, human drivers kill hundreds of pedestrians every day, but very few of those accidents make headlines.

I suspect that the same story will play out for AI-enabled simultaneous interpretation. As soon as news of the first “major accident” breaks out (well, hopefully, it won’t cost the life of anyone), people will blame it on the imperfect technology, and hold back from it for a while. Perhaps this means that the wide adoption of breakthrough technologies in facilitating international meetings, where the stakes are usually high, won’t be an easy pathway. Will it happen one day? Maybe, but much later than a technologist might expect.

To conclude, I would like to offer three predictions about where this “race” between human brain and artificial intelligence is headed.

1) In the next 5 -10 years, technology will continue to re-shape conference interpreting, improve its user experience, and lower its cost.

2) In 10 – 20 years, most interpretation work will involve AI-powered tools as assistants.

3) The human interpreter’s job will be re-defined, but will never be replaced.

In short, AI should, and will, have a role to play in shaping the future of simultaneous interpretation, just as AI will do for many other professions. However, AI should be not a replacement, but rather an assistant, to human interpreters. If used properly, it will greatly augment our ability to translate spoken language simultaneously, accurately, and elegantly.


Rony Gao is a member of Mensa Canada, a practicing conference interpreter, and a cross-cultural consultant based in Toronto. As a Chinese-English interpreter, Rony has worked for a wide array of political and business leaders.

2 Responses

  1. Very interesting approach of AI as assistant for Interpreters, as some are for translators.. This would be the case, for example, if DeepL could talk…

  2. It’s only a matter of time.
    Computers will beat you at chess and translate what you say at the same time. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *