Why I do not believe stories about AI

We all know that AI is meant to be taking over the world and doing amazing things, things you would never dream of: finding facts, discovering our political weak-spots, and (most important) doing all those things that we used to think only people could do, like play chess, answer telephones, or – today’s subject – translation.

Ah, machine translation: the golden prize, so often promised us by the AI people, so often not delivered.  Yes, Willard Quine might have proved that it was impossible back in the 1950s, and no, no serious linguistics scholar believes that it is achievable, but that doesn’t stop the AI people claiming that they have already done it, and their cheer-leaders in the not-entirely-educated classes from supporting their claims.

So, we have a number of major AI translation engines now, all of which are claimed to be as good as any human.  We have all had enough fun at the expense of Google translate, and I don’t want to upset Microsoft any more than I have to, so today’s target is Amazon Translate.

Experiment 1: syntactical structure

I have Amazon translate the following sentence, and asked it to translate from English into German (which should be pretty easy, given how close the languages are):

It had quickly become clear to him that, in the time it had taken for him to slide his socks down his legs and over his ankles, first the one and then the other, in a sinuously erotic susurration of silk on skin, the robot army had not just assembled, but had gone to town on the citizens, who were now in the way of becoming disseminated, some here, some there, and some quite literally all over the place.

It replied with the following:

Es war ihm schnell klar geworden, dass in der Zeit, als er seine Socken in die Beine und über seine Knöchel geschoben hatte, zuerst der eine und dann der andere, in einer sündhaft erotischen Anfälligkeit von Seide auf der Haut, die Roboterarmee nicht nur versammelt war, sondern in die Stadt über die Bürger gegangen war, die jetzt im Weg der Verbreitung waren, einige hier, einige dort, und einige ziemlich buchstäblich überall.

To see what’s wrong with this, let’s take my initial sentence apart.  Its structure is as follows:

Main clause It had quickly become clear to him that [1]
Embedded clause [1] [2] the robot army had not just assembled, but had gone to town on the citizens [3]
Embedded clause [2] in the time it had taken for him to slide his socks down his legs and over his ankles, first the one and then the other, in a sinuously erotic susurration of silk on skin
Embedded clause [3] who were now in the way of becoming disseminated, some here, some there, and some quite literally all over the place

So only the first phrase is part of the main sentence. The remainder is a long embedded clause ([1]) which itself has two embedded sub-clauses ([2] and [3]).

Now, it is a fundamental, and inflexible, rule of German that the verb of an embedded clause must always be its final element.  So we would expect the German translation of the embedded clause [1] to look something like (here I am using the algorithm’s choice of vocabulary, not because is is correct, for it is not, but because my concern is structural):

der Roboterarmee … die Bürger, wie jetzt … waren,  in der Zeit … hatte, gegangen war

Inserting this into the main clause, and noting that the main verb is past perfect, and so the participle must go to the end of the sentence, we would get:

Es war … dass der Roboterarmee … die Bürger, wie jetzt … waren,  in der Zeit … hatte, gegangen war, geworden.

Comparing with Amazon’s ‘translation’, it is clear that it has simply followed the English word order.  The result then may be a number of relevant German words, but it is not a German sentence.

Experiment 2: iterative translation

I next applied the famous test invented by Umberto Eco, in which one takes a sentence in (say) English, then asks a machine to translate it into German, and then asks it to translate it back into English, and so on and so forth until the result stabilises.

So, I started from my English sentence above.  After one round-trip via German, I got:

“It had quickly become clear to him that in the time when he had pushed his socks into the legs and over his ankles, first the one and then the other, in a sinfully erotic vulnerability of silk on the skin, the robot army was not only gathered, but had gone to the city about the citizens who were now in the way of spreading, some here, some there, and some quite literally everywhere.”

After another round-trip it stabilised at:

“It had quickly become clear to him that in the time when he had pushed his socks in the legs and over his ankles, first the one and then the other, in a sinfully erotic vulnerability of silk on the skin, the robot army was not only collected, but had gone to the city over the citizens who were now in the way of spread, some here, some there, and some quite literally everywhere.”

This is reasonably close to the original in terms of the sequence of words, but in terms of meaning it is worryingly distant.

Conclusion

The conclusion is that, like Amazon translate, and every other translation engine put together by arrogant, ignorant technologists who don’t consider knowing something about language a prerequisite for designing a translation engine, Amazon translate is not fit for purpose.

This should not surprise anyone, given that it works (like all the other failed translation engines) by using programs trained on existing statements in a language; so it looks for something reminiscent of something it already knows, and assumes that is what it should do next.  Such an approach, in which no effort is made to explain basic syntactical rules, in other words the structure that makes language language, and not a structureless flow of sounds, cannot do anything but fail on anything but the simplest of sentences.

In this era, when the so-called President of the United States of America insists that experts are always wrong, and where a whole generation appears to prefer ‘what the Internet says’ to well-verified fact, it is not surprising that the bogus claims made by the proponents of machine translation, and by extension, on behalf of AI, should gain an overly credulous hearing.  However, they are charlatans, and their product is snake oil.  We listen to them at our peril.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s