Machine Translation (MT) software comes in many forms and in two specific categories: commercial, and free of charge. At the top end of the commercial offerings are sophisticated and expensive software tools used by professional freelance translators and translation companies in order to ease and speed up their laborious tasks. The name TRADOS is one of the most used. It offers packages costing from 600 to 2,500 Euros. At the lower commercial level there are many products costing between $60 and $120 for help with translating between the major European languages, or at least between English and those languages. For free Internet translation services, the current leader is Google Translate, closely followed by its recent challenger Microsoft’s BING Translator. Both produce fast but basic translations of all sorts of Internet material, in a very wide range of languages and language pairs. For a number of years, the earlier Internet leader was Altavista’s Babelfish. Under the Yahoo label, this free programme is still available and widely used but with the two younger competitors making fast progress with their more effective MT formulas, it is showing its age.
As a preliminary sample of MT, take the following absurdly easy test used by blog researchers comparing and rating ten budget software translation packages. From this site,
the references lead to this basic test, from Spanish to English.
“Abuela, ¿por qué tienes los ojos tan grandes?” Caperucita Roja preguntó. “Para que yo pueda ver mejor,” Dijo la abuela. “¡Oh, abuelita, ¿por qué tienes la boca tan grande?” “Para poder comerte mejor!” Entonces, la abuela salta de la cama.
They offer the following as a “Correct Translation” against which to compare the ten commercial contenders:
“Grandma, why do you have such big eyes?” Little Red Riding Hood asked. “So that I can see better.” the grandma said. “Oh, Grandma, why do you have such a big mouth?” “So I can eat better!” Then, the grandma jumps out of the bed.
For a description of the major three free Internet MT systems listed above and a judgement on their relative qualities, see John Yunker’s articles on the work of Ethan Shen, starting with this one and following the links).(Shen pronounces Google Translate as the overall winner.)
Another strong recommendation of Google’s quality and breadth of coverage as well as a clear explanation of the Google method is to be found in Chapter 23 of David Bellos’s recent wide-ranging book on Translation, Is That a Fish in Your Ear – by now a runaway bestseller.
The chapter offers a potted history of MT and expresses Bellos’s very positive view of the advances in MT achieved by Google, emphasising its novel approach to the task of MT. In a recent article, Bellos offers an edited version of pages 263-266 of that chapter (‘The Adventure of Automated Language Translation Machines’) in which, in characteristic manner, he succinctly explains the complex Google system to us:
“Using software originally developed in the 1980s by researchers at IBM, Google has created an automatic translation tool that is unlike all others. It is not based on the intellectual presuppositions of early machine translation efforts – it isn’t an algorithm designed only to extract the meaning of an expression from its syntax and vocabulary.
“In fact, at bottom, it doesn’t deal with meaning at all. Instead of taking a linguistic expression as something that requires decoding, Google Translate (GT) takes it as something that has probably been said before.
“It uses vast computing power to scour the internet in the blink of an eye, looking for the expression in some text that exists alongside its paired translation.
“The corpus it can scan includes all the paper put out since 1957 by the EU in two dozen languages, everything the UN and its agencies have ever done in writing in six official languages, and huge amounts of other material, from the records of international tribunals to company reports and all the articles and books in bilingual form that have been put up on the web by individuals, libraries, booksellers, authors and academic departments.
“Drawing on the already established patterns of matches between these millions of paired documents, Google Translate uses statistical methods to pick out the most probable acceptable version of what’s been submitted to it.”
Although he admits that Google Translate results are not always satisfactory, Bellos forecasts a rosy future for MT and for Google in particular as it improves and adds to its fabulous corpora in 58 language.
To give an idea of the standard of translation achieved by Google, and to give a glimpse of what Professor Bellos’s enthusiasm is founded on, I propose to offer and examine samples of translations into English from four languages. The additional factor is that BING (which offers 2-way translations to and from 37 languages as compared with the 58 Google pairs cited by Professor Bellos) will be subjected to the same tests, as evidence of this battle of the Free to Ether Translation Titans. (Results from Yahoo’s Babelfish are offered at the end of the piece.)
Firstly (in the current article) I present and compare translations from French and Spanish into English. In a later blog article I hope to offer similar material from Russian and Hindi (probably transliterated to fit in the WordPress system). From these disparate examples, we may be able to discern the strengths of the two software programmes and some of the problems which still remain to be overcome in the search for workable and useful translations into and out of all printed languages.
By way of Prologue to the proposed comparisons, if we try the ‘Little Red Riding Hood’ test sample on Google and BING, we get the following results.
Google Translate:
“Grandma, why are your eyes so big?” Little Red Riding Hood said. “So that I can see better,” said the grandmother. “Oh, Grandma, why your mouth is so big?” “To eat better!” Then the grandmother jumps out of bed.
There are two unsatisfactory translations here:
why your mouth is so big?” “To eat better!”
BING Translator:
“Grandmother, why you have such large eyes?” Little Red Riding Hood asked. “So that I can see better,” said the grandmother. “Oh, grandmother, why have the big mouth?!” “To be able to eat better!” Then Grandma jumps out of bed.
Again, two unsatisfactory translations, and for the same segment:
why have the big mouth?!” and “To be able to eat better!”
Both Google and BING completely miss the agglutinated Spanish pronoun in “comer” + “-te” (“to eat YOU better”), but, IMHO, Google is marginally in front of BING in the second listed infelicity.
Now let us move on to a more challenging test of MT ability. For this I have chosen short segments from French and Spanish Wikipedia on a topic of recent interest.
1.
French Wikipedia: ‘Crise financière mondiale débutant en 2007’
“La crise financière mondiale qui a commencé en 2007 est une crise financière marquée par une crise de liquidité et parfois par des crise [sic: = crises] de solvabilité tant au niveau des banques que des Etats, et une raréfaction du crédit au niveau des entreprises. Amorcée en juillet 2007, elle trouve son origine dans le dégonflement de bulles de prix (dont la bulle immobilière américaine des années 2000) et les pertes importantes des établissements financiers provoquées par la crise des subprimes. C’est la crise la plus grave de l’histoire des bourses de valeurs, après celle de 1873, découlant de la crise bancaire de mai 1873.
“La crise financière de l’automne 2008 amplifie le mouvement et provoque une chute des cours des marchés boursiers et la faillite de plusieurs établissements financiers. Pour éviter une crise systémique, les Etats doivent intervenir et sauver de nombreuses banques ce qui provoquera une crise de la dette publique en Islande d’abord puis en Irlande. Par ailleurs, elle provoque une récession touchant l’ensemble de la planète. Les finances publiques ont été lourdement sollicitées pour résoudre cette crise. Le déficit public s’est creusé dans de nombreux pays, après un recul du produit intérieur brut mondial de 2,2% en 2009.”
(200 words)
(Points for attention are italicised.)
Google:
“The global financial crisis that began in 2007 is a financial crisis marked by a liquidity crisis and sometimes by solvency crisis at both banks of the States, and a credit crunch at the company level. Began in July 2007, it has its origins in the bursting of price bubbles (including the U.S. housing bubble of the 2000s) and losses of financial institutions caused by the subprime crisis. This is the worst crisis in the history of stock exchanges, after that of 1873, arising from the banking crisis of May 1873.
The financial crisis of autumn 2008 amplifies the movement and causes a fall in equity markets and the collapse of several financial institutions. To avoid a systemic crisis, governments must step in and save many banks which will cause a crisis of public debt in Iceland first and then in Ireland. Moreover, it causes a recession in the entire world. Public finances were heavily used to solve this crisis. The deficit has widened in many countries, after a decline in global GDP by 2.2% in 2009.” (177 words)
Microsoft:
“The global financial crisis that began in 2007 is a financial crisis marked by a crisis of liquidity and solvency crisis sometimes both at the level of the banks of the States, and a scarcity of credit at the level of enterprises. Started in July 2007, it finds its origin in the bursting of bubbles of awards (including the the 2000 US housing bubble) and losses of financial institutions caused by the subprime crisis. It is the most serious crisis in the history of stock exchanges, after 1873, arising from the banking crisis of May 1873.
The financial crisis of autumn 2008 amplifies the movement and causes a collapse in stock market prices and the bankruptcy of several financial institutions. To prevent a systemic crisis, States should intervene and save many banks which will cause a crisis of public debt in Iceland first and then in Ireland. In addition, it causes a recession affecting the entire planet. Public finances were heavily sought to resolve this crisis. The public deficit widened in many countries, after a decline of 2.2% in 2009 world gross domestic product.” (169 words)
These are worthy attempts, useful to the general reader looking for a gist, and produced, on demand, in a few seconds. All that is needed to make them more reliable is shown below (in bold type).
Google, improved:
“The global financial crisis that began in 2007 is a financial crisis marked by a liquidity crisis and sometimes by solvency crises for both banks and States, and a credit crunch at the company level. Beginning in July 2007, it has its origins in the bursting of price bubbles (including the U.S. housing bubble of the 2000s) and serious losses by financial institutions caused by the subprime crisis. This is the worst crisis in the history of stock exchanges, after that of 1873, arising from the banking crisis of May 1873.
The financial crisis of autumn 2008 amplifies the movement and causes a fall in equity markets and the collapse of several financial institutions. To avoid a systemic crisis, governments had to step in and save many banks, which was to cause a crisis of public debt first in Iceland and then in Ireland. Moreover, it caused a recession in the entire world. Public finances were heavily used to solve this crisis. The deficit has widened in many countries, after a decline in global GDP of 2.2% in 2009.” (177 words)
BING, improved:
“The global financial crisis that began in 2007 is a financial crisis marked by a crisis of liquidity and sometimes by solvency crises both at the level of the banks and of the States, and by a scarcity of credit at the company level. Commencing in July 2007, it has its origin in the bursting of price bubbles (including the 2000 US housing bubble) and the serious losses of financial institutions caused by the subprime crisis. It is the most serious crisis in the history of stock exchanges, after the 1873 crisis, arising from the banking crisis of May 1873.
The financial crisis of autumn 2008 amplifies the movement and causes a collapse in stock market prices and the bankruptcy of several financial institutions. To prevent a systemic crisis, States had to intervene and save many banks, which was to cause a crisis of public debt first in Iceland and then in Ireland. In addition, it caused a recession affecting the entire planet. Public finances were heavily drawn on to resolve this crisis. The public deficit widened in many countries, after a decline of 2.2% in world gross domestic product in 2009.” (169 words)
2.
Spanish Wikipedia: ‘Crisis económica de 2008-2011’
“Por crisis económica de 2008 a 2011 se conoce a la crisis económica mundial que comenzó ese año, originada en los Estados Unidos. Entre los principales factores causantes de la crisis estarían los altos precios de las materias primas, la sobrevalorización del producto, una crisis alimentaria mundial y energética, una elevada inflación planetaria y la amenaza de una recesión en todo el mundo, así como una crisis crediticia, hipotecaria y de confianza en los mercados. La causa raíz de toda crisis según la Teoría austríaca del ciclo económico es una expansión artificial del crédito. En palabras de Jesús Huerta de Soto «esta crisis surge de la expansión crediticia ficticia orquestada por los bancos centrales, y que ha motivado que los empresarios invirtieran donde no debían”.
“La crisis iniciada en el 2008 ha sido señalada por muchos especialistas internacionales como la “crisis de los países desarrollados”, ya que sus consecuencias se observan fundamentalmente en los países más ricos del mundo.” (159 words)
(Points for attention are italicised.)
Google
In economic crisis from 2008 to 2011 is known to the world economic crisis that began that year, which originated in the United States. Among the main factors causing the crisis would be the high prices of raw materials, the overvaluation of the product, a global food and energy crisis, high inflation and the threat of global recession around the world and a credit crisis trust and mortgage markets. The root cause of all crises as the Austrian theory of business cycle is an artificial expansion of credit. In the words of Jesus Huerta de Soto “this crisis arises from the fictitious credit expansion orchestrated by central banks, and has motivated entrepreneurs to invest where there were”.
The crisis that began in 2008 has been noted by many international experts as the “crisis of the developed countries”, since its effects are observed mainly in the richer countries of the world. (150 words)
Microsoft
The global economic crisis that began that year, originating in the United States is known by economic crisis of 2008 to 2011. Among the major causative factors of the crisis would be high prices of raw materials, the sobrevalorización of the product, energy and global food crisis, high global inflation and the threat of a recession around the world, as well as a loan, mortgage crisis and confidence in the markets. Caused by following every crisis according to the Austrian business cycle theory is an artificial expansion of credit. In the words of Jesus Huerta de Soto “this crisis stems from the fictional credit expansion orchestrated by central banks, and that has motivated entrepreneurs to invest where wrong”.
The crisis which began in 2008 has been brought by many international experts as the ‘crisis of developed countries’, already that its consequences are observed mainly in countries richest in the world. (150 words)
Google, improved
The economic crisis of 2008 to 2011 is the title given to the world economic crisis that began that year and originated in the United States. Among the main factors causing the crisis would be the high prices of raw materials, the overvaluation of the product, a global food and energy crisis, high inflation and the threat of global recession around the world and a crisis in credit, mortgages and market confidence. The root cause of all crises according to the Austrian theory of the business cycle is an artificial expansion of credit. In the words of Jesus Huerta de Soto “this crisis arises from the fictitious credit expansion orchestrated by central banks, and has motivated entrepreneurs to invest where they should not have done“.
The crisis that began in 2008 has been noted by many international experts as the “crisis of the developed countries”, since its effects are observed mainly in the richer countries of the world. (158 words)
BING, improved
The global economic crisis that began in 2008, originating in the United States, is known as the economic crisis of 2008 to 2011. Among the major causative factors of the crisis would be high prices of raw materials, the overvaluation of the product, a global food and energy crisis, high global inflation and the threat of a recession around the world, as well as a loan crisis, a mortgage crisis and loss of confidence in the markets. The root cause of every crisis, according to the Austrian business cycle theory is an artificial expansion of credit. In the words of Jesus Huerta de Soto “this crisis stems from the fictional credit expansion orchestrated by central banks, and that has motivated entrepreneurs to invest where they should not have done.”
The crisis which began in 2008 has been labelled by many international experts as the ‘crisis of developed countries’, since its consequences are observed mainly in the richest countries in the world. (161 words)
So, on the above evidence, both of these translation tools, Google and BING, offer a very useful BASIC – and lightning fast – FREE service for French and Spanish to English to millions of Internet users. (The situation of English INTO French and Spanish needs separate attention and may be dealt with in a future blog.)
For comparison, here are the results obtained Yahoo’s Babelfish with its updated but ageing technology. Note the higher number of italicised items, and their nature.
1. French to English:
“The world financial crisis which started in 2007 is a financial crisis marked by a crisis of liquidity and sometimes by crisis of solvency as well on the level of the banks as of the States, and a rarefaction of the credit on the level of the companies. Started in July 2007, it finds its origin in the deflation of bubbles of price (of which the American real estate bubble of the years 2000) and important losses of the financial institutions caused by the crisis of the subprimes. C’ is the most serious crisis of l’ history of the purses of values, after that of 1873, rising from the banking crisis of May 1873. The financial crisis of l’ autumn 2008 amplifies the movement and causes a fall of the courses of the stockmarkets and the bankruptcy of several financial institutions. To avoid a systemic crisis, the States must intervene and save many banks what will cause a crisis of the national debt in Iceland d’ access then in Ireland.
In addition, it causes a recession concerning l’ together of planet. Public finances were heavily requested to solve this crisis. The public deficit s’ is dug in many countries, after a retreat of the world gross domestic product of 2,2% in 2009.”
2. Spanish to English
“By economic crisis from 2008 to 2011 it is known world-wide the economic crisis that began that year, originated in the United States. Between the main factors causes of the crisis they would be the high prices of the raw materials, the sobrevaluation of the product, world-wide an nourishing crisis and energetics, a high planetary inflation and the threat of a recession anywhere in the world, as well as a credit, hypothecating crisis and of confidence in the markets. The root cause of all crisis according to the Austrian Theory of the economic cycle is an artificial expansion of the credit. In words of Jesus Kitchen garden of Grove “this crisis arises from the fictitious credit expansion orchestrated by the central banks, and that have motivated that the industralists invested where they did not have”. The crisis initiated in the 2008 has been indicated by many international specialists like the “crisis of the developed countries”, since their consequences are observed essentially in the richest countries of the world.”
*
In a later blog, passages will be selected from two languages which are “more foreign” to English speakers, and for which less raw material has been available to the colossal Internet data banks on which Google Translate and Microsoft Translator rely for their lightning fast searches. The samples will be taken from Russian and Hindi, languages whose structures differ more basically from English than its familiar French and Spanish cousins.
Da svidanya. Phir milenge
(For a lighter and enlightening finish to this long essay, Google’s own explanation of its system is to be found here.)