MIT researchers have developed a novel "unsupervised" language translation model -- meaning it runs without the need for human annotations and guidance -- that could lead to faster, more efficient computer-based translations of far more languages.

Translation systems from Google, Facebook, and Amazon require training models to look for patterns in millions of documents -- such as legal and political documents, or news articles -- that have been translated into various languages by humans. Given new words in one language, they can then find the matching words and phrases in the other language.

But this translational data is time consuming and difficult to gather, and simply may not exist for many of the 7,000 languages spoken worldwide. Recently, researchers have been developing "monolingual" models that make translations between texts in two languages, but without direct translational information between the two.

In a paper being presented this week at the Conference on Empirical Methods in Natural Language Processing, researchers from MIT's Computer Science and Artificial Intelligence Laboratory (CSAIL) describe a model that runs faster and more efficiently than these monolingual models.

The model leverages a metric in statistics, called Gromov-Wasserstein distance, that essentially measures distances between points in one computational space and matches them to similarly distanced points in another space. They apply that technique to "word embeddings" of two languages, which are words represented as vectors -- basically, arrays of numbers -- with words of similar meanings clustered closer together. In doing so, the model quickly aligns the words, or vectors, in both embeddings that are most closely correlated by relative distances, meaning they're likely to be direct translations.

In experiments, the researchers' model performed as accurately as state-of-the-art monolingual models -- and sometimes more accurately -- but much more quickly and using only a fraction of the computation power.

"The model sees the words in the two languages as sets of vectors, and maps [those vectors] from one set to the other by essentially preserving relationships," says the paper's co-author Tommi Jaakkola, a CSAIL researcher and the Thomas Siebel Professor in the Department of Electrical Engineering and Computer Science and the Institute for Data, Systems, and Society. "The approach could help translate low-resource languages or dialects, so long as they come with enough monolingual content."

The model represents a step toward one of the major goals of machine translation, which is fully unsupervised word alignment, says first author David Alvarez-Melis, a CSAIL PhD student: "If you don't have any data that matches two languages ... you can map two languages and, using these distance measurements, align them."

Relationships matter most

Aligning word embeddings for unsupervised machine translation isn't a new concept. Recent work trains neural networks to match vectors directly in word embeddings, or matrices, from two languages together. But these methods require a lot of tweaking during training to get the alignments exactly right, which is inefficient and time consuming.

Measuring and matching vectors based on relational distances, on the other hand, is a far more efficient method that doesn't require much fine-tuning. No matter where word vectors fall in a given matrix, the relationship between the words, meaning their distances, will remain the same. For instance, the vector for "father" may fall in completely different areas in two matrices. But vectors for "father" and "mother" will most likely always be close together.

"Those distances are invariant," Alvarez-Melis says. "By looking at distance, and not the absolute positions of vectors, then you can skip the alignment and go directly to matching the correspondences between vectors."

That's where Gromov-Wasserstein comes in handy. The technique has been used in computer science for, say, helping align image pixels in graphic design. But the metric seemed "tailor made" for word alignment, Alvarez-Melis says: "If there are points, or words, that are close together in one space, Gromov-Wasserstein is automatically going to try to find the corresponding cluster of points in the other space."

For training and testing, the researchers used a dataset of publicly available word embeddings, called FASTTEXT, with 110 language pairs. In these embeddings, and others, words that appear more and more frequently in similar contexts have closely matching vectors. "Mother" and "father" will usually be close together but both farther away from, say, "house."

Providing a "soft translation"

The model notes vectors that are closely related yet different from the others, and assigns a probability that similarly distanced vectors in the other embedding will correspond. It's kind of like a "soft translation," Alvarez-Melis says, "because instead of just returning a single word translation, it tells you 'this vector, or word, has a strong correspondence with this word, or words, in the other language.'"

An example would be in the months of the year, which appear closely together in many languages. The model will see a cluster of 12 vectors that are clustered in one embedding and a remarkably similar cluster in the other embedding. "The model doesn't know these are months," Alvarez-Melis says. "It just knows there is a cluster of 12 points that aligns with a cluster of 12 points in the other language, but they're different to the rest of the words, so they probably go together well. By finding these correspondences for each word, it then aligns the whole space simultaneously."

The researchers hope the work serves as a "feasibility check," Jaakkola says, to apply Gromov-Wasserstein method to machine-translation systems to run faster, more efficiently, and gain access to many more languages.

Additionally, a possible perk of the model is that it automatically produces a value that can be interpreted as quantifying, on a numerical scale, the similarity between languages. This may be useful for linguistics studies, the researchers say. The model calculates how distant all vectors are from one another in two embeddings, which depends on sentence structure and other factors. If vectors are all really close, they'll score closer to 0, and the farther apart they are, the higher the score. Similar Romance languages such as French and Italian, for instance, score close to 1, while classic Chinese scores between 6 and 9 with other major languages.

"This gives you a nice, simple number for how similar languages are ... and can be used to draw insights about the relationships between languages," Alvarez-Melis says.

Story Source:

Materials provided by Massachusetts Institute of Technology. Original written by Rob Matheson. Note: Content may be edited for style and length.

Cite This Page:

MLA
APA
Chicago

Massachusetts Institute of Technology. "Model paves way for faster, more efficient translations of more languages: New system may open up the world's roughly 7,000 spoken languages to computer-based translation.." ScienceDaily. ScienceDaily, 1 November 2018. <www.sciencedaily.com/releases/2018/11/181101092056.htm>.

RELATED TOPICS
Mind & Brain
Language Acquisition
Child Development
Literacy
Computers & Math
Computer Modeling
Mathematical Modeling
Mathematics
Science & Society
Space Policy
STEM Education
Security and Defense

advertisement

RELATED TERMS
Computer simulation
Great Ape language
3D computer graphics
Mathematical model
Psycholinguistics
Grid computing
Artificial neural network
Cryptography

RELATED STORIES
Children Take Longer to Learn Two Languages at Once Compared to Just One -- Don't Fret
Aug. 23, 2018 — Bilingual children from immigrant families are not two monolinguals in one. They develop each language at a slower pace because their learning is spread across two languages. A researcher shows ... read more
How Information Moves Between Cultures
Dec. 16, 2014 — Networks that map strength of connections between languages predict global influence of their speakers. By analyzing data on multilingual Twitter users and Wikipedia editors and on 30 years' worth of ... read more
Scientists Create Circuit Board Modeled on the Human Brain
Apr. 28, 2014 — Scientists have developed faster, more energy-efficient microchips based on the human brain -- 9,000 times faster and using significantly less power than a typical PC. This offers greater ... read more
Our Brains Are Hardwired for Language
Apr. 17, 2014 — People blog, they don't lbog, and they schmooze, not mshooze. But why is this? Why are human languages so constrained? Can such restrictions unveil the basis of the uniquely human capacity for ... read more
FROM AROUND THE WEB
Below are relevant articles that may interest you. ScienceDaily shares links with scholarly publications in the TrendMD network and earns revenue from third-party advertisers, where indicated.

Image classification by addition of spatial information based on histograms of orthogonal vectors
PLoS ONE
The Processing of English Derived Words by Chinese-English Bilinguals
Language Learning
Sollis lands $50m for extended-release pain-relief drug
Sarah Faulkner, Drug Delivery Business
Bilinguals’ Existing Languages Benefit Vocabulary Learning in a Third Language
Language Learning

Navigating between a Monolingual Utopia and Translingual Realities: Experiences of American Learners of Yorùbá as an Additional Language
Applied Linguistics
Constructing L2 learning spaces: Ways to achieve learning inside and outside the classroom
Applied Linguistics
Translingual Practice as Spatial Repertoires: Expanding the Paradigm beyond Structuralist Orientations
Applied Linguistics
Energy-Based Normalization for Resonant Power Converters
IEEE Transactions on Power Electronics

Powered by
advertisement
Most Popular
this week
HEALTH & MEDICINE
Obese Mice Lose a Third of Their Fat Using a Natural Protein
How the Brain Decides What to Learn
Appendix Identified as a Potential Starting Point for Parkinson's Disease
MIND & BRAIN
AI Systems Shed Light on Root Cause of Religious Conflict: Humanity Is Not Naturally Violent
Mind’s Quality Control Center Found in Long-Ignored Brain Area
Just a Few Drinks Can Change How Memories Are Formed
LIVING & WELL
High Exposure to Radio Frequency Radiation Associated With Cancer in Male Rats
Boy Or Girl? It's In The Father's Genes
A Dog's Color Could Impact Longevity, Increase Health Issues
advertisement
Strange & Offbeat

HEALTH & MEDICINE
Immigration to the United States Changes a Person's Microbiome
Drugs from Dirt
An End to Arachnophobia 'Just a Heartbeat Away'
MIND & BRAIN
Gut Bacteria May Control Movement
Empathetic Machines Favored by Skeptics but Might Creep out Believers
Sales Staff Recommend More Round-Shaped Products for Obese Customers
LIVING & WELL
How Should Autonomous Vehicles Be Programmed?
Rating Movies Based on Fear Pheromones in the Cinema
Would You Zap Your Brain to Improve Your Memory?
advertisement