One class in all languages

Advances in communication technology have had a major impact in all sorts of industries, but perhaps none bigger than in education. Now anyone from around the world can listen live to a Nobel Prize Laureate lecture or earn credits from the most reputable universities with nothing more than internet access. However, the possible information to be gained from watching and listening online is lost if the audience cannot understand the language of the lecturer. To solve this problem, scientists at the Nara Institute of Science and Technology (NAIST), Japan, presented a solution with new machine learning at the 240th meeting of the Special Interest Group of Natural Language Processing, Information Processing Society of Japan (IPSJ SIG-NL).

Machine translation systems have made it remarkably simple for someone to ask for directions to their hotel in a language they have never heard or seen before. Sometimes the systems can make amusing and innocent errors, but overall achieve coherent communication, at least for short exchanges usually only a sentence or two long. In the case of a presentation that can extend past an hour, for example, an academic lecture, they are far less robust.

“NAIST has 20% foreign students and, while the number of English classes is expanding, the options these students have are limited by their Japanese ability,” explains NAIST Professor Satoshi Nakamura, who led the study.

Nakamura’s research group acquired 46.5 hours of archived lecture videos from NAIST with their transcriptions and English translations, and developed a deep learning-based system to transcribe Japanese lecture speech and to translate it into English. While watching the videos, users would see subtitles in Japanese and English that matched the lecturer’s speaking.

One might expect the ideal output would be simultaneous translations that could be done with live presentations. However, live translations limit the processing time and thus the accuracy.

“Because we are putting videos with subtitles in the archives, we found better translations by creating subtitles with a longer processing time,” he says.

The archived footage used for the evaluation consisted of lectures from robotics, speech processing and software engineering. Interestingly, the word error rate in speech recognition correlated to disfluency in the lecturers’ speech. Another factor from the different error rates was the length of time speaking without pause. The corpus used for the training was still insufficient and should be developed more for further improvements.

“Japan wants to increase its international students and NAIST has a great opportunity to be a leader in this endeavor. Our project will not only improve machine translation, it will also bring bright minds to the country,” he continued.

###

Media Contact
Takahito Shikano
[email protected]

One class in all languages

Related Posts

Five or more hours of smartphone usage per day may increase obesity

NASA’s terra satellite finds tropical storm 07W’s strength on the side

NASA finds one burst of energy in weakening Depression Dalila

Researcher’s innovative flood mapping helps water and emergency management officials

POPULAR NEWS

Revolutionary AI Model Enhances Precision in Detecting Food Contamination

Imagine a Social Media Feed That Challenges Your Views Instead of Reinforcing Them

Uncovering Functions of Cavernous Malformation Proteins in Organoids

Promising Outcomes from First Clinical Trials of Gene Regulation in Epilepsy

About

Follow us

Recent News

In-Sensor Cryptography Links Physical Process to Digital Identity

Can Psychosocial Factors Influence Cancer Risk?

Depression Factors in Elderly: Pre vs. Post-COVID Analysis

Subscribe to Blog via Email

Welcome Back!

Retrieve your password