There is no end to the amount of research now being done into helping us speak more than one language. Microsoft has now developed software that can recognise speech, whether from an audio recording or from a person’s natural voice, and can translate it into a second language that sounds exactly like the original voice. This could be fantastic news when someone wants a document translation when conducting a presentation at a conference.
The Microsoft researchers’ software is able to recognise the sound of your voice and use it to translate it into another language that you don’t have the capability to speak. This new translation services tool could be great for tutoring in a second language or to make language translation tools for travellers. Even better, it is an easy way for website users to get an audio description of a product to be converted into their language. This is great for business and cost effective for web design when only one voice needs to be used in an audio description of the product.
Recently a demonstration of this new software was conducted at a demonstration in Redmond, Washington which is the base for Microsoft. Frank Soongshowed, who has helped to develop this software, showed how it could “speak” a Spanish text by using Rick Rashid’s voice, the leader of the research team. He also demonstrated how the software could provide Mandarin in Craig Mundie’s voice. Mundie is Microsoft’s most important research officer. The demonstration listened to words spoken by Rashid in his own language, which was then translated into Spanish, English, Italian and Mandarin.
A synthetic adaptation in English of Mundie’s voice was a welcome speech to an open day audience put on by Microsoft. Using the software, this projected the same words to the listeners, but in Mandarin. It could clearly be heard that the Mandarin Chinese voice was Mundie’s.
This software is great news for the monolingual traveller who through speech recognition can read out a text which can be translated into the desired language.
It is also expected to be useful for a directions phone app. Imagine a synthetic English accent reading out text that is present on Chinese road signs when relaying a set of instructions for following a particular route throughout Beijing!
This system needs about 60 minutes of training to create a model which has the ability to read out any text in an individual’s own voice. The model is then transformed into one which is able to read aloud text in a different language by comparing it to sample text-to-speech models for the language targeted. Individual audio sounds utilized by the 1st model to accumulate words through the use of a person’s voice in his or her normal language are tweaked carefully to give the new text-to-speech model a full ability to sound out phrases in the second language. Soon a translation agency in Australia will be experimenting with this software.
No comments:
Post a Comment