In our impression, robots should speak in a robot-like way with a dull look in their eyes, and their intonation ought to be metallic and slow. However this is not the case with Cortana who speaks nothing like a robot, but rather like a normal person, how did she achieve that?
People speak with a tone.
First it needs to know how humans pronounce. There contains a large number of roots in each word which we have all experienced while we learns to talk, these roots composing together will form the complicated language we speak everyday.
Cortana speaks more complicated language than us.
This is how robots talk: they parse and extract the roots from language samples and then recombine them to generate new target language.
As Cortana’s first language is English, it’s relatively easier for Cortana to speak English. After receiving the language text, Cortana needs to first analyze it with logics like computational semantics to find out its semantics. Then it would use composite signal processing to parse the voice root. In this way, it can have simple English conversations.
It’s harder for Cortana to speak Chinese.
Here comes the question, what if Cortana has received other languages? Why is Cortana also able to speak Chinese? The reason is similar, Cortana could translate English semantics into Chinese and then analyze and construct in Chinese roots.
However if Cortana comes across with new problems such as absence of basic dialect, artificial intelligence will play its role. Using the match searching of Microsoft Azure, Cortana could search for similar basic dialects to replace it, of course, this could not always solve the problem, at this moment, real persons dubbing will be used.
Microsoft Azure has inject new vigor into Cortana.
Cloud technology such as cloud storage and large-scale computing plays a critical role during the process. However this hasn’t answered the question, as a lot of robots apply the principle to speak, why just Cortana doesn’t sound like a robot?
Cortana acts rather intelligently on this point, it will analyze the emotions according to the present conversation scenario, this also applies the principle of the almighty machine learning. Cortana then will control its tone and voice according to the conversation scenario thus generates relevant tone.
Cortana also has mood.
Users must have experienced this while using it, for example Cortana would select a humble tone while making an apology, it uses a firm tone while answering a question. These tones would make the cold words more similar to human pronunciation, this is also the reason why Cortana defers with the other robots.
Of course, Cortana’s competitors Siri and Google Now also apply the technology. However according to many users, Cortana performs better while mimicking mankind, the reason to this is not Cortana, but Microsoft Research behind it.
Cortana speaks more like humans than Siri.
You must have been shocked if you know how powerful Microsoft Research is. As one of the world’s most powerful technology enterprise, Microsoft Research is Cortana’s biggest innovation support. Unlike the other innovation centers, Microsoft Research is completely academic.
What come to you might be all kinds of universities when it comes to publishing papers. However, Microsoft Research ranks top on publishing papers, and unlike the other enterprises whose employees are measured by their performance, Microsoft Research evaluate its employees based on the number of their papers publishing. This might well explain why Cortana is better than the other products.