VoXy phone software will use a telephone coupling module connected to the computer sound card. The software will recognize the components of a noisy incoming speech signal and convert it into a visually-displayed string of words in the French or English language. The sound-to-text translation will be done in real time and directly (i.e., without the need of any intermediary, whether it is a human telephonist or an application on a multi-user server). Moreover, the speech-recognition process will be based on a best-fit analysis from the output of many models’ algorithms. A self-contained application for Android and IOS phones will also be developed.
We currently have a functioning prototype that relies on Google Chrome’s WebSpeech API. This temporary solution is not desirable as it is contingent upon having:
- an internet connexion;
- voice transmission across the network;
- voice recognition through an external server;
- decoded text transmission across the network from the server back to the web application that displays it on screen.
Another drawback of this indirect speech-to-text translation is that the voice-template models used by WebSpeech API have been sampled at a much higher frequency (i.e., higher temporal resolution of signal) than that actually used in digital phone lines, thereby increasing the number of transcription errors. Our system will use an adaptive multi-frequency model that optimizes transcription quality at the receiver's end. Furthermore, it will detect touch-tone tonalities in such a way as to facilitate the unambiguous interpretation of digits. Indeed, numbers are often very difficult to understand via the phone, even for normal-hearing receivers. Our product will target deaf as well as hard-of-hearing individuals, the latter being on a steep increase with the aging population and thereby representing a large proportion of the market for VoXy phone.
For French version please visit: http://www.voxyphone.com/?lang=fr
Here is our prototype using a telephone coupling device to decode in realtime the received voice over the phone from a voicemail.