llbbl
03-24-2004, 10:30 AM
The day has arrived that I predicted would happen... although I didn't tell anyone so doesn't really matter if I did or not =) .. I was happy to see that the beginings of voice interaction with your handheld device. It is something new and they are calling multimodal. Look for more devices to support this if they do not already.
XHTML+Voice, or X+V, is a standard that has been developed by IBM and is supported by the Opera browser. Here is the specification (http://www.voicexml.org/specs/multimodal/x+v/12/spec.html) that has been developed if you are interested in the details of how it works. Basicaly it uses a new HTML tag in the code that will accept speech or typed inpurt from the user.
You will need to install a copy of WebSpere Studio Site Develooper v5.0 or WebSpere Studio Application Develooper v5.0.
Here is an example of how the code might look.
<input type="text"
id="pizzaQuantity"
ev:event="focus"
ev:handler="#voice_quantity"/>
You might ask how you are supposed to train the software so it recognizes how bad you pronounce (http://www.merriam-webster.com/help/faq/pronounce.htm) things. Well you use a Java Speech Grammer Format grammar file that is generated using this Multimodal Toolkit. (http://www14.software.ibm.com/webapp/download/search.jsp?go=y&rs=multimodal)
The Main IBM site can be found here:
http://www-306.ibm.com/software/pervasive/multimodal/
What do you guys think about this? How useful will it become? It certainly seems easier to talk to the thing rather than try to scribble out what you want to say. I could certainly see a lot of handheld applications being written that support the X+V tags. Maybe Palm will include this as standard on their next OS version, what would that be 6.0?
XHTML+Voice, or X+V, is a standard that has been developed by IBM and is supported by the Opera browser. Here is the specification (http://www.voicexml.org/specs/multimodal/x+v/12/spec.html) that has been developed if you are interested in the details of how it works. Basicaly it uses a new HTML tag in the code that will accept speech or typed inpurt from the user.
You will need to install a copy of WebSpere Studio Site Develooper v5.0 or WebSpere Studio Application Develooper v5.0.
Here is an example of how the code might look.
<input type="text"
id="pizzaQuantity"
ev:event="focus"
ev:handler="#voice_quantity"/>
You might ask how you are supposed to train the software so it recognizes how bad you pronounce (http://www.merriam-webster.com/help/faq/pronounce.htm) things. Well you use a Java Speech Grammer Format grammar file that is generated using this Multimodal Toolkit. (http://www14.software.ibm.com/webapp/download/search.jsp?go=y&rs=multimodal)
The Main IBM site can be found here:
http://www-306.ibm.com/software/pervasive/multimodal/
What do you guys think about this? How useful will it become? It certainly seems easier to talk to the thing rather than try to scribble out what you want to say. I could certainly see a lot of handheld applications being written that support the X+V tags. Maybe Palm will include this as standard on their next OS version, what would that be 6.0?