US20060100881A1 - Multi-modal web interaction over wireless network - Google Patents
Multi-modal web interaction over wireless network Download PDFInfo
- Publication number
- US20060100881A1 US20060100881A1 US10/534,661 US53466105A US2006100881A1 US 20060100881 A1 US20060100881 A1 US 20060100881A1 US 53466105 A US53466105 A US 53466105A US 2006100881 A1 US2006100881 A1 US 2006100881A1
- Authority
- US
- United States
- Prior art keywords
- server
- speech
- display element
- client
- mml
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 94
- 238000000034 method Methods 0.000 claims abstract description 43
- 230000005540 biological transmission Effects 0.000 claims description 23
- 230000008569 process Effects 0.000 claims description 8
- 230000007246 mechanism Effects 0.000 description 39
- 238000012545 processing Methods 0.000 description 16
- 238000004891 communication Methods 0.000 description 15
- 230000009471 action Effects 0.000 description 14
- 230000001960 triggered effect Effects 0.000 description 11
- 230000006870 function Effects 0.000 description 10
- 238000013459 approach Methods 0.000 description 7
- 230000004913 activation Effects 0.000 description 6
- 229920001690 polydopamine Polymers 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000005587 bubbling Effects 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 230000001902 propagating effect Effects 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000010006 flight Effects 0.000 description 1
- 230000008450 motivation Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/487—Arrangements for providing information services, e.g. recorded voice services or time announcements
- H04M3/493—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals
- H04M3/4938—Interactive information services, e.g. directory enquiries ; Arrangements therefor, e.g. interactive voice response [IVR] systems or voice portals comprising a voice browser which renders and interprets, e.g. VoiceXML
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04W—WIRELESS COMMUNICATION NETWORKS
- H04W80/00—Wireless network protocols or protocol adaptations to wireless operation
Definitions
- This invention relates to web interaction over a wireless network between wireless communication devices and an Internet application.
- the present invention relates to multi-modal web interaction over wireless network, which enables users to interact with an Internet application in a variety of ways.
- Wireless communication devices are becoming increasingly prevalent for personal communication needs. These devices include, for example, cellular telephones, alphanumeric pagers, “palmtop” computers, personal information managers (PIMS), and other small, primarily handheld communication and computing devices. Wireless communication devices have matured considerably in their features and now support not only basic point-to-point communication functions like telephone calling, but more advanced communications functions, such as electronic mail, facsimile receipt and transmission, Internet access and browsing of the World Wide Web, and the like.
- PIMS personal information managers
- MMI Man-Machine-Interface
- U.S. Pat. No. 6,317,781 discloses a markup language based man-machine interface.
- the man-machine interface provides a user interface for the various telecommunications functionality of the wireless communication device, including dialing telephone numbers, answering telephone calls, creating messages, sending messages, receiving messages, and establishing configuration settings, which are defined in a well-known markup language, such as HTML, and accessed through a browser program executed by the wireless communication device.
- This feature enables direct access to Internet and World Wide web content, such as web pages, to be directly integrated with telecommunication functions of the device, and allows web content to be seamlessly integrated with other types of data, because all data presented to the user via the user interface is presented via markup language-based pages.
- Such a markup language based man-machine interface enables users directly to interact with an Internet application.
- wireless communication devices have a very limited input capability.
- Desktop or notebook computers have cursor based pointing devices, such as computer mouse, trackballs, joysticks, and the like, and full keyboards. This enables navigation of Web content by clicking and dragging of scroll bars, clicking of hypertext links, and keyboard tabbing between fields of forms, such as HTML forms.
- wireless communication devices have a very limited input capability, typically up and down keys, and one to three soft keys.
- users of wireless communication devices are unable to interact with an Internet application using conventional technology.
- some forms of speech recognition exist in the prior art, there is no prior art system to realize multi-modal web interaction, which will enable users to perform web interaction over a wireless network in a variety of ways.
- FIG. 1 is an illustration of the network environment in which an embodiment of the present invention may be applied.
- FIG. 2 is an illustration of the system 100 for web interaction over a wireless network according to one embodiment of the present invention.
- FIG. 3 and FIG. 4 show focus on a group of hyperlinks or a form.
- FIGS. 5-6 present the MML event mechanisms.
- FIG. 7 presents the fundamental flow chart of system messages & MML events.
- FIG. 8 shows the details of MML element blocks used in the system of one embodiment of the present invention.
- Various embodiments of the present invention overcome the limitation of the conventional Man-Machine Interface for wireless communication by providing a system and method for multi-modal web interaction over a wireless network.
- the multi-modal web interaction of the present invention will enable users to interact with an Internet application in a variety of ways, including, for example:
- the invention uses a multi-modal markup language (MML).
- MML multi-modal markup language
- the present invention provides an approach for web interaction over wireless network.
- a client system receives user inputs, interprets the user inputs to determine at least one of several web interaction modes, produces a corresponding client request and transmits the client request.
- the server receives and interprets the client request to perform specific retrieving jobs, and transmits the result to the client system.
- the invention is implemented using a multi-modal markup language (MML) with DSR (Distributed Speech Recognition) mechanism, focus mechanism, synchronization mechanism and control mechanism, wherein the focus mechanism is used for determining which active display is to be focused and the ID of the focused display element.
- the synchronization mechanism is used for retrieving the synchronization relation between a speech element and a display element to build the grammar of corresponding speech element to deal with user's speech input.
- the control mechanism controls the interaction between client and server. According to such an implementation, the multi-modal web interaction flow is shown by way of example as follows:
- Client receive and interpret the user input.
- the client transmits a request to the server for a new page or submits the form.
- the client determines which active display element is to be focused and the identifier (ID) of the focused display element, captures speech, extracts speech features, and transmits the speech features, the ID of focused display element and other information such as URL of the current page to the server.
- the client waits for the corresponding server response.
- the server receive and interpret the request from the client.
- the server retrieves the new page from cache or web server and sends the page to the client
- the server receives the ID of the focused display element to build the correct grammar based on the synchronization of display elements and speech elements. Then, speech recognition will be performed. According to the result of speech recognition, the server will do specific jobs and send events or new page to the client. Then, the server waits for new requests from the client.
- Client load the new page or handle events.
- DSR Distributed Speech Recognition
- the present invention will enable the speech recognition technology to be feasibly used to retrieve information on the web, improve the precision of speech recognition, reduce the computing resources necessary for speech recognition, and realize real-time speech recognition.
- the approach of the present invention can be shared across communities.
- the approach can be used to help Internet Service Providers (ISP) to easily build server platforms for multi-modal web interaction.
- the approach can be used to help Internet Content Providers (ICP) to easily create applications with the feature of multi-modal web interaction.
- ISP Internet Service Providers
- ICP Internet Content Providers
- MML Multi-modal Markup Language
- client 10 can access documents from Web server 12 via the Internet 5 , particularly via the World-Wide Web (“the Web”).
- the Web is a collection of formatted hypertext pages located on numerous computers around the world that are logically connected by the Internet.
- the client 10 may be personal computers or various mobile computing devices 14 , such as personal digital assistants or wireless telephones.
- personal digital assistants, or PDA's are commonly known hand-held computers that can be used to store various personal information including, but not limited to contact information, calendar information, etc.
- Such information can be downloaded from other computer systems, or can be inputted by way of a stylus and pressure sensitive screen of the PDA.
- PDA's are the PalmTM computer of 3Com Corporation, and Microsoft CETM computers, which are each available from a variety of vendors.
- a user operating a mobile computing device such as a cordless handset, dual-mode cordless handset, PDA or operating portable laptop computer generates control commands to access the Internet.
- the control commands may consist of digitally encoded data, DTMF or voice commands.
- These control commands are often transmitted to a gateway 18 .
- the gateway 18 processes the control commands (including performing speech recognition) from the mobile computing device 14 and transmits requests to the Web server 12 .
- the Web server 12 sends documents to the gateway 18 .
- the gateway 18 consolidates display contents from the document and sends the display contents to the client 14 .
- the client 14 interprets the user inputs to determine a web interaction mode, produces and transmits the client 14 request based on the interaction mode determination result; and multi-modal markup language (MML) server (gateway) 18 interprets the client 14 request to perform specific retrieving jobs.
- the Web interaction mode can be traditional input/output (for example: keyboard, keypad, mouse and stylus/plaintext, graphics, and motion video) or speech input/audio (synthesis speech) output.
- This embodiment enables users to browse the World Wide Web in a variety of ways. Specifically, users can interact with an Internet application via traditional input/output and speech input/output independently or concurrently.
- XHTML Basic extends XHTML Basic by adding speech features to enhance XHTML modules.
- the motivation for XHTML Basic is to provide an XHTML document type that can be shared across communities.
- an XHTML Basic document can be presented on the maximum number of Web clients, such as mobile phones, PDAs and smart phones. That is the reason to implement MML based on XHTML Basic.
- Style Module Support inline style sheet.
- the client 110 comprises: web interaction mode interpreter 111 , speech input/output processor 112 , focus mechanism 113 , traditional input/output processor 114 , data wrap 115 and control mechanism 116 .
- the MML server 120 comprises: web interaction mode interpreter 121 , speech recognition processor 122 , synchronization mechanism 123 , dynamic grammar builder 124 , HTTP processor 125 , data wrap 126 and control mechanism 127 .
- web interaction mode interpreter 111 receives and interprets user inputs to determine the web interaction mode.
- the web interaction mode interpreter 111 also assists content interpretation in the client 110 .
- traditional input/output processor 114 processes user input, then data wrap 115 transmits a request to the server 120 for a new page or form submittal.
- speech input/output processor 112 captures and extracts speech features, focus mechanism 113 determines which active display element is to be focused upon and the ID of the focused display element. Then data wrap 115 transmits the extracted speech features, the ID of the focused display element and other information such as URL of current page to the MML server.
- web interaction mode interpreter 121 receives and interprets the request from the client 110 to determine the web interaction mode.
- the web interpretation mode interpreter 121 also assists content interpretation on the server 120 .
- HTTP processor 125 retrieves the new page or form from cache or web server 130 .
- synchronization mechanism 123 retrieves the synchronization relation between a speech element and a display element based on the received ID
- dynamic grammar builder 124 builds the correct grammar based on the synchronization relation between speech element and display element.
- Speech recognition processor 122 performs speech recognition based on the correct grammar built by dynamic grammar builder 124 . According the recognition result, HTTP processor 125 retrieves the new page from cache or web server 130 . Then, data wrap 126 transmits a response to the client 110 based on the retrieved result.
- the control mechanisms 116 and 127 are used to control the interaction between the client and the server.
- speech input can become a new input source.
- speech interaction speech is detected and feature is extracted at the client, and speech recognition is performed at the server.
- speech recognition is performed at the server.
- the user will typically do input using the following types of conventional display element(s):
- a focus mechanism is provided to focus the user's attention on the active display element(s) on which the user will perform speech input.
- a display element is focused by highlighting or otherwise rendering distinctive the display element upon which the user's speech input will be applied.
- the server can perform speech recognition based on the corresponding relationship between the display element and the speech element. Therefore, instead of conventional dictation with a very large vocabulary, the vocabulary database of one embodiment is based on the hyperlinks, electronic forms, and other display elements on which users will perform speech input.
- the correct grammar can be built dynamically based on the synchronization of display elements and speech elements. Therefore, the precision of speech recognition will be improved, the computing load of speech recognition will be reduced, and real-time speech recognition will actually be realized.
- FIG. 3 shows the focus on a group of hyperlinks
- FIG. 4 shows the focus on a form.
- an utterance from the user is received and scored or matched against available input selections associated with the focused display element. If the scored utterance is close enough to a particular input selection, the “match” event is produced and new card or page is displayed.
- the new card or page corresponds to the matched input selection. If the scored utterance cannot be matched to a particular input selection, a “no match” event is produced, audio or text prompt is displayed and the display element is still focused.
- the user may also use traditional ways of causing a particular display element to be focused, such as pointing at an input area, such as a box in a form.
- the currently focused display element changes into un-focused as a different display element is selected.
- the user may also point to a hypertext link, which causes a new card or page to be displayed. If the user points the other “Talk To Me Button”, the previous display element changes into un-focused and the display element, to which the last activation belongs, changes into focused.
- the focused display element may change into un-focused.
- the grammar of the corresponding speech elements should be loaded at the server to deal with the user's speech input. So, the synchronization or configuration scheme for the speech element and the display element is necessary. Following are two embodiments which accomplish this result.
- One fundamental speech element has one grammar that includes all entrance words for one time speech interaction on the Web.
- One of the fundamental speech elements must have one and only one corresponding display element(s) as follows:
- One identified single or group of display elements One identified single or group of display elements.
- a “bind” attribute is defined in ⁇ mml:link>, ⁇ mml:sform> and ⁇ mml:input>. It contains the information for one pair of display elements and corresponding speech element.
- the system messages produced at the client or the server or other events produced at the client or the server should be well defined.
- a Client-Server Control Mechanism is designed to provide a mechanism for the definition of the system messages and MML events which are needed to control the interaction between the client and server.
- Table 1 includes a representative set of system messages and MML events.
- Control Information Table Communicated between client and System Messages Events server Error (Server) ⁇ ⁇ Transmission ⁇ ⁇ (Server) Transmission ⁇ ⁇ (Client) Ready (Server) ⁇ ⁇ Session (Client) ⁇ ⁇ Exit (Client) ⁇ ⁇ OnFocus* (Client) ⁇ ⁇ UnFocus* (Client) ⁇ Match (Server) ⁇ ⁇ Nomatch (Server) ⁇ ⁇ Onload (Client) ⁇ Unload (Client) ⁇ System Messages:
- the System Messages are for client and server to exchange system information. Some types of system Messages are triggered by the client and sent to the server. Others are triggered by the server and sent to the client.
- the System messages triggered at the client include the following:
- the Session message is sent when the client initializes the connection to the server.
- a Ready message or an Error Message is expected to be received from the server after the Session Message is sent.
- the Transmission Message (Client) is sent after the client establishes the session with the server.
- a Transmission Message (Server) or an Error Message is expected to be received from the server after the Transmission Message (Client) is sent.
- ⁇ message type ”transmission”> ⁇ session> ⁇ /session> ⁇ !-- the ID of this session--> ⁇ crc> ⁇ /crc> ⁇ !----- the crc information that the client requests--> ⁇ QoS> ⁇ /QoS> ⁇ !--- QoS information that the client requests--> ⁇ bandwith> ⁇ /bandwith> ⁇ !--- Bandwith that the client requests--> ⁇ /message> ⁇ 3> OnFocus Message
- OnFocus and UnFocus messages are special client side System Messages.
- OnFocus occurs when user points on, or presses, or otherwise activates the “Talk Button” (Here “Talk Button” means “Hardware Programmable Button” and “Talk to Me Button”).
- the client will perform the following tasks:
- OnFocus Message be transmitted with speech features rather than transmitted alone. The reason is to optimize and reduce unnecessary communication and server load in these cases:
- the Exit Message is sent when the client quits the session.
- ⁇ message type ”exit”> ⁇ session> ⁇ session> ⁇ !-- the ID of this session --> ⁇ /message> System Messages Triggered at the Server ⁇ 1> Ready Message
- the Ready Message is sent by the server when the client sends the Session Message first and the server is ready to work.
- ⁇ message type ”ready”> ⁇ session> ⁇ /session> ⁇ !-- the ID of this session which is created by server after --> ⁇ !-- the Session Message is received--> ⁇ ip> ⁇ /ip> ⁇ !-- the IP address of the responding server --> ⁇ voice> ⁇ support>T ⁇ /support> ⁇ !--- T or F
- the server supports or not support the voice character--> ⁇ !-- that the client request in Session Message --> ⁇ server> ⁇ /server> ⁇ !--- the voice character that the server is using now-- > ⁇ /voice> ⁇ language> ⁇ support>T ⁇ /support> ⁇ !--- T or F
- the server supports or not support the language--> ⁇ !-- that the client request in Session Message --> ⁇ server> ⁇ /server
- the Transmission message is sent by the server when the client sends a transmission message first or the network status has changed. This message is used to notify the client of the transmission parameters the client should use.
- ⁇ message type ”transmission”> ⁇ session> ⁇ /session> ⁇ !-- the ID of this session--> ⁇ crc> ⁇ /crc> ⁇ !-- the crc information --> ⁇ QoS> ⁇ /QoS> ⁇ !--- QoS information --> ⁇ bandwidth> ⁇ /bandwidth> ⁇ !--the bandwidth which client should use--> ⁇ /message> ⁇ 3> Error Message
- the Error message is sent by the server. If the server generates some error while processing the client request, the server will send an Error Message to the client.
- ⁇ message type ”error”> ⁇ session> ⁇ /session> ⁇ !-- the ID of this session--> ⁇ errorcode> 500 ⁇ /errorcode> ⁇ !-- the code number of the error --> ⁇ errorinfo> ⁇ /errorinfo> ⁇ !--- the text information of the error--> ⁇ /message> MML Events
- MML events can be categorized as client-produced events and server-produced events according to the event source. And the events might need to be communicated between the client and the server.
- the element of event processing instruction is ⁇ mml:onevent>.
- the “Onload” event occurs when certain display elements are loaded. This event type could only be valid when the trigger attribute is sent to the “client”.
- the “Unload” event occurs when certain display elements are unloaded. This event type could only be valid when the trigger attribute is sent to the “client”.
- the MML Events Mechanism of one embodiment is an extension of the conventional XML Event Mechanism. As shown in FIG. 5 , there are two phases in the conventional event handling: “capture” and “bubbling” (See XML Event Conformance).
- the MML Simple Events Mechanism of one embodiment. As shown in FIG. 6 , in the Simple Event Mechanism, neither a “capture” nor “bubbling” phase is needed. In the MML Event Mechanism of one embodiment, the observer node must be the parent of the event handler ⁇ mml:onevent>. The event triggered by one node is to be handled only by the child ⁇ mml:onevent> event handler node. Other ⁇ mml:onevent> nodes will not intercept the event. Further, the phase attribute of ⁇ mml:onevent> is ignored.
- FIGS. 5 and 6 illustrate the two event mechanisms.
- the dotted line parent node (node 510 in FIG. 5 and node 610 in FIG. 6 ) will intercept the event.
- the child node ⁇ mml:onevent> (node 520 in FIG. 5 and node 620 in FIG. 6 ) of the dotted line node will have the chance to handle the specific events.
- MML Events in the embodiment have the unified event interface with the host language (XHTML) but are independent from traditional events of the host language.
- Page developers can write events in the MML web page by adding a ⁇ mml:onevent> tag as the child node of an observer node or a target node.
- FIG. 7 illustrates the fundamental flow chart of the processing of system messages & MML events used in one embodiment of the present invention. This processing can be partitioned into the following segments with the listed steps performed for each segment as shown in FIG. 7 .
- Step 1 Session message is sent from client to server
- Step 2 Ready message is sent from server to client
- Step 3 Transmission message (transmission parameters) is sent from client to server
- Step 4 Transmission message (transmission parameters) is sent from server to client
- Step 1 Feature Flow with OnFocus message is sent from the client to the server
- Step 2 Several cases will happen:
- the event will be sent to client.
- the nomatch event with event handling information will be sent to client.
- the implementation does not include optional event handling in the MML web page, the nomatch event with empty information will be sent to client.
- FIG. 8 shows the details of the MML element blocks used in one embodiment of the invention.
- MML elements use the namespace identified by the “mml:” prefix.
- the function is used to divide the whole document into some cards or pages (segments).
- the client device will display one card at a time. This is optimized for small display devices and wireless transmission. Multiple card elements may appear in a single document Each card element represents an individual presentation or interaction with the user.
- the ⁇ mml:card> element is the one and only one element of MML that has relation to the content presentation and document structure.
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the optional title attribute specifies the string that would be displayed on the title bar of the user agent when the associated card is loaded and displayed.
- the optional style attribute specifies the XHTML inline style.
- the effect scope of the style is the whole card. But this may be overridden by some child XHTML elements, which could define their own inline style.
- the ⁇ mml:speech> element is the container of all speech relevant elements.
- the child elements of ⁇ mml:speech> can be ⁇ mml:recog> and/or ⁇ mml:prompt>.
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the ⁇ mml:recog> is the container of speech recognition elements.
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the ⁇ group> Element Elements Attributes Minimal Content Model mml:group id(ID), mode(speech
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the optional mode attribute specifies the speech recognition modes. Two modes are supported:
- the optional accuracy attribute specifies the lowest accuracy of speech recognition that the page developers will accept. Following styles are supported:
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the required value attribute specifies the ⁇ mml:link> element is corresponding to which part of the grammar.
- the required bind attribute specifies which XHTML hyperlink (such as ⁇ a>) is to be bound with.
- the ⁇ mml:sform> element functions as the speech input form. It should be bound with the XHTML ⁇ form> element.
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the optional mode attribute specifies the speech recognition modes. Two modes are supported:
- the optional accuracy attribute specifies the lowest accuracy of speech recognition that the page developers will accept. Following styles are supported:
- the ⁇ mml:input> element functions as the speech input data placeholder. It should be bound with XHTML ⁇ input>.
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the optional value attribute specifies which part of the speech recognition result should be assigned to the bound XHTML ⁇ input> tag. If this attribute is not set, the whole speech recognition result will be assigned to the bound XHTML ⁇ input> tag.
- the required bind attribute specifies which XHTML ⁇ input> in the ⁇ form> is to be bound with.
- the ⁇ mml:grammar> specifies the grammar for speech recognition.
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the optional src attribute specifies the URL of the grammar document. If this attribute is not set, the grammar content should be in the content of ⁇ mml:grammar>.
- the ⁇ prompt> Element Elements Attributes Minimal Content Model mml:prompt id(ID), type(text (PCDATA
- the ⁇ mml:prompt> specifies the prompt message.
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the optional type attribute specifies the prompt type. Three types are supported now in one embodiment:
- client side user agent may override this “type” attribute from “tts” to “text”.
- the optional src attribute specifies the URL of the prompt output document. If this attribute is not set, the prompt content should be in the content of ⁇ mml:promt>.
- the optional loop attribute specifies how many times should the speech output be activated. Two modes are supported in one embodiment:
- the optional interval attribute specifies the spacing time between two rounds of the speech output. It needs to be set only when the loop attribute is set to “loop”.
- the ⁇ onevent> Element Elements Attributes Minimal Content Model mml:onevent id(ID), type(match
- the ⁇ mml:onevent> element is used to intercept certain events.
- the type attribute indicates the name of the event
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the required type attribute specifies the event type that would be handled. Following event types are supported in one embodiment:
- the required trigger attribute specifies the event is desired to occur at client or server side.
- the event should occur at the client.
- the event is desired to occur at server.
- the optional phase attribute specifies when the ⁇ mml:onevent> will be activated by the desired event. If user agent (including client and server) supports MML Simple Content Events Conformance, this attribute should be ignored.
- the optional propagate attribute specifies whether the intercepted event should continue propagating (XML Events Conformance). If the user agent (including client and server) supports MML Simple Content Events Conformance, this attribute should be ignored.
- XML Events Conformance If the user agent (including client and server) supports MML Simple Content Events Conformance, this attribute should be ignored.
- the intercepted event will continue propagating.
- the intercepted event will stop propagating.
- the optional defaultaction attribute specifies whether the default action for the event (if any) should be performed or not after handling this event by ⁇ mml:onevent>.
- the default action of a “match” event on an ⁇ mml:sform> is to submit the form.
- the default action of “nomatch” event on an ⁇ mml:sform> is to reset the corresponding ⁇ form> and give a “nomatch” message.
- the ⁇ mml:do> element is always a child element of an ⁇ mml:onevent> element
- the ⁇ mml:onevent> element intercepts a desired event, it will invoke the behavior specified by the contained ⁇ mml:do> element.
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the optional target attribute specifies the id of the target element that will be invoked.
- the optional href attribute specifies the URL or Script to the associated behavior. If the target attribute is set, this attribute will be ignored.
- the optional action attribute specifies the action type that will be invoked on the target or URL.
- the ⁇ mml:getvalue> element is a child element of ⁇ mml:prompt>. It is used to get the content from ⁇ form> or ⁇ sform> data placeholder.
- the optional id attribute specifies the unique identifier of the element in the scope of the whole document.
- the required from attribute specifies the identifier of the data placeholder.
- the required at attribute specifies that the value to be assigned is at the client or the server:
- the ⁇ mml:getvalue> get client side element value.
- the from attribute should be set to a data placeholder of a ⁇ form>.
- the ⁇ mml:getvalue> get server side element value.
- the from attribute should be set to a data placeholder of a ⁇ sform>.
- the system of the present invention supports multi-modal web interaction. Because the main speech recognition processing job is handled by the server, the multi-modal web page will be interpreted at both the client and the server side. The following is an example of the simple flow of client and server interaction using an embodiment of the present invention.
- ⁇ User> Select hyperlinks, submit a form (traditional web interaction) or press the “Talk Button” and input an utterance (speech interaction).
- ⁇ Client> In the case of traditional web interaction, the client transmits a request to the server for a new page or submits the form. In case of speech interaction, the client determines which active display element is to be focused and the ID of the focused display element, captures speech, extracts speech features, and transmits the id of focused display element, the extracted speech features and other information such as URL of the current page to the server. Then, the client waits for a response.
- ⁇ Server> In the case of traditional web interaction, the server retrieves the new page from a cache or web server and sends it to the client. In the case of speech recognition, the server receives the id of the focused display element and builds the correct grammar. Then, speech recognition will be performed. According to the result of speech recognition, the server will do specific jobs and send events or new pages to the client. Then, the server waits for a new request from the client.
- ⁇ Client> Client will load the new page or handle events.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Information Transfer Between Computers (AREA)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US12/976,320 US8566103B2 (en) | 2002-11-13 | 2010-12-22 | Multi-modal web interaction over wireless network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2002/000807 WO2004045154A1 (en) | 2002-11-13 | 2002-11-13 | Multi-modal web interaction over wireless network |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/976,320 Continuation US8566103B2 (en) | 2002-11-13 | 2010-12-22 | Multi-modal web interaction over wireless network |
Publications (1)
Publication Number | Publication Date |
---|---|
US20060100881A1 true US20060100881A1 (en) | 2006-05-11 |
Family
ID=32304059
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US10/534,661 Abandoned US20060100881A1 (en) | 2002-11-13 | 2002-11-13 | Multi-modal web interaction over wireless network |
US12/976,320 Expired - Fee Related US8566103B2 (en) | 2002-11-13 | 2010-12-22 | Multi-modal web interaction over wireless network |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/976,320 Expired - Fee Related US8566103B2 (en) | 2002-11-13 | 2010-12-22 | Multi-modal web interaction over wireless network |
Country Status (5)
Country | Link |
---|---|
US (2) | US20060100881A1 (de) |
EP (1) | EP1576769A4 (de) |
CN (1) | CN100477627C (de) |
AU (1) | AU2002347129A1 (de) |
WO (1) | WO2004045154A1 (de) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050091059A1 (en) * | 2003-08-29 | 2005-04-28 | Microsoft Corporation | Assisted multi-modal dialogue |
US20050138647A1 (en) * | 2003-12-19 | 2005-06-23 | International Business Machines Corporation | Application module for managing interactions of distributed modality components |
US20080151886A1 (en) * | 2002-09-30 | 2008-06-26 | Avaya Technology Llc | Packet prioritization and associated bandwidth and buffer management techniques for audio over ip |
US7529677B1 (en) * | 2005-01-21 | 2009-05-05 | Itt Manufacturing Enterprises, Inc. | Methods and apparatus for remotely processing locally generated commands to control a local device |
US20110161890A1 (en) * | 2009-12-31 | 2011-06-30 | Anderson Glen J | Using multi-modal input to control multiple objects on a display |
US7978827B1 (en) | 2004-06-30 | 2011-07-12 | Avaya Inc. | Automatic configuration of call handling based on end-user needs and characteristics |
US8218751B2 (en) | 2008-09-29 | 2012-07-10 | Avaya Inc. | Method and apparatus for identifying and eliminating the source of background noise in multi-party teleconferences |
US20130041666A1 (en) * | 2011-08-08 | 2013-02-14 | Samsung Electronics Co., Ltd. | Voice recognition apparatus, voice recognition server, voice recognition system and voice recognition method |
US8442990B1 (en) * | 2011-06-17 | 2013-05-14 | Sencha, Inc. | Query-based event routing |
US8593959B2 (en) | 2002-09-30 | 2013-11-26 | Avaya Inc. | VoIP endpoint call admission |
US20180041397A1 (en) * | 2016-08-05 | 2018-02-08 | International Business Machines Corporation | Network modality reduction |
US20180278695A1 (en) * | 2017-03-24 | 2018-09-27 | Baidu Online Network Technology (Beijing) Co., Ltd. | Network access method and apparatus for speech recognition service based on artificial intelligence |
US10304461B2 (en) * | 2015-02-13 | 2019-05-28 | Tencent Technology (Shenzhen) Company Limited | Remote electronic service requesting and processing method, server, and terminal |
EP2415291B1 (de) * | 2009-03-30 | 2020-03-11 | Orange | Aushandlungsverfahren zum bereitstellen eines dienstes für ein endgerät |
Families Citing this family (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7769806B2 (en) | 2007-10-24 | 2010-08-03 | Social Communications Company | Automated real-time data stream switching in a shared virtual area communication environment |
US8407605B2 (en) | 2009-04-03 | 2013-03-26 | Social Communications Company | Application sharing |
US8397168B2 (en) | 2008-04-05 | 2013-03-12 | Social Communications Company | Interfacing with a spatial virtual communication environment |
JP5368547B2 (ja) | 2008-04-05 | 2013-12-18 | ソーシャル・コミュニケーションズ・カンパニー | 共有仮想エリアコミュニケーション環境ベースの装置および方法 |
KR20130010910A (ko) | 2008-12-05 | 2013-01-29 | 소우셜 커뮤니케이션즈 컴퍼니 | 실시간 커널 |
US9069851B2 (en) | 2009-01-15 | 2015-06-30 | Social Communications Company | Client application integrating web browsing and network data stream processing for realtime communications |
US9853922B2 (en) | 2012-02-24 | 2017-12-26 | Sococo, Inc. | Virtual area communications |
WO2012024205A2 (en) * | 2010-08-16 | 2012-02-23 | Social Communications Company | Promoting communicant interactions in a network communications environment |
WO2013119802A1 (en) | 2012-02-11 | 2013-08-15 | Social Communications Company | Routing virtual area based communications |
WO2013181026A1 (en) | 2012-06-02 | 2013-12-05 | Social Communications Company | Interfacing with a spatial virtual communications environment |
WO2014039828A2 (en) * | 2012-09-06 | 2014-03-13 | Simmons Aaron M | A method and system for reading fluency training |
US9495965B2 (en) * | 2013-09-20 | 2016-11-15 | American Institutes For Research | Synthesis and display of speech commands method and system |
US10274911B2 (en) * | 2015-06-25 | 2019-04-30 | Intel Corporation | Conversational interface for matching text of spoken input based on context model |
DE102015222956A1 (de) * | 2015-11-20 | 2017-05-24 | Robert Bosch Gmbh | Verfahren zum Betreiben eines Serversystems und zum Betreiben eines Aufnahmegeräts zum Aufnehmen eines Sprachbefehls, Serversystem, Aufnahmegerät und Sprachdialogsystem |
CN110399040B (zh) * | 2019-07-23 | 2023-05-12 | 芋头科技(杭州)有限公司 | 多模态交互方法、用户端设备、服务器及*** |
Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5819220A (en) * | 1996-09-30 | 1998-10-06 | Hewlett-Packard Company | Web triggered word set boosting for speech interfaces to the world wide web |
US5915001A (en) * | 1996-11-14 | 1999-06-22 | Vois Corporation | System and method for providing and using universally accessible voice and speech data files |
US6101472A (en) * | 1997-04-16 | 2000-08-08 | International Business Machines Corporation | Data processing system and method for navigating a network using a voice command |
US6101473A (en) * | 1997-08-08 | 2000-08-08 | Board Of Trustees, Leland Stanford Jr., University | Using speech recognition to access the internet, including access via a telephone |
US6185535B1 (en) * | 1998-10-16 | 2001-02-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Voice control of a user interface to service applications |
US6192339B1 (en) * | 1998-11-04 | 2001-02-20 | Intel Corporation | Mechanism for managing multiple speech applications |
US6298326B1 (en) * | 1999-05-13 | 2001-10-02 | Alan Feller | Off-site data entry system |
US20030083879A1 (en) * | 2001-10-31 | 2003-05-01 | James Cyr | Dynamic insertion of a speech recognition engine within a distributed speech recognition system |
US20030154085A1 (en) * | 2002-02-08 | 2003-08-14 | Onevoice Medical Corporation | Interactive knowledge base system |
US20030182622A1 (en) * | 2002-02-18 | 2003-09-25 | Sandeep Sibal | Technique for synchronizing visual and voice browsers to enable multi-modal browsing |
US20030225825A1 (en) * | 2002-05-28 | 2003-12-04 | International Business Machines Corporation | Methods and systems for authoring of mixed-initiative multi-modal interactions and related browsing mechanisms |
US20040006474A1 (en) * | 2002-02-07 | 2004-01-08 | Li Gong | Dynamic grammar for voice-enabled applications |
US6760697B1 (en) * | 2000-01-25 | 2004-07-06 | Minds And Technology, Inc. | Centralized processing of digital speech data originated at the network clients of a set of servers |
US20040141597A1 (en) * | 2001-03-12 | 2004-07-22 | Fabrizio Giacomelli | Method for enabling the voice interaction with a web page |
US6865258B1 (en) * | 1999-08-13 | 2005-03-08 | Intervoice Limited Partnership | Method and system for enhanced transcription |
US7020611B2 (en) * | 2001-02-21 | 2006-03-28 | Ameritrade Ip Company, Inc. | User interface selectable real time information delivery system and method |
US7020845B1 (en) * | 1999-11-15 | 2006-03-28 | Gottfurcht Elliot A | Navigating internet content on a television using a simplified interface and a remote control |
US7152203B2 (en) * | 2000-09-11 | 2006-12-19 | Appeon Corporation | Independent update and assembly of web page elements |
US7366752B2 (en) * | 2000-08-31 | 2008-04-29 | Schneider Automation | Communication system of an automation equipment based on the soap protocol |
Family Cites Families (29)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5748186A (en) * | 1995-10-02 | 1998-05-05 | Digital Equipment Corporation | Multimodal information presentation system |
DE19610493C2 (de) * | 1996-03-18 | 1998-07-30 | Deutsche Telekom Mobil | Verfahren zur Verbreitung von Mehrwertinformationen |
US6031836A (en) * | 1996-09-13 | 2000-02-29 | Lucent Technologies Inc. | Web-page interface to telephony features |
US5867160A (en) * | 1996-10-31 | 1999-02-02 | International Business Machines Corporation | System and method for task prioritization in computerized graphic interface environments |
US6632248B1 (en) * | 1996-12-06 | 2003-10-14 | Microsoft Corporation | Customization of network documents by accessing customization information on a server computer using uniquie user identifiers |
US5982370A (en) * | 1997-07-18 | 1999-11-09 | International Business Machines Corporation | Highlighting tool for search specification in a user interface of a computer system |
US6571282B1 (en) * | 1999-08-31 | 2003-05-27 | Accenture Llp | Block-based communication in a communication services patterns environment |
EP1137235A1 (de) * | 2000-03-24 | 2001-09-26 | BRITISH TELECOMMUNICATIONS public limited company | Verarbeitung von Netzkommunikationssteuernachrichten |
WO2001084535A2 (en) * | 2000-05-02 | 2001-11-08 | Dragon Systems, Inc. | Error correction in speech recognition |
US6438575B1 (en) * | 2000-06-07 | 2002-08-20 | Clickmarks, Inc. | System, method, and article of manufacture for wireless enablement of the world wide web using a wireless gateway |
US6961895B1 (en) * | 2000-08-10 | 2005-11-01 | Recording For The Blind & Dyslexic, Incorporated | Method and apparatus for synchronization of text and audio data |
FI110297B (fi) * | 2000-08-21 | 2002-12-31 | Mikko Kalervo Vaeaenaenen | Lyhytäänisanomajärjestelmä, -menetelmä ja -päätelaite |
GB0029025D0 (en) * | 2000-11-29 | 2001-01-10 | Hewlett Packard Co | Enhancement of communication capabilities |
US6996800B2 (en) * | 2000-12-04 | 2006-02-07 | International Business Machines Corporation | MVC (model-view-controller) based multi-modal authoring tool and development environment |
US7028306B2 (en) * | 2000-12-04 | 2006-04-11 | International Business Machines Corporation | Systems and methods for implementing modular DOM (Document Object Model)-based multi-modal browsers |
US7170863B1 (en) * | 2001-02-12 | 2007-01-30 | Nortel Networks Limited | Push-to-talk wireless telecommunications system utilizing a voice-over-IP network |
US7061928B2 (en) * | 2001-03-26 | 2006-06-13 | Azurn Networks, Inc. | Unified XML voice and data media converging switch and application delivery system |
US20030046316A1 (en) * | 2001-04-18 | 2003-03-06 | Jaroslav Gergic | Systems and methods for providing conversational computing via javaserver pages and javabeans |
US7610547B2 (en) | 2001-05-04 | 2009-10-27 | Microsoft Corporation | Markup language extensions for web enabled recognition |
US7020841B2 (en) * | 2001-06-07 | 2006-03-28 | International Business Machines Corporation | System and method for generating and presenting multi-modal applications from intent-based markup scripts |
US6983307B2 (en) * | 2001-07-11 | 2006-01-03 | Kirusa, Inc. | Synchronization among plural browsers |
US7269627B2 (en) * | 2001-07-27 | 2007-09-11 | Intel Corporation | Routing messages using presence information |
WO2003014955A1 (en) * | 2001-08-09 | 2003-02-20 | Gigamedia Access Corporation | Hybrid system architecture for secure peer-to-peer-communication |
US7149776B1 (en) * | 2001-08-31 | 2006-12-12 | Oracle International Corp. | System and method for real-time co-browsing |
US6976081B2 (en) * | 2002-01-30 | 2005-12-13 | Motorola, Inc. | Session initiation protocol compression |
US6882974B2 (en) * | 2002-02-15 | 2005-04-19 | Sap Aktiengesellschaft | Voice-control for a user interface |
US6912581B2 (en) * | 2002-02-27 | 2005-06-28 | Motorola, Inc. | System and method for concurrent multimodal communication session persistence |
US7640293B2 (en) * | 2002-07-17 | 2009-12-29 | Research In Motion Limited | Method, system and apparatus for messaging between wireless mobile terminals and networked computers |
EP1394692A1 (de) * | 2002-08-05 | 2004-03-03 | Alcatel | Verfahren, Terminal, Browser und Markierungssprache für multimodale Interaktionen zwischen einem Benutzer und dem Terminal |
-
2002
- 2002-11-13 US US10/534,661 patent/US20060100881A1/en not_active Abandoned
- 2002-11-13 EP EP02782634A patent/EP1576769A4/de not_active Withdrawn
- 2002-11-13 WO PCT/CN2002/000807 patent/WO2004045154A1/en not_active Application Discontinuation
- 2002-11-13 CN CNB028298853A patent/CN100477627C/zh not_active Expired - Fee Related
- 2002-11-13 AU AU2002347129A patent/AU2002347129A1/en not_active Abandoned
-
2010
- 2010-12-22 US US12/976,320 patent/US8566103B2/en not_active Expired - Fee Related
Patent Citations (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5819220A (en) * | 1996-09-30 | 1998-10-06 | Hewlett-Packard Company | Web triggered word set boosting for speech interfaces to the world wide web |
US5915001A (en) * | 1996-11-14 | 1999-06-22 | Vois Corporation | System and method for providing and using universally accessible voice and speech data files |
US6101472A (en) * | 1997-04-16 | 2000-08-08 | International Business Machines Corporation | Data processing system and method for navigating a network using a voice command |
US6101473A (en) * | 1997-08-08 | 2000-08-08 | Board Of Trustees, Leland Stanford Jr., University | Using speech recognition to access the internet, including access via a telephone |
US6185535B1 (en) * | 1998-10-16 | 2001-02-06 | Telefonaktiebolaget Lm Ericsson (Publ) | Voice control of a user interface to service applications |
US6192339B1 (en) * | 1998-11-04 | 2001-02-20 | Intel Corporation | Mechanism for managing multiple speech applications |
US6298326B1 (en) * | 1999-05-13 | 2001-10-02 | Alan Feller | Off-site data entry system |
US6865258B1 (en) * | 1999-08-13 | 2005-03-08 | Intervoice Limited Partnership | Method and system for enhanced transcription |
US7020845B1 (en) * | 1999-11-15 | 2006-03-28 | Gottfurcht Elliot A | Navigating internet content on a television using a simplified interface and a remote control |
US6760697B1 (en) * | 2000-01-25 | 2004-07-06 | Minds And Technology, Inc. | Centralized processing of digital speech data originated at the network clients of a set of servers |
US7366752B2 (en) * | 2000-08-31 | 2008-04-29 | Schneider Automation | Communication system of an automation equipment based on the soap protocol |
US7152203B2 (en) * | 2000-09-11 | 2006-12-19 | Appeon Corporation | Independent update and assembly of web page elements |
US7020611B2 (en) * | 2001-02-21 | 2006-03-28 | Ameritrade Ip Company, Inc. | User interface selectable real time information delivery system and method |
US20040141597A1 (en) * | 2001-03-12 | 2004-07-22 | Fabrizio Giacomelli | Method for enabling the voice interaction with a web page |
US20030083879A1 (en) * | 2001-10-31 | 2003-05-01 | James Cyr | Dynamic insertion of a speech recognition engine within a distributed speech recognition system |
US20040006474A1 (en) * | 2002-02-07 | 2004-01-08 | Li Gong | Dynamic grammar for voice-enabled applications |
US20030154085A1 (en) * | 2002-02-08 | 2003-08-14 | Onevoice Medical Corporation | Interactive knowledge base system |
US20030182622A1 (en) * | 2002-02-18 | 2003-09-25 | Sandeep Sibal | Technique for synchronizing visual and voice browsers to enable multi-modal browsing |
US20030225825A1 (en) * | 2002-05-28 | 2003-12-04 | International Business Machines Corporation | Methods and systems for authoring of mixed-initiative multi-modal interactions and related browsing mechanisms |
Cited By (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8370515B2 (en) | 2002-09-30 | 2013-02-05 | Avaya Inc. | Packet prioritization and associated bandwidth and buffer management techniques for audio over IP |
US20080151886A1 (en) * | 2002-09-30 | 2008-06-26 | Avaya Technology Llc | Packet prioritization and associated bandwidth and buffer management techniques for audio over ip |
US20080151921A1 (en) * | 2002-09-30 | 2008-06-26 | Avaya Technology Llc | Packet prioritization and associated bandwidth and buffer management techniques for audio over ip |
US7877500B2 (en) | 2002-09-30 | 2011-01-25 | Avaya Inc. | Packet prioritization and associated bandwidth and buffer management techniques for audio over IP |
US7877501B2 (en) | 2002-09-30 | 2011-01-25 | Avaya Inc. | Packet prioritization and associated bandwidth and buffer management techniques for audio over IP |
US8015309B2 (en) | 2002-09-30 | 2011-09-06 | Avaya Inc. | Packet prioritization and associated bandwidth and buffer management techniques for audio over IP |
US8593959B2 (en) | 2002-09-30 | 2013-11-26 | Avaya Inc. | VoIP endpoint call admission |
US20050091059A1 (en) * | 2003-08-29 | 2005-04-28 | Microsoft Corporation | Assisted multi-modal dialogue |
US8311835B2 (en) * | 2003-08-29 | 2012-11-13 | Microsoft Corporation | Assisted multi-modal dialogue |
US20050138647A1 (en) * | 2003-12-19 | 2005-06-23 | International Business Machines Corporation | Application module for managing interactions of distributed modality components |
US7978827B1 (en) | 2004-06-30 | 2011-07-12 | Avaya Inc. | Automatic configuration of call handling based on end-user needs and characteristics |
US7529677B1 (en) * | 2005-01-21 | 2009-05-05 | Itt Manufacturing Enterprises, Inc. | Methods and apparatus for remotely processing locally generated commands to control a local device |
US8218751B2 (en) | 2008-09-29 | 2012-07-10 | Avaya Inc. | Method and apparatus for identifying and eliminating the source of background noise in multi-party teleconferences |
EP2415291B1 (de) * | 2009-03-30 | 2020-03-11 | Orange | Aushandlungsverfahren zum bereitstellen eines dienstes für ein endgerät |
US8977972B2 (en) * | 2009-12-31 | 2015-03-10 | Intel Corporation | Using multi-modal input to control multiple objects on a display |
US20110161890A1 (en) * | 2009-12-31 | 2011-06-30 | Anderson Glen J | Using multi-modal input to control multiple objects on a display |
US8442990B1 (en) * | 2011-06-17 | 2013-05-14 | Sencha, Inc. | Query-based event routing |
US20130041666A1 (en) * | 2011-08-08 | 2013-02-14 | Samsung Electronics Co., Ltd. | Voice recognition apparatus, voice recognition server, voice recognition system and voice recognition method |
US10304461B2 (en) * | 2015-02-13 | 2019-05-28 | Tencent Technology (Shenzhen) Company Limited | Remote electronic service requesting and processing method, server, and terminal |
US20180041397A1 (en) * | 2016-08-05 | 2018-02-08 | International Business Machines Corporation | Network modality reduction |
US10171307B2 (en) * | 2016-08-05 | 2019-01-01 | International Business Machines Corporation | Network modality reduction |
US10425289B2 (en) * | 2016-08-05 | 2019-09-24 | International Business Machines Corporation | Network modality reduction |
US20180278695A1 (en) * | 2017-03-24 | 2018-09-27 | Baidu Online Network Technology (Beijing) Co., Ltd. | Network access method and apparatus for speech recognition service based on artificial intelligence |
US11399067B2 (en) * | 2017-03-24 | 2022-07-26 | Baidu Online Network Technology (Beijing) Co., Ltd. | Network access method and apparatus for speech recognition service based on artificial intelligence |
Also Published As
Publication number | Publication date |
---|---|
AU2002347129A1 (en) | 2004-06-03 |
US8566103B2 (en) | 2013-10-22 |
EP1576769A4 (de) | 2011-08-31 |
CN100477627C (zh) | 2009-04-08 |
EP1576769A1 (de) | 2005-09-21 |
AU2002347129A8 (en) | 2004-06-03 |
US20110202342A1 (en) | 2011-08-18 |
CN1701568A (zh) | 2005-11-23 |
WO2004045154A1 (en) | 2004-05-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8566103B2 (en) | Multi-modal web interaction over wireless network | |
US7158779B2 (en) | Sequential multimodal input | |
US7272564B2 (en) | Method and apparatus for multimodal communication with user control of delivery modality | |
US7216351B1 (en) | Systems and methods for synchronizing multi-modal interactions | |
US7093198B1 (en) | Skins for mobile communication devices | |
US7363027B2 (en) | Sequential multimodal input | |
US7286145B2 (en) | System for describing markup language for mobile use, and information processing apparatus and program for generating display content | |
JP3490235B2 (ja) | 携帯電話機等の双方向データ通信デバイスとコンピュータとの間の通信のための双方向データ通信システム及びそこで使用される双方向通信デバイスとその使用方法 | |
US7921154B2 (en) | System and method of live data search on a mobile device | |
US20040025115A1 (en) | Method, terminal, browser application, and mark-up language for multimodal interaction between a user and a terminal | |
CA2471292C (en) | Combining use of a stepwise markup language and an object oriented development tool | |
WO2001050257A2 (en) | Incorporating non-native user interface mechanisms into a user interface | |
US20030037021A1 (en) | JavaScript in a non-JavaScript environment | |
US7076523B2 (en) | Interaction interface for a composite device computing environment | |
EP1345400A1 (de) | Funk-mobilendgerätekommunikationssystem | |
KR100716147B1 (ko) | Vxml을 이용하여 이동통신 단말기에 메뉴 네비게이션서비스를 제공하는 서버, 시스템 및 방법 | |
JP2006107199A (ja) | 検索システム | |
JP2006107201A (ja) | 検索システムおよび情報提供システム | |
JP2004246865A (ja) | 音声応答ウェブシステム及びその入出力制御方法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: INTEL CORPORATION, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:HE, LIANG;REEL/FRAME:017604/0254 Effective date: 20051107 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |