WO2020108385A1 - 语音交互方法和用户设备 - Google Patents

语音交互方法和用户设备 Download PDF

Info

Publication number
WO2020108385A1
WO2020108385A1 PCT/CN2019/120046 CN2019120046W WO2020108385A1 WO 2020108385 A1 WO2020108385 A1 WO 2020108385A1 CN 2019120046 W CN2019120046 W CN 2019120046W WO 2020108385 A1 WO2020108385 A1 WO 2020108385A1
Authority
WO
WIPO (PCT)
Prior art keywords
application
user
execution parameter
execution
voice
Prior art date
Application number
PCT/CN2019/120046
Other languages
English (en)
French (fr)
Inventor
贺真
郑勇
陶俊
黄茂胜
王涛
钟鼎
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020108385A1 publication Critical patent/WO2020108385A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • G06F3/04847Interaction techniques to control parameter settings, e.g. interaction with sliders or dials
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/048Interaction techniques based on graphical user interfaces [GUI]
    • G06F3/0484Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • This application relates to the field of electronics and communication technology, especially to the field of human-computer interaction technology.
  • voice has gradually become an important way for users to interact with user equipment.
  • These user equipment are generally wearable devices, vehicle-mounted terminals, or smart phones. Users want to make these user devices complete some operations by issuing voice commands, such as: playing songs, or searching maps, etc., but at present, when user devices receive voice commands, they often need to confirm the user multiple times. What kind of operations do you want to complete? The process is generally more complicated and takes longer time.
  • the embodiments of the present application provide a voice interaction mode and device, to a certain extent, to solve the problem that the user equipment responds slowly when receiving a voice command.
  • the present application provides a voice interaction method, including the following steps:
  • the input component detects that the first application is selected
  • S102 Receive an execution parameter of the first application through a voice receiving circuit, where the execution parameter of the first application is input in the form of voice;
  • the processor determines, according to the execution parameter, among operations supported by the first application, an operation associated with the execution parameter;
  • the voice input by the user is only the execution parameter of the first application, the voice input by the user will be relatively concise, thereby shortening the time for inputting the voice and recognizing the voice, and increasing the response speed of the user device .
  • the voice input by the user is very concise, because the operations supported by an application are limited, it is still possible to match operations that match the execution parameters. In this way, the user's intention can be judged in a relatively short time, so that the user's desired operation can be performed.
  • step S101 the user equipment has multiple applications, and the first application is any one of the multiple applications .
  • the step S101 may specifically include: the input component detects that the user selects the first application among various applications.
  • the execution parameter of the first application is a parameter used by the first application during execution.
  • step S101 is performed first, and then step S102 is performed; or step S102 is performed first, and then step S101 is performed; or steps S101 and S102 are performed simultaneously.
  • the voice interaction method further includes starting the first application, and immediately starting the first application after detecting an instruction that the first application is selected An application, or start the first application while performing the step S102 or S103.
  • the first application can be started immediately after the instruction to select the first application is detected, or the first application can be started while performing the steps S102 or S103 In this way, after the operation associated with the content in the voice is determined, the operation associated with the content in the voice can be performed immediately, which improves the response speed of the user equipment.
  • the selection of the first application means that it is desired to start the first application.
  • the first application is a tool included in the user equipment, so that the user equipment has a purpose or function.
  • the input component includes one or more sensors.
  • the voice receiving circuit may be a microphone or a biosensor that detects voiceprints.
  • the operation is to perform a task.
  • the operation is to provide a single function.
  • the execution parameter may be a parameter during execution of various operations supported by the application.
  • the execution parameter is used to indicate the object involved in the operation or indicate the type of the operation.
  • the execution parameter in the case where the execution parameter is used to indicate an object involved in the operation, is time, or place name, or The name of the person, or the name of the group, or the phone number that the user wants to dial, or the web page that the user wants to open, or the message that the user wants to send; or, the execution parameter is a word indicating a certain kind of thing.
  • the number of execution parameters is less than or equal to three.
  • the time for the user to input voice can be reduced, and the time for the user equipment to determine which operation to perform according to the execution parameters can also be reduced.
  • step S103 in determining each operation supported by the first application according to the execution parameter, the The operations related to parameters include:
  • a user intention recognition engine is used to determine an operation most commonly used by the user associated with the execution parameter, and the execution parameter is configured to the operation.
  • the present application also provides user equipment, which includes:
  • Voice receiving circuit used to receive voice input
  • a voice recognition circuit configured to recognize the execution parameter of the first application from the received voice
  • a processor configured to determine an operation associated with the execution parameter in each operation supported by the first application according to the execution parameter
  • the component associated with the first application is used to perform an operation associated with the execution parameter.
  • the voice recognition circuit may be integrated with the processor.
  • FIG. 1 is a schematic structural diagram of user equipment in an embodiment of the present application
  • FIG. 2 is a schematic flowchart of a voice interaction method in an embodiment of this application.
  • FIG. 3 is a diagram showing what is displayed on the display.
  • FIG. 4 is a diagram showing the content displayed on the display.
  • FIG. 5 is a diagram showing the content displayed on the display.
  • FIG. 6 is a diagram showing the content displayed on the display.
  • FIG. 7 is a diagram showing the content displayed on the display.
  • FIG. 8 is a diagram showing what is displayed on the display.
  • FIG. 9 is a diagram showing the content displayed on the display.
  • the embodiments of the present application can be applied in various scenarios of voice interaction between a person and a user device.
  • the user device may be a wearable device, a vehicle-mounted terminal, a personal mobile terminal, a personal computer, a multimedia player, an electronic reader Smart home devices, or robots, etc.
  • the personal mobile terminal may also be a smart phone, a tablet computer or the like.
  • the wearable device may also be a smart bracelet, or a smart medical device, or a head-mounted terminal.
  • the head-mounted terminal device may be a virtual reality terminal or an augmented reality terminal, such as Google Glass.
  • the intelligent medical device may be an intelligent blood pressure measuring device or an intelligent blood glucose measuring device.
  • the smart home device may be a smart access control system or the like.
  • the robot may be various other electronic devices that provide services to humans according to human instructions.
  • FIG. 1 shown in FIG. 1 is a user equipment, which can receive a user's voice instruction and complete a corresponding operation according to the user's voice instruction.
  • the components shown in FIG. 1 are not necessary for the user equipment, and can be adjusted according to the functions supported by the user equipment. For example, if the user equipment needs to support more functions, you need to install more Many parts. If the user equipment supports few functions, and some components shown in FIG. 1 are not related to the functions supported by the user equipment, these components may not be provided.
  • some components in FIG. 1 may be combined, for example, some modules in the communication module 1020 may be combined with the processor 1010 into one component. Some components in FIG. 1 can be provided separately.
  • the hologram device 1064 in the display 1060 can be provided independently of the display 1060.
  • the user equipment 1001 shown in FIG. 1 includes a communication module 1020, a user identification module 1024, a memory 1030, a sensor module 1040, an input device 1050, a display 1060, an interface 1070, an audio module 1080, a camera module 1091, a power management module 1095, and a battery 1096, indicator 1097 and motor 1098, and processor 1010.
  • the functions of the processor 1010 are generally divided into three aspects.
  • the first aspect is running an operating system;
  • the second aspect is processing various data, for example, processing various data received from the communication module 1020 or the input device 1050, And send the processed data through the communication module 1020, or display through the display.
  • the third aspect is to run application programs and control multiple hardware connected to the processor 1010 to complete corresponding functions. For example, by controlling the camera module 1091, the user is provided with a photographing function.
  • the processor 1010 may have one or more functions among the above three aspects, and may be split into one or more processors according to different functions, for example, a graphics processing unit (Graphics Processing Unit, GPU) ), image signal processor (Image), processor (ISP), central processor (Central Processing Unit, CPU), application processor (Application Processor, AP) or communication processor (Communication Processor, CP), etc.
  • the split processor with independent functions may be set on other associated modules, for example, a communication processor (CP) may be set together with the cellular module 1021.
  • the processor 1010 may be composed of one or more IC chips.
  • the processor may be an integrated circuit working according to a non-solidified instruction or an integrated circuit working according to a solidified instruction.
  • a processor working according to non-solidified instructions realizes the functions carried on the processor by reading and executing instructions in the internal memory 1032.
  • the processor working according to the curing instruction implements the function carried by the processor by running its own hardware logic circuit, and the processor working according to the curing instruction often needs to access the internal memory 1032 when running its own hardware logic circuit. Read some data or output the operation result to the internal memory 1032.
  • the memory 1030 includes the internal memory 1032, and may further include an external memory 1034.
  • the internal memory 1032 may include one or more of the following: volatile memory (for example, dynamic random access memory (Dynamic Random Access Memory, DRAM), static random access memory (Static Random Access Memory, SRAM), or Synchronous dynamic random access memory (Synchronous Dynamic Random Access Memory, SDRAM, etc.), non-volatile memory (for example, one-time programmable read-only memory (One Time Programmable Read Only Memory, OTPROM), programmable read-only memory ( Programmable Read Only Memory (PROM), erasable programmable read-only memory (erasable programmable-read-only memory, EPROM), electrically erasable programmable read-only memory (electrically erasable programmable-read-only memory (EEPROM), mask type Read-only memory, flash read-only memory, or flash memory (for example, NAND flash memory, NOR flash memory, etc.), hard disk drive or solid state drive (Solid State
  • the external memory 1034 may include flash drives such as: Compact Flash (CF), Secure Digital (SD) card, Micro SD (Secure Digital) card, Mini SD (Secure Digital) card, Mini SD (Secure Digital) card, Extreme speed card (Extreme Digital-Picture Card, xD card), MultiMedia Card (MMC), or memory stick, etc.
  • flash drives such as: Compact Flash (CF), Secure Digital (SD) card, Micro SD (Secure Digital) card, Mini SD (Secure Digital) card, Mini SD (Secure Digital) card, Extreme speed card (Extreme Digital-Picture Card, xD card), MultiMedia Card (MMC), or memory stick, etc.
  • the communication module 1020 may include a cellular module 1021, a Wi-Fi (wireless fidelity) module 1023, a Bluetooth (BT) module 1025, a GPS (Global Positioning System) module 1027, an NFC (Near Field Communication) module 1028, and a radio frequency (radio) frequency, RF) module 1029.
  • the cellular module 1021 may provide, for example, a voice call service, a video call service, a text message service, or an Internet service through a communication network.
  • the radio frequency module 1029 is used to send/receive communication signals (for example, RF signals).
  • the radio frequency module 1029 may include a transceiver, a power amplifier module (Power Amplifier, Module, PAM), a frequency filter, and a low noise amplifier (low- noise (amplifier, LNA), or antenna, etc.
  • the user identification module 1024 is used to store unique identification information (for example, integrated circuit card identification code (Integrate Circuit Identification Card, ICCID)) or user information (for example, International Mobile Subscriber Identification Code (International Mobile Subscriber Identification Number, IMSI).
  • unique identification information for example, integrated circuit card identification code (Integrate Circuit Identification Card, ICCID)
  • user information for example, International Mobile Subscriber Identification Code (International Mobile Subscriber Identification Number, IMSI).
  • the user identification module 1024 may include an embedded SIM (Subscriber Identity Module) card and the like.
  • the sensor module 1040 is used to detect the state of the user equipment 1001 and/or measure physical quantities.
  • the sensor module 1040 may include a gesture sensor 1040A, a gyro sensor 1040B, an atmospheric pressure sensor 1040C, a magnetic sensor 1040D, an acceleration sensor 1040E, a grip sensor 1040F, a proximity sensor 1040G, a color sensor 1040H (for example, red/green/blue ( red (green), blue (RGB) sensor), biosensor 1040I, temperature/humidity sensor 1040J, illuminance sensor 1040K, ultraviolet (UV) sensor 1040M, olfactory sensor (electronic nose sensor), electromyography (EMG) sensor, One or more of electroencephalogram (EEG) sensor, electrocardiogram (ECG) sensor, infrared (IR) sensor, iris recognition sensor and fingerprint sensor.
  • EEG electroencephalogram
  • ECG electrocardiogram
  • IR infrared
  • the input device 1050 may include one or more of a touch panel 1052, a (digital) pen sensor 1054, a key 1056, and an ultrasonic input device 1058.
  • the (digital) pen sensor 1054 may be provided independently, or as part of the touch panel 1052.
  • the key 1056 may include one or more of physical buttons, optical buttons, and a keyboard.
  • the ultrasonic input device 1058 is used to sense ultrasonic waves generated by the microphone 1088 or other input tools.
  • the display 1060 (or may also be referred to as a screen) is used to present various contents (eg, text, images, videos, icons, symbols, or the like) to the user.
  • the display 1060 may include a panel 1062 or a touch screen, and the panel 1062 may be rigid, flexible, or transparent, or wearable.
  • the display 1060 may further include a hologram device 1064 or a projector 1066, and may be further used to receive touch, gesture, proximity, or hovering indication signals input from an electronic pen or a part of a user's body.
  • the panel 1062 and the touch panel 1052 can be integrated together.
  • the hologram device 1064 is used to display a stereoscopic image in space using the phenomenon of light interference.
  • the projector 1066 is used to project light onto the display 1060 to display an image.
  • the interface 1070 may include HDMI (High Definition Multimedia Interface) 1072, USB (Universal Serial) Bus 1074, optical interface 1076, D-subminiature interface (Dsub) 1078, mobile high-definition link (Mobile High-Definition Link) , MHL) interface, SD card/multimedia card (MMC) interface or Infrared Data Association (Infrared Data Association, IrDA) interface, etc.
  • HDMI High Definition Multimedia Interface
  • USB Universal Serial Bus
  • optical interface 1076 D-subminiature interface
  • Dsub D-subminiature interface
  • MHL mobile high-definition link
  • MMC SD card/multimedia card
  • IrDA Infrared Data Association
  • the audio module 1080 is used to convert sound into electrical signals or electrical signals into sound.
  • the audio module 1080 may process sound information input or output through the speaker 1082, the receiver 1084, the earphone 1086, or the microphone 1088.
  • the camera module 1091 is used to capture still images or moving images.
  • the power management module 1095 is used to manage power supply of other modules in the user equipment 1001.
  • the indicator 1097 is used to display the state in which the user equipment 1001 is in or the state in which various components in the user equipment 1001 are in, for example, a startup state, a message state, or a charging state.
  • the motor 1098 is used to drive one or more components in the user equipment 1001 to perform mechanical movement.
  • an embodiment of the present application provides a voice interaction method, including the following steps:
  • the input component detects that the first application is selected
  • S102 Receive a voice input through a voice receiving circuit, and recognize an execution parameter of the first application from the received voice through a voice recognition circuit;
  • the processor determines, according to the execution parameter, among operations supported by the first application, an operation associated with the execution parameter;
  • the voice input by the user is only an execution parameter of the first application, and it is not necessary to input a complete instruction, the voice input by the user will be relatively concise, thereby shortening the time for inputting voice and recognizing voice, and improving the user The response speed of the device.
  • the voice input by the user is very concise, because the operations supported by an application are limited, it is still possible to match operations that match the execution parameters. In this way, the user's intention can be judged in a relatively short time, so that the user's desired operation can be performed.
  • the user equipment may have multiple applications, and the first application may be any one of the multiple applications.
  • the step S101 may specifically include: the input component detects that the user selects the first application among various applications.
  • the execution parameter of the first application is not a complete instruction, but a parameter utilized by the first application during execution.
  • the first application is to play a song, and the input voice is "Wang Fei”
  • step S101 and the step S102 may be flexible, for example, step S101 may be executed first, and then step S102 may be executed, or step S102 may be executed first, and then step S101 may be executed, or steps S101 and S102 may be executed simultaneously. .
  • the voice interaction method may further include starting the first application because it takes a certain time to start the first application, and the first application may be started immediately after detecting an instruction that the first application is selected, Or the first application is started while performing the steps S102 or S103, so that after the operation associated with the content in the voice is determined, the operation associated with the content in the voice can be performed immediately, which improves The response speed of the user equipment.
  • the selection of the first application means that it is desired to start the first application.
  • the first application may be a tool possessed by the user equipment, so that the user equipment has a purpose or function, such as: WeChat, taking pictures, or playing songs, and so on.
  • the user equipment may have multiple applications, that is, multiple tools, so that the user equipment may have multiple uses.
  • the tool in the user equipment may be in the form of hardware, software, or a combination of software and hardware.
  • the so-called software is a piece of program executed by the processor, and the user equipment can realize a purpose by executing a piece of program (for example, an application (APP)).
  • APP application
  • Some applications of the user equipment require hardware to complete.
  • the tool in the form of hardware refers to any one or more components in the user equipment.
  • the tool in the form of software is executed by a processor in the user equipment.
  • the one or more components in the user equipment execute a section of program instructions, or process some electrical or optical signals, or perform some mechanical actions based on the electrical or optical signals, etc., so as to achieve the purpose of the application.
  • the input component may include one or more sensors, and the sensors may be various sensors in the sensor module 1040 in FIG. 1, for example: a gesture sensor 1040A, a gyro sensor 1040B, an acceleration sensor 1040E, The grip sensor 1040F, proximity sensor 1040G, infrared sensor, structured light device, camera, or biosensor 1040I.
  • the biosensor 1040I may be various sensors that detect human biological information, such as a sensor that detects an iris, a sensor that detects a fingerprint, or a sensor that detects a voiceprint.
  • the sensor may also be the microphone 1088 shown in FIG. 1.
  • the sensor may also be a touch sensing unit or a pressure sensing unit.
  • the touch sensing unit may be independently set or disposed in the touch panel 1052, and the pressure sensing unit may be independently set or disposed in the touch panel 1052.
  • the touch sensing unit is used to detect a contact signal of a contact, and the contact may be a user's finger or a gel pen head or other tools.
  • the pressure sensing unit is used to detect the pressure applied by the user through his own finger or other tools.
  • step S101 can be implemented in the following ways:
  • Manner 1 The touch sensing unit in the touch panel 1052 detects that the icon corresponding to the first application displayed on the touch panel 1052 is clicked or touched for more than a preset duration.
  • the input component includes the touch sensing unit, wherein the preset duration is determined according to user habits, and when the user wishes to start the first application, click or touch the first The duration of an application.
  • Manner 2 The pressure sensing unit in the touch panel 1052 detects that the icon of the first application displayed on the touch panel 1052 has been pressed beyond a preset pressure.
  • the input component includes the Pressure sensing unit.
  • the preset pressure is determined according to user habits, and when the user wishes to start the first application, the pressure of the first application is touched.
  • the structured light device, the infrared sensor, the camera, or the biosensor 1040I detects that the icon of the first application on the display 1060 is being watched for more than a preset duration.
  • the input component includes the structured light device or the infrared sensor or camera or the biosensor 1040I.
  • the input component may include various devices for detecting eyeball gaze, for accurately capturing the eyeball gaze position.
  • the preset duration is determined according to user habits, and when the user wishes to start the first application, watch the duration of the first application.
  • the gesture sensor 1040A detects that the user made a trigger gesture to the icon of the first application displayed on the display 1060.
  • the input component includes the gesture sensor 1040A.
  • the microphone 1088 detects that the user utters a voice corresponding to the first application.
  • the input component includes the microphone 1088, and the voice corresponding to the first application may be the voice of the name of the first application or the voice of the code name of the first application, or the like.
  • the gesture sensor 1040A detects that the user makes a gesture corresponding to the first application.
  • the input component includes the gesture sensor 1040A.
  • Manner 7 The touch sensing unit corresponding to the first application detects that it is touched.
  • the input component includes the touch-sensing unit, and the user equipment may set a plurality of the touch-sensing units to one-to-one correspondence with multiple applications, and the first application is the One of multiple applications.
  • the biosensor 1040I corresponding to the first application detects that it has been gazed for more than a preset duration.
  • the input component may include the biosensor 1040I, and the biosensor 1040I may be a sensor that detects an iris.
  • the user equipment may set a plurality of the biosensors 1040I to correspond to a plurality of applications, and the first application is one of the plurality of applications.
  • the biosensor 1040I can be replaced with a structured light device, an infrared sensor, a camera, or other types of devices that detect gaze of the eyeball to accurately capture the gaze position of the eyeball.
  • the pressure sensing unit corresponding to the first application detects that it is under pressure.
  • the input component includes the pressure-sensing unit
  • the user equipment may set a plurality of the pressure-sensing units to one-to-one correspondence with multiple applications, and the first application is the multiple One of the applications.
  • the pressure sensing unit may be a phone button on the wearable device. When the user presses the phone button, it means that the phone call application is triggered.
  • the touch panel 1052 or the display 1060 can display icons of other applications in addition to icons of the first application, so that the user can do among the multiple applications select.
  • the icon of the first application may be visually displayed on the touch panel 1052 or the display 1060.
  • the icon of the first application is different from when it is not selected.
  • the voice receiving circuit may be a microphone or a biosensor that detects voiceprints.
  • the speech recognition circuit may be integrated with the processor.
  • the first application serves as a tool of the user equipment, and it can support one or more operations.
  • the so-called operation can be understood as performing a task.
  • you can send a message to A you can also send a message to B, you can also send photos to a circle of friends.
  • "send message to A” can be regarded as an operation
  • "send message to B” can also be regarded as an operation
  • "send photo to circle of friends” can also be regarded as an operation.
  • An application in the user equipment provides a relatively systematic function.
  • An operation is a single function of this systematic function.
  • “Call” is an application that performs a more comprehensive and systematic function about making calls, including: making and receiving calls, receiving calls, storing incoming call records, and storing missed calls, etc.
  • An operation refers to a single function: "Call Zhang San's phone”.
  • the applications in the user equipment are generally divided into several categories: applications for user communication, applications for user entertainment, applications for providing life services for users, and applications for providing medical services for users.
  • An application for user communication refers to an application for users to transfer data between the user equipment and other users, operators, or service providers, such as WeChat, phone calls, and QQ.
  • Entertainment for users refers to applications that allow users to obtain a visually or spiritually pleasant experience by showing them entertainment materials or interacting with users, such as watching movies, listening to music, playing games, taking photos, and browsing photos.
  • Applications that provide users with life services refer to applications that provide users with some guidance or reminder help to facilitate users' lives, such as navigation, maps, calendars, and logs.
  • Applications that provide users with medical services refer to applications that provide users with physical index detection services or provide users with physical functions such as massage or electromagnetic pulses to promote users' physical health, such as wearable devices and bracelets.
  • the execution parameter may be a parameter during execution of various operations supported by the application, and may specifically be used to indicate an object involved in the various operations, or indicate a type of operation.
  • the execution parameter may be time, or place name, or person name, or group name, or a phone number that the user wishes to dial, or the user The web page that you want to open, or the message that the user wants to send, etc., the execution parameter may also be specifically a word indicating a certain type of things, etc.
  • the execution parameter is: sad song, the operation associated with the execution parameter is playback Sad song.
  • the execution parameter may be, for example: "callback", or "navigation", or "send message” and the name of the operation.
  • the other parameters of the operation for example: who calls back, or navigates there, or who sends the message can be obtained from the historical data stored in the memory in the user equipment.
  • the historical data may be historical data of the operation type indicated by the execution parameter, or historical data of other applications. For example, the phone number displayed by the user in the latest piece of historical data in the "short message" application is used as the phone number for the "callback" operation.
  • the number of the execution parameters may be one or more. In order to improve the response speed, the execution parameters may not exceed three. When the number of the execution parameters is small, the time for the user to input voice can be reduced, and the time for the user equipment to determine which operation to perform according to the execution parameters can also be reduced.
  • the operation associated with the execution parameter refers to any operation that conforms to the execution parameter. There may be many operations that meet the execution parameters, but only one is selected for execution to achieve the purpose of rapid response.
  • the processor determines an operation according to the pre-configuration, and configures the execution parameter to the operation. Which operation is specifically configured for each application can be shown in Table 1.
  • the configurable operations are: displaying photos, and as to which photos are displayed, it can be determined according to the execution parameters.
  • the operations that can be configured are: navigation, and when navigating there, it can be determined according to the execution parameters.
  • the configurable operations are: chat, and who to chat with can be determined according to the execution parameters.
  • the configurable operations are: search and display some videos, and which videos to search and display can be determined according to the execution parameters.
  • the configurable operations are: displaying the Weibo list, and as for whose Weibo list is displayed, it can be determined according to the execution parameters.
  • the configurable operation is: booking a ticket, as to when and where the train ticket can be determined according to the execution parameter.
  • the second type the processor determines a most commonly used operation by the user based on the execution parameter and the type of the application, using a user intention recognition engine, and configures the execution parameter to the operation.
  • the user intention recognition engine may be a machine learning model built using various decision algorithms. For example, by counting the data related to the user's behavior, habits, and hobbies, and thereby predicting the operation the user wishes to perform based on the execution parameters, this can be achieved by establishing a probability/statistic/random model.
  • the probability/statistical/random model may be a Bayesian (eg, Naive Bayes) classifier, decision tree (eg, fast decision tree), support vector machine (SVM), hidden Markov model (HMM) , Or Gaussian mixture model (GMM), etc.
  • the probability/statistics/random model can adopt machine learning technology for online and offline learning, thereby making the probability/statistics/random model more accurate.
  • the first example is a first example:
  • the "phone” application is a tool that provides users with telephone communication-related services. Specifically, it can be used to make calls, record the dialed phone numbers, and record missed calls Calls, record contact information, etc.
  • the voice interaction method includes:
  • the voice service is activated, and visual feedback that the phone icon is selected is presented on the display;
  • the voice receiving circuit receives voice, where the voice is a person's name, for example: He Zhen;
  • the phone number corresponding to the name is found in the phone application, and the phone number is called.
  • the application selected by the user is the "view contact” application below the “phone” application, after receiving the voice, if the content of the voice is a person's name, a page displaying contact information of this name is opened.
  • the application selected by the user is the "sending text” application below the “short message” application, after receiving the voice, if the content of the voice is a person's name, a page for sending a short message to this name is opened.
  • the "map" application is a location for users Guiding tools, for example, can provide users with positioning functions, can provide users with maps of specified areas, or provide navigation services for users, etc.
  • the execution parameter input by the user through voice may be a geographical location name, for example, Shenzhen. According to this execution parameter, the operation performed may be to view a map of the geographic location indicated by the place name or turn on a navigation function to navigate to the geographic location indicated by the place name.
  • the voice receiving circuit receives voice, wherein the voice indicates a place name: Xunliao Bay;
  • the third example is a first example.
  • the voice recognition function is activated, and the The icon of the "Music” application gives users visual feedback.
  • the voice input by the user is recognized as: Chopin, and Chopin's music is played.
  • the voice input by the user may be the song name, the name of the singer, the name of the music team, or the album name, or an adjective or emotional word describing a type of song, such as sadness, nostalgia, joy, joy, etc. It can also be the year the song was published, or a lyric in the song.
  • the operations performed according to the voice input by the user may include:
  • search for a playlist or collection related to the adjective or emotion search for a playlist or collection related to the adjective or emotion, and open the homepage of the playlist or collection or play a song in the playlist or collection.
  • the fourth example is a first example.
  • the instant messaging application As shown in FIG. 6, in the step S101, after detecting that the user's eyes gaze at an instant messaging application (for example: WeChat, QQ, or short message, etc.) exceeds a certain threshold, the instant messaging application is started.
  • an instant messaging application for example: WeChat, QQ, or short message, etc.
  • the received voice is "He Zhen", and the operation performed is to open the page for communicating with He Zhen.
  • the execution parameter may be not only the name of the contact, but also keywords in the messages exchanged in the instant communication, and so on.
  • the keywords in the interactive messages in the instant messaging may be keywords related to the name, location or time of the person.
  • the operation related to the execution parameter may be: opening a page recording information of the contact, or opening a page for communicating with the contact.
  • Send WeChat message If the input voice is: person name + communication information, the person name is the name of a contact stored in the user device, the operation can be: open the information interaction page corresponding to the person name in the WeChat application, and Enter the communication message into the dialog box with the person's name.
  • the performed operation may further include: inputting the communication message into a dialog box with the name of the person, and sending the communication message to the contact.
  • the input voice is: "He Zhen, chat together at 5 pm”
  • the operation is: open the interactive page with "He Zhen” in WeChat, and enter “chat together at 5 pm” into the dialog box And send it to "He Zhen”.
  • the fifth example is a first example.
  • step S101 it is detected that the user's eyes are looking at the "schedule” or “memo” application, and the voice recognition is activated beyond a certain time threshold, and the voice input by the user is recognized as: “2 pm "Half E1 meeting” will establish a schedule of "E1 meeting at 2:30 pm”.
  • the operation performed may be: establishing a schedule corresponding to the time, place, person name, and event.
  • the operation performed may be: querying the time, or the place, or the person's name, or the schedule of the event.
  • step S101 it is detected that the eyes gaze at the “photographing” application, and after exceeding a certain time threshold, voice recognition is activated, and when the execution parameter is “yesterday”, the operation performed is: display Photos from yesterday.
  • step S101 it is detected that the user's eyes gaze at the “contact” application, and when a threshold of a certain time is exceeded, voice recognition is activated.
  • the execution parameter is "callback”
  • the operation performed is a callback operation for the phone number in the latest call record.
  • the latest call record is a call with "He Zhen”, then call back to "He Zhen".
  • He Zhen is a person's name.
  • the application icon of "My City Weather” is selected, and the obtained voice input is "Beijing", that is, the execution parameter is "Beijing".
  • the operations to be performed are: obtaining and broadcasting or displaying the weather information of Beijing;
  • first and second are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features.
  • the features defined as “first” and “second” may explicitly or implicitly include one or more of the features.
  • the meaning of “plurality” is two or more.
  • At least one refers to one or more, and “multiple” refers to two or more.
  • “And/or” describes the relationship of the related objects, indicating that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist at the same time, B exists alone, where A, B can be singular or plural.
  • the character “/” generally indicates that the related object is a “or” relationship.
  • “At least one of the following” or a similar expression refers to any combination of these items, including any combination of a single item or a plurality of items.
  • At least one item (a) in a, b, or c can represent: a, b, c, ab, ac, bc, or abc, where a, b, c can be a single or multiple .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种根据用户简洁的语音指令启动应用程序的方法及用户设备,通过检测到应用图标被触摸,或按压,或注视等,来确定用户希望启动该应用,并及时打开麦克风接收用户的语音,该语音为人名,地名或歌曲名等,用户设备根据该人名,地名或歌曲名等预测用户希望启动该应用哪方面的功能,并启动该功能。虽然用户输入的语音很简洁,但是通过统计用户的爱好和习惯的相关数据等方式,可以预测出用户希望启动的功能,不需要用户输入复杂的指令或进行复杂的操作,这样,可以提高响应的速度,使用户体验更佳。

Description

语音交互方法和用户设备
相关申请的交叉引用
本申请要求在2018年11月29日提交中国国家知识产权局、申请号为201811445680.2、申请名称为“语音交互方法和用户设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及电子及通信技术领域,尤其涉及人机交互技术领域。
背景技术
目前,语音渐渐成为一种重要的用户与用户设备交互的方式,这些用户设备一般为穿戴设备,车载终端,或智能手机等。用户希望通过发出语音指令的方式,来使这些用户设备完成一些操作,例如:播放歌曲,或搜索地图等等,但是目前来看,用户设备在接收到语音指令时,往往需要多次确认用户到底希望完成什么样的操作,过程一般比较复杂,耗时比较长。
发明内容
有鉴于此,本申请实施例提供了一种语音交互方式和装置,以在一定程度上解决用户设备在接收到语音指令时,响应速度慢的问题。
第一方面,本申请提供一种语音交互方法,包括如下步骤:
S101、输入部件检测到第一应用被选择;
S102、通过语音接收电路接收所述第一应用的执行参数,所述第一应用的执行参数采用语音的形式输入;
S103、处理器根据所述执行参数确定所述第一应用所支持的各项操作中,与所述执行参数关联的操作;以及
S104、执行与所述执行参数关联的操作。
在本申请中,因为用户输入的语音仅是所述第一应用的执行参数,这样,用户输入的语音会比较简洁,从而缩短了输入语音和识别语音时间,提高了所述用户设备的响应速度。虽然用户输入的语音很简洁,但是因为一个应用所支持的操作是有限的,所以仍然可以匹配出与该执行参数相符的操作。这样,在比较短的时间内就可以判断出用户的意图,从而可以执行用户所希望的操作。
在所述第一方面中的语音交互方法的一种具体的实施方式中,在所述步骤S101中,用户设备具有多种应用,所述第一应用是所述多种应用中的任意一种。所述步骤S101具体可以包括:输入部件检测到用户在多种应用中选择了所述第一应用。
在所述第一方面中的语音交互方法的一种具体的实施方式中,所述第一应用的执行参数是所述第一应用在执行过程中利用的参数。
在所述第一方面中的语音交互方法的一种具体的实施方式中,先执行步骤S101,然后再执行步骤S102;或者先执行步骤S102,然后再执行步骤S101;或者同时执行步骤S101和S102。
在所述第一方面中的语音交互方法的一种具体的实施方式中,所述语音交互方法还包括启动所述第一应用,在检测到第一应用被选择的指令之后立即启动所述第一应用,或者 在执行所述步骤S102或S103的同时启动所述第一应用。
因为启动所述第一应用需要消耗一定的时间,可以在检测到第一应用被选择的指令之后立即启动所述第一应用,或者在执行所述步骤S102或S103的同时启动所述第一应用,这样可以在确定出与所述语音中的内容关联的操作之后,立即执行与所述语音中的内容关联的操作,提高了所述用户设备的响应速度。
在所述第一方面中的语音交互方法的一种具体的实施方式中,在所述步骤S101中,所述第一应用被选择是指,希望启动所述第一应用。
在所述第一方面中的语音交互方法的一种具体的实施方式中,所述第一应用为所述用户设备所具备的一个工具,从而使所述用户设备具有一种用途或功能。
在所述第一方面中的语音交互方法的一种具体的实施方式中,所述输入部件包括一种或多种传感器。
在所述第一方面中的语音交互方法的一种具体的实施方式中,在所述步骤S102中,所述语音接收电路可以为麦克风或者检测声纹的生物传感器等。
在所述第一方面中的语音交互方法的一种具体的实施方式中,所述操作是执行一项任务。
在所述第一方面中的语音交互方法的一种具体的实施方式中,所述操作是用于提供一个单一的功能。
在所述第一方面中的语音交互方法的一种具体的实施方式中,所述执行参数可以为所述应用所支持的各种操作在执行过程中的参数。
在所述第一方面中的语音交互方法的一种具体的实施方式中,所述执行参数用于指示所述操作所涉及的对象,或者指示所述操作的类型。
在所述第一方面中的语音交互方法的一种具体的实施方式中,在所述执行参数用于指示所述操作所涉及的对象的情况下,所述执行参数为时间,或地名,或人名,或团体名称,或用户希望拔打的电话号码,或用户希望打开的网页,或用户希望发送的消息;或者,所述执行参数为指示某一类事物的词语。
在所述第一方面中的语音交互方法的一种具体的实施方式中,所述执行参数的数量小于或等于三个。所述执行参数的数量较少的情况下,可以减少用户输入语音的时间,也可以减少用户设备根据所述执行参数确定执行哪项操作的时间。
在所述第一方面中的语音交互方法的一种具体的实施方式中,在所述步骤S103中,根据所述执行参数确定所述第一应用所支持的各项操作中,与所述执行参数关联的操作,具体包括:
根据预先的配置确定出一种与所述执行参数关联的操作,并将所述执行参数配置给所述操作;或者
根据所述执行参数和所述应用的类型,采用用户意图识别引擎确定出用户最常用的一种与所述执行参数关联的操作,并将所述执行参数配置给所述操作。
第二方面,本申请还提供一种用户设备,该设备包括:
输入部件,用于检测到第一应用被选择;
语音接收电路,用于接收语音输入;
语音识别电路,用于从所述接收到的语音中识别出所述第一应用的执行参数;
处理器,用于根据所述执行参数确定所述第一应用所支持的各项操作中,与所述执行 参数关联的操作;
与所述第一应用关联的部件,用于执行与所述执行参数关联的操作。
在所述第二方面中的用户设备一种具体的实施方式中,所述语音识别电路可以和所述处理器集成在一起。
所述第二方面中的用户设备具体的实施方式可以参照所述第一方面中的语音交互方法中的各种具体的实施方式。
附图说明
图1为本申请的实施例中的用户设备的结构示意图;
图2为本申请的实施例中的语音交互方法的流程示意图;
图3为通过显示器所显示的内容来展示的,在本申请的实施例中,在检测到“电话”应用被按压时执行对应的操作的示意图;
图4为通过显示器所显示的内容来展示的,在本申请的实施例中,在检测到“地图”应用被按压时执行对应的操作的示意图;
图5为通过显示器所显示的内容来展示的,在本申请的实施例中,在检测到“音乐”应用被按压时执行对应的操作的示意图;
图6为通过显示器所显示的内容来展示的,在本申请的实施例中,在检测到“微信”应用被用户注视时执行对应的操作的示意图;
图7为通过显示器所显示的内容来展示的,在本申请的实施例中,在检测到“日程备忘”应用被用户注视时执行对应的操作的示意图;
图8为通过显示器所显示的内容来展示的,在本申请的实施例中,在检测到“拍照”应用被用户注视时执行对应的操作的示意图;以及
图9为通过显示器所显示的内容来展示的,在本申请的实施例中,在检测到“联系人”应用被用户注视时执行对应的操作的示意图。
具体实施方式
本申请的实施例可以应用在各种人与用户设备之间进行语音交互的场景中,所述用户设备可以是穿戴设备,车载终端,个人移动终端,个人计算机,多媒体播放器、电子阅读器,智能家居设备,或机器人等。所述个人移动终端也可以是智能手机,或平板电脑等。所述穿戴设备还可以是智能手环,或智能医疗设备,或头戴式终端等。所述头戴式终端设备可以是虚拟现实,或增强现实的终端等,例如:谷歌眼镜。所述智能医疗设备可以是智能测血压设备,或智能测血糖设备等。所述智能家居设备可以是智能门禁***等。所述机器人可以是其他各种根据人类指令为人类提供服务的电子设备等。
为了便于理解,下面举一个具体的例子来说明用户设备具体是什么样子的。参见图1,图1中显示的是一种用户设备,该用户设备可以接收用户的语音指令,并根据用户的语音指令完成相应的操作。需要说明的是,图1中所示的部件并不是用户设备所必须具备的,可以根据该用户设备所支持的功能作调整,例如:如果该用户设备需要支持更多的功能,则需要安装更多的部件。如果该用户设备支持的功能很少,图1中所示的某些部件是与用户设备所支持的功能不相关的,则可以不设置这些部件。另外,图1中的某些部件是可以合并的,例如通信模块1020中的某些模块可以与所述处理器1010合并为一个部件。图1中的某些部件是可以分离设置的,例如:显示器1060中的全息照相装置1064可以独立于 所述显示器1060设置。
图1中所示的用户设备1001包括通信模块1020、用户识别模块1024、存储器1030、传感器模块1040、输入装置1050、显示器1060、接口1070、音频模块1080、相机模块1091、电源管理模块1095、电池1096、指示器1097和马达1098,以及处理器1010。
所述处理器1010的功能一般分为三个方面,第一个方面是运行操作***;第二个方面是处理各种数据,例如:处理从通信模块1020或输入装置1050接收的各种数据,并将处理后的数据通过所述通信模块1020发送出去,或者通过所示显示器显示。第三个方面是运行应用程序,并控制连接到所述处理器1010的多个硬件,完成相应的功能。例如:通过控制相机模块1091,为用户提供拍照功能。
所述处理器1010可以具有上述三个方面的功能中的一个或多个的功能,并且,可以按照不同的功能拆分为一个或多个处理器,例如:图形处理单元(Graphics Processing Unit,GPU),图像信号处理器(Image Signal Processor,ISP)、中央处理器(Central Processing Unit,CPU)、应用处理器(Application Processor,AP)或通信处理器(Communication Processor,CP)等。拆分出来的具有独立的功能的处理器可以设置在其他的关联模块上,例如:通信处理器(CP)可以与所述蜂窝模块1021设置在一起。
在硬件上,所述处理器1010可以由一个或多个IC芯片构成。
所述处理器可以是一种根据非固化指令工作的集成电路或根据固化指令工作的集成电路。根据非固化指令工作的处理器通过读取并执行内部存储器1032中的指令来实现承载在所述处理器的功能。根据固化指令工作的处理器通过运行自身的硬件逻辑电路来实现承载在所述处理器的功能,根据固化指令工作的处理器在运行自身的硬件逻辑电路的过程中往往也需要从内部存储器1032中读取一些数据,或者将运行结果输出到内部存储器1032。
所述存储器1030包括所述内部存储器1032,还可以进一步包括外部存储器1034。内部存储器1032可包括下列项中的一个或多个:易失性存储器(例如,动态随机存取存储器(Dynamic Random Access Memory,DRAM)、静态随机存取存储器(Static Random Access Memory,SRAM)、或同步动态随机存取存储器(Synchronous Dynamic Random Access Memory,SDRAM)等)、非易失性存储器(例如,一次性可编程只读存储器(One Time Programmable Read Only Memory,OTPROM)、可编程只读存储器(Programmable Read Only Memory,PROM)、可擦除可编程只读存储器(erasable programmable read-only memory,EPROM)、电可擦除可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、掩模型只读存储器、闪速只读存储器、或闪存(例如,NAND闪存、或NOR闪存等))、硬盘驱动器或固态驱动器(Solid State Disk,SSD)。
外部存储器1034可包括闪存驱动器,诸如:紧凑型闪存(Compact Flash,CF)、安全数字卡(Secure Digital card,SD card)、微型SD(Secure Digital)卡、迷你SD(Secure Digital)卡、极速卡(Extreme Digital-Picture Card,xD card)、多媒体卡(MultiMedia Card,MMC)、或记忆棒等。
所述通信模块1020可以包括蜂窝模块1021、Wi-Fi(wireless fidelity)模块1023、蓝牙(BT)模块1025、GPS(Global Positioning System)模块1027、NFC(Near Field Communication)模块1028,以及射频(radio frequency,RF)模块1029。蜂窝模块1021可通过通信网络提供例如语音呼叫服务、视频呼叫服务、文本消息服务或互联网服务。
所述射频模块1029用于发送/接收通信信号(例如,RF信号),所述射频模块1029可以 包括收发器、功率放大模块(Power Amplifier Module,PAM)、频率滤波器、低噪声放大器(low-noise amplifier,LNA)、或天线等。
所述用户识别模块1024用于存储唯一的识别信息(例如,集成电路卡识别码(Integrate circuit card identity,ICCID)或用户信息(例如,国际移动用户识别码(International Mobile Subscriber Identification Number,IMSI)。所述用户识别模块1024可以包括嵌入式SIM(Subscriber Identity Module)卡等。
所述传感器模块1040用于检测所述用户设备1001的状态和/或测量物理量。所述传感器模块1040可以包括手势传感器1040A、陀螺仪传感器1040B、大气压力传感器1040C、磁性传感器1040D、加速度传感器1040E、握持传感器1040F、接近传感器1040G、颜色传感器1040H(例如,红/绿/蓝(red green blue,RGB)传感器)、生物传感器1040I、温度/湿度传感器1040J、照度传感器1040K、紫外(Ultraviolet,UV)传感器1040M、嗅觉传感器(电子鼻传感器)、肌电图(electromyography,EMG)传感器、脑电图(electroencephalogram,EEG)传感器、心电图(electrocardiogram,ECG)传感器、红外(Infrared,IR)传感器、虹膜识别传感器以及指纹传感器中的一个或多个。
所述输入装置1050可以包括触摸面板1052、(数字)笔传感器1054、键1056以及超声波输入装置1058中的一个或多个。所述(数字)笔传感器1054可以独立设置,也可以作为触摸面板1052的一部分。键1056可以包括物理按钮、光学按钮以及键盘中一个或多个。所述超声波输入装置1058用于感测通过麦克风1088或其他输入工具产生的超声波。
所述显示器1060(或者也可以称为屏幕)用于向用户呈现各种内容(例如:文本、图像、视频、图标、符号或类似物)。所述显示器1060可以包括面板1062或触摸屏,所述面板1062可以是刚性的、柔性的、或透明的、或可穿戴的。所述显示器1060还可以进一步包括全息照相装置1064或投影仪1066,并且可以进一步用于接收从电子笔或用户身体的一部分输入的触摸、手势、接近或悬停等指示信号。
所述面板1062和触摸面板1052可集成在一起。所述全息照相装置1064用于利用光干涉现象而在空间中显示立体图像。所述投影仪1066用于将光投射到显示器1060上以便显示图像。
所述接口1070可以包括HDMI(High Definition Multimedia Interface)1072、USB(Universal Serial Bus)1074、光学接口1076、D-超小型接口(D-subminiature,Dsub)1078、移动高清链接(Mobile High-Definition Link,MHL)接口、SD卡/多媒体卡(MMC)接口或红外数据协会(Infrared Data Association,IrDA)接口等等。
所述音频模块1080用于将声音转换成电信号或者将电信号转换成声音。
所述音频模块1080可以处理通过扬声器1082、接收器1084、耳机1086或麦克风1088输入或输出的声音信息。
所述相机模块1091用于拍摄静止图像或动态图像。
所述电源管理模块1095用于管理所述用户设备1001中的其他模块的供电。所述指示器1097用于显示所述用户设备1001所处于的状态或所述用户设备1001中各部件所处于的状态,例如:启动状态、消息状态、或充电状态等。
所述马达1098用于驱动所述用户设备1001中的一个或多个部件进行机械运动。
参见图2,本申请实施例提供一种语音交互方法,包括如下步骤:
S101、输入部件检测到第一应用被选择;
S102、通过语音接收电路接收语音输入,并通过语音识别电路从所述接收到的语音中识别出所述第一应用的执行参数;
S103、处理器根据所述执行参数确定所述第一应用所支持的各项操作中,与所述执行参数关联的操作;以及
S104、执行与所述执行参数关联的操作。
由于用户输入的语音仅是所述第一应用的执行参数,而不需要输入一个完整的指令,这样,用户输入的语音会比较简洁,从而缩短了输入语音和识别语音时间,提高了所述用户设备的响应速度。虽然用户输入的语音很简洁,但是因为一个应用所支持的操作是有限的,所以仍然可以匹配出与该执行参数相符的操作。这样,在比较短的时间内就可以判断出用户的意图,从而可以执行用户所希望的操作。
在所述步骤S101中,用户设备可以具有多种应用,所述第一应用可以是所述多种应用中的任意一种。所述步骤S101具体可以包括:输入部件检测到用户在多种应用中选择了所述第一应用。
在本申请的上述实施例中,所述第一应用的执行参数并不是一个完整的指令,而是所述第一应用在执行过程中利用的参数。例如,在所述第一应用是播放歌曲的情况下,输入的语音是“王菲”,就可以预测出用户希望听王菲的歌,而不需要用户输入“我要听王菲的歌”的语音。
所述步骤S101与所述步骤S102的先后顺序可以是灵活的,例如:可以先执行步骤S101,然后再执行步骤S102,或者先执行步骤S102,然后再执行步骤S101,或者同时执行步骤S101和S102。
另外,所述语音交互方法还可以包括启动所述第一应用,因为启动所述第一应用需要消耗一定的时间,可以在检测到第一应用被选择的指令之后立即启动所述第一应用,或者在执行所述步骤S102或S103的同时启动所述第一应用,这样可以在确定出与所述语音中的内容关联的操作之后,立即执行与所述语音中的内容关联的操作,提高了所述用户设备的响应速度。
在所述步骤S101中,所述第一应用被选择是指,希望启动所述第一应用。所述第一应用可以为所述用户设备所具备的一个工具,从而使所述用户设备具有一种用途或功能,例如:微信,拍照,或播放歌曲等等。所述用户设备可以具备多个应用,也就是具备多个工具,这样,用户设备就可以具有多种用途。
具体来讲,用户设备中的工具,可以为硬件形式,也可以为软件形式,还可以为软件和硬件结合的形式。所谓的软件,就是所述处理器所执行的一段程序,所述用户设备可以通过执行一段程序(例如:应用程序(Application,APP))而实现一种用途。所述用户设备的有些应用需要硬件来完成。还有一些应用需要软件和硬件结合起来,一起完成,例如:所述用户设备在运行一段应用程序的过程中,需要对其他部件发出指令,以使其他部件完成配合的动作或处理。
具体来讲,硬件形式的所述工具,是指所述用户设备中的任意一个或多个部件。软件形式的所述工具由所述用户设备中的处理器执行。通过所述用户设备中的一个或多个部件执行一段程序指令,或处理一些电信号或光信号,或根据电信号或光信号做一些机械动作等等,从而实现所述应用的用途。
如图1所示,所述输入部件可以包括一种或多种传感器,所述传感器可以为图1中传 感器模块1040中各种传感器,例如:手势传感器1040A、陀螺仪传感器1040B、加速度传感器1040E、握持传感器1040F、接近传感器1040G、红外线传感器、结构光器件、摄像头或生物传感器1040I。所述生物传感器1040I可以为检测人体生物信息的各种传感器,例如:检测虹膜的传感器、检测指纹的传感器、或检测声纹的传感器等。
所述传感器也可以为图1中所示的麦克风1088。
所述传感器还可以为触控感知单元或压力感知单元。所述触控感知单元可以独立设置或设置在所述触摸面板1052中,所述压力感知单元可以独立设置或设置在所述触摸面板1052中。所述触控感知单元用于检测触头的接触信号,所述触头可以为用户的手指或胶笔头或其他工具。所述压力感知单元用于检测用户通过自己的手指或其他工具施加的压力。
举例来讲,步骤S101可以有以下几种实现方式:
方式一,触摸面板1052中的触控感知单元检测到所述触摸面板1052所显示的所述第一应用对应的图标被点击或触摸的时间超过预设的时长。在这种情况下,所述输入部件包括所述触控感知单元,其中所述预设的时长为根据用户习惯确定的,用户希望启动所述第一应用的情况下,点击或触摸所述第一应用的时长。
方式二,触摸面板1052中的压力感知单元检测到所述触摸面板1052所显示的所述第一应用的图标被触压超过预设的压力,在这种情况下,所述输入部件包括所述压力感知单元。其中,所述预设的压力为根据用户习惯确定的,用户希望启动所述第一应用的情况下,触压所述第一应用的压力。
方式三,所述结构光器件或所述红外传感器或摄像头或所述生物传感器1040I检测到显示器1060上所述第一应用的图标被注视的时间超过预设的时长,在这种情况下,所述输入部件包括所述结构光器件或所述红外传感器或摄像头或所述生物传感器1040I。所述输入部件可以包括各类检测眼球注视的器件,用于精确捕捉眼球注视位置。其中所述预设的时长为根据用户习惯确定的,用户希望启动所述第一应用的情况下,注视所述第一应用的时长。
方式四,手势传感器1040A检测到用户对着所述显示器1060所显示的所述第一应用的图标做了触发手势,在这种情况下,所述输入部件包括所述手势传感器1040A。
方式五,麦克风1088检测到用户发出与所述第一应用对应的语音。在这种情况下,所述输入部件包括所述麦克风1088,与所述第一应用对应的语音可以是所述第一应用的名字的语音或所述第一应用的代号的语音等。
方式六,手势传感器1040A检测到用户做出与所述第一应用对应的手势。在这种情况下,所述输入部件包括所述手势传感器1040A。
方式七,与所述第一应用对应的触控感知单元检测到自己被触控。在这种情况下,所述输入部件包括所述触控感知单元,所述用户设备可以设置多个所述触控感知单元分别与多个应用一一对应设置,所述第一应用是所述多个应用中的一个应用。
方式八,与所述第一应用对应的生物传感器1040I检测到自己被注视的时间超过预设的时长。在这种情况下,所述输入部件可以包括所述生物传感器1040I,所述生物传感器1040I可以为检测虹膜的传感器。所述用户设备可以设置多个所述生物传感器1040I分别与多个应用一一对应设置,所述第一应用是所述多个应用中的一个应用。所述生物传感器1040I可以替换为结构光器件或红外传感器或摄像头等各类检测眼球注视的器件,用于精确捕捉眼球注视位置。
方式九,与所述第一应用对应的压力感知单元检测到自己被施加了压力。在这种情况下,所述输入部件包括所述压力感知单元,所述用户设备可以设置多个所述压力感知单元分别与多个应用一一对应设置,所述第一应用是所述多个应用中的一个应用。举例来讲,所述压力感知单元可以为穿戴设备上的电话按钮,用户按压这个电话按钮,就代表,打电话这个应用被触发。
其中,在方式一至方式四中,所述触摸面板1052或显示器1060除了显示所述第一应用的图标之外,还可以同时显示其他应用的图标,以便于用户在所述多种应用之中做选择。
在检测到所述第一应用的图标被选中后,可以在所述触摸面板1052或显示器1060上,从视觉上显示出所述第一应用的图标被选中。例如:所述第一应用的图标在没有被选中的时候与被选中的时候是不同的。
在所述步骤S102中,所述语音接收电路可以为麦克风或者检测声纹的生物传感器等。所述语音识别电路可以和所述处理器集成在一起。
在所述步骤S103中,所述第一应用作为所述用户设备的一个工具,它可以支持一项或多项操作,所谓操作可以理解为是执行一项任务。例如,在“微信”这项功能中,可以给A发信息,也可以给B发信息,还可以将照片发朋友圈。其中,“给A发信息”可以认为是一项操作,“给B发信息”也可以认为是一项操作,“将照片发朋友圈”也可以认为是一项操作。
所述用户设备中的一个应用,提供的是一个比较***化的功能。而一项操作是这个***化的功能中的一个单一的功能。例如:“打电话”是一个应用,完成有关打电话的一个比较全面比较***化的功能,其中包括:拔打电话,接听电话,存储来电记录,以及存储未接电话等等。而一项操作是指:“拨打张三的电话”这样一个单一的功能。
所述用户设备中的应用一般分为几类:供用户通信的应用,供用户娱乐的应用,为用户提供生活服务以及为用户提供医疗服务的应用。
供用户通信的应用是指供用户通过所述用户设备与其他用户或运营商或服务提供商等之间传送数据的应用,例如:微信,打电话,QQ等。
供用户的娱乐是指通过向用户展示娱乐资料或通过与用户互动,以使用户获得视觉或精神上的愉快体验的应用,例如:看电影,听音乐,打游戏,拍照,浏览照片等。
为用户提供生活服务的应用是指通过为用户提供一些引导或提醒帮助,以方便用户生活的应用,例如导航,地图,日历,日志等。
为用户提供医疗服务的应用是指为用户提供身体指标检测服务或为用户的身体提供按摩或电磁脉冲等物理作用以促进用户身体健康的应用,例如:穿戴设备,手环等。
所述执行参数可以为所述应用所支持的各种操作在执行过程中的参数,具体可以用于指示所述各种操作所涉及的对象,或者指示操作的类型。在所述执行参数用于指示所述各种操作所涉及的对象的情况下,所述执行参数可以为时间,或地名,或人名,或团体名称,或用户希望拔打的电话号码,或用户希望打开的网页,或用户希望发送的消息等等,所述执行参数也可以具体为指示某一类事物的词语等,例如:执行参数为:伤感歌曲,与所述执行参数关联的操作是播放伤感歌曲。在所述执行参数用于指示操作的类型的情况下,所述执行参数可以为例如:“回拨”,或“导航”,或“发信息”等操作的名称。至于该操作的其他参数,例如:回拨谁的电话,或导航到那里,或向谁发消息可以从用户设备中的存储器中存储的历史数据中获得。该历史数据可以是该执行参数所指示的操作类型的历史数据, 或者其他应用的历史数据。例如:将用户在“短消息”的这个应用中的最晚的一条历史数据中显示的电话号码,作为“回拨”操作的电话号码。
所述执行参数的数量可以为一个或多个,为了提高响应速度,执行参数可以不超过三个。所述执行参数的数量较少的情况下,可以减少用户输入语音的时间,也可以减少用户设备根据所述执行参数确定执行哪项操作的时间。
与所述执行参数关联的操作是指符合所述执行参数的任意一种操作。符合所述执行参数的操作可能有很多种,但是仅选择一种来执行,以达到迅速响应的目的。
至于如何在符合所述执行的参数的多个操作中选择出一个来执行,可以采用以下两种方式:
第一种,通过固定配置的方式。处理器根据预先的配置确定出一种操作,并将所述执行参数配置给所述操作。针对各个应用具体配置哪一种操作,可以如表1中所示。针对“照片浏览应用”,可以配置的操作为:显示照片,至于显示哪些照片,可以根据所述执行参数来确定。针对“百度地图应用”,可以配置的操作为:导航,至于导航到那里,可以根据所述执行参数来确定。针对“微信”,可以配置的操作为:聊天,至于与谁聊天,可以根据所述执行参数来确定。针对“视频播放器”,可以配置的操作为:搜索并显示一些视频,至于搜索并显示哪些视频,可以根据所述执行参数来确定。针对“微博”,可以配置的操作为:显示微博列表,至于显示谁的微博列表,可以根据所述执行参数来确定。针对“铁友火车票”,可以配置的操作为:订票,至于哪个时间到哪里的火车票,可以根据所述执行参数来确定。
Figure PCTCN2019120046-appb-000001
Figure PCTCN2019120046-appb-000002
表1
第二种:所述处理器根据所述执行参数和所述应用的类型,采用用户意图识别引擎确定用户最常用的一种操作,并将所述执行参数配置给所述操作。
所述用户意图识别引擎可以是利用各种决策算法建立的机器学习模型。例如:通过统计用户的行为习惯或兴趣爱好的相关数据,从而根据所述执行参数,预测用户希望执行的操作,具体可以通过建立概率/统计/随机模型的方式实现。所述概率/统计/随机模型可以为贝叶斯(例如,朴素贝叶斯)分类器、决策树(例如,快速决策树)、支持向量机(SVM)、隐式马尔科夫模型(HMM)、或高斯混合模型(GMM)等。
所述概率/统计/随机模型可以采用机器学习技术进行在线和离线的学习,从而使所述概率/统计/随机模型更准确。
下面通过几个具体的例子,以详细的说明本申请的实施例:
第一个例子:
如图3所示,检测到用户按压“电话”这个应用,“电话”这个应用是一个为用户提供电话通信相关业务的工具,具体可以用于拨打电话,记录拨打过的电话号码,记录未接来电,记录联系人信息等。
如图3所示,所述语音交互方法包括:
检测到用户的手指按压显示器上的电话图标,在按压在电话图标上的压力超过一定阈值,则激活语音服务,并在显示器上呈现电话图标被选中的视觉反馈;
语音接收电路接收语音,其中语音为一个人的名字,例如:贺真;
根据接收到的语音,在电话应用中查找到与这个名字对应的电话号码,并呼叫这个电话号码。
如果所述用户选择的应用是“电话”这个应用下面的“查看联系人”应用,则在接收到语音后,如果语音的内容是一个人的名字,则打开显示这个名字的联系信息的页面。
如果所述用户选择的应用是“短信”这个应用下面的“发短信”应用,则在接收到语音后,如果语音的内容是一个人的名字,则打开向这个名字发送短信的页面。
第二个例子:
在所述S101步骤中,所述第一应用为“地图”(例如:高德地图、百度地图、腾讯地图、或谷歌地图等)的情况下,“地图”这个应用是一个为用户提供地理位置指引的工具,例如:可以用户提供定位功能,可以为用户提供指定区域的地图,或者为用户提供导航服务等等。用户通过语音输入的执行参数可以为某一地理位置的地名,例如:深圳。根据这一执行参数,执行的操作可以为查看该地名所指的地理位置的地图或开启导航功能,导航到该地名所指的地理位置。
具体举例如下,如图4所示:
检测到用户的手指按压显示器上的“地图”应用图标,在按压在“地图”应用图标上的压力超过一定阈值,则激活语音服务,并在显示器上呈现“地图”应用图标被选中的视觉反馈;
语音接收电路接收语音,其中所述语音指示的是一个地名:巽寮湾;
导航至巽寮湾。
第三个例子:
如图5所示,在检测手指长按“音乐”应用(例如:各大互联网厂商、手机厂商的音乐应用等)的时间超过一定阈值时,激活语音识别功能,并在点亮显示屏上的“音乐”应用的图标,给用户视觉反馈。识别出用户输入的语音为:肖邦,播放肖邦的音乐。
用户输入的语音可以为歌名,歌手姓名、音乐团队的名字、或专辑名字,也可以为形容一类歌曲的形容词或情感词语,例如:伤感、怀旧、喜庆、欢快等等。也可以为歌曲出版的年份,或者歌曲里面的一句歌词。
根据用户输入的语音所执行的操作可以包括:
在输入的语音为歌名时,播放该歌曲。
在输入的语音为专辑名称时,打开该专辑首页或播放该专辑内的歌曲。
在输入的语音为歌手或音乐团队的名字时,打开该歌手或音乐团队主页,或播放歌手或音乐团队主页的歌曲。
在输入的语音为形容词或情感词语时,搜索和该形容词或情感词语相关的歌单或精选集,并打开歌单或精选集的主页或播放歌单或精选集中的歌曲。
第四个例子:
如图6所示,在所述步骤S101中,检测到用户眼睛注视即时通讯应用(例如:微信、QQ、或短信等)的时间超过一定的阈值后,启动所述即时通讯应用。
接收到的语音为“贺真”,执行的操作为:打开与贺真进行沟通的页面。
所述执行参数不仅可以为联系人的名字,还可以为即时通信中交互的消息中的关键词等等。所述即时通信中交互的消息中的关键词可以为有关人名,地点或时间的关键词。
在所述执行参数为某一联系人的姓名的时候,与所述执行参数相关的操作可以为:打开记录有该联系人的信息的页面,或者,打开与该联系人进行沟通的页面。
发送微信消息:如果输入的语音是:人名+沟通信息,其中的人名是用户设备中存储的一个联系人的名字,执行的操作可以为:打开微信应用中,该人名对应的信息交互页面,并将该沟通消息输入至与该人名的对话框中。该执行的操作还可以包括:将该沟通消息输入至与该人名的对话框中后,并将该沟通消息发送给该联系人。
例如:输入的语音是:“贺真,下午5点一起聊聊”,执行的操作是:打开微信中与“贺真”的交互页面,并将“下午5点一起聊聊”输入至对话框中,并发送给“贺真”。
第五个例子:
如图7所示,在所述步骤S101中,检测到用户眼睛注视“日程”或“备忘”应用,超过一定时间的阈值,激活语音识别,识别出用户输入的语音是:“下午2点半E1开会”,则建立一个“下午2点半E1开会”的日程。
需要说明的是,在识别到执行参数是时间、地点、人名、以及事件时,执行的操作可以为:建立一个对应该时间、该地点、该人名、以及该事件的日程。
在识别到执行参数是时间、或地点、或人名、或事件时,执行的操作可以为:查询对应该时间、或该地点、或该人名、或该事件的日程。
第六个例子:
如图8所示,在所述步骤S101中,检测到眼睛注视“拍照”应用,超过一定时间的阈值后,激活语音识别,在所述执行参数为“昨天”时,执行的操作是:显示昨天的照片。
第七个例子:
如图9所示,在所述步骤S101中,检测用户的眼睛注视“联系人”应用,超过一定时间的阈值时,激活语音识别。在所述执行参数为“回拨”时,执行的操作是针对最晚的一条通话记录中的电话号码,进行回拨操作。最晚一条通话记录是与“贺真”的通话,则回拨“贺真”的电话。“贺真”是一个人的名字。
其他例子:
例如,“我的都市天气”应用图标被选中,获取的语音输入是“北京”,也就是执行参数是“北京”,则根据“我的都市天气”应用和执行参数“北京”等信息,确定需要执行的操作是:获取并播报或显示北京的天气信息;
“铁友火车票”应用被选中,获取的语音输入仍是“北京”,也就是执行参数是“北京”,则根据“铁友火车票”应用和执行参数“北京”等信息,确定需要执行的操作是打开“铁友火车票”应用并进入高铁预定界面,目的地是“北京”。
在上述描述中,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本申请的描述中,除非另有说明,“多个”的含义是两个或两个以上。
在本说明书的描述中,具体特征、结构、材料或者特点可以在任何的一个或多个实施例或示例中以合适的方式结合。其中方法和设备是基于同一发明构思的,由于方法及设备解决问题的原理相似,因此装置与方法的实施可以相互参见,重复之处不再赘述。
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。

Claims (10)

  1. 一种语音交互的方法,其特征在于,所述方法包括;
    S101、检测到第一应用被选择;
    S102、接收所述第一应用的执行参数,所述第一应用的执行参数采用语音的形式输入;
    S103、根据所述执行参数确定所述第一应用所支持的各项操作中,与所述执行参数关联的操作;以及
    S104、执行与所述执行参数关联的操作。
  2. 如权利要求1所述的方法,其特征在于,所述第一应用的执行参数是所述第一应用在执行过程中利用的参数。
  3. 如权利要求1或2所述的方法,其特征在于,所述执行参数用于指示所述操作所涉及的对象,或者指示所述操作的类型。
  4. 如权利要求1至3中任意一项中所述的方法,其特征在于,在所述步骤S103中,根据所述执行参数确定所述第一应用所支持的各项操作中,与所述执行参数关联的操作,具体包括:
    根据预先的配置确定出一种与所述执行参数关联的操作,并将所述执行参数配置给所述操作。
  5. 如权利要求1至4中任意一项中所述的方法,其特征在于,在所述步骤S103中,根据所述执行参数确定所述第一应用所支持的各项操作中,与所述执行参数关联的操作,具体包括:
    根据所述执行参数和所述应用的类型,采用用户意图识别引擎确定用户最常用的一种与所述执行参数关联的操作,并将所述执行参数配置给所述操作。
  6. 一种用户设备,其特征在于,所述设备包括:
    输入部件,用于检测到第一应用被选择;
    语音接收电路,用于接收语音输入,
    语音识别电路,用于从所述接收到的语音中识别出所述第一应用的执行参数;
    处理器,用于根据所述执行参数确定所述第一应用所支持的各项操作中,与所述执行参数关联的操作;
    与所述第一应用关联的部件,用于执行与所述执行参数关联的操作。
  7. 如权利要求6所述的用户设备,其特征在于,所述第一应用的执行参数是所述第一应用在执行过程中利用的参数。
  8. 如权利要求6或7所述的用户设备,其特征在于,所述执行参数用于指示所述操作所涉及的对象,或者指示所述操作的类型。
  9. 如权利要求6至8中任意一项中所述的用户设备,其特征在于,所述处理器进一步用于根据预先的配置确定出一种与所述执行参数关联的操作,并将所述执行参数配置给所述操作。
  10. 如权利要求6至9中任意一项中所述的用户设备,其特征在于,所述处理器进一步用于根据所述执行参数和所述应用的类型,采用用户意图识别引擎确定出用户最常用的一种与所述执行参数关联的操作,并将所述执行参数配置给所述操作。
PCT/CN2019/120046 2018-11-29 2019-11-21 语音交互方法和用户设备 WO2020108385A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811445680.2A CN111240561A (zh) 2018-11-29 2018-11-29 语音交互方法和用户设备
CN201811445680.2 2018-11-29

Publications (1)

Publication Number Publication Date
WO2020108385A1 true WO2020108385A1 (zh) 2020-06-04

Family

ID=70854479

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/120046 WO2020108385A1 (zh) 2018-11-29 2019-11-21 语音交互方法和用户设备

Country Status (2)

Country Link
CN (1) CN111240561A (zh)
WO (1) WO2020108385A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110248822A1 (en) * 2010-04-09 2011-10-13 Jc Ip Llc Systems and apparatuses and methods to adaptively control controllable systems
CN104111728A (zh) * 2014-06-26 2014-10-22 联想(北京)有限公司 基于操作手势的语音指令输入方法及电子设备
US20150234546A1 (en) * 2014-02-18 2015-08-20 Hong-Lin LEE Method for Quickly Displaying a Skype Contacts List and Computer Program Thereof and Portable Electronic Device for Using the Same
CN107491286A (zh) * 2017-07-05 2017-12-19 广东艾檬电子科技有限公司 移动终端的语音输入方法、装置、移动终端及存储介质
CN107949826A (zh) * 2016-08-09 2018-04-20 华为技术有限公司 一种消息显示方法、用户终端及图形用户接口

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108320742B (zh) * 2018-01-31 2021-09-14 广东美的制冷设备有限公司 语音交互方法、智能设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110248822A1 (en) * 2010-04-09 2011-10-13 Jc Ip Llc Systems and apparatuses and methods to adaptively control controllable systems
US20150234546A1 (en) * 2014-02-18 2015-08-20 Hong-Lin LEE Method for Quickly Displaying a Skype Contacts List and Computer Program Thereof and Portable Electronic Device for Using the Same
CN104111728A (zh) * 2014-06-26 2014-10-22 联想(北京)有限公司 基于操作手势的语音指令输入方法及电子设备
CN107949826A (zh) * 2016-08-09 2018-04-20 华为技术有限公司 一种消息显示方法、用户终端及图形用户接口
CN107491286A (zh) * 2017-07-05 2017-12-19 广东艾檬电子科技有限公司 移动终端的语音输入方法、装置、移动终端及存储介质

Also Published As

Publication number Publication date
CN111240561A (zh) 2020-06-05

Similar Documents

Publication Publication Date Title
JP6738445B2 (ja) デジタルアシスタントサービスの遠距離拡張
EP3567584B1 (en) Electronic apparatus and method for operating same
EP3321787B1 (en) Method for providing application, and electronic device therefor
JP2024056690A (ja) Tvユーザ対話のためのインテリジェント自動アシスタント
WO2021213496A1 (zh) 消息显示方法及电子设备
CN109905852B (zh) 通过使用呼叫方电话号码来提供附加信息的装置和方法
JP6321296B2 (ja) テキスト入力方法、装置、プログラム及び記録媒体
JP2021525430A (ja) 表示制御方法及び端末
KR20160026317A (ko) 음성 녹음 방법 및 장치
WO2019114584A1 (zh) 应用关联启动的方法、装置及移动终端
WO2015127825A1 (zh) 表情输入方法、装置及电子设备
WO2020207413A1 (zh) 一种内容推送方法、装置与设备
KR20150090966A (ko) 전자 장치 및 전자 장치의 검색 결과 제공 방법
EP3190527A1 (en) Multimedia data processing method of electronic device and electronic device thereof
KR20160016532A (ko) 메시지 서비스를 제공하는 전자기기 및 그 전자기기가 컨텐트 제공하는 방법
WO2017012423A1 (zh) 一种显示即时通信消息的方法和终端
TWI554900B (zh) 提供資訊的裝置與方法
US20180249056A1 (en) Mobile terminal and method for controlling same
WO2019007236A1 (zh) 输入方法、装置和机器可读介质
US20210185244A1 (en) Subtitle presentation based on volume control
CN106897937A (zh) 一种展示社交分享信息的方法和装置
CN110989847A (zh) 信息推荐方法、装置、终端设备及存储介质
CN106471493B (zh) 用于管理数据的方法和装置
WO2018018912A1 (zh) 一种搜索方法、装置及电子设备
CN105187597B (zh) 一种语音记录的管理方法、装置及其移动终端

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19889518

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19889518

Country of ref document: EP

Kind code of ref document: A1