WO2020181988A1 - 一种语音控制方法及电子设备 - Google Patents

一种语音控制方法及电子设备 Download PDF

Info

Publication number
WO2020181988A1
WO2020181988A1 PCT/CN2020/076689 CN2020076689W WO2020181988A1 WO 2020181988 A1 WO2020181988 A1 WO 2020181988A1 CN 2020076689 W CN2020076689 W CN 2020076689W WO 2020181988 A1 WO2020181988 A1 WO 2020181988A1
Authority
WO
WIPO (PCT)
Prior art keywords
interface
electronic device
voice
application
control signal
Prior art date
Application number
PCT/CN2020/076689
Other languages
English (en)
French (fr)
Inventor
王守诚
吴思举
周轩
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020181988A1 publication Critical patent/WO2020181988A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72433User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/22Details of telephonic subscriber devices including a touch pad, a touch sensor or a touch detector
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means

Definitions

  • This application relates to the field of terminal technology, and in particular to a voice control method and electronic equipment.
  • the mobile phone will be pre-configured with voice tasks that the mobile phone can recognize and perform before leaving the factory, such as voice tasks for querying the weather and voice tasks for booking air tickets.
  • voice tasks for querying the weather and voice tasks for booking air tickets.
  • the task type needs to be configured on the background server corresponding to the voice assistant, and then the dialogue flow is designed according to the task type, so as to obtain information corresponding to the task type.
  • the voice assistant of the phone collects the voice control signal and sends the voice control signal to the background server .
  • the back-end server first reads the task type "book air ticket”, and then extracts the voice control signal according to the key information "departure place", "destination” and "time” required by the pre-configured "air ticket booking” task type
  • the key word is to generate a voice user interface (VUI) task.
  • the background server converts the VUI tasks into corresponding control instructions and sends them to the corresponding application.
  • the application responds with a pre-customized code and outputs the query result. It can be seen that the prior art requires the background server to pre-configure task types and key information, and the amount of task configuration is large, and in order to adapt to voice tasks, developers also need to adaptively develop applications that support voice interaction.
  • the present application provides a voice control method and electronic device, which supports voice control by combining with a graphical user interface, improves the user's voice control experience, and has a small development workload.
  • an embodiment of the present application provides a voice control method, the method is applicable to an electronic device, the method includes: the electronic device displays a first interface of the application, and the first interface includes a control for updating the first interface , And then the electronic device collects the user’s voice control signal, and when the touch event corresponding to the voice control signal is determined, in response to the voice control signal, executes the corresponding touch event, and finally displays the second interface of the application.
  • the second interface is the interface after the controls in the first interface perform touch operations.
  • the electronic device determines the corresponding input event according to the collected voice control signal, and then reuses the operation process of the input event of the operating system, and the voice task can be completed without adaptive development of the application.
  • This method makes full use of the operational convenience of voice control, uses voice control when the user is inconvenient for manual operation, and combines a graphical user interface to improve the user's voice experience.
  • the electronic device first obtains the configuration file associated with the first interface, where the configuration file includes the corresponding relationship between the control identifiers of the controls in the first interface and the touch event, so The electronic device may determine the target control identifier that matches the text information of the voice control signal, and then search the configuration file for the touch event corresponding to the target control identifier.
  • the electronic device when running an interface of the application, can determine the touch event corresponding to the voice control signal input by the user according to the configuration file of the interface, and then execute the touch event, so as to realize voice control The function of each control in the application interface.
  • the electronic device may also display an animation effect when a touch operation is performed on the control in the first interface.
  • the user is reminded by displaying an animation effect that the user is currently responding to voice control to improve the user's experience.
  • the electronic device may first respond to the wake-up signal input by the user, and start the voice application in the background; and then collect the voice control signal of the user through the voice application.
  • the input events of the voice application and the current operating system are combined to determine the input event corresponding to the collected voice control signal , And then reuse the operating process of the input event of the operating system, without the need for adaptive development of the application, you can complete the voice task.
  • the electronic device may also perform the touch operation.
  • an embodiment of the present application provides an electronic device including a processor and a memory.
  • the memory is used to store one or more computer programs; when the one or more computer programs stored in the memory are executed by the processor, the electronic device can implement any one of the possible design methods in any of the foregoing aspects.
  • an embodiment of the present application also provides a device, which includes a module/unit that executes any one of the possible design methods in any of the foregoing aspects.
  • modules/units can be realized by hardware, or by hardware executing corresponding software.
  • an embodiment of the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium includes a computer program.
  • the computer program runs on an electronic device, the electronic device executes any one of the above aspects.
  • a possible design method is also provided.
  • the embodiments of the present application also provide a method that includes a computer program product, and when the computer program product runs on an electronic device, the electronic device executes any one of the possible designs in any of the foregoing aspects.
  • FIG. 1 is a schematic diagram of a voice control system provided by an embodiment of this application.
  • FIG. 2 is a schematic structural diagram of a mobile phone provided by an embodiment of the application.
  • FIG. 3 is a schematic diagram of the architecture of an operating system in an electronic device provided by an embodiment of the application.
  • FIG. 4 is a schematic diagram of an interface provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of a scene of a voice control method provided by an embodiment of this application.
  • FIG. 6 is a schematic diagram of a scene of another voice control method provided by an embodiment of the application.
  • Figure 7a is a schematic diagram of another interface provided by an embodiment of the application.
  • 7b to 7g are schematic diagrams of scenes of another voice control method provided by an embodiment of the application.
  • FIG. 8 is a schematic diagram of another voice control method provided by an embodiment of the application.
  • FIG. 9 is a schematic diagram of the interface of the voice assist function switch and the voice wake-up function switch provided by an embodiment of the application.
  • Figures 10a to 10b are schematic diagrams of scenarios of another voice control method provided by an embodiment of the application.
  • FIG. 11 is a schematic flowchart of a voice control method provided by an embodiment of this application.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of this application.
  • the voice control method provided by the embodiments of this application can be applied to mobile phones, tablet computers, desktops, laptops, notebook computers, ultra-mobile personal computers (UMPC), handheld computers, netbooks, and personal digital computers.
  • UMPC ultra-mobile personal computers
  • PDA personal digital assistants
  • wearable electronic devices wearable electronic devices
  • virtual reality devices the embodiments of the present application do not impose any limitation on this.
  • Fig. 2 shows a schematic structural diagram of the mobile phone.
  • the mobile phone may include a processor 110, an external memory interface 120, an internal memory 121, a USB interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, and audio Module 170, speaker 170A, receiver 170B, microphone 170C, earphone interface 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display screen 194, SIM card interface 195 and so on.
  • a processor 110 an external memory interface 120, an internal memory 121, a USB interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, and audio Module 170, speaker 170A, receiver 170B, microphone 170C, earphone interface 170D, sensor module 180, buttons 190, motor 191, indicator 192, camera 193, display screen 194, SIM card interface 195 and so on.
  • the sensor module 180 may include a gyroscope sensor 180A, an acceleration sensor 180B, a proximity light sensor 180G, a fingerprint sensor 180H, a touch sensor 180K, and a hinge sensor 180M (Of course, the mobile phone 100 may also include other sensors, such as temperature sensors, pressure sensors, and distance sensors. Sensors, magnetic sensors, ambient light sensors, air pressure sensors, bone conduction sensors, etc., not shown in the figure).
  • the structure illustrated in the embodiment of the present invention does not constitute a specific limitation on the mobile phone 100.
  • the mobile phone 100 may include more or fewer components than shown, or combine certain components, or split certain components, or arrange different components.
  • the illustrated components can be implemented in hardware, software, or a combination of software and hardware.
  • the processor 110 may include one or more processing units.
  • the processor 110 may include an application processor (AP), a modem processor, a graphics processing unit (GPU), and an image signal processor. (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (Neural-network Processing Unit, NPU) Wait.
  • AP application processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • NPU neural network Processing Unit
  • the different processing units may be independent devices or integrated in one or more processors.
  • the controller may be the nerve center and command center of the mobile phone 100. The controller can generate operation control signals according to the instruction operation code and timing signals to complete the control of fetching and executing instructions.
  • a memory may also be provided in the processor 110 to store instructions and data.
  • the memory in the processor 110 is a cache memory.
  • the memory can store instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to use the instruction or data again, it can be directly called from the memory. Repeated accesses are avoided, the waiting time of the processor 110 is reduced, and the efficiency of the system is improved.
  • the processor 110 can run the voice control method provided by the embodiments of the present application.
  • the method converts the voice control signal into an existing touch event, thereby realizing the support of the existing graphical user interface for the voice interaction mode, reducing development work Enhance the voice interaction function of electronic equipment.
  • the processor 110 integrates different devices, such as integrated CPU and GPU, the CPU and GPU can cooperate to execute the voice control method provided in the embodiment of the present application. For example, some of the algorithms in the method are executed by the CPU and the other part of the algorithms are executed by the GPU to obtain Faster processing efficiency.
  • the display screen 194 is used to display images, videos, etc.
  • the display screen 194 includes a display panel.
  • the display panel can adopt liquid crystal display (LCD), organic light-emitting diode (OLED), active-matrix organic light-emitting diode or active-matrix organic light-emitting diode (active-matrix organic light-emitting diode).
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • active-matrix organic light-emitting diode active-matrix organic light-emitting diode
  • AMOLED flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, quantum dot light-emitting diode (QLED), etc.
  • the mobile phone 100 may include one or N display screens 194, and N is a positive integer greater than one.
  • the display screen can accept the user's touch operation and display the graphical user interface.
  • the display screen can also display the touch operation corresponding to the voice control signal when receiving the voice control signal. The animation effect of the event and the interface after execution.
  • the camera 193 (front camera or rear camera) is used to capture still images or videos.
  • the camera 193 may include photosensitive elements such as a lens group and an image sensor, where the lens group includes a plurality of lenses (convex lens or concave lens) for collecting light signals reflected by the object to be photographed and transmitting the collected light signals to the image sensor .
  • the image sensor generates an original image of the object to be photographed according to the light signal.
  • the internal memory 121 may be used to store computer executable program code, where the executable program code includes instructions.
  • the processor 110 executes various functional applications and data processing of the mobile phone 100 by running instructions stored in the internal memory 121.
  • the internal memory 121 may include a storage program area and a storage data area.
  • the storage program area can store operating system, application program (such as camera application, WeChat application, etc.) codes and so on.
  • the data storage area can store data created during the use of the mobile phone 100 (for example, images and videos collected by a camera application).
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), etc.
  • a non-volatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash storage (UFS), etc.
  • the functions of the sensor module 180 are described below.
  • the gyroscope sensor 180A can be used to determine the movement posture of the mobile phone 100.
  • the angular velocity of the electronic device 100 around three axes ie, x, y, and z axes
  • the gyroscope sensor 180A can be used to detect the current movement state of the mobile phone 100, such as shaking or static.
  • the acceleration sensor 180B can detect the magnitude of the acceleration of the mobile phone 100 in various directions (generally three axes). That is, the gyroscope sensor 180A can be used to detect the current movement state of the mobile phone 100, such as shaking or static.
  • the proximity light sensor 180G may include, for example, a light emitting diode (LED) and a light detector such as a photodiode.
  • the light emitting diode may be an infrared light emitting diode.
  • the mobile phone emits infrared light through light-emitting diodes. Mobile phones use photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the phone. When insufficient reflected light is detected, the phone can determine that there is no object near the phone.
  • the gyroscope sensor 180A (or acceleration sensor 180B) may send the detected motion state information (such as angular velocity) to the processor 110.
  • the processor 110 determines whether it is currently in a hand-held state or a tripod state based on the motion state information (for example, when the angular velocity is not 0, it means that the mobile phone 100 is in the hand-held state).
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the mobile phone 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access application locks, fingerprint photographs, fingerprint answering calls, and so on.
  • Touch sensor 180K also called “touch panel”.
  • the touch sensor 180K may be disposed on the display screen 194, and the touch screen is composed of the touch sensor 180K and the display screen 194, which is also called a “touch screen”.
  • the touch sensor 180K is used to detect touch operations acting on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • the visual output related to the touch operation can be provided through the display screen 194.
  • the touch sensor 180K may also be disposed on the surface of the mobile phone 100, which is different from the position of the display screen 194.
  • the display screen 194 of the mobile phone 100 displays a main interface, and the main interface includes icons of multiple applications (such as a camera application, a WeChat application, etc.).
  • the display screen 194 displays an interface of the camera application, such as a viewfinder interface.
  • the wireless communication function of the mobile phone 100 can be realized by the antenna 1, the antenna 2, the mobile communication module 151, the wireless communication module 152, the modem processor, and the baseband processor.
  • the antenna 1 and the antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in the terminal device 100 can be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization.
  • antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna can be used in combination with a tuning switch.
  • the mobile communication module 151 can provide a wireless communication solution including 2G/3G/4G/5G and the like applied to the terminal device 100.
  • the mobile communication module 151 may include at least one filter, switch, power amplifier, low noise amplifier (LNA), etc.
  • the mobile communication module 151 can receive electromagnetic waves by the antenna 1, filter and amplify the received electromagnetic waves, and transmit them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signal modulated by the modem processor, and convert it into electromagnetic waves for radiation via the antenna 1.
  • at least part of the functional modules of the mobile communication module 151 may be provided in the processor 110.
  • at least part of the functional modules of the mobile communication module 151 and at least part of the modules of the processor 110 may be provided in the same device.
  • the modem processor may include a modulator and a demodulator.
  • the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low-frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is processed by the baseband processor and then passed to the application processor.
  • the application processor outputs a sound signal through an audio device (not limited to the speaker 170A, the receiver 170B, etc.), or displays an image or video through the display screen 194.
  • the modem processor may be an independent device.
  • the modem processor may be independent of the processor 110 and be provided in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 152 can provide applications on the terminal device 100 including wireless local area networks (WLAN) (such as wireless fidelity (Wi-Fi) networks), bluetooth (BT), and global navigation satellites.
  • WLAN wireless local area networks
  • BT wireless fidelity
  • GNSS global navigation satellite system
  • FM frequency modulation
  • NFC near field communication technology
  • infrared technology infrared, IR
  • the wireless communication module 152 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 152 receives electromagnetic waves via the antenna 2, frequency modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110.
  • the wireless communication module 152 can also receive the signal to be sent from the processor 110, perform frequency modulation, amplify it, and convert it into electromagnetic wave radiation through the antenna 2.
  • the mobile phone 100 can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. For example, music playback, recording, etc.
  • the mobile phone 100 can receive the key 190 input, and generate key signal input related to the user settings and function control of the mobile phone 100.
  • the mobile phone 100 can use the motor 191 to generate a vibration notification (such as an incoming call vibration notification).
  • the indicator 192 in the mobile phone 100 can be an indicator light, which can be used to indicate the charging status, power change, and can also be used to indicate messages, missed calls, notifications, and so on.
  • the SIM card interface 195 in the mobile phone 100 is used to connect to the SIM card.
  • the SIM card can be connected to and separated from the mobile phone 100 by inserting into the SIM card interface 195 or pulling out from the SIM card interface 195.
  • the mobile phone 100 may include more or less components than those shown in FIG. 1, which is not limited in the embodiment of the present application.
  • the software system of the above electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture.
  • the embodiment of the present application takes a layered Android system as an example to illustrate the software structure of the electronic device 100.
  • FIG. 3 is a block diagram of the software structure of the electronic device 100 according to an embodiment of the present application.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Communication between layers through software interface.
  • the Android system is divided into four layers, from top to bottom, the application layer, the application framework layer, the Android runtime and system library, and the kernel layer.
  • the application layer can include a series of application packages. As shown in Figure 3, the application package can include applications such as camera, gallery, calendar, call, map, navigation, Bluetooth, music, video, short message, etc.
  • the application layer may also include a voice application with a voice recognition function.
  • the voice control signal sent by the user can be collected, and the voice control signal can be converted into text for semantic understanding.
  • the voice application can be converted into a touch event of the application program to complete the voice task.
  • the voice application can communicate with the background server to complete the voice task.
  • a voice application consists of two parts.
  • One part is a voice service running in the background, which is used to collect voice signals input by users, extract voice signals, text conversion or voice recognition, etc., and the other part refers to the mobile phone screen
  • the display content is used to display the interface of the voice application, such as the content of the dialogue between the user and the voice application.
  • the mobile phone running a voice application in the background can be understood as the mobile phone running a voice service in the background.
  • the mobile phone can also display information such as the identification of the voice APP in the form of a floating menu or the like, and the embodiment of the present application does not impose any restriction on this.
  • the application framework layer provides application programming interfaces (application programming interface, API) and programming frameworks for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer can include a window manager, a content provider, a view system, a phone manager, a resource manager, and a notification manager.
  • the window manager is used to manage window programs.
  • the window manager can obtain the size of the display, determine whether there is a status bar, lock the screen, take a screenshot, etc.
  • the content provider is used to store and retrieve data and make these data accessible to applications.
  • the data may include video, image, audio, phone calls made and received, browsing history and bookmarks, phone book, etc.
  • the view system includes visual controls, such as controls that display text and controls that display pictures.
  • the view system can be used to build applications.
  • the display interface can be composed of one or more views.
  • a display interface that includes a short message notification icon may include a view that displays text and a view that displays pictures.
  • the phone manager is used to provide the communication function of the electronic device 100. For example, the management of the call status (including connecting, hanging up, etc.).
  • the resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, etc.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and it can disappear automatically after a short stay without user interaction.
  • the notification manager is used to notify the download completion, message reminder, etc.
  • the notification manager can also be a notification that appears in the status bar at the top of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, text messages are prompted in the status bar, prompt sounds, electronic devices vibrate, and indicator lights flash.
  • the application framework layer also includes a VUI (voice user interface, voice user interface) manager.
  • VUI voice user interface, voice user interface
  • the VUI manager can monitor the running status of voice applications, and can also be used as a bridge between voice applications and other applications, passing the voice tasks recognized by the voice applications to related applications for execution.
  • Android Runtime includes core libraries and virtual machines. Android runtime is responsible for the scheduling and management of the Android system.
  • the core library consists of two parts: one part is the function functions that the java language needs to call, and the other part is the core library of Android.
  • the application layer and the application framework layer run in a virtual machine.
  • the virtual machine executes the java files of the application layer and the application framework layer as binary files.
  • the virtual machine is used to perform functions such as object life cycle management, stack management, thread management, security and exception management, and garbage collection.
  • the system library can include multiple functional modules. For example: surface manager (surface manager), media library (Media Libraries), three-dimensional graphics processing library (for example: OpenGL ES), 2D graphics engine (for example: SGL), etc.
  • the surface manager is used to manage the display subsystem and provides a combination of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files.
  • the media library can support multiple audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to realize 3D graphics drawing, image rendering, synthesis, and layer processing.
  • the 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer includes at least a display driver, a camera driver, an audio driver, a sensor driver, etc., which are not limited in the embodiment of the present application.
  • This application provides a voice control method that combines voice applications and input events of the current operating system (such as virtual key input events, key input events, and screen touch events, etc.) to determine the collected voice control
  • the input event corresponding to the signal is then reused with the operating process of the input event of the operating system, and the voice task can be completed without the need for adaptive development of the application.
  • This method makes full use of the operational convenience of voice control, uses voice control when the user is inconvenient for manual operation, and combines a graphical user interface to improve the user's voice experience.
  • the GUI (graphical user interface) displayed by the mobile phone generally includes one or more controls.
  • the elements presented in the GUI can be called controls, which can provide users with certain operations.
  • Figure 4 is a GUI schematic diagram of a new contact in a mobile phone application. As can be seen from the figure, each input box has a prompt text, such as "name”, “work unit”, “phone number”, “email”, and “remarks”. There is also corresponding text information on the button, such as "Add another item". When the voice control function of the phone is turned on, the phone will start the voice application in the background.
  • the user can send a voice control signal to the mobile phone through the voice application, and then the mobile phone determines the corresponding control and the type of the control from the current interface according to the voice control signal, and then performs a touch operation corresponding to the control type on the control.
  • the mobile phone displays the interface of Figure 4, if the user sends out a voice control signal of "Name Zhang San", the voice application uses a microphone to collect the voice control signal input by the user, and extracts, text conversion or voice recognition of the voice control signal. Obtain the voice content "Name Zhang San”, then the mobile phone finds the touch event corresponding to "Name” from the configuration file of the interface according to the voice content "Name” to perform an input operation on the control 202, and first focus on the input Then call the input method to set the voice content "Zhang San” as the input content into the input box, as shown in Figure 5.
  • the mobile phone can also display the animation effect of inputting "Zhang San” as the input content, which visually reminds the user that the mobile phone is responding to the user's input of "Zhang San”.
  • the mobile phone displays the interface in Figure 4, if the user sends out a voice control signal of "add other items", the voice application uses a microphone to collect the voice control signal input by the user, and extracts, text conversion, or voice recognition of the voice control signal , Get the voice content "add other items”. Then, according to the voice content "add other items", the mobile phone finds from the configuration file that the corresponding touch event is to perform a click operation on the control 203, so the click operation is performed on the button, as shown in FIG.
  • the voice application can use a preset voice recognition algorithm to convert the voice control signal input by the user into text and perform semantic understanding, so as to find the control based on the voice content after semantic understanding.
  • the phone can start a voice application in the background.
  • the icon 201 of the voice application may be displayed on the interface shown in FIG. 4.
  • the icon 201 is used to indicate that the voice application is running in the background of the mobile phone.
  • the mobile phone can still respond to the user's various touch operations in the interface, for example, the mobile phone responds when the user clicks the "add other item" click operation.
  • it can also be set by default that when the voice application is running in the background, the mobile phone does not respond to various touch operations of the user on the interface, which is not limited in the embodiment of the present application.
  • Figure 7a is the interface of the ticketing application. If the mobile phone is in the interface shown in Figure 7a, when the user sends out the voice control signal of "ticket", the voice application uses the microphone to collect the voice control signal input by the user and performs the voice control signal Extraction, text conversion or voice recognition, to get the voice content "ticket”. Then, the mobile phone finds that the corresponding touch event based on the voice content "ticket” is to perform a click operation on the control 204, so the mobile phone performs a click operation on the control, as shown in FIG. 7b. Then the mobile phone switches from the interface shown in Figure 7b to the interface shown in Figure 7c.
  • the voice application uses the microphone to collect the voice control signal input by the user, and extracts the voice control signal, text conversion or voice recognition, Get the voice content "Departure Shanghai”. Then the mobile phone finds the corresponding touch event according to the voice content "starting place”. The input operation is performed in the input box corresponding to the "starting place”, so first put the focus on the input box, and then call the input method to change the voice content. "Shanghai” is set as the input content to the input box, as shown in Figure 7d.
  • the voice application uses the microphone to collect the voice control signal input by the user, and extracts, texts, or converts the voice control signal. Voice recognition, get the voice content "Destination Beijing”. Then the mobile phone finds the corresponding touch event according to the voice content "destination” and performs the input operation in the input box corresponding to the "destination”, so first put the focus on the input box, and then call the input method to change the voice content " "Beijing" is set as the input content to the input box, as shown in Figure 7e.
  • the voice application uses the microphone to collect the voice control signal input by the user, and extracts and texts the voice control signal. Conversion or voice recognition, get the voice content "time March 6th”. Then the mobile phone finds the corresponding touch event according to the voice content "time” and performs the input operation in the input box corresponding to the "time”, so first put the focus on the input box, and then call the input method to change the voice content "March No. 6" is set as the input content to the input box, as shown in Figure 7f.
  • the voice application uses the microphone to collect the voice control signal input by the user, and extracts the voice control signal, text conversion or voice recognition, Get the voice content "Search”. Then the mobile phone "search” according to the voice content to find the corresponding touch event is to perform a click operation on the "search" control, as shown in Figure 7g
  • the embodiment of the present application combines the voice control function with the graphical user interface to realize that the existing graphical user interface supports voice control, improves the voice experience, and has a smaller development workload.
  • the background server extracts keywords based on the key information "departure”, “destination”, and “time” required by the pre-configured "ticket booking” task, and generates a VUI task.
  • the background server converts the VUI task into a corresponding control instruction and sends it to the corresponding application program.
  • the application program responds with a pre-customized code and displays the interface as shown in Figure 8b.
  • this embodiment of the application can provide a voice control method that combines the Talkback function and the voice control function, that is, the user turns on the Talkback function switch and the voice wake-up function switch, as shown in the figure 9 shown.
  • the voice application uses the microphone to collect the voice control signal input by the user, and extracts, texts, or converts the voice control signal.
  • the speech content "Zhang San” is obtained, and then the input method is called to set the speech content "Zhang San” as the input content into the input box, as shown in Figure 10b.
  • the mobile phone can also voice broadcast "Zhang San input completed” to remind the user that the operation was successful.
  • the above-mentioned operation method is more convenient and efficient than the traditional language assistance function, which is convenient for the blind and people with low vision to operate the mobile phone more conveniently, and further improves the user experience.
  • Step 301 The electronic device displays the first interface of the application.
  • the first interface includes one or more controls for updating the first interface.
  • the first interface displayed by the mobile phone is the interface shown in FIG. 4, and multiple controls such as a button "add other items" and an input box are provided in the interface.
  • the user can operate these controls to update the display content of the mobile phone, so that the mobile phone displays the updated second interface.
  • Step 302 The electronic device collects the user's voice control signal.
  • the mobile phone may set the microphone to be always on (always on). Then, while the mobile phone displays an application interface (for example, the first interface), the microphone of the mobile phone is also collecting voice control signals at a certain working frequency.
  • the user can start the voice application of the mobile phone by issuing a wake-up signal, and then the mobile phone collects the user's voice control signal through the voice application, and performs extraction, text conversion or voice recognition on it. For example, after the user sends out the sound signal of "Xiaoyi Xiaoyi", the mobile phone can collect the sound signal through the microphone. If the sound signal of the hand is the preset wake-up signal, the mobile phone starts the voice application to collect the voice control signal.
  • Step 303 The electronic device determines a touch event corresponding to the voice control signal.
  • a touch event refers to a touch operation performed on a control.
  • the electronic device may pre-store the configuration files of each application, for example, each application corresponds to one or more configuration files.
  • the configuration file records the correspondence between touch events and voice control signals in different interfaces of an application.
  • a configuration file can also only record the correspondence between touch events and voice control signals in one interface of an application.
  • All controls on the Android-based interface are mounted under a DecorView node under the window of the current interface.
  • the Android software system can scan each control identifier from DecorView and communicate with the user The text information of the spoken voice control signal is compared to determine the target control identifier corresponding to the text information of the voice control signal, and then the touch event corresponding to the target control identifier is searched from the configuration file.
  • the developer can set the configuration file 1 of the new contact interface in the installation package of the phone book application.
  • the configuration file 1 records the corresponding relationship between each touch event and voice control signal in the new contact interface.
  • the input event of the "Name” input box corresponds to the control identifier "Name”
  • the control identifier "Name” corresponds to voice control.
  • the text information of the signal corresponds.
  • the click operation of "add other item” corresponds to the control identifier "add other item”
  • the control identifier "add other item” corresponds to the text information "add other item” of the voice control signal.
  • the electronic device receives the voice control signal, it can find the touch event corresponding to the voice control signal from the configuration file. That is to say, the corresponding relationship between the “voice control signal” and the touch event of clicking the first control in the first interface is recorded in the configuration file 1.
  • the mobile phone when the mobile phone receives the voice control signal for the user to input "name”, it is equivalent to the mobile phone detecting that the user clicks on the "name” input box, and the focus falls on the input box.
  • the electronic device can directly install the configuration file locally.
  • the configuration file 1 provided in the phonebook application installation package can be stored in the memory of the mobile phone. In this way, the mobile phone can support the voice control function even if it is not connected to the Internet.
  • Step 304 In response to the voice control signal, the electronic device executes the touch event and displays a second interface of the application, where the second interface is after the first control in the first interface performs a touch operation Interface.
  • corresponding configuration files can be set for each interface in the application, and the configuration file records the voice control signal supported by the corresponding interface and the touch event corresponding to the voice control signal.
  • the electronic device can determine the touch event corresponding to the voice control signal input by the user according to the configuration file of the interface, and then execute the touch event, so as to realize the voice control of the application interface The function of each control.
  • the interface to which the electronic device can be applied is granular to realize the voice control function of each operation button in the interface, thereby improving voice control efficiency and user experience.
  • the electronic device determines that the voice control signal input by the user is not a voice control signal supported by the configuration file, it can also send the voice control signal to the background server, and the background server determines the task Type and extract key information to generate VUI tasks.
  • the background server converts the VUI task into a corresponding control instruction and sends it to the corresponding application program. For a specific example, see scenario three.
  • control in the interface of some applications of the electronic device may not display the name or text prompt information of the control, and the embodiment of the present application may provide the text prompt information of this type of control in the interface.
  • the controls are configured with "android:contentDescribtion" information. This type of control can directly reuse the configured text description of android:contentDescribtion, that is, these texts The description is displayed in the interface as a text prompt for this type of control.
  • some touch events may be preset in the embodiment of the present application, such as: upward movement corresponding to the voice control signal “up”, The downward movement operation corresponding to the voice control signal “bottom”, the left movement operation corresponding to the voice control signal “left”, and the right movement operation corresponding to the voice control signal “right” are used to simulate the direction stick operation or keyboard
  • the up, down, left, and right key operations of the handles the movement of the focus of the current control.
  • the core of the voice control method provided by the embodiments of the application is to determine the touch event according to the voice control signal, that is, find the corresponding control, and then simulate the corresponding touch event (such as click, Long press), input method events (such as text input) and key operations (such as up, down, left, and right movement), the GUI of the application does not need to be adaptively developed to implement specific voice control functions.
  • an embodiment of the present application discloses an electronic device, including: a touch screen 1201, the touch screen 1201 includes a touch-sensitive surface 1206 and a display screen 1207; one or more processors 1202; a memory 1203; a communication module 1208
  • One or more application programs (not shown); and one or more computer programs 1204, each of the above-mentioned devices may be connected through one or more communication buses 1205.
  • the one or more computer programs 1204 are stored in the aforementioned memory 1203 and are configured to be executed by the one or more processors 1202, and the one or more computer programs 1204 include instructions, which can be used to execute the aforementioned implementations. Each step in the example, for example, the instruction can be used to execute each step shown in FIG. 11.
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores computer instructions, and when the computer instructions run on an electronic device, the electronic device executes the above-mentioned related method steps to implement the above-mentioned embodiment Methods.
  • the embodiments of the present application also provide a computer program product, which when the computer program product runs on a computer, causes the computer to execute the above-mentioned related steps to implement the method in the above-mentioned embodiment.
  • the embodiments of the present application also provide a device.
  • the device may specifically be a chip, component or module.
  • the device may include a connected processor and a memory; wherein the memory is used to store computer execution instructions.
  • the processor can execute the computer-executable instructions stored in the memory, so that the chip executes the methods in the foregoing method embodiments.
  • the electronic devices, computer storage media, computer program products, or chips provided in the embodiments of this application are all used to execute the corresponding methods provided above. Therefore, the beneficial effects that can be achieved can refer to the corresponding methods provided above. The beneficial effects of the method are not repeated here.
  • the disclosed device and method may be implemented in other ways.
  • the device embodiments described above are merely illustrative.
  • the division of modules or units is only a logical function division.
  • there may be other division methods for example, multiple units or components may be combined or It can be integrated into another device, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separate, and the components displayed as units may be one physical unit or multiple physical units, that is, they may be located in one place, or they may be distributed to multiple different places. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • each unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solutions of the embodiments of the present application are essentially or the part that contributes to the prior art, or all or part of the technical solutions can be embodied in the form of software products, which are stored in a storage medium. It includes several instructions to make a device (may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read only memory (read only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephone Function (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

一种语音控制方法及电子设备,涉及通信技术领域,可在运行应用的过程中提示用户执行与该应用相关的语音任务,提高电子设备的语音控制效率和用户体验。该方法包括:显示应用的第一界面,第一界面包括用于更新第一界面的控件(301);然后采集用户的语音控制信号(302),确定与语音控制信号对应的触控事件,触控事件为对控件执行触控操作(303);响应于语音控制信号,电子设备执行触控事件,并显示应用的第二界面,第二界面是第一界面中的控件执行触控操作后的界面(304)。

Description

一种语音控制方法及电子设备
本申请要求在2019年3月8日提交中国国家知识产权局、申请号为201910176543.1、发明名称为“一种语音控制方法及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端技术领域,尤其涉及一种语音控制方法及电子设备。
背景技术
语音识别技术其目标是将人类的语音中的词汇内容转换为计算机可读的输入。目前,许多手机都安装了用于语音识别的语音助手(例如,小爱同学、Siri以及小艺等)。一般,手机会预先设置一个或多个唤醒信号(例如,敲击信号或者“你好,小E”等唤醒词)。当检测到用户输入这些唤醒信号时,说明用户此时有使用语音识别功能的意图,因此触发手机启动语音应用进行语音识别,进而执行相应的语音任务。
目前手机在出厂前会预先配置好手机能够识别并执行的语音任务,例如查询天气的语音任务、订机票的语音任务等。为了实现这些语音任务,语音助手对应的后台服务器上需要配置任务类型,然后根据任务类型设计对话流程,从而获取与该任务类型对应的信息。如图1所示,当用户对着手机发出“帮我订一张明天早上从上海到深圳的机票”的语音控制信号时,手机的语音助手采集语音控制信号,将语音控制信号发送至后台服务器,后台服务器先读取任务类型“订机票”,然后根据预配置的“订机票”任务类型所需要的关键信息“出发地”、“目的地”、“时间”来提取该语音控制信号中的关键词,生成语音用户界面(voice user interface,VUI)任务。后台服务器将VUI任务转换成相应的控制指令发送给对应的应用程序,该应用程序通过预先定制的代码进行响应,并输出查询结果。可见,现有技术需要后台服务器预先配置任务类型和关键信息,任务配置量很大,而且为了适配语音任务,开发人员还需要对支持语音交互的应用程序进行适配性地开发。
发明内容
本申请提供一种语音控制方法及电子设备,通过结合图形用户界面来支持语音控制,提升用户语音控制体验,开发工作量较小。
第一方面,本申请实施例提供了一种语音控制方法,所述方法适用于电子设备,该方法包括:电子设备显示应用的第一界面,第一界面上包括用于更新第一界面的控件,然后电子设备采集用户的语音控制信号,当确定与语音控制信号对应的触控事件时,则响应于该语音控制信号,执行与之对应的触控事件,最终显示应用的第二界面,其中第二界面是第一界面中的控件执行触控操作后的界面。
本申请实施例中,电子设备根据采集的语音控制信号确定对应的输入事件,继而复用操作***的输入事件的操作流程,不需要应用做适配性地开发,就可以完成语音任务。该方法充分利用语音控制的操作便利性,在用户不方便手动操作时使用语音控制,同时又结合图形用户界面,提升用户的语音体验。
在一种可能的设计中,电子设备先获取与所述第一界面关联的配置文件,其中,配置文件中包括所述第一界面中的控件对应的控件标识和触控事件的对应关系,因此电子设备可以确定与所述语音控制信号的文本信息相符的目标控件标识,进而从配置文件中查找与目标控件标识对应的触控事件。
本申请实施例中,在运行该应用的某一界面时,电子设备可根据该界面的配置文件确定用户输入的语音控制信号对应的触控事件,进而执行该触控事件,从而实现通过语音控制应用的界面中各个控件的功能。
在一种可能的设计中,电子设备还可以显示对第一界面中的控件执行触控操作时的动画效果。
本申请实施例中,通过显示动画效果提醒用户,当前正在响应用户的语音控制,提升用户的体验。
在一种可能的设计中,电子设备可以先响应于用户输入的唤醒信号,在后台启动语音应用;然后通过语音应用采集用户的语音控制信号。
本申请实施例中,将语音应用和当前操作***的输入事件(例如虚拟按键的输入事件、按键输入事件和屏幕触摸(Touch)事件等)相结合,确定所采集的语音控制信号对应的输入事件,继而复用操作***的输入事件的操作流程,不需要应用做适配性地开发,就可以完成语音任务。
在一种可能的设计中,当检测到用户对所述第一界面中的控件的触控操作时,电子设备还可以执行触控操作。
本申请实施例中,通过触控操作和语音控制结合,便于用户将语音控制功能与触控功能相结合,提升用户体验,提高操作效率。
第二方面,本申请实施例提供一种电子设备,包括处理器和存储器。其中,存储器用于存储一个或多个计算机程序;当存储器存储的一个或多个计算机程序被处理器执行时,使得该电子设备能够实现上述任一方面的任意一种可能的设计的方法。
第三方面,本申请实施例还提供一种装置,该装置包括执行上述任一方面的任意一种可能的设计的方法的模块/单元。这些模块/单元可以通过硬件实现,也可以通过硬件执行相应的软件实现。
第四方面,本申请实施例中还提供一种计算机可读存储介质,该计算机可读存储介质包括计算机程序,当计算机程序在电子设备上运行时,使得电子设备执行上述任一方面的任意一种可能的设计的方法。
第五方面,本申请实施例还提供一种包含计算机程序产品,当计算机程序产品在电子设备上运行时,使得所述电子设备执行上述任一方面的任意一种可能的设计的方法。
本申请的这些方面或其他方面在以下实施例的描述中会更加简明易懂。
附图说明
图1为本申请实施例提供的一种语音控制***示意图;
图2为本申请实施例提供的一种手机结构示意图;
图3为本申请实施例提供的电子设备内操作***的架构示意图;
图4为本申请实施例提供的一种界面示意图;
图5为本申请实施例提供的一种语音控制方法的场景示意图;
图6为本申请实施例提供的另一种语音控制方法的场景示意图;
图7a为本申请实施例提供的另一种界面示意图;
图7b至图7g为本申请实施例提供的另一种语音控制方法的场景示意图;
图8为本申请实施例提供的另一种语音控制方法的场景示意图;
图9为本申请实施例提供的语音辅助功能开关和语音唤醒功能开关的界面示意图;
图10a值至图10b为本申请实施例提供的另一种语音控制方法的场景示意图;
图11为本申请实施例提供的一种语音控制方法流程示意图;
图12为本申请实施例提供的一种电子设备的结构示意图。
具体实施方式
下面将结合附图对本实施例的实施方式进行详细描述。
本申请实施例提供的一种语音控制方法可应用于手机、平板电脑、桌面型、膝上型、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、手持计算机、上网本、个人数字助理(personal digital assistant,PDA)、可穿戴电子设备、虚拟现实设备等电子设备中,本申请实施例对此不做任何限制。
以电子设备是手机为例,图2示出了手机的结构示意图。
手机可以包括处理器110,外部存储器接口120,内部存储器121,USB接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及SIM卡接口195等。其中传感器模块180可以包括陀螺仪传感器180A,加速度传感器180B,接近光传感器180G、指纹传感器180H,触摸传感器180K、转轴传感器180M(当然,手机100还可以包括其它传感器,比如温度传感器,压力传感器、距离传感器、磁传感器、环境光传感器、气压传感器、骨传导传感器等,图中未示出)。
可以理解的是,本发明实施例示意的结构并不构成对手机100的具体限定。在本申请另一些实施例中,手机100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(Neural-network Processing Unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。其中,控制器可以是手机100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了***的效率。
处理器110可以运行本申请实施例提供的语音控制方法,该方法通过将语音控制信号转换成已有的触控事件,从而实现现有图形用户界面对语音交互方式的支持,减小了开发工作 量,增强了电子设备的语音交互功能。当处理器110集成不同的器件,比如集成CPU和GPU时,CPU和GPU可以配合执行本申请实施例提供的语音控制方法,比如方法中部分算法由CPU执行,另一部分算法由GPU执行,以得到较快的处理效率。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。在一些实施例中,手机100可以包括1个或N个显示屏194,N为大于1的正整数。
在本申请实施例中,显示屏可以接受用户的触控操作,对图形用户界面进行显示,另外,显示屏也可以在接收语音控制信号的情况下,显示执行与该语音控制信号对应的触控事件的动画效果和执行后的界面。
摄像头193(前置摄像头或者后置摄像头)用于捕获静态图像或视频。通常,摄像头193可以包括感光元件比如镜头组和图像传感器,其中,镜头组包括多个透镜(凸透镜或凹透镜),用于采集待拍摄物体反射的光信号,并将采集的光信号传递给图像传感器。图像传感器根据所述光信号生成待拍摄物体的原始图像。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行手机100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作***,应用程序(比如相机应用,微信应用等)的代码等。存储数据区可存储手机100使用过程中所创建的数据(比如相机应用采集的图像、视频等)等。
此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。
下面介绍传感器模块180的功能。
陀螺仪传感器180A,可以用于确定手机100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180A确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。即陀螺仪传感器180A可以用于检测手机100当前的运动状态,比如抖动还是静止。
加速度传感器180B可检测手机100在各个方向上(一般为三轴)加速度的大小。即陀螺仪传感器180A可以用于检测手机100当前的运动状态,比如抖动还是静止。接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。手机通过发光二极管向外发射红外光。手机使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定手机附近有物体。当检测到不充分的反射光时,手机可以确定手机附近没有物体。
陀螺仪传感器180A(或加速度传感器180B)可以将检测到的运动状态信息(比如角速度)发送给处理器110。处理器110基于运动状态信息确定当前是手持状态还是脚架状态(比如,角速度不为0时,说明手机100处于手持状态)。
指纹传感器180H用于采集指纹。手机100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其 上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于手机100的表面,与显示屏194所处的位置不同。
示例性的,手机100的显示屏194显示主界面,主界面中包括多个应用(比如相机应用、微信应用等)的图标。用户通过触摸传感器180K点击主界面中相机应用的图标,触发处理器110启动相机应用,打开摄像头193。显示屏194显示相机应用的界面,例如取景界面。
手机100的无线通信功能可以通过天线1,天线2,移动通信模块151,无线通信模块152,调制解调处理器以及基带处理器等实现。
天线1和天线2用于发射和接收电磁波信号。终端设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。
移动通信模块151可以提供应用在终端设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块151可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块151可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块151的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块151的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。
无线通信模块152可以提供应用在终端设备100上的包括无线局域网(wireless local area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星***(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块152可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块152经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块152还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。
另外,手机100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。手机100可以接收按键190输入,产生与手机100的用户设置以及功能控制有关的键信号输入。手机100可以利用马达191产生振动提示(比如来电振动提示)。手机100中的指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。手机100中的SIM卡接口195用于连接SIM卡。SIM卡可以通过***SIM卡接口195,或从SIM卡接口195拔出,实现和手机100的接触和分离。
应理解,在实际应用中,手机100可以包括比图1所示的更多或更少的部件,本申请实施例不作限定。
上述电子设备100的软件***可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的Android***为例,示例性说明电子设备100的软件结构。
图3是本申请实施例的电子设备100的软件结构框图。
分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将Android***分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(Android runtime)和***库,以及内核层。
应用程序层可以包括一系列应用程序包。如图3所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,蓝牙,音乐,视频,短信息等应用程序。
在本申请实施例中,应用程序层中还可以包括具有语音识别功能的语音应用。例如,语音助手小E、小爱同学以及Siri等。
语音应用开启后可采集用户发出的语音控制信号,并将该语音控制信号转换为文本并进行语义理解。一种情况下,语音应用可被转换成应用程序的触控事件,以完成该语音任务,另一种情况下,语音应用可以与后台服务器进行通信,以完成语音任务。
一般,语音应用包括两部分,一部分是运行在后台的语音服务(service),用于采集用户输入的声音信号、对声音信号进行提取、文本转换或语音识别等,另一部分是指在手机屏幕中的显示内容,用于展示语音应用的界面,例如用户与语音应用的对话内容等。在本申请实施例中,可将手机在后台运行语音应用理解为手机在后台运行语音服务。当然,在后台运行语音服务时,手机也可以以悬浮菜单等形式显示语音APP的标识等信息,本申请实施例对此不做任何限制。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
如图3所示,应用程序框架层可以包括窗口管理器,内容提供器,视图***,电话管理器,资源管理器,通知管理器等。
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。
视图***包括可视控件,例如显示文字的控件,显示图片的控件等。视图***可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。
电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括接通,挂断等)。
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在***顶部状态栏的通知,例如后 台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。
在本申请实施例中,应用程序框架层中还包括VUI(voice user interface,声音用户界面)管理器。VUI管理器可以监测语音应用的运行状态,也可作为语音应用与其他应用之间的桥梁,将语音应用识别出的语音任务传递给相关的应用执行。
Android Runtime包括核心库和虚拟机。Android runtime负责安卓***的调度和管理。
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象生命周期的管理,堆栈管理,线程管理,安全和异常的管理,以及垃圾回收等功能。
***库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(Media Libraries),三维图形处理库(例如:OpenGL ES),2D图形引擎(例如:SGL)等。
表面管理器用于对显示子***进行管理,并且为多个应用程序提供了2D和3D图层的融合。
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。
2D图形引擎是2D绘图的绘图引擎。
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动等,本申请实施例对此不做任何限制。
本申请提供一种语音控制方法,该方法将语音应用和当前操作***的输入事件(例如虚拟按键的输入事件、按键输入事件和屏幕触摸(Touch)事件等)相结合,确定所采集的语音控制信号对应的输入事件,继而复用操作***的输入事件的操作流程,不需要应用做适配性地开发,就可以完成语音任务。该方法充分利用语音控制的操作便利性,在用户不方便手动操作时使用语音控制,同时又结合图形用户界面,提升用户的语音体验。
以下将结合附图和应用场景,对本申请实施例提供的语音控制方法进行详细介绍。
场景一
手机显示的GUI(图形用户界面)中一般包括一个或多个控件。一般,可将在GUI中呈现的元素称为控件,其能够为用户提供一定的操作,控件有多种类型,例如输入框(EditText)和按钮(Button)。图4为手机电话应用中的新建联系人的GUI示意图。从图中可见,每个输入框都有一个提示文本,比如“姓名”、“工作单位”、“电话号码”、“电子邮件”、“备注”。按钮上也有对应文本信息,比如“添加其它项”。当手机的语音控制功能被打开时,手机在后台会启动语音应用。用户可以通过语音应用向手机发出语音控制信号,然后手机根据语音控制信号从当前界面中确定对应的控件和控件的类型,然后对控件执行与控件的类型对应的触控操作。
例如,在手机显示图4的界面时,若用户发出“姓名张三”的语音控制信号,语音应用使用麦克风采集用户输入的语音控制信号,并对语音控制信号进行提取、文本转换或语音识别,得到语音内容“姓名张三”,那么手机根据语音内容“姓名”从该界面的配置文件中查找到与“姓名”对应的触控事件是对控件202执行输入操作,首先将焦点落在该输入框中, 然后调用输入法把语音内容“张三”作为输入内容设置到该输入框中,如图5所示。
另外,在图5a中,手机还可以将把“张三”作为输入内容进行输入操作的动画效果显示出来,从视觉上提示用户手机正在响应用户输入“张三”这一内容。
再比如,在手机显示图4的界面时,若用户发出“添加其它项”的语音控制信号,语音应用使用麦克风采集用户输入的语音控制信号,并对语音控制信号进行提取、文本转换或语音识别,得到语音内容“添加其它项”。那么手机根据语音内容“添加其它项”从配置文件中查找到对应的触控事件是对控件203执行点击操作,因此对该按钮执行点击操作,如图6所示。
其中,需要说明的是,对“添加其它项”执行点击操作以及之后的实现,可以复用电话本应用的原有实现,开发人员并不需要对电话本应用做适配性地开发。语音应用可使用预设的语音识别算法将用户输入的语音控制信号转换为文本并进行语义理解,从而根据语义理解后的语音内容查找控件。
为了避免遮挡手机正在显示的界面,手机可在后台启动语音应用。例如,如图4所示,手机在后台启动语音应用后,可在图4所示的界面中显示语音应用的图标201。该图标201用于指示语音应用正在手机后台运行。虽然语音应用在后台运行,手机仍可响应用户在该界面中的各种触控操作,例如用户点击“添加其它项”的点击操作时,手机作出响应等。当然,也可以默认设置当语音应用在后台运行时,手机不响应用户在该界面中的各种触控操作,本申请实施例对此不做限制。
场景二
图7a为票务类应用程序的界面,若手机处于图7a所示的界面时,用户发出“机票”的语音控制信号时,语音应用使用麦克风采集用户输入的语音控制信号,并对语音控制信号进行提取、文本转换或语音识别,得到语音内容“机票”。那么手机根据语音内容“机票”查找到对应的触控事件是对控件204执行点击操作,因此手机对该控件执行点击操作,如图7b所示。然后手机从图7b所示的界面切换为图7c所示的界面。若手机处于图7c所示的界面时,用户发出“出发地上海”的语音控制信号时,语音应用使用麦克风采集用户输入的语音控制信号,并对语音控制信号进行提取、文本转换或语音识别,得到语音内容“出发地上海”。那么手机根据语音内容“出发地”查找到对应的触控事件是在“出发地”对应的输入框中执行输入操作,因此首先将焦点落在该输入框中,然后调用输入法把语音内容“上海”作为输入内容设置到该输入框中,如图7d所示。
进一步地,若手机处于图7d所示的界面时,用户发出“目的地北京”的语音控制信号时,语音应用使用麦克风采集用户输入的语音控制信号,并对语音控制信号进行提取、文本转换或语音识别,得到语音内容“目的地北京”。那么手机根据语音内容“目的地”查找到对应的触控事件是在“目的地”对应的输入框中执行输入操作,因此首先将焦点落在该输入框中,然后调用输入法把语音内容“北京”作为输入内容设置到该输入框中,如图7e所示。同样地,若手机处于图7e所示的界面时,用户发出“时间3月6号”的语音控制信号时,语音应用使用麦克风采集用户输入的语音控制信号,并对语音控制信号进行提取、文本转换或语音识别,得到语音内容“时间3月6号”。那么手机根据语音内容“时间”查找到对应的触控事件是在“时间”对应的输入框中执行输入操作,因此首先将焦点落在该输入框中,然后调用输入法把语音内容“3月6号”作为输入内容设置到该输入框中,如图7f所示。
最后,若手机处于图7f所示的界面时,用户发出“搜索”的语音控制信号时,语音应用 使用麦克风采集用户输入的语音控制信号,并对语音控制信号进行提取、文本转换或语音识别,得到语音内容“搜索”。那么手机根据语音内容“搜索”查找到对应的触控事件是对“搜索”控件执行点击操作,如图7g所示
可见,与现有技术相比,本申请实施例通过将语音控制功能与图形用户界面相结合,实现现有的图形用户界面支持语音控制,提升语音体验,且开发工作量较小。
场景三
参见图8,在手机显示电话本应用的新建联系人的界面时,如图8a所示,若用户发出“帮我订一张明天早上从上海到深圳的机票”的语音控制信号时,语音应用使用麦克风采集用户输入的语音控制信号,并对语音控制信号进行提取、文本转换或语音识别,得到语音内容“帮我订一张明天早上从上海到深圳的机票”。当手机根据该语音内容未从图8a所示的界面的配置文件中查找到对应的触控事件,则将该语音控制信号转发至后台服务器。后台服务器根据预配置的“订机票”任务所需要的关键信息“出发地”、“目的地”、“时间”来提取关键词,生成VUI任务。后台服务器将VUI任务转换成相应的控制指令发送给对应的应用程序,该应用程序通过预先定制的代码进行响应,并显示如图8b所示的界面。
场景四
针对盲人和视力低弱人士,本申请实施例可以提供一种将Talkback(语言辅助)功能和语音控制功能进行结合后的语音控制方法,即用户将Talkback功能开关和语音唤醒功能开关开启,如图9所示。假设用户是盲人,当盲人触控到控件202时,如图10a所示,手机会语音播报“请输入姓名”,这时焦点落在该输入框中。因此该用户可以直接在听到语音播报“请输入姓名”后,发出“张三”的语音控制信号,语音应用使用麦克风采集用户输入的语音控制信号,并对语音控制信号进行提取、文本转换或语音识别,得到语音内容“张三”,然后调用输入法把语音内容“张三”作为输入内容设置到该输入框中,如图10b所示。之后,手机还可以语音播报“张三输入完成”,以提醒用户操作成功。
可见,上述操作方式相较于传统的语言辅助功能更加简便高效,便于盲人和视力低弱人士更为方便地操作手机,进一步提升用户体验。
基于上述场景,本申请实施例提供的一种语音控制方法的流程,该方法由电子设备执行,如图11所示。
步骤301,电子设备显示应用的第一界面。其中,第一界面中包括用于更新第一界面的一个或多个控件。
例如,手机显示的第一界面为如图4所示的界面,该界面中设置有按钮“添加其它项”、以及输入框等多个控件。用户可操作这些控件更新手机的显示内容,使手机显示出更新后的第二界面。
步骤302,电子设备采集用户的语音控制信号。
示例性的,手机可将麦克风设置为常开状态(always on)。那么,手机在显示某一应用的界面(例如第一界面)的同时,手机的麦克风也在以一定的工作频率采集语音控制信号。在一种可能的实施例中,用户可以通过发出唤醒信号启动手机的语音应用,然后手机通过语音应用采集用户的语音控制信号,并对其进行提取、文本转换或语音识别。例如,用户发出“小艺小艺”的声音信号后,手机通过麦克风可采集到该声音信号。如果手该声音信号为预设的 唤醒信号,手机就启动语音应用以采集语音控制信号。
步骤303,电子设备确定与该语音控制信号对应的触控事件。
其中,触控事件指的是对控件执行触控操作。在本申请实施例中,电子设备可以预先存储各个应用的配置文件,比如每个应用对应一个或多个配置文件。对一个配置文件而言,该配置文件中记录了一个应用的不同界面中各个触控事件与语音控制信号的对应关系。一个配置文件也可以只记录一个应用的一个界面中的触控事件与语音控制信号的对应关系。
基于Android的界面上的所有控件,都是挂载在当前界面的窗口下的一个DecorView节点下,当用户说出的文本时,Android的软件***可以从DecorView开始扫描每个控件标识,并与用户说出的语音控制信号的文本信息比对,从而确定语音控制信号的文本信息相对应的目标控件标识,继而从所述配置文件中查找与所述目标控件标识对应触控事件。以图4所示的电话本应用举例,开发人员可以在电话本应用的安装包中设置新建联系人界面的配置文件1。配置文件1中记录了新建联系人界面中各个触控事件和语音控制信号的对应关系,例如,“姓名”输入框的输入事件与控件标识“姓名”相对应,控件标识“姓名”与语音控制信号的文本信息相对应。“添加其它项”的点击操作与控件标识“添加其它项”相对应,控件标识“添加其它项”与语音控制信号的文本信息“添加其它项”相对应。这样,当电子设备接收到语音控制信号后,就可以从配置文件中查找到与该语音控制信号对应的触控事件。也就是说,配置文件1中记录了“语音控制信号”与点击第一界面中的第一控件这一触控事件之间的对应关系。如图5所示,对手机而言,手机接收到用户输入“姓名”的语音控制信号时,相当于手机检测到用户点击“姓名”输入框,从而焦点落在该输入框中。
需要说明的是,电子设备可以直接将配置文件安装在本地,例如手机在安装电话本应用时可将电话本应用的安装包中提供的配置文件1存储在手机的内存中。这样,手机即使不联网,也可以支持该语音控制功能。
步骤304,响应于该语音控制信号,电子设备执行所述触控事件,并显示所述应用的第二界面,所述第二界面是所述第一界面中的第一控件执行触控操作后的界面。
仍以图5所示的新建联系人为例,若手机接收到“张三”这一语音控制信号,则相当于手机检测到用户在“姓名”输入框启动输入法操作,因此手机执行图5a所示的输入事件,并显示图5b所示的界面。
可以看出,本申请实施例中可以为应用中的各个界面设置相应的配置文件,配置文件中记录了对应的界面所支持的语音控制信号,以及该语音控制信号所对应的触控事件。这样,在运行该应用的某一界面时,电子设备可根据该界面的配置文件确定用户输入的语音控制信号对应的触控事件,进而执行该触控事件,从而实现通过语音控制应用的界面中各个控件的功能。这样一来,电子设备可以应用的界面为粒度实现对界面中各个操作按钮的语音控制功能,从而提高语音控制效率和用户体验。
另外,在一种可能的实施例中,如果电子设备确定出用户输入的语音控制信号并非配置文件所支持的语音控制信号时,还可以将该语音控制信号发送至后台服务器,由后台服务器确定任务类型和提取关键信息,从而生成VUI任务。后台服务器将VUI任务转换成相应的控制指令发送给对应的应用程序,具体示例可以参见场景三。
需要说明的是,电子设备的部分应用的界面中的控件可能并没有显示该控件的名称或者文本提示信息,本申请实施例可以在界面中提供该类控件的文本提示信息。具体地,由于Android设备都支持盲人使用的语音播报功能,因此在控件中都配置有 “android:contentDescribtion”信息,这类控件可以直接复用已经配置好的android:contentDescribtion的文本描述,即将这些文本描述作为该类控件的文本提示显示在界面中。
再者,在一种可能的实施例中,为了提升语音控制功能的可操作性,本申请实施例中可以预置一些触控事件,比如:与语音控制信号“上边”对应的向上移动操作、与语音控制信号“下边”对应的向下移动操作、与语音控制信号“左边”对应的向左移动操作、与语音控制信号“右边”对应的向右移动操作,用以模拟方向杆操作或者键盘的上、下、左、右的按键操作,处理当前控件焦点的移动。
综上,本申请实施例所提供的语音控制方法的核心是根据语音控制信号确定出触控事件,即找到对应的控件,然后根据该控件的接口能力,模拟对应的触控事件(例如点击、长按),输入法事件(例如输入文本)以及按键操作(例如上、下、左、右移动),应用程序的GUI并不需要为实现具体地语音控制功能做适配性地开发。
如图12所示,本申请实施例公开了一种电子设备,包括:触摸屏1201,所述触摸屏1201包括触敏表面1206和显示屏1207;一个或多个处理器1202;存储器1203;通信模块1208;一个或多个应用程序(未示出);以及一个或多个计算机程序1204,上述各器件可以通过一个或多个通信总线1205连接。其中该一个或多个计算机程序1204被存储在上述存储器1203中并被配置为被该一个或多个处理器1202执行,该一个或多个计算机程序1204包括指令,该指令可以用于执行上述实施例中的各个步骤,例如,该指令可以用于执行图11中所示的各个步骤。
本申请实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,当该计算机指令在电子设备上运行时,使得电子设备执行上述相关方法步骤实现上述实施例中的方法。
本申请实施例还提供了一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述相关步骤,以实现上述实施例中的方法。
另外,本申请的实施例还提供一种装置,这个装置具体可以是芯片,组件或模块,该装置可包括相连的处理器和存储器;其中,存储器用于存储计算机执行指令,当装置运行时,处理器可执行存储器存储的计算机执行指令,以使芯片执行上述各方法实施例中的方法。
其中,本申请实施例提供的电子设备、计算机存储介质、计算机程序产品或芯片均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。
通过以上实施方式的描述,所属领域的技术人员可以了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其他的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其他的形式。
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (11)

  1. 一种语音控制方法,其特征在于,所述方法包括:
    电子设备显示应用的第一界面,所述第一界面包括用于更新所述第一界面的控件;
    所述电子设备采集用户的语音控制信号;
    所述电子设备确定与所述语音控制信号对应的触控事件,所述触控事件为对控件执行触控操作;
    响应于所述语音控制信号,所述电子设备执行所述触控事件,并显示所述应用的第二界面,所述第二界面是所述第一界面中的控件执行触控操作后的界面。
  2. 根据权利要求1所述的方法,其特征在于,所述电子设备采集用户的语音控制信号之后,还包括:
    所述电子设备获取与所述第一界面关联的配置文件,所述配置文件中包括所述第一界面中的控件对应的控件标识和触控事件的对应关系;
    所述电子设备确定与所述语音控制信号对应的触控事件,包括:
    所述电子设备确定与所述语音控制信号的文本信息相符的目标控件标识;
    所述电子设备从所述配置文件中查找与所述目标控件标识对应的触控事件。
  3. 根据权利要求1或2所述的方法,其特征在于,所述电子设备执行所述触控事件之后,显示所述应用的第二界面之前,还包括:
    所述电子设备显示对所述第一界面中的控件执行触控操作时的动画效果。
  4. 根据权利要求1至3任一项所述的方式,其特征在于,所述电子设备采集用户的语音控制信号,包括:
    响应于用户输入的唤醒信号,所述电子设备在后台启动语音应用;
    所述电子设备通过所述语音应用采集用户的语音控制信号。
  5. 根据权利要求1至4任一项所述的方法,其特征在于,该方法还包括:
    当检测到用户对所述第一界面中的控件的触控操作时,所述电子设备执行所述触控操作。
  6. 一种电子设备,其特征在于,包括:触摸屏、处理器和存储器;
    所述存储器用于存储一个或多个计算机程序;
    当所述存储器存储的一个或多个计算机程序被所述处理器执行时,使得所述电子设备执行:
    显示应用的第一界面,所述第一界面包括用于更新所述第一界面的控件;
    采集用户的语音控制信号;
    确定与所述语音控制信号对应的触控事件,所述触控事件为对控件执行触控操作;
    响应于所述语音控制信号,执行所述触控事件,并显示所述应用的第二界面,所述第二界面是所述第一界面中的控件执行触控操作后的界面。
  7. 根据权利要求6所述的电子设备,其特征在于,当所述电子设备采集用户的语音控制信号之后,所述电子设备还用于执行:
    获取与所述第一界面关联的配置文件,所述配置文件中包括所述第一界面中的控件对应的控件标识和触控事件的对应关系;
    确定与所述语音控制信号的文本信息相符的目标控件标识;
    所述电子设备从所述配置文件中查找与所述目标控件标识对应的触控事件。
  8. 根据权利要求6或7所述的电子设备,其特征在于,当所述电子设备执行所述触控事件之后,显示所述应用的第二界面之前,还使得所述电子设备执行:
    显示对所述第一界面中的控件执行触控操作时的动画效果。
  9. 根据权利要求6至8任一项所述的电子设备,其特征在于,当所述电子设备执行所述触控事件时,还使得所述电子设备执行:
    响应于用户输入的唤醒信号,在后台启动语音应用;
    通过所述语音应用采集用户的语音控制信号。
  10. 根据权利要求6至9任一项所述的电子设备,其特征在于,还使得所述电子设备执行:
    当检测到用户对所述第一界面中的控件的触控操作时,执行所述触控操作。
  11. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质包括计算机程序,当计算机程序在电子设备上运行时,使得所述电子设备执行如权利要求1至5任一项所述的语音控制方法。
PCT/CN2020/076689 2019-03-08 2020-02-26 一种语音控制方法及电子设备 WO2020181988A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910176543.1 2019-03-08
CN201910176543.1A CN110060672A (zh) 2019-03-08 2019-03-08 一种语音控制方法及电子设备

Publications (1)

Publication Number Publication Date
WO2020181988A1 true WO2020181988A1 (zh) 2020-09-17

Family

ID=67316741

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/076689 WO2020181988A1 (zh) 2019-03-08 2020-02-26 一种语音控制方法及电子设备

Country Status (2)

Country Link
CN (1) CN110060672A (zh)
WO (1) WO2020181988A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114826805A (zh) * 2021-01-28 2022-07-29 星络家居云物联科技有限公司 计算机可读存储介质、移动终端、智能家居控制方法
CN115706749A (zh) * 2021-08-12 2023-02-17 华为技术有限公司 一种设置提醒的方法和电子设备

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110060672A (zh) * 2019-03-08 2019-07-26 华为技术有限公司 一种语音控制方法及电子设备
CN110493123B (zh) * 2019-09-16 2022-06-28 腾讯科技(深圳)有限公司 即时通讯方法、装置、设备及存储介质
CN110837334B (zh) * 2019-11-04 2022-03-22 北京字节跳动网络技术有限公司 用于交互控制的方法、装置、终端及存储介质
CN110968362B (zh) * 2019-11-18 2023-09-26 北京小米移动软件有限公司 应用运行方法、装置及存储介质
CN111443850A (zh) * 2020-03-10 2020-07-24 努比亚技术有限公司 一种终端操作方法、终端和存储介质
CN111475241B (zh) * 2020-04-02 2022-03-11 深圳创维-Rgb电子有限公司 一种界面的操作方法、装置、电子设备及可读存储介质
CN111599358A (zh) * 2020-04-09 2020-08-28 华为技术有限公司 语音交互方法及电子设备
CN111475216B (zh) * 2020-04-15 2024-03-08 亿咖通(湖北)技术有限公司 一种app的语音控制方法、计算机存储介质及电子设备
CN114007117B (zh) * 2020-07-28 2023-03-21 华为技术有限公司 一种控件显示方法和设备
CN112083843B (zh) * 2020-09-02 2022-05-27 珠海格力电器股份有限公司 应用图标的控制方法及装置
CN114527920A (zh) * 2020-10-30 2022-05-24 华为终端有限公司 一种人机交互方法及电子设备
CN112581957B (zh) * 2020-12-04 2023-04-11 浪潮电子信息产业股份有限公司 一种计算机语音控制方法、***及相关装置
CN112863514B (zh) * 2021-03-15 2024-03-15 亿咖通(湖北)技术有限公司 一种语音应用的控制方法和电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358953A (zh) * 2017-06-30 2017-11-17 努比亚技术有限公司 语音控制方法、移动终端及存储介质
CN107967055A (zh) * 2017-11-16 2018-04-27 深圳市金立通信设备有限公司 一种人机交互方法、终端及计算机可读介质
CN108108142A (zh) * 2017-12-14 2018-06-01 广东欧珀移动通信有限公司 语音信息处理方法、装置、终端设备及存储介质
CN108364644A (zh) * 2018-01-17 2018-08-03 深圳市金立通信设备有限公司 一种语音交互方法、终端及计算机可读介质
CN108538291A (zh) * 2018-04-11 2018-09-14 百度在线网络技术(北京)有限公司 语音控制方法、终端设备、云端服务器及***
CN110060672A (zh) * 2019-03-08 2019-07-26 华为技术有限公司 一种语音控制方法及电子设备

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107358953A (zh) * 2017-06-30 2017-11-17 努比亚技术有限公司 语音控制方法、移动终端及存储介质
CN107967055A (zh) * 2017-11-16 2018-04-27 深圳市金立通信设备有限公司 一种人机交互方法、终端及计算机可读介质
CN108108142A (zh) * 2017-12-14 2018-06-01 广东欧珀移动通信有限公司 语音信息处理方法、装置、终端设备及存储介质
CN108364644A (zh) * 2018-01-17 2018-08-03 深圳市金立通信设备有限公司 一种语音交互方法、终端及计算机可读介质
CN108538291A (zh) * 2018-04-11 2018-09-14 百度在线网络技术(北京)有限公司 语音控制方法、终端设备、云端服务器及***
CN110060672A (zh) * 2019-03-08 2019-07-26 华为技术有限公司 一种语音控制方法及电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114826805A (zh) * 2021-01-28 2022-07-29 星络家居云物联科技有限公司 计算机可读存储介质、移动终端、智能家居控制方法
CN115706749A (zh) * 2021-08-12 2023-02-17 华为技术有限公司 一种设置提醒的方法和电子设备

Also Published As

Publication number Publication date
CN110060672A (zh) 2019-07-26

Similar Documents

Publication Publication Date Title
WO2020181988A1 (zh) 一种语音控制方法及电子设备
JP7142783B2 (ja) 音声制御方法及び電子装置
WO2021063343A1 (zh) 语音交互方法及装置
WO2021164313A1 (zh) 界面布局方法、装置及***
WO2021057868A1 (zh) 一种界面切换方法及电子设备
CN111316199B (zh) 一种信息处理方法及电子设备
WO2022052776A1 (zh) 一种人机交互的方法、电子设备及***
WO2021037223A1 (zh) 一种触控方法与电子设备
WO2021110133A1 (zh) 一种控件的操作方法及电子设备
CN112130714B (zh) 可进行学习的关键词搜索方法和电子设备
WO2021175272A1 (zh) 一种应用信息的显示方法及相关设备
WO2022100221A1 (zh) 检索处理方法、装置及存储介质
CN114466102A (zh) 显示应用界面的方法、电子设备以及交通信息显示***
WO2021151320A1 (zh) 一种握持姿态检测方法及电子设备
CN112835495B (zh) 开启应用程序的方法、装置及终端设备
CN112740148A (zh) 一种向输入框中输入信息的方法及电子设备
WO2024093103A9 (zh) 笔迹处理方法、终端设备及芯片***
CN116028148B (zh) 一种界面处理方法、装置及电子设备
WO2023116012A9 (zh) 屏幕显示方法和电子设备
WO2022052961A1 (zh) 同时显示多个应用界面时进行生物特征认证的方法
WO2022001261A1 (zh) 提示方法及终端设备
CN111475363B (zh) 卡死识别方法及电子设备
WO2023202444A1 (zh) 一种输入方法及装置
US20240129619A1 (en) Method and Apparatus for Performing Control Operation, Storage Medium, and Control
CN115291995B (zh) 一种消息显示方法及相关电子设备、可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20769916

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20769916

Country of ref document: EP

Kind code of ref document: A1