WO2015090137A1 - A voice message search method, device, and system - Google Patents

A voice message search method, device, and system Download PDF

Info

Publication number
WO2015090137A1
WO2015090137A1 PCT/CN2014/092426 CN2014092426W WO2015090137A1 WO 2015090137 A1 WO2015090137 A1 WO 2015090137A1 CN 2014092426 W CN2014092426 W CN 2014092426W WO 2015090137 A1 WO2015090137 A1 WO 2015090137A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
message
voice message
voice
search
Prior art date
Application number
PCT/CN2014/092426
Other languages
French (fr)
Inventor
Yelu LIU
Original Assignee
Tencent Technology (Shenzhen) Company Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology (Shenzhen) Company Limited filed Critical Tencent Technology (Shenzhen) Company Limited
Publication of WO2015090137A1 publication Critical patent/WO2015090137A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Definitions

  • the present disclosure relates to the field of mobile Internet, and more particularly to a voice message search method, device, and system.
  • a voice instant messaging application is an application that allows two or more parties to communicate with each other instantly by exchanging voice messages. Such applications include Yixin, Line, and Laiwang. Voice instant messaging applications are now among the applications that are the most widely used on mobile terminals, including smartphones, tablet PCs, and eBook readers.
  • the existing voice message search method comprises the following: the user uses a mobile terminal to play all the voice messages one by one or play a voice message selected based on guess; after a voice message is played, the user determines whether the voice message contains the target content; if yes, the user stops the search; if no, the user continue to play the next voice message by using the mobile terminal.
  • the inventor has found that the existing art has at least the following problems. If the number of voice messages is large, searching for the target content by playing voice messages one by one is very inefficient. In addition, the user's judgment also deteriorates due to repeated clicking and the visual fatigue caused by sliding operations. Consequently, the overall efficiency of the searches performed using the above-mentioned voice message search method is low.
  • the embodiments of the present disclosure provide a voice message search method, device, and system.
  • the technical solution is as follows:
  • a voice message search method for use on a client and comprising: obtaining a text search keyword; searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message; feeding back as the search result the voice message corresponding to the text message that includes the search keyword.
  • a voice message search device in a second aspect, includes: a search acquisition module, a text search module, and a result feedback module.
  • the search acquisition module is configured to obtain a text search keyword.
  • the text search module is configured to search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message.
  • the result feedback module is configured to feed back as the search result the voice message corresponding to the text message that includes the search keyword.
  • a voice message search system comprising a client and a server, with the client and the server interconnected using a wireless network or wired network;
  • the client can be the voice message search device described in the above-mentioned third aspect.
  • the solution obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message. This solves the problem of inefficient search using the voice message search method provided by the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
  • Figure 1 shows the flowchart of the voice message search method provided by an embodiment of the present disclosure.
  • Figure 2A shows the flowchart of the voice message search method provided by another embodiment of the present disclosure.
  • FIGS 2B to 2E show the schematic diagrams for the implementation interfaces related in the embodiment as shown in Figure 2A.
  • Figure 3 shows the structural diagram of the voice message search device provided by an embodiment of the present disclosure.
  • Figure 4 shows the structural diagram of the voice message search device provided by another embodiment of the present disclosure.
  • Figure 5 shows the structural diagram of the voice message search system provided by an embodiment of the present disclosure.
  • module may refer to, be part of, or include an Application Specific Integrated Circuit (ASIC) ; an electronic circuit; a combinational logic circuit; a field programmable gate array (FPGA) ; a processor (shared, dedicated, or group) that executes code; other suitable hardware components that provide the described functionality; or a combination of some or all of the above, such as in a system-on-chip.
  • ASIC Application Specific Integrated Circuit
  • FPGA field programmable gate array
  • processor shared, dedicated, or group
  • the term module or unit may include memory (shared, dedicated, or group) that stores code executed by the processor.
  • the exemplary environment may include a server, a client, and a communication network.
  • the server and the client may be coupled through the communication network for information exchange, such as sending/receiving identification information, sending/receiving data files such as splash screen images, etc.
  • information exchange such as sending/receiving identification information, sending/receiving data files such as splash screen images, etc.
  • client and one server are shown in the environment, any number of terminals or servers may be included, and other devices may also be included.
  • the communication network may include any appropriate type of communication network for providing network connections to the server and client or among multiple servers or clients.
  • communication network may include the Internet or other types of computer networks or telecommunication networks, either wired or wireless.
  • the disclosed methods and apparatus may be implemented, for example, in a wireless network that includes at least one client.
  • the client may refer to any appropriate user terminal with certain computing capabilities, such as a personal computer (PC) , a work station computer, a server computer, a hand-held computing device (tablet) , a smart phone or mobile phone, or any other user-side computing device.
  • the client may include a network access device.
  • the client may be stationary or mobile.
  • a server may refer to one or more server computers configured to provide certain server functionalities, such as database management and search engines.
  • a server may also include one or more processors to execute computer programs in parallel.
  • a user as used herein, may refer to one or more persons or things that control a client. The user may control more than one clients or other devices.
  • a client can be an application client that allows two or more parties to communicate with each other by exchanging voice messages on terminals such as smartphones, tablet PCs, eBook readers, Moving Picture Experts Group Audio Layer III (MP3) players, and Moving Picture Experts Group Audio Layer IV (MP4) players.
  • terminals such as smartphones, tablet PCs, eBook readers, Moving Picture Experts Group Audio Layer III (MP3) players, and Moving Picture Experts Group Audio Layer IV (MP4) players.
  • MP3 Moving Picture Experts Group Audio Layer III
  • MP4 Moving Picture Experts Group Audio Layer IV
  • Figure 1 shows the flowchart of the voice message search method provided by embodiments of the present disclosure.
  • the scenario where the voice message search method is used for a client that allows two or more parties to communicate with each other by exchanging voice messages is described as an example.
  • the above-mentioned method comprises:
  • Step 102 Obtain a text search keyword.
  • the client may directly obtain a search keyword directly input as text.
  • the client may also obtain search voice signals that the user inputs as voice, and then use a speech recognition technology locally or on the server to identify the search keyword in text format from the search voice signals.
  • Step 104 Search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding voice message.
  • Each voice message respectively corresponds to a text messages.
  • Each text message is generated based on speech recognition result of the corresponding voice message.
  • Step 106 Feed back as a search result the voice message corresponding to the text message that includes the search keyword.
  • a voice message search method is provided by the present embodiments.
  • the client obtains a text search keyword and then obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message.
  • the method solves the problem of inefficient search using the voice message search method in the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
  • Figure 2A shows the flowchart of the voice message search method provided by embodiments of the present disclosure.
  • the scenario where the voice message search method is used for a client that allows two or more parties to communicate with each other by exchanging voice messages is described as an example.
  • the above-mentioned method comprises:
  • Step 201 Obtain and store the text messages corresponding to each voice message.
  • the client needs to first obtain and store the text messages corresponding to each voice message. For example, the client needs to convert the voice message "Hello, this is John. " into the text message "Hello, this is John. " which is stored in association with the voice message.
  • This step may be implemented by using any of the following three methods:
  • the client performs speech recognition on each voice message to obtain respective speech recognition results, and based on the speech recognition results, generates the text messages respectively corresponding to each voice message.
  • the terminal running the client needs to have powerful processing capabilities.
  • the client performs the above-mentioned speech recognition procedure during idle time.
  • the client sends each voice message to the server and receives the text messages returned by the server corresponding to each voice message.
  • the text messages are generated based on the speech recognition results obtained after the server performs speech recognition on the voice messages.
  • the client may send all or some of the local voice messages to the server, each voice message having a unique message ID.
  • the server After receiving voice messages from the client, the server performs speech recognition on each voice message to obtain respective speech recognition results and generates the corresponding text messages based on the speech recognition results. Then, the server returns each text message to the client, with each text message having the message ID of the corresponding voice message. The client receives and stores the text messages corresponding to each voice message.
  • the client receives the voice messages sent by other clients and forwarded by the server and the text messages corresponding to the voice messages, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message; and/or, after sending the local voice message, the client receives the text message returned by the server corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message.
  • a voice message is generated by the communication between clients and needs to be forwarded by a server during transfer.
  • the server Before forwarding a voice message, the server performs speech recognition on the voice message to obtain the speech recognition result and then generates the corresponding text message. Then, the server sends the voice message and the text message corresponding to the voice message to the target client.
  • the target client receives and stores the voice messages sent by other clients and simultaneously forwarded by the server and the text messages corresponding to the voice messages.
  • the server returns a text messages to the source client that sent the text message. After sending a local voice message, a source client receives and stores the text message returned by the server that corresponds to the voice message.
  • the third mode is the preferred mode of implementing this step.
  • Step 202 The client obtains a text search keyword.
  • the client can obtain a text search keyword by in one of the following three modes:
  • the client obtains a search keyword directly input as text.
  • the client may receive the search keyword "Tomorrow let's go to” that user A directly inputs as text in the text search box 22, as shown in Figure 2B of a voice instant messaging application in a terminal device.
  • the client obtains the search voice signals that the user inputs as voice, and the client uses a speech recognition technology to identify the search keyword in text format from the search voice signals.
  • the client upon receiving a signal indicating that user A has pressed the voice search button 24, the client uses the microphone 26 of the terminal to receive the search voice signals that the user inputs as voice. Then, the client uses a speech recognition technology to identify the search keyword "Tomorrow let's go to" from the search voice signals, as shown in Figure 2C.
  • the client obtains the search voice signals that the user inputs as voice, and then the client sends the search voice signals to the server.
  • the client receives the search keyword returned by the server, with the search keyword identified by the server from the search voice signals by using a speech recognition technology.
  • Step 203 Search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding voice message.
  • this step may include the following sub-steps:
  • the preset conditions sort the text messages corresponding to each voice message to be searched, the preset conditions including at least one of the following: the generation time corresponding to each voice message, priorities of the contacts corresponding to each voice message, and data sizes of each text message.
  • the voice messages to be searched refer to the voice messages of the contacts related to the current interface shown on the client.
  • the current interface may be the active user interface shown on the terminal device.
  • the voice messages to be searched are the voice messages generated during the voice chat between contact A and contact B.
  • the voice messages to be searched are the voice messages generated during the voice chat among the contacts belonging to the group.
  • the voice messages to be searched can be all the voice messages globally.
  • the client may select the messages to be searched based on the content of the active user interface.
  • the client can sort the text messages corresponding to each voice message based on the generation time corresponding to each voice message. For example, the client sorts the text messages corresponding to each voice message in ascending order or descending order of the generation time corresponding to each voice message. For another example, if the number of voice messages is very large, in descending order of the forget possibility corresponding to different generation time andbased on the human forget curve, the client sorts the text messages corresponding to each voice message to be searched.
  • the user may have located the current interface to the voice messages generated within a time segment other than the latest time segment, for example, the chat messages generated the day before yesterday. In this case, the client may sort to the beginning the text messages corresponding to the voice messages generated within that time segment and sort the text messages corresponding to the voice messages generated within other time segments.
  • the client may, based on the priorities of the contacts, sort the text messages corresponding to each voice message to be searched.
  • the priorities can be preset by the client. For example, if it is more possibly that the search result is found among the voice messages of other contacts, the client may set the priorities for other contacts higher than those of the contacts corresponding to the current client. That is, if the voice messages are the chat messages exchanged between the current contact A and other contact B, the text messages corresponding to the voice messages of other contact B are sorted in front of the text messages corresponding to the voice messages of the current contact A. Thus, the search is performed preferentially in the text messages corresponding to contact B.
  • the client can also set different priorities for contacts based on the numbers of history messages of each contact and the levels of friendliness of each contact with the current contact A.
  • the client can, in descending order or ascending order of the data sizes of each text message, sort the text messages corresponding to each voice message to be searched.
  • the client can first perform sorting based on one of the conditions and then, based on another condition, continue to sort the sorting result obtained using the preceding condition. For example, the client can sort each text message based on the priorities of the contacts and then continue to sort the text messages of the same contact in ascending order of the generation time of the corresponding voice messages.
  • the above-mentioned sorting may be performed before or during step 202.
  • the client receives a signal indicating that the user has pressed the voice search button 24, the sorting is triggered.
  • the client receives the search voice signals input by the user after or during the sorting.
  • the terminal searches for the text message that includes the search keyword.
  • the client searches for and finds the text message "Tomorrow let's go to the Curious Dinosaur Park. It's Halloween tomorrow. There is a haunted house over there”, which includes the search keyword "Tomorrow let's go to. "
  • Step 204 Feed back as a search result the voice message corresponding to the text message that includes the search keyword.
  • the terminal After finding the text message that includes the search keyword, the terminal displays or plays, as the search result on the current interface, the voice message corresponding to the text message that includes the search keyword.
  • the client not only can use the found voice messages as the search results but also can use the found text messages as the search results. In addition, the client can also use the found voice messages and the corresponding text messages as the search results.
  • the mode of displaying search results can be set by the user. For example, the user can set the mode in which voice messages are always used as the search results for feedback, as shown in Figure 2B. The mode of displaying search results can also be determined based on the current scenario mode of the terminal.
  • the client uses the found voice messages as the search results for feedback; if the scenario mode of the terminal is currently set to "Mute”, the client uses the found text messages as the search results for feedback, or the client uses the found voice messages and the corresponding text messages as the search results for feedback, as shown in Figure 2C.
  • the voice message search method provided by the present embodiment, by obtaining a text search keyword, obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message. This solves the problem of inefficient search using the voice message search method provided by the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
  • text messages are sorted according to preset conditions to accelerate the search.
  • voice messages can be sorted giving another contact a priority higher than that of the current contact, thus significantly accelerating the search.
  • the client can, before the sorting, receive a selection signal indicating that the user has selected the target contact from at least two contacts related to the current interface; then, the client determines the voice messages of the selected target contact as each voice message to be searched.
  • the client may provide the interface 27 for selecting at least two contacts related to the current interface. Then, the user can select all or some of the contacts. Based on the received selection signal, the client determines the voice messages of the target contacts "Jack" and "Ashley, " selected from the three contacts in the group, as the voice messages to be searched. Thus, the range of the voice messages to be searched is narrowed down, improving the search efficiency. In group chat where the voice messages to be searched involve multiple persons or in a scenario where all the contacts are related to the current interface, this implementation mode can significantly accelerate searches.
  • the client may receive a selection signal indicating that the user has selected the target time segment from at least two preset candidate time segments; then the client determines the voice messages that belong to selected the target time segment as the voice messages to be searched.
  • the client may provide the interface 28 for selecting at least two time segments. Then, the user can select all or some of the time segments. Based on the received selection signal, the client determines the voice messages generated during the selected time segment "recent week" as the voice messages to be searched. Thus, the range of the voice messages to be searched is narrowed down, improving the search efficiency. In a scenario where the voice messages to be searched include multiple voice messages generated over a very long period of time, this implementation mode can significantly accelerate searches.
  • the voice message search device 300 includes a hardware processor 302 and a non-transitory storage medium 304 configured to store the following modules: a search acquisition module 320, a text search module 340, and a result feedback module 360.
  • the search acquisition module 320 is configured to obtain a text search keyword.
  • the text search module 340 is configured to search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message.
  • the result feedback module 360 is configured to feed back as the search result the voice message corresponding to the text message that includes the search keyword.
  • a voice message search device is provided by the embodiments.
  • the voice message search device obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message. This solves the problem of inefficient search using the voice message search method provided by the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
  • FIG. 4 shows the structural diagram for the voice message search device 300 provided by another embodiment of the present disclosure.
  • the voice message search device may be implemented as all or part of a client or a terminal.
  • the voice message search device 300 include a hardware processor 302 and a non-transitory storage medium 304 configured to store the following modules:
  • the search acquisition module 320 configured to obtain a text search keyword
  • the text search module 340 configured to search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message;
  • the result feedback module 360 configured to feed back as the search result the voice message corresponding to the text message that includes the search keyword.
  • the device may further include the text generation module 310.
  • the text generation module 310 is configured to perform speech recognition on each voice message to obtain respective speech recognition results; based on the speech recognition results, generate the text messages respectively corresponding to each voice message;
  • the text generation module 310 is configured to send each voice message to the server; receive the text messages fed back by the server that respectively correspond to each voice message; the text message is generated based on the speech recognition results obtained after the server performs speech recognition on each voice message;
  • the text generation module 310 is configured to receive the voice message sent by another client and forwarded by the server and the text message corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message; and/or, after sending the local the voice message, receive the text message fed back by the server corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message.
  • the text search module 340 may include the message sorting module 342 and the sorting search module 344.
  • the message sorting module 342 is configured to, according to the preset conditions, sort the text messages corresponding to each voice message to be searched, the preset conditions including at least one of the following: the times corresponding to each voice message, priorities of the contacts corresponding to each voice message, and data sizes of each text message.
  • the sorting search module 344 is configured to, from sorted the text messages, search for the text message that includes the search keyword.
  • the text search module 340 may further include a contact selection module and a contact determination module (not shown in figure) .
  • the contact selection module is configured to receive a selection signal for selecting the target contact from at least two contacts related to the current interface.
  • the contact determination module is configured to determine the voice messages that belong to selected the target contact as the each voice message to be searched.
  • the text search module 340 may further include a time segment selection module and a time segment determination module (not shown in figure) .
  • the time segment selection module is configured to receive a selection signal for selecting the target time segment from at least two preset candidate time segments.
  • the time segment determination module is configured to determine the voice messages that belong to selected the target time segment as the each voice message to be searched.
  • the search acquisition module 320 is configured to obtain the search keyword directly input as text
  • the search acquisition module 320 is configured to obtain search voice signals input as text; use a speech recognition technology to identify the search keyword in text format from the search voice signals;
  • the search acquisition module 320 is configured to obtain search voice signals input as voice; send the search voice signals to the server; receive the search keyword fed back by the server, with the search keyword identified by the server from the search voice signals by using a speech recognition technology.
  • a voice message search device is provided by the embodiments.
  • the voice message search device obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message. This solves the problem of inefficient search using the voice message search method provided by the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
  • text messages are sorted according to preset conditions to accelerate the search.
  • voice messages can be sorted giving another contact a priority higher than that of the current contact, thus significantly accelerating the search.
  • the voice message search performed by the voice message search device provided by the above-mentioned embodiment is described by using only the division of the above-mentioned function modules as an example.
  • the above-mentioned functions can be assigned to different function modules for completion as needed. That is, the internal structure of the device can be divided into different function modules to complete all or some of the above-mentioned functions.
  • the voice message search device and voice message search method provided by the above-mentioned embodiments adopt the same concept. For implementation details, see the description of the method provided by the above-mentioned embodiments.
  • FIG. 5 shows the structural diagram for the voice message search system provided by an embodiment of the present disclosure.
  • the voice message search system comprises at least a client 520 and a server 540.
  • the client 520 and server 540 are interconnected using a wireless network or wired network.
  • the client 520 includes the voice message search device provided by the embodiment as shown in Figure 3 or Figure 4.
  • the program can be stored on a computer-readable storage medium.
  • the storage medium can be a Read-Only Memory (ROM) , a magnetic disk, or an optical disk.
  • the whole or part of method in embodiments above may be realized through relevant hardware under instruction of computer program, in which the program may be stored in a computer-readable memory medium.
  • the program When the program is executed, flow processes in embodiments of method above may be contained.
  • the memory medium above may be diskette, optical disk, Read-Only Memory (ROM) or Random Access Memory (RAM) , or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Information Transfer Between Computers (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present disclosure discloses a voice message search method, device, and system, and relates to the field of mobile Internet. The method comprises: obtaining a text search keyword; searching for a text message that includes the text search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message; feeding back as the search result the voice message corresponding to the text message that includes the text search keyword. The present disclosure solves the problem of low search efficiency with the voice message search method provided by the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting the text search keyword.

Description

Description
A VOICE MESSAGE SEARCH METHOD, DEVICE, AND SYSTEM CROSS-REFERENCE TO RELATED APPLICATIONS
This application claims priority to Chinese Patent Application No. 201310695093. X, filed on December 17, 2013, which is hereby incorporated by reference in its entirety.
FIELD
The present disclosure relates to the field of mobile Internet, and more particularly to a voice message search method, device, and system.
BACKGROUND
A voice instant messaging application is an application that allows two or more parties to communicate with each other instantly by exchanging voice messages. Such applications include 
Figure PCTCN2014092426-appb-000001
Yixin, Line, and Laiwang. Voice instant messaging applications are now among the applications that are the most widely used on mobile terminals, including smartphones, tablet PCs, and eBook readers.
When using a voice instant messaging application, a user may need to search for the target content from the history voice messages. For example, after user A and user B send dozens of voice messages to each other arranging a meeting, user A may need to, from the voice messages sent by user B, find the one that contains information about the location arranged for the meeting. In this case, the existing voice message search method comprises the following: the user uses a mobile terminal to play all the voice messages one by one or play a voice message selected based on guess; after a voice message is played, the user determines whether the voice message contains the target content; if yes, the user stops the search; if no, the user continue to play the next voice message by using the mobile terminal.
During the implementation of the present disclosure, the inventor has found that the existing art has at least the following problems. If the number of voice messages is large, searching for the target content by playing voice messages one by one is very inefficient. In addition, the user's judgment also deteriorates due to repeated clicking and the visual fatigue caused by sliding operations. Consequently, the overall efficiency of the searches performed using the above-mentioned voice message search method is low.
SUMMARY
To solve the problem of inefficient search using the voice message search method provided by the prior art, the embodiments of the present disclosure provide a voice message search method, device, and system. The technical solution is as follows:
In a first aspect, a voice message search method is provided, for use on a client and comprising: obtaining a text search keyword; searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message; feeding back as the search result the voice message corresponding to the text message that includes the search keyword.
In a second aspect, a voice message search device is provided. The device includes: a search acquisition module, a text search module, and a result feedback module. The search acquisition module is configured to obtain a text search keyword. The text search module is configured to search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message. The result feedback module is configured to feed back as the search result the voice message corresponding to the text message that includes the search keyword.
In a third aspect, a voice message search system is provided, comprising a client and a server, with the client and the server interconnected using a wireless network or wired network; The client can be the voice message search device described in the above-mentioned third aspect.
The technical solution provided by the embodiments of the present disclosure has the following benefits:
By obtaining a text search keyword, the solution obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message. This solves the problem of inefficient search using the voice message search method provided by the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
BRIEF DESCRIPTION OF THE DRAWINGS
To more clearly describe the technical solution provided by the embodiments of the present disclosure, the following gives an overview of the drawings needed to describe the embodiments. Obviously, the following drawings show only some of the embodiments of the present  disclosure, and those of ordinary skill in existing art may obtain other drawings based on these drawings without creative work.
Figure 1 shows the flowchart of the voice message search method provided by an embodiment of the present disclosure.
Figure 2A shows the flowchart of the voice message search method provided by another embodiment of the present disclosure.
Figures 2B to 2E show the schematic diagrams for the implementation interfaces related in the embodiment as shown in Figure 2A.
Figure 3 shows the structural diagram of the voice message search device provided by an embodiment of the present disclosure.
Figure 4 shows the structural diagram of the voice message search device provided by another embodiment of the present disclosure.
Figure 5 shows the structural diagram of the voice message search system provided by an embodiment of the present disclosure.
DETAILED DESCRIPTION OF THE DRAWINGS
Reference throughout this specification to "embodiments, " "an embodiment, " "example embodiment, " or the like in the singular or plural means that one or more particular features, structures, or characteristics described in connection with an embodiment is included in at least embodiments of the present disclosure. Thus, the appearances of the phrases "in embodiments" or "in an embodiment, " "in an example embodiment, " or the like in the singular or plural in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
The terminology used in the description of the disclosure herein is for the purpose of describing particular examples only and is not intended to be limiting of the disclosure. As used in the description of the disclosure and the appended claims, the singular forms "a, " "an, " and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Also, as used in the description herein and throughout the claims that follow, the meaning of “in” includes “in” and “on” unless the context clearly dictates otherwise. It will also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items. It will be further understood that the terms "may include, " "including, ""comprises, " and/or "comprising, " when used in this specification, specify the presence of stated  features, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, operations, elements, components, and/or groups thereof.
As used herein, the term “module” or “unit” may refer to, be part of, or include an Application Specific Integrated Circuit (ASIC) ; an electronic circuit; a combinational logic circuit; a field programmable gate array (FPGA) ; a processor (shared, dedicated, or group) that executes code; other suitable hardware components that provide the described functionality; or a combination of some or all of the above, such as in a system-on-chip. The term module or unit may include memory (shared, dedicated, or group) that stores code executed by the processor.
The exemplary environment may include a server, a client, and a communication network. The server and the client may be coupled through the communication network for information exchange, such as sending/receiving identification information, sending/receiving data files such as splash screen images, etc. Although only one client and one server are shown in the environment, any number of terminals or servers may be included, and other devices may also be included.
The communication network may include any appropriate type of communication network for providing network connections to the server and client or among multiple servers or clients. For example, communication network may include the Internet or other types of computer networks or telecommunication networks, either wired or wireless. In a certain embodiment, the disclosed methods and apparatus may be implemented, for example, in a wireless network that includes at least one client.
In some cases, the client may refer to any appropriate user terminal with certain computing capabilities, such as a personal computer (PC) , a work station computer, a server computer, a hand-held computing device (tablet) , a smart phone or mobile phone, or any other user-side computing device. In various embodiments, the client may include a network access device. The client may be stationary or mobile.
A server, as used herein, may refer to one or more server computers configured to provide certain server functionalities, such as database management and search engines. A server may also include one or more processors to execute computer programs in parallel. A user, as used herein, may refer to one or more persons or things that control a client. The user may control more than one clients or other devices.
The solutions in the embodiments of the present disclosure are clearly and completely described in combination with the attached drawings in the embodiments of the present disclosure. Obviously, the described embodiments are only a part, but not all, of the embodiments of the present disclosure. Based on the embodiments of the present disclosure, all other embodiments acquired by  those of ordinary skill in the art under the precondition that no creative efforts have been made shall be covered by the protective scope of the present disclosure.
To make clearer the purpose, technical solution, and benefits of the present disclosure, the following describes in further details the embodiments of the present disclosure based on the drawings.
Inthe embodiments of the present disclosure, a client can be an application client that allows two or more parties to communicate with each other by exchanging voice messages on terminals such as smartphones, tablet PCs, eBook readers, Moving Picture Experts Group Audio Layer III (MP3) players, and Moving Picture Experts Group Audio Layer IV (MP4) players.
Figure 1 shows the flowchart of the voice message search method provided by embodiments of the present disclosure. In the embodiments, the scenario where the voice message search method is used for a client that allows two or more parties to communicate with each other by exchanging voice messages is described as an example. The above-mentioned method comprises:
Step 102: Obtain a text search keyword.
The client may directly obtain a search keyword directly input as text. The client may also obtain search voice signals that the user inputs as voice, and then use a speech recognition technology locally or on the server to identify the search keyword in text format from the search voice signals.
Step 104: Search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding voice message.
Each voice message respectively corresponds to a text messages. Each text message is generated based on speech recognition result of the corresponding voice message.
Step 106: Feed back as a search result the voice message corresponding to the text message that includes the search keyword.
In short, a voice message search method is provided by the present embodiments. The client obtains a text search keyword and then obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message. The method solves the problem of inefficient search using the voice message search method in the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
Figure 2A shows the flowchart of the voice message search method provided by embodiments of the present disclosure. In the embodiments, the scenario where the voice message  search method is used for a client that allows two or more parties to communicate with each other by exchanging voice messages is described as an example. The above-mentioned method comprises:
Step 201: Obtain and store the text messages corresponding to each voice message.
As a voice message is stored and transferred in a voice format, the client needs to first obtain and store the text messages corresponding to each voice message. For example, the client needs to convert the voice message "Hello, this is John. " into the text message "Hello, this is John. " which is stored in association with the voice message.
This step may be implemented by using any of the following three methods:
In the first implementation mode, the client performs speech recognition on each voice message to obtain respective speech recognition results, and based on the speech recognition results, generates the text messages respectively corresponding to each voice message.
In this implementation mode, the terminal running the client needs to have powerful processing capabilities. Preferentially, the client performs the above-mentioned speech recognition procedure during idle time.
In the second implementation mode, the client sends each voice message to the server and receives the text messages returned by the server corresponding to each voice message. The text messages are generated based on the speech recognition results obtained after the server performs speech recognition on the voice messages.
At a preset time interval, during idle time, or when connected to a wireless local area network (LAN) , the client may send all or some of the local voice messages to the server, each voice message having a unique message ID. After receiving voice messages from the client, the server performs speech recognition on each voice message to obtain respective speech recognition results and generates the corresponding text messages based on the speech recognition results. Then, the server returns each text message to the client, with each text message having the message ID of the corresponding voice message. The client receives and stores the text messages corresponding to each voice message.
In the third implementation mode, the client receives the voice messages sent by other clients and forwarded by the server and the text messages corresponding to the voice messages, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message; and/or, after sending the local voice message, the client receives the text message returned by the server corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message.
A voice message is generated by the communication between clients and needs to be forwarded by a server during transfer. Before forwarding a voice message, the server performs speech recognition on the voice message to obtain the speech recognition result and then generates the corresponding text message. Then, the server sends the voice message and the text message corresponding to the voice message to the target client. The target client receives and stores the voice messages sent by other clients and simultaneously forwarded by the server and the text messages corresponding to the voice messages. In addition, the server returns a text messages to the source client that sent the text message. After sending a local voice message, a source client receives and stores the text message returned by the server that corresponds to the voice message.
Obviously, if the processing capabilities of the server are powerful, the third mode is the preferred mode of implementing this step.
Step 202: The client obtains a text search keyword.
Generally, the client can obtain a text search keyword by in one of the following three modes:
In the first mode, the client obtains a search keyword directly input as text.
For example, the client may receive the search keyword "Tomorrow let's go to" that user A directly inputs as text in the text search box 22, as shown in Figure 2B of a voice instant messaging application in a terminal device.
In the second mode, the client obtains the search voice signals that the user inputs as voice, and the client uses a speech recognition technology to identify the search keyword in text format from the search voice signals.
For example, if the processing capabilities of the terminal running the client are powerful, upon receiving a signal indicating that user A has pressed the voice search button 24, the client uses the microphone 26 of the terminal to receive the search voice signals that the user inputs as voice. Then, the client uses a speech recognition technology to identify the search keyword "Tomorrow let's go to" from the search voice signals, as shown in Figure 2C.
In the third mode, the client obtains the search voice signals that the user inputs as voice, and then the client sends the search voice signals to the server. The client receives the search keyword returned by the server, with the search keyword identified by the server from the search voice signals by using a speech recognition technology.
Step 203: Search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding voice message.
To improve the search efficiency, this step may include the following sub-steps:
First, according to the preset conditions, sort the text messages corresponding to each voice message to be searched, the preset conditions including at least one of the following: the generation time corresponding to each voice message, priorities of the contacts corresponding to each voice message, and data sizes of each text message.
Generally, the voice messages to be searched refer to the voice messages of the contacts related to the current interface shown on the client. The current interface may be the active user interface shown on the terminal device. For example, if the current active interface is the chat interface between contact A and contact B, the voice messages to be searched are the voice messages generated during the voice chat between contact A and contact B. For another example, if the current interface is the chat interface of a group, the voice messages to be searched are the voice messages generated during the voice chat among the contacts belonging to the group. For yet another example, if the current interface is not a chat interface, the voice messages to be searched can be all the voice messages globally. In summary, the client may select the messages to be searched based on the content of the active user interface.
If the preset conditions include the generation time corresponding to each voice message, the client can sort the text messages corresponding to each voice message based on the generation time corresponding to each voice message. For example, the client sorts the text messages corresponding to each voice message in ascending order or descending order of the generation time corresponding to each voice message. For another example, if the number of voice messages is very large, in descending order of the forget possibility corresponding to different generation time andbased on the human forget curve, the client sorts the text messages corresponding to each voice message to be searched. For yet another example, the user may have located the current interface to the voice messages generated within a time segment other than the latest time segment, for example, the chat messages generated the day before yesterday. In this case, the client may sort to the beginning the text messages corresponding to the voice messages generated within that time segment and sort the text messages corresponding to the voice messages generated within other time segments. 
If the preset conditions include the priorities of the contacts corresponding to each voice message, the client may, based on the priorities of the contacts, sort the text messages corresponding to each voice message to be searched. The priorities can be preset by the client. For example, if it is more possibly that the search result is found among the voice messages of other contacts, the client may set the priorities for other contacts higher than those of the contacts corresponding to the current client. That is, if the voice messages are the chat messages exchanged between the current contact A and other contact B, the text messages corresponding to the voice messages of other contact B are sorted in front of the text messages corresponding to the voice messages of the current contact A.  Thus, the search is performed preferentially in the text messages corresponding to contact B. For another example, the client can also set different priorities for contacts based on the numbers of history messages of each contact and the levels of friendliness of each contact with the current contact A.
If the preset conditions include the data sizes of each text message, the client can, in descending order or ascending order of the data sizes of each text message, sort the text messages corresponding to each voice message to be searched.
Note that two or three of the above-mentioned preset conditions may be combined to use simultaneously for sorting the messages. The client can first perform sorting based on one of the conditions and then, based on another condition, continue to sort the sorting result obtained using the preceding condition. For example, the client can sort each text message based on the priorities of the contacts and then continue to sort the text messages of the same contact in ascending order of the generation time of the corresponding voice messages.
Note that the above-mentioned sorting may be performed before or during step 202. For example, when the client receives a signal indicating that the user has pressed the voice search button 24, the sorting is triggered. Concurrently, the client receives the search voice signals input by the user after or during the sorting.
Second, from the sorted text messages, a search is performed for the text message that includes the search keyword.
Then, from the sorted text messages, the terminal searches for the text message that includes the search keyword.
For example, the client searches for and finds the text message "Tomorrow let's go to the Curious Dinosaur Park. It's Halloween tomorrow. There is a haunted house over there", which includes the search keyword "Tomorrow let's go to. "
Step 204: Feed back as a search result the voice message corresponding to the text message that includes the search keyword.
After finding the text message that includes the search keyword, the terminal displays or plays, as the search result on the current interface, the voice message corresponding to the text message that includes the search keyword.
The client not only can use the found voice messages as the search results but also can use the found text messages as the search results. In addition, the client can also use the found voice messages and the corresponding text messages as the search results. The mode of displaying search results can be set by the user. For example, the user can set the mode in which voice messages are always used as the search results for feedback, as shown in Figure 2B. The mode of displaying  search results can also be determined based on the current scenario mode of the terminal. For example, if the scenario mode of the terminal is currently set to "Outdoor" , the client uses the found voice messages as the search results for feedback; if the scenario mode of the terminal is currently set to "Mute", the client uses the found text messages as the search results for feedback, or the client uses the found voice messages and the corresponding text messages as the search results for feedback, as shown in Figure 2C.
To sum up, the voice message search method provided by the present embodiment, by obtaining a text search keyword, obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message. This solves the problem of inefficient search using the voice message search method provided by the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
In addition, in the present embodiment, text messages are sorted according to preset conditions to accelerate the search. Particularly, in two-party or multi-party chat using voice messages, voice messages can be sorted giving another contact a priority higher than that of the current contact, thus significantly accelerating the search.
Note that, to make searches faster, the client can, before the sorting, receive a selection signal indicating that the user has selected the target contact from at least two contacts related to the current interface; then, the client determines the voice messages of the selected target contact as each voice message to be searched.
As shown in Figure 2D, after a search is triggered, the client may provide the interface 27 for selecting at least two contacts related to the current interface. Then, the user can select all or some of the contacts. Based on the received selection signal, the client determines the voice messages of the target contacts "Jack" and "Ashley, " selected from the three contacts in the group, as the voice messages to be searched. Thus, the range of the voice messages to be searched is narrowed down, improving the search efficiency. In group chat where the voice messages to be searched involve multiple persons or in a scenario where all the contacts are related to the current interface, this implementation mode can significantly accelerate searches.
Similarly, before the sorting, the client may receive a selection signal indicating that the user has selected the target time segment from at least two preset candidate time segments; then the client determines the voice messages that belong to selected the target time segment as the voice messages to be searched.
As shown in Figure 2E, after a search is triggered, the client may provide the interface 28 for selecting at least two time segments. Then, the user can select all or some of the time segments.  Based on the received selection signal, the client determines the voice messages generated during the selected time segment "recent week" as the voice messages to be searched. Thus, the range of the voice messages to be searched is narrowed down, improving the search efficiency. In a scenario where the voice messages to be searched include multiple voice messages generated over a very long period of time, this implementation mode can significantly accelerate searches.
The following describes an embodiment of the device provided by the present disclosure. For details not given, see the above-mentioned method embodiments that correspond with each other.
See Figure 3, which shows the structural diagram for the voice message search device provided by an embodiment of the present disclosure. By using software, hardware, or a combination of software and hardware, the voice message search device can be implemented as all or part of a client or a terminal. The voice message search device 300 includes a hardware processor 302 and a non-transitory storage medium 304 configured to store the following modules: a search acquisition module 320, a text search module 340, and a result feedback module 360. The search acquisition module 320 is configured to obtain a text search keyword. The text search module 340 is configured to search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message. The result feedback module 360 is configured to feed back as the search result the voice message corresponding to the text message that includes the search keyword.
To sum up, a voice message search device is provided by the embodiments. By obtaining a text search keyword, the voice message search device obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message. This solves the problem of inefficient search using the voice message search method provided by the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
Figure 4 shows the structural diagram for the voice message search device 300 provided by another embodiment of the present disclosure. By using software, hardware, or a combination of software and hardware, the voice message search device may be implemented as all or part of a client or a terminal. The voice message search device 300 include a hardware processor 302 and a non-transitory storage medium 304 configured to store the following modules:
the search acquisition module 320, configured to obtain a text search keyword;
the text search module 340, configured to search for the text message that includes the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message;
the result feedback module 360, configured to feed back as the search result the voice message corresponding to the text message that includes the search keyword.
Optionally, the device may further include the text generation module 310.
The text generation module 310 is configured to perform speech recognition on each voice message to obtain respective speech recognition results; based on the speech recognition results, generate the text messages respectively corresponding to each voice message;
or,
the text generation module 310 is configured to send each voice message to the server; receive the text messages fed back by the server that respectively correspond to each voice message; the text message is generated based on the speech recognition results obtained after the server performs speech recognition on each voice message;
or,
the text generation module 310 is configured to receive the voice message sent by another client and forwarded by the server and the text message corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message; and/or, after sending the local the voice message, receive the text message fed back by the server corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message.
Optionally, the text search module 340 may include the message sorting module 342 and the sorting search module 344. The message sorting module 342 is configured to, according to the preset conditions, sort the text messages corresponding to each voice message to be searched, the preset conditions including at least one of the following: the times corresponding to each voice message, priorities of the contacts corresponding to each voice message, and data sizes of each text message. The sorting search module 344, is configured to, from sorted the text messages, search for the text message that includes the search keyword.
Optionally, the text search module 340 may further include a contact selection module and a contact determination module (not shown in figure) . The contact selection module is configured to receive a selection signal for selecting the target contact from at least two contacts related to the current interface. The contact determination module is configured to determine the voice messages that belong to selected the target contact as the each voice message to be searched.
Optionally, the text search module 340 may further include a time segment selection module and a time segment determination module (not shown in figure) . The time segment selection module is configured to receive a selection signal for selecting the target time segment from at least  two preset candidate time segments. The time segment determination module is configured to determine the voice messages that belong to selected the target time segment as the each voice message to be searched.
Optionally, the search acquisition module 320 is configured to obtain the search keyword directly input as text;
or,
the search acquisition module 320 is configured to obtain search voice signals input as text; use a speech recognition technology to identify the search keyword in text format from the search voice signals;
or,
the search acquisition module 320 is configured to obtain search voice signals input as voice; send the search voice signals to the server; receive the search keyword fed back by the server, with the search keyword identified by the server from the search voice signals by using a speech recognition technology.
To sum up, a voice message search device is provided by the embodiments. By obtaining a text search keyword, the voice message search device obtains the search result by searching for the text message that includes the search keyword from the text messages respectively corresponding to each voice message. This solves the problem of inefficient search using the voice message search method provided by the prior art, allowing a user to quickly and conveniently find the target voice message simply by inputting a search keyword.
In addition, in the present embodiment, text messages are sorted according to preset conditions to accelerate the search. Particularly, in two-party or multi-party chat using voice messages, voice messages can be sorted giving another contact a priority higher than that of the current contact, thus significantly accelerating the search.
Note that, the voice message search performed by the voice message search device provided by the above-mentioned embodiment is described by using only the division of the above-mentioned function modules as an example. In actual application, the above-mentioned functions can be assigned to different function modules for completion as needed. That is, the internal structure of the device can be divided into different function modules to complete all or some of the above-mentioned functions. In addition, the voice message search device and voice message search method provided by the above-mentioned embodiments adopt the same concept. For implementation details, see the description of the method provided by the above-mentioned embodiments.
Figure 5 shows the structural diagram for the voice message search system provided by an embodiment of the present disclosure. The voice message search system comprises at least a  client 520 and a server 540. The client 520 and server 540 are interconnected using a wireless network or wired network.
The client 520 includes the voice message search device provided by the embodiment as shown in Figure 3 or Figure 4.
The sequence numbers of the above-mentioned embodiments are intended only for description, instead of indicating the priorities of the embodiments.
Those of ordinary skill in the existing art can understand that all of or part of the steps described in the above-mentioned embodiments can be completed by hardware or by related hardware as instructed by a program. The program can be stored on a computer-readable storage medium. The storage medium can be a Read-Only Memory (ROM) , a magnetic disk, or an optical disk.
While the present disclosure has been particularly disclosed and described above with reference to preferred embodiments, it should be understood that the description is not intended to limit the present disclosure. Any modifications, equivalent substitutions, and improvements made without departing from the spirit or principle of the present disclosure shall fall within the scope of the present disclosure.
Person of skill in the art can get aware that the whole or part of method in embodiments above may be realized through relevant hardware under instruction of computer program, in which the program may be stored in a computer-readable memory medium. When the program is executed, flow processes in embodiments of method above may be contained. Therein, the memory medium above may be diskette, optical disk, Read-Only Memory (ROM) or Random Access Memory (RAM) , or the like.
All disclosures above are just some of the preferred embodiments of the disclosure, which are descried specifically and particularly but not intending to limit the range of the disclosure. It should be noticed that person of skill in the art can make various changes and modifications within the scope of the disclosure, therefore, the protection scope of the present disclosure is defined by the claims.

Claims (18)

  1. A method for searching voice message in a terminal, comprising:
    obtaining, by the terminal, a text search keyword;
    searching, by the terminal, for a text message that comprises the text search keyword from text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message; and
    feeding back, by the terminal, as a search result the voice message corresponding to the text message that comprises the text search keyword.
  2. The method according to claim 1, wherein before searching for the text message that comprises the text search keyword from the text messages respectively corresponding to each voice message, the method further comprises at least one of the following:
    performing speech recognition on each voice message to obtain respective speech recognition results; based on the speech recognition results, generating the text messages respectively corresponding to each voice message;
    sending each voice message to a server; receiving the text messages returned by the server that respectively correspond to each voice message; the text message is generated based on the speech recognition results obtained after the server performs speech recognition on each voice message;
    and
    receiving the voice message sent by another client and forwarded by the server and the text message corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message; and/or, after sending the voice message from the client, receiving the text message returned by the server corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message.
  3. The method according to claim 1, wherein searching for the text message that comprises the text search keyword from the text messages respectively corresponding to each voice message comprises:
    according to preset conditions, sorting the text messages corresponding to each voice message to be searched, the preset conditions comprising at least one of the following: generation time corresponding to each voice message, priorities of contacts corresponding to each voice message, and data sizes of each text message; and
    from sorted the text messages, searching for the text message that comprises the text search keyword.
  4. The method according to claim 3, wherein before sorting, according to the preset conditions, the text messages corresponding to each voice message to be searched, further comprises:
    receiving a selection signal for selecting target contact from at least two contacts related to a current interface; and
    determining the voice messages that belong to selected the target contact as the each voice message to be searched.
  5. The method according to claim 3, wherein before sorting, according to the preset conditions, the text messages corresponding to each voice message to be searched, further comprises:
    receiving a selection signal for selecting a target time segment from at least two preset candidate time segments; and
    determining the voice messages that belong to selected the target time segment as the each voice message to be searched.
  6. The method according to any one of claims 1 to 5, wherein obtaining a text search keyword comprises at least one of the following:
    obtaining the text search keyword directly input as text;
    obtaining search voice signals input as voice; using a speech recognition technology to identify the text search keyword in text format from the search voice signals; and
    obtaining search voice signals input as voice; sending the search voice signals to a server; receiving the text search keyword returned by the server, with the text search keyword identified by the server from the search voice signals by using a speech recognition technology.
  7. A device comprising a processor and a non-transitory storage medium configured to store modules comprising:
    a search acquisition module, configured to obtain a text search keyword;
    a text search module, configured to search for a text message that comprises the search keyword from the text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message; and
    a result feedback module, configured to feed back as a search result the voice message corresponding to the text message that comprises the text search keyword.
  8. The device according to claim 7, further comprising a text generation module configured to perform at least one of the following:
    perform speech recognition on each voice message to obtain respective speech recognition results; based on the speech recognition results, generate the text messages respectively corresponding to each voice message;
    send each voice message to a server; receive the text messages returned by the server that respectively correspond to each voice message; the text message is generated based on the speech recognition results obtained after the server performs speech recognition on each voice message; and
    receive the voice message sent by another client and forwarded by the server and the text message corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message; and/or, after sending the voice message, receive the text message returned by the server corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message.
  9. The device according to claim 7, wherein the text search module further comprises:
    a message sorting module, configured to, according to preset conditions, sort the text messages corresponding to each voice message to be searched, the preset conditions comprising at least one of the following: generation time corresponding to each voice message, priorities of contacts corresponding to each voice message, and data sizes of each text message; and
    a sorting search module, configured to, from sorted the text messages, search for the text message that comprises the text search keyword.
  10. The device according to claim 9, wherein the text search module further comprises a contact selection module and a contact determination module;
    the contact selection module is configured to receive a selection signal for selecting target contact from at least two contacts related to a current interface; and
    the contact determination module is configured to determine the voice messages that belong to selected the target contact as the each voice message to be searched.
  11. The device according to claim 9, wherein the text search module further comprises a time segment selection module and a time segment determination module;
    the time segment selection module is configured to receive a selection signal for selecting a target time segment from at least two preset candidate time segments; and
    the time segment determination module is configured to determine the voice messages that belong to selected the target time segment as the each voice message to be searched.
  12. The device according to any of claims 7 to 11, wherein the search acquisition module is configured to perform at least one of the following:
    obtain the text search keyword directly input as text;
    obtain search voice signals input as text; use a speech recognition technology to identify the text search keyword in text format from the search voice signals; and
    obtain search voice signals input as voice; send the search voice signals to a server; receive the text search keyword returned by the server, with the text search keyword identified by the server from the search voice signals by using a speech recognition technology.
  13. A non-transitory storage medium storing a set of instructions for searching voice message in a device having a processor, the set of instructions to direct the processor to perform acts of:
    obtaining a text search keyword;
    searching for a text message that comprises the text search keyword from text messages respectively corresponding to each voice message, with each text message generated based on speech recognition result of the corresponding the voice message; and
    feeding back as a search result the voice message corresponding to the text message that comprises the text search keyword.
  14. The non-transitory storage medium according to claim 13, the set of instructions to direct the processor to perform at least one of the following:
    performing speech recognition on each voice message to obtain respective speech recognition results; based on the speech recognition results, generating the text messages respectively corresponding to each voice message;
    sending each voice message to a server; receiving the text messages returned by the server that respectively correspond to each voice message; the text message is generated based on the speech recognition results obtained after the server performs speech recognition on each voice message;
    and
    receiving the voice message sent by another client and forwarded by the server and the text message corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message; and/or, after sending the voice message from the client, receiving the text message returned by the server corresponding to the voice message, the text message being generated based on the speech recognition result obtained after the server performs speech recognition on the voice message.
  15. The non-transitory storage medium according to claim 13, wherein searching for the text message that comprises the text search keyword from the text messages respectively corresponding to each voice message comprises:
    according to preset conditions, sorting the text messages corresponding to each voice message to be searched, the preset conditions comprising at least one of the following: generation time corresponding to each voice message, priorities of contacts corresponding to each voice message, and data sizes of each text message; and
    from sorted the text messages, searching for the text message that comprises the text search keyword.
  16. The non-transitory storage medium according to claim 15, the set of instructions to direct the processor to perform acts of:
    receiving a selection signal for selecting target contact from at least two contacts related to a current interface; and
    determining the voice messages that belong to selected the target contact as the each voice message to be searched.
  17. The non-transitory storage medium according to claim 15, the set of instructions to direct the processor to perform acts of:
    receiving a selection signal for selecting a target time segment from at least two preset candidate time segments; and
    determining the voice messages that belong to selected the target time segment as the each voice message to be searched.
  18. The non-transitory storage medium according to any one of claims 13 to 17, wherein obtaining a text search keyword comprises at least one of the following:
    obtaining the text search keyword directly input as text;
    obtaining search voice signals input as voice; using a speech recognition technology to identify the text search keyword in text format from the search voice signals; and
    obtaining search voice signals input as voice; sending the search voice signals to a server; receiving the text search keyword returned by the server, with the text search keyword identified by the server from the search voice signals by using a speech recognition technology.
PCT/CN2014/092426 2013-12-17 2014-11-28 A voice message search method, device, and system WO2015090137A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201310695093.XA CN104714981B (en) 2013-12-17 2013-12-17 Voice message searching method, device and system
CN201310695093.X 2013-12-17

Publications (1)

Publication Number Publication Date
WO2015090137A1 true WO2015090137A1 (en) 2015-06-25

Family

ID=53402086

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/092426 WO2015090137A1 (en) 2013-12-17 2014-11-28 A voice message search method, device, and system

Country Status (2)

Country Link
CN (1) CN104714981B (en)
WO (1) WO2015090137A1 (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3376358A1 (en) * 2016-05-18 2018-09-19 Apple Inc. Devices, methods, and graphical user interfaces for messaging
CN110188233A (en) * 2019-05-27 2019-08-30 努比亚技术有限公司 Method, wearable device and the storage medium of voice on-line search processing
WO2020024525A1 (en) * 2018-07-31 2020-02-06 珠海格力电器股份有限公司 Electrical appliance control method and apparatus, storage medium and electrical appliance
CN111506752A (en) * 2019-01-30 2020-08-07 阿里巴巴集团控股有限公司 Search method, search device, electronic equipment and computer storage medium
CN113112236A (en) * 2021-04-19 2021-07-13 云南电网有限责任公司迪庆供电局 Intelligent distribution network scheduling system and method based on voice and voiceprint recognition
US11159922B2 (en) 2016-06-12 2021-10-26 Apple Inc. Layers in messaging applications
US11221751B2 (en) 2016-05-18 2022-01-11 Apple Inc. Devices, methods, and graphical user interfaces for messaging

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106558311B (en) * 2015-09-30 2020-11-27 北京奇虎科技有限公司 Voice content prompting method and device
US9984075B2 (en) 2015-10-06 2018-05-29 Google Llc Media consumption context for personalized instant query suggest
CN107402748A (en) * 2016-07-01 2017-11-28 北京都在哪网讯科技有限公司 Information processing method and device for communications applications
CN107967250B (en) * 2016-10-19 2020-12-29 中兴通讯股份有限公司 Information processing method and device
CN110019923A (en) * 2017-07-18 2019-07-16 北京国双科技有限公司 The lookup method and device of speech message
CN107818786A (en) * 2017-10-25 2018-03-20 维沃移动通信有限公司 A kind of call voice processing method, mobile terminal
CN107798143A (en) * 2017-11-24 2018-03-13 珠海市魅族科技有限公司 A kind of information search method, device, terminal and readable storage medium storing program for executing
CN110099360A (en) * 2018-01-30 2019-08-06 腾讯科技(深圳)有限公司 Voice message processing method and device
CN108446389B (en) * 2018-03-22 2021-12-24 平安科技(深圳)有限公司 Voice message search display method and device, computer equipment and storage medium
CN110399468A (en) * 2018-04-20 2019-11-01 北京搜狗科技发展有限公司 A kind of data processing method, device and the device for data processing
CN108874904B (en) * 2018-05-24 2022-04-29 平安科技(深圳)有限公司 Voice message searching method and device, computer equipment and storage medium
CN109274586A (en) * 2018-11-14 2019-01-25 深圳市云歌人工智能技术有限公司 Storage method, device and the storage medium of chat message
CN112311652B (en) * 2019-07-23 2023-02-07 腾讯科技(深圳)有限公司 Message sending method, device, terminal and storage medium
CN111988479B (en) * 2020-08-20 2021-04-20 浙江企蜂信息技术有限公司 Call information processing method and device, computer equipment and storage medium
CN112287162A (en) * 2020-10-27 2021-01-29 维沃移动通信有限公司 Message searching method and device and electronic equipment
CN113282772A (en) * 2021-04-25 2021-08-20 夏贵军 User searching method and system based on 5G message
CN113836270A (en) * 2021-09-28 2021-12-24 深圳格隆汇信息科技有限公司 Big data processing method and related product

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2308978A1 (en) * 1999-05-26 2000-11-26 Lucent Technologies Inc. Voice message search system and method
CN102750365A (en) * 2012-06-14 2012-10-24 华为软件技术有限公司 Retrieval method and system of instant voice messages, user device and server

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4629560B2 (en) * 2004-12-01 2011-02-09 本田技研工業株式会社 Interactive information system
CN106886587A (en) * 2011-12-23 2017-06-23 优视科技有限公司 Voice search method, apparatus and system, mobile terminal, transfer server
CN103425668A (en) * 2012-05-16 2013-12-04 联想(北京)有限公司 Information search method and electronic equipment

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2308978A1 (en) * 1999-05-26 2000-11-26 Lucent Technologies Inc. Voice message search system and method
CN102750365A (en) * 2012-06-14 2012-10-24 华为软件技术有限公司 Retrieval method and system of instant voice messages, user device and server

Cited By (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11320982B2 (en) 2016-05-18 2022-05-03 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US11513677B2 (en) 2016-05-18 2022-11-29 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US11112963B2 (en) 2016-05-18 2021-09-07 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US11954323B2 (en) 2016-05-18 2024-04-09 Apple Inc. Devices, methods, and graphical user interfaces for initiating a payment action in a messaging session
US11625165B2 (en) 2016-05-18 2023-04-11 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US10592098B2 (en) 2016-05-18 2020-03-17 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US10254956B2 (en) 2016-05-18 2019-04-09 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US10852935B2 (en) 2016-05-18 2020-12-01 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US11126348B2 (en) 2016-05-18 2021-09-21 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US10983689B2 (en) 2016-05-18 2021-04-20 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US11966579B2 (en) 2016-05-18 2024-04-23 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US10331336B2 (en) 2016-05-18 2019-06-25 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US10949081B2 (en) 2016-05-18 2021-03-16 Apple Inc. Devices, methods, and graphical user interfaces for messaging
EP3376358A1 (en) * 2016-05-18 2018-09-19 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US11221751B2 (en) 2016-05-18 2022-01-11 Apple Inc. Devices, methods, and graphical user interfaces for messaging
US11159922B2 (en) 2016-06-12 2021-10-26 Apple Inc. Layers in messaging applications
US11778430B2 (en) 2016-06-12 2023-10-03 Apple Inc. Layers in messaging applications
WO2020024525A1 (en) * 2018-07-31 2020-02-06 珠海格力电器股份有限公司 Electrical appliance control method and apparatus, storage medium and electrical appliance
CN111506752A (en) * 2019-01-30 2020-08-07 阿里巴巴集团控股有限公司 Search method, search device, electronic equipment and computer storage medium
CN110188233B (en) * 2019-05-27 2023-11-14 努比亚技术有限公司 Voice online search processing method, wearable device and storage medium
CN110188233A (en) * 2019-05-27 2019-08-30 努比亚技术有限公司 Method, wearable device and the storage medium of voice on-line search processing
CN113112236A (en) * 2021-04-19 2021-07-13 云南电网有限责任公司迪庆供电局 Intelligent distribution network scheduling system and method based on voice and voiceprint recognition

Also Published As

Publication number Publication date
CN104714981A (en) 2015-06-17
CN104714981B (en) 2020-01-10

Similar Documents

Publication Publication Date Title
WO2015090137A1 (en) A voice message search method, device, and system
WO2017186054A1 (en) Emoticon recommendation method and apparatus
WO2017084541A1 (en) Method and apparatus for sending expression image during call session
US10791074B2 (en) Information pushing method, apparatus, and system, and computer storage medium
US20210397645A1 (en) Image search method and apparatus, computer device, and storage medium
US20190012373A1 (en) Conversational/multi-turn question understanding using web intelligence
CN106302996B (en) Message display method and device
US20140255895A1 (en) System and method for training agents of a contact center
WO2017172499A1 (en) One step task completion
CN104951546A (en) Method and device for subscribing for messages in instant messaging software
CN111563151B (en) Information acquisition method, session configuration method, device and storage medium
CN111158924B (en) Content sharing method and device, electronic equipment and readable storage medium
CN110795589A (en) Image searching method and device, computer equipment and storage medium
CN112003778B (en) Message processing method, device, equipment and computer storage medium
CN111368063A (en) Information pushing method based on machine learning and related device
CN108306851A (en) Information acquisition method, providing method, apparatus and system
CN113392178A (en) Message reminding method, related device, equipment and storage medium
CN110196833A (en) Searching method, device, terminal and the storage medium of application program
CN111523053A (en) Information flow processing method and device, computer equipment and storage medium
US11294962B2 (en) Method for processing random interaction data, network server and intelligent dialog system
WO2017201939A1 (en) Method and device for sorting contacts in contact list, and mobile terminal
CA2948000A1 (en) Method, system and apparatus for autonomous message generation
CN111666498B (en) Friend recommendation method based on interaction information, related device and storage medium
US11855945B2 (en) Method, computer device, and non-transitory computer-readable recording medium to pick and display message in messaging-based social network service
CN111191143A (en) Application recommendation method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14871088

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC , EPO FORM 1205A DATED 31-10-16

122 Ep: pct application non-entry in european phase

Ref document number: 14871088

Country of ref document: EP

Kind code of ref document: A1