CN113407768B

CN113407768B - Voiceprint retrieval method, voiceprint retrieval device, voiceprint retrieval system, voiceprint retrieval server and storage medium

Info

Publication number: CN113407768B
Application number: CN202110703864.XA
Authority: CN
Inventors: 卢宇机; 唐智; 刘小钊
Original assignee: Voiceai Technologies Co ltd
Current assignee: Voiceai Technologies Co ltd
Priority date: 2021-06-24
Filing date: 2021-06-24
Publication date: 2024-02-02
Anticipated expiration: 2041-06-24
Also published as: CN113407768A

Abstract

The embodiment of the application discloses a voiceprint retrieval method, a voiceprint retrieval device, a voiceprint retrieval system, a voiceprint retrieval server and a voiceprint retrieval storage medium. The method comprises the following steps: acquiring voiceprint feature data to be retrieved; performing feature comparison on the voiceprint feature data to be retrieved and voiceprint feature data stored in a memory in advance; and acquiring a feature comparison result corresponding to the voiceprint feature data to be searched, and sending the feature comparison result. According to the method, when the voiceprint feature data to be searched is obtained, the voiceprint feature data to be searched is directly compared with the voiceprint feature data stored in the memory in advance, the voiceprint feature comparison result is output, and the voiceprint feature searching speed can be improved by directly traversing the voiceprint feature data in the memory.

Description

Voiceprint retrieval method, voiceprint retrieval device, voiceprint retrieval system, voiceprint retrieval server and storage medium

Technical Field

The application belongs to the technical field of data processing, and particularly relates to a voiceprint retrieval method, device, system, server and storage medium.

Background

Voiceprint retrieval is to compare the speech to be retrieved with the speech stored in the database and return one or more voices from the same speaker as the speech. With the development of the voiceprint retrieval technology, the application scenario of the voiceprint retrieval technology is more and more, especially in the aspect of massive voiceprint feature retrieval, the related voiceprint retrieval method, the program and the Redis (Remote Dictionary Server ) server are all in data transmission through a network transmission mode, so that when the retrieved massive voiceprint feature data is output from a database, the network transmission can bring a very large performance bottleneck, and the retrieval speed is slowed down.

Disclosure of Invention

In view of the above problems, the present application proposes a voiceprint retrieval method, apparatus, system, server and storage medium, so as to achieve improvement of the above problems.

In a first aspect, an embodiment of the present application provides a voiceprint retrieval method, which is applied to a voiceprint retrieval module of a voiceprint retrieval server, where the method includes: acquiring voiceprint feature data to be retrieved; performing feature comparison on the voiceprint feature data to be retrieved and voiceprint feature data stored in a memory in advance; and acquiring a feature comparison result corresponding to the voiceprint feature data to be searched, and sending the feature comparison result.

In a second aspect, an embodiment of the present application provides a voiceprint retrieval method, which is applied to a service server, where the method includes: sending voiceprint feature data to be searched to a voiceprint search server, so that a voiceprint search module of the voiceprint search server performs feature comparison on the voiceprint feature data to be searched and voiceprint feature data stored in a memory in advance; and receiving a characteristic comparison result corresponding to the voiceprint characteristic data to be searched, and displaying the characteristic comparison result.

In a third aspect, an embodiment of the present application provides a voiceprint retrieval method, which is applied to a voiceprint retrieval system, where the system includes a voiceprint retrieval server and a service server, and the method includes: the service server sends voiceprint feature data to be searched to the voiceprint search server; the voiceprint retrieval server acquires voiceprint feature data to be retrieved; the voiceprint retrieval server compares the characteristic of the voiceprint characteristic data to be retrieved with the characteristic data of the voiceprint stored in the memory in advance; and the service server receives the feature comparison result corresponding to the voiceprint feature data to be searched, which is sent by the voiceprint search server, and displays the feature comparison result.

In a fourth aspect, an embodiment of the present application provides a voiceprint retrieval apparatus, a voiceprint retrieval module running on a voiceprint retrieval server, the apparatus including: the data acquisition unit is used for acquiring voiceprint feature data to be retrieved; the feature comparison unit is used for comparing the feature of the voiceprint feature data to be retrieved with the feature data of the voiceprint stored in the memory in advance; and the result acquisition unit is used for acquiring the characteristic comparison result corresponding to the voiceprint characteristic data to be searched and transmitting the characteristic comparison result.

In a fifth aspect, an embodiment of the present application provides a voiceprint retrieval apparatus, running on a service server, where the apparatus includes: the voice print retrieval device comprises a data sending unit, a voice print retrieval server and a voice print processing unit, wherein the data sending unit is used for sending voice print feature data to be retrieved to the voice print retrieval server so that a voice print retrieval module of the voice print retrieval server can conduct feature comparison on the voice print feature data to be retrieved and voice print feature data stored in a memory in advance; and the display unit is used for receiving the characteristic comparison result corresponding to the voiceprint characteristic data to be searched and displaying the characteristic comparison result.

In a sixth aspect, embodiments of the present application provide a voiceprint retrieval system, where the system includes a voiceprint retrieval server and a service server; the service server is used for sending voiceprint feature data to be searched to the voiceprint search server; the voiceprint retrieval server is used for acquiring voiceprint feature data to be retrieved; the voiceprint retrieval server is used for comparing the voiceprint characteristic data to be retrieved with voiceprint characteristic data stored in the memory in advance; the service server is used for receiving the feature comparison result corresponding to the voiceprint feature data to be searched, which is sent by the voiceprint search server, and displaying the feature comparison result.

In a seventh aspect, embodiments of the present application provide a server comprising one or more processors and memory; one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods described above.

In an eighth aspect, embodiments of the present application provide a computer readable storage medium having program code stored therein, wherein the above-described method is performed when the program code is run.

The embodiment of the application provides a voiceprint retrieval method, a voiceprint retrieval device, a voiceprint retrieval system, a voiceprint retrieval server and a voiceprint retrieval storage medium. Firstly, obtaining voiceprint feature data to be searched, then carrying out feature comparison on the voiceprint feature data to be searched and the voiceprint feature data stored in the memory in advance, finally obtaining a feature comparison result corresponding to the voiceprint feature data to be searched, and sending the feature comparison result. According to the method, when the voiceprint feature data to be searched is obtained, the voiceprint feature data to be searched is directly compared with the voiceprint feature data stored in the memory in advance, the voiceprint feature comparison result is output, and the voiceprint feature searching speed can be improved by directly traversing the voiceprint feature data in the memory.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 shows an application scenario schematic diagram of a voiceprint retrieval method according to an embodiment of the present application;

FIG. 2 is a flow chart of a voiceprint retrieval method according to one embodiment of the present application;

FIG. 3 is a flow chart illustrating a voiceprint retrieval method according to another embodiment of the present application;

FIG. 4 is a schematic diagram of a scenario in which a callback function is inserted according to another embodiment of the present application;

FIG. 5 is a flow chart of a voiceprint retrieval method according to yet another embodiment of the present application;

fig. 6 shows a flowchart of obtaining voiceprint feature data to be stored in a voiceprint retrieval method according to still another embodiment of the present disclosure;

FIG. 7 is a schematic diagram of a scenario featuring data synchronization of voiceprint according to still another embodiment of the present application;

FIG. 8 is a flowchart of a voiceprint retrieval method according to yet another embodiment of the present application;

FIG. 9 is a schematic diagram of a voiceprint search scenario set forth in a further embodiment of the present application;

fig. 10 shows a block diagram of a voiceprint retrieval apparatus according to an embodiment of the present application;

FIG. 11 is a block diagram illustrating another voiceprint retrieval apparatus according to an embodiment of the present application;

FIG. 12 is a block diagram of still another voiceprint retrieval apparatus according to an embodiment of the present application;

FIG. 13 is a block diagram showing still another voiceprint retrieval apparatus according to an embodiment of the present application;

FIG. 14 is a block diagram of a voiceprint retrieval system according to an embodiment of the present application;

FIG. 15 shows a block diagram of a server for executing a voiceprint retrieval method according to an embodiment of the present application in real time;

fig. 16 shows a storage unit for storing or carrying program codes for implementing a voiceprint retrieval method according to an embodiment of the present application in real time.

Detailed Description

The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.

The voiceprint retrieval is to acquire voiceprint characteristics of a given voice, compare the voiceprint characteristics of the given voice with voiceprint characteristics of voices stored in a database, and then return speaker information corresponding to the voice.

Due to the popularity of microphone-entered devices such as cell phones, personal computers, etc. in recent years, the rapid growth of network media has occurred with a large number of voice and video surges, with thousands of hundred hours of video uploaded to the cloud every minute. Voiceprint retrieval is also increasingly used, for example, to recommend similar voices by retrieving them; detecting infringement behavior through voiceprint retrieval; in large-scale voiceprint authentication, too many speakers can cause slow authentication, and retrieval techniques can be used to speed up the authentication process, etc.

The inventor finds that the voice print retrieval server and the voice print data storage server are both used for transmitting voice print retrieval data in a network transmission mode in the research of the related voice print retrieval method, so that when the retrieved mass voice print characteristic data are output from a database, the network transmission can bring about a very large performance bottleneck, and the retrieval speed is dragged down.

Therefore, the inventor proposes that firstly, voiceprint feature data to be searched is obtained, then the voiceprint feature data to be searched is subjected to feature comparison with voiceprint feature data stored in a memory in advance, finally, a feature comparison result corresponding to the voiceprint feature data to be searched is obtained, the feature comparison result is sent, when the voiceprint feature data to be searched is obtained, the voiceprint feature data to be searched is directly compared with the voiceprint feature data stored in the memory in advance, the voiceprint feature comparison result is output, and the voiceprint feature searching method, device, system, server and storage medium of the voiceprint feature searching speed can be improved in a mode of directly traversing the voiceprint feature data in the memory.

The following describes an application environment of the voiceprint search method provided by the implementation of the present invention:

referring to fig. 1, the voiceprint retrieval method provided by the embodiment of the present invention may be applied to a retrieval system 100, where the system 100 may include a voiceprint feature extraction server 110, a Redis server 120, a voiceprint retrieval server 710, and a service server 720. The voiceprint search server 710 includes a voiceprint search module 711, and further, the voiceprint search module 711 includes a Redis dynamic database module 7111, a memory 7112, and a core algorithm module 7113.

In the embodiment of the present application, the voiceprint feature extraction server 110 may be used to extract voiceprint features of a registrant's voice. When the voiceprint feature extraction server 110 extracts the voiceprint feature of the voice, the voiceprint feature of the voice of the registrant may be extracted using the voiceprint feature extraction model included in the voiceprint feature extraction server 110. The voiceprint feature extraction model may be a pre-trained neural network model, which is configured to output voiceprint features of a registrant's voice according to the input registrant's voice; the registrant may be a user who first sends sound data or audio data to the service server 720.

Redis server 120 is an open source, written in ANSI C language, supports networking, can be based on a log, key-Value database which can also be persistent, and provides an Application programming interface (Application ProgrammingInterface, API) of multiple languages, and is commonly called a data structure server, because a Value can be five types, namely a String (String), a Hash/Map, a list (list), a collection (set) and a sorted collection (sorted set), and is convenient to operate. The dis server 120 may be used to store voiceprint feature data of a registrant's voice.

The voiceprint retrieval module 711 in the voiceprint retrieval server 710 may be a voiceprint retrieval application program, configured to compare voiceprint feature data to be retrieved with voiceprint feature data stored in advance according to a received voiceprint retrieval instruction.

The pre-stored voiceprint feature data may be stored in the Redis dynamic database module 7111 and the memory 7112. Specifically, the dis dynamic database module 7111 and the memory 7112 may store voiceprint feature data sent by the dis server 120 according to a received voiceprint feature data synchronization instruction.

The service server 720 may be configured to receive the voiceprint feature data to be retrieved, and then send the voiceprint feature data to be retrieved to the voiceprint retrieval server 710, so that the voiceprint retrieval module 711 performs voiceprint retrieval on the voiceprint feature data to be retrieved.

Embodiments of the present application will be described in detail below with reference to the accompanying drawings.

Referring to fig. 2, a voiceprint retrieval method provided in an embodiment of the present application is applied to a voiceprint retrieval module of a voiceprint retrieval server, and the method includes:

step S110: and obtaining voiceprint feature data to be retrieved.

In this embodiment of the present application, the voiceprint feature data to be retrieved is voiceprint features corresponding to input audio data of a user who needs voiceprint retrieval. The voiceprint is a sound wave spectrum which is displayed by an electroacoustical instrument and carries speech information, and has the characteristics of specificity and relative stability. In embodiments of the present application, the voiceprint features may include, but are not limited to, MFCC (Mel Frequency Cepstral Coefficients, mel-frequency cepstral coefficient) features, LPCC (Linear Prediction Cepstrum Coefficient, linear predictive cepstral coefficient) features. The MFCC features utilize the non-linear characteristic of the auditory frequency of the human ear, convert the frequency spectrum into a non-linear frequency spectrum based on Mel frequency, and then convert the non-linear frequency spectrum into a cepstrum domain, so that the auditory characteristic of the human is fully simulated, and no precondition is provided, and the MFCC features have the identification performance and the anti-noise capability; LPCC features are the representation mode of linear prediction coefficients in the cepstrum domain, and based on the assumption that the voice signal is an autoregressive signal, the cepstrum parameters are obtained by utilizing linear prediction analysis, and the LPC order in the experiment is the linear prediction cepstrum parameters, so that the specific sound channel characteristics of each person are reflected.

As a way, the voiceprint feature data to be retrieved may be voiceprint feature data corresponding to audio data of a user acquired in real time, or may be voiceprint feature data corresponding to audio data of a user acquired in advance, which is transmitted through an external device. The audio data may be audio data generated in a call process, may be audio data generated in a conference process, or may be audio data input by a user in application software.

If the voiceprint feature data to be searched is voiceprint feature data corresponding to the audio data of the user, which is transmitted through the external device and is acquired in advance, when the audio data of the user is acquired and stored, the external device can acquire time information of occurrence of the audio data, record the time information, the source of the audio data, the text information and the corresponding voiceprint feature data according to a specified format, and further determine the voiceprint feature data to be searched according to the time information, the source of the audio data and the like when the voiceprint search module of the voiceprint search server acquires the voiceprint feature data to be searched. Recording according to a specified format can be understood as storing time information corresponding to audio data, sources of the audio data, text information and corresponding voiceprint feature data in different fields of the same piece of data, and then a voiceprint retrieval module of a voiceprint retrieval server can acquire corresponding data information by reading the different fields of the piece of data.

When the voiceprint feature data to be searched is voiceprint feature data corresponding to the pre-collected user audio data transmitted by the external equipment, the voiceprint search module of the voiceprint search server can send a voiceprint feature data acquisition request to the external equipment in advance, and when the external equipment receives the voiceprint feature data acquisition request, echo voiceprint feature data is returned to the voiceprint search module of the voiceprint search server to serve as the voiceprint feature data to be searched. The external device may be an audio collection device in communication with the voiceprint retrieval server, such as a smart phone, a tablet computer, or an intelligent device with a microphone. In the embodiment of the application, the audio data of the user may be collected through a microphone installed in the external device.

If the voice print feature data to be searched is voice print feature data corresponding to the voice print data of the user, the voice print feature data can be collected in real time through the voice print collecting device, after the voice print data of the user is collected through the voice print collecting device, the collected voice print data of the user can be sent to the voice print feature extraction server, and then the voice print feature extraction server can conduct voice print feature extraction on the voice print data of the user.

Step S120: and comparing the voiceprint characteristic data to be searched with the voiceprint characteristic data stored in the memory in advance.

As one mode, voiceprint feature data corresponding to the audio data of a plurality of users are stored in the memory in advance. When the voiceprint retrieval module of the voiceprint retrieval server obtains the voiceprint feature data to be retrieved, the voiceprint feature data to be retrieved can be subjected to feature comparison with the voiceprint feature data which correspond to the audio data of a plurality of users stored in the memory in advance.

Step S130: and acquiring a feature comparison result corresponding to the voiceprint feature data to be searched, and sending the feature comparison result.

In the embodiment of the present application, the feature comparison result may be a user corresponding to a voiceprint comparison score. And carrying out feature comparison on the voiceprint feature data to be searched and voiceprint feature data corresponding to the audio data of a plurality of users stored in the memory in advance one by one to obtain a plurality of voiceprint comparison scores corresponding to the voiceprint feature data to be searched, taking the user corresponding to the voiceprint comparison score exceeding the voiceprint score threshold value in the plurality of voiceprint comparison scores as a feature comparison result, and sending the feature comparison result.

Alternatively, the voiceprint comparison scores exceeding the voiceprint score threshold among the plurality of voiceprint comparison scores may be ranked in order from high to low, and the user corresponding to the specified number of voiceprint comparison scores ranked in front may be used as the feature comparison result.

According to the voiceprint retrieval method, firstly, voiceprint feature data to be retrieved are obtained, then feature comparison is carried out on the voiceprint feature data to be retrieved and the voiceprint feature data stored in the memory in advance, finally, feature comparison results corresponding to the voiceprint feature data to be retrieved are obtained, and the feature comparison results are sent. According to the method, when the voiceprint feature data to be searched is obtained, the voiceprint feature data to be searched is directly compared with the voiceprint feature data stored in the memory in advance, the voiceprint feature comparison result is output, and the voiceprint feature searching speed can be improved by directly traversing the voiceprint feature data in the memory.

Referring to fig. 3, a voiceprint retrieval method provided in an embodiment of the present application is applied to a voiceprint retrieval module of a voiceprint retrieval server, and the method includes:

step S210: and receiving a voiceprint feature data synchronization instruction.

As one way, the voiceprint retrieval module includes a Redis dynamic database module. The voiceprint feature data sync instruction may be sent by a Redis server. Specifically, when the Redis server receives the voiceprint feature data sent by the service server, the Redis server sends a voiceprint feature data synchronization instruction to a voiceprint retrieval module of the voiceprint retrieval server, and further the Redis dynamic database module can acquire and store the voiceprint feature data from the Redis server after receiving the voiceprint feature data synchronization instruction. Optionally, when the Redis server sends the voiceprint feature data synchronization instruction to the voiceprint retrieval module of the voiceprint retrieval server, the voiceprint feature data and the voiceprint feature data synchronization instruction may be sent to the voiceprint retrieval module as one piece of data, and further the Redis dynamic database module in the voiceprint retrieval module may directly obtain the voiceprint feature data from the one piece of data and store the voiceprint feature data in the Redis dynamic database module.

Step S220: and synchronously storing the voiceprint feature data stored in the Redis dynamic database module to the memory based on the voiceprint feature data synchronization instruction.

The voiceprint feature data stored in the Redis dynamic database module are voiceprint feature data of a registrant acquired in a registration stage.

In the embodiment of the present application, the registrant may be a user who first sends sound data or audio data to the service server. The voiceprint feature data sent by the service server to the Redis server is voiceprint feature data of a registrant acquired in a registration stage. Specifically, the registrant can input audio data through an API interface provided by the audio acquisition device, and after the service server receives the audio data input by the registrant, the service server sends the audio data to the voiceprint feature extraction server so that the voiceprint feature extraction server extracts voiceprint feature data of the audio data input by the registrant.

After the voiceprint feature extraction server extracts voiceprint feature data of the audio data input by the registrant, the voiceprint feature data are sent to the service server, and then the service server sends the voiceprint feature data of the registrant to the Redis server for storage.

After receiving the voiceprint feature data of the registrant, the Redis server synchronizes the voiceprint feature data of the registrant into the Redis dynamic database module based on the voiceprint feature data synchronization instruction, and then the Redis dynamic database module synchronizes the registered voiceprint feature data into the memory.

Optionally, the step of synchronously storing the voiceprint feature data stored in the Redis dynamic database module into the memory includes: and when the voiceprint feature data synchronization instruction is received, the voiceprint feature data stored in the Redis dynamic database module is synchronously stored into the memory through a callback function.

In the embodiment of the application, voiceprint feature data of a registrant is stored in a form of a hash table in a Redis dynamic database module. The callback function may be used to store the voice print feature data of the registrant synchronously into the memory. Specifically, different callback functions can be inserted into different positions of the hash table, so that the different callback functions can execute different functions when the voiceprint retrieval server is in different running states, wherein the callback functions can comprise a database state callback function, a database empty callback function, a key value insertion callback function, a key value deletion callback function, a key value update callback function and the like. As shown in fig. 4, the Redis dynamic database module may access the memory through the callback functions such as the database state callback function, the database empty callback function, the key insertion callback function, the key deletion callback function, and the key update callback function.

As one way, in determining the locations of callback function insertions of different functions, the callback functions may be inserted into different locations of the hash table according to the actions of the callback functions. Specifically, the database state callback function is used for starting the memory; the database clearing and back-tuning function is used for clearing voiceprint feature data stored in the memory; the key value insertion callback function, the key value deletion callback function and the key value update function are all used for processing data modification when voiceprint feature data are synchronized into a memory. Because the database state callback function is used for starting the memory, the database state callback function can be inserted into the starting position of the hash table, so that when a voiceprint feature data synchronization instruction is received, the memory can be started by the database state callback function preferentially, and voiceprint feature data in the Redis dynamic database module can be synchronized into the memory. In addition, because the callback functions such as the database clear callback function, the key value insertion callback function, the key value deletion callback function and the key value updating callback function are all used for operations such as adding, deleting, changing and clearing of the voiceprint feature data in the voiceprint feature data synchronization process, the callback functions such as the database clear callback function, the key value insertion callback function, the key value deletion callback function and the key value updating callback function can be inserted into the middle position of the hash table. Further, when the callback functions such as the database clear callback function, the key value insertion callback function, the key value deletion callback function and the key value update callback function are inserted into the middle position of the hash table, the callback functions such as the database clear callback function, the key value insertion callback function, the key value deletion callback function and the key value update callback function can be inserted into the same position of the hash table, and the callback functions such as the database clear callback function, the key value insertion callback function, the key value deletion callback function and the key value update callback function can also be inserted into different positions of the hash table. When the callback functions such as the database emptying callback function, the key value inserting callback function, the key value deleting callback function and the key value updating callback function are inserted into different positions of the hash table, the callback functions such as the database emptying callback function, the key value inserting callback function, the key value deleting callback function and the key value updating callback function can be inserted into different positions of the hash table according to the sequence of operations such as adding, deleting, changing and emptying of voiceprint feature data in the actual application process.

Step S230: and receiving a voiceprint retrieval instruction sent by the service server.

In the voiceprint retrieval stage, the voiceprint retrieval module of the voiceprint retrieval server can trigger starting voiceprint retrieval based on the voiceprint retrieval instruction sent by the service server.

Step S240: and responding to the voiceprint retrieval instruction, and acquiring the voiceprint feature data to be retrieved.

As one way, after the voiceprint retrieval module of the voiceprint retrieval server receives the voiceprint retrieval instruction sent by the service server, the voiceprint feature data to be retrieved starts to be obtained. Optionally, when the voiceprint retrieval module obtains the voiceprint feature data to be retrieved, the voiceprint feature data to be retrieved may be obtained from the Redis server based on the voiceprint retrieval instruction, or the voiceprint feature data to be retrieved may be obtained from the service server.

Step S250: and comparing the voiceprint characteristic data to be searched with the voiceprint characteristic data stored in the memory in advance.

According to the embodiment of the invention, the voiceprint characteristic data corresponding to the audio data of each user can be respectively used as the reference voiceprint characteristic data, and each reference voiceprint characteristic data is stored.

As a way, the memory may store reference voiceprint feature data corresponding to the audio data of a plurality of users in advance, and when the voiceprint feature data to be retrieved is obtained, the feature comparison is performed on the voiceprint feature data to be retrieved and the reference voiceprint feature data corresponding to the audio data of a plurality of users stored in advance in the memory. For example, N (N is a positive integer) pieces of reference voiceprint feature data are stored in the memory in advance. In the characteristic comparison process, the voiceprint characteristic data to be searched is compared with N pieces of reference voiceprint characteristic data in sequence, and when the voiceprint characteristic data to be searched is found to be consistent with one piece of reference voiceprint characteristic data, a comparison result is determined to be consistent, and comparison with the subsequent reference voiceprint characteristic data is not performed. If the voiceprint feature data to be searched is inconsistent with any reference voiceprint feature data, determining that the comparison result is inconsistent. Or, the voiceprint feature data to be searched can be respectively compared with the N pieces of reference voiceprint feature data to obtain N comparison results, and each comparison result represents the similarity between the voiceprint feature data to be searched and the corresponding reference voiceprint feature data. Furthermore, a comparison result with the maximum similarity is obtained, and when the maximum similarity exceeds a preset similarity threshold, the comparison result of the voiceprint feature data to be searched and the corresponding reference voiceprint feature data is consistent; and when the maximum similarity does not exceed a preset similarity threshold, determining that the voiceprint features to be searched are inconsistent with any reference voiceprint features.

Step S260: and acquiring a feature comparison result corresponding to the voiceprint feature data to be searched, and sending the feature comparison result to the service server for display.

In the embodiment of the application, the feature comparison result is sent to the client running in the service server for display.

According to the voiceprint retrieval method, firstly, a voiceprint feature data synchronization instruction is received, voiceprint feature data stored in a Redis dynamic database module are synchronously stored into a memory based on the voiceprint feature data synchronization instruction, then, a voiceprint retrieval instruction sent by a service server is received, voiceprint feature data to be retrieved is obtained in response to the voiceprint retrieval instruction, then, feature comparison is carried out on the voiceprint feature data to be retrieved and the voiceprint feature data stored in the memory in advance, finally, feature comparison results corresponding to the voiceprint feature data to be retrieved are obtained, and the feature comparison results are sent to the service server for display. According to the method, when the voiceprint feature data to be searched is obtained, the voiceprint feature data to be searched is directly compared with the voiceprint feature data stored in the memory in advance, the voiceprint feature comparison result is output, and the voiceprint feature searching speed can be improved by directly traversing the voiceprint feature data in the memory.

Referring to fig. 5, a voiceprint retrieval method provided in an embodiment of the present application is applied to a service server, and the method includes:

step S310: and sending voiceprint feature data to be searched to a voiceprint search server, so that a voiceprint search module of the voiceprint search server performs feature comparison on the voiceprint feature data to be searched and voiceprint feature data stored in a memory in advance.

In the embodiment of the application, in the voiceprint retrieval stage, the service server sends voiceprint feature data to be retrieved to the voiceprint retrieval server. The voiceprint feature data to be retrieved may be voiceprint feature data of a sound to be verified. Alternatively, the sound to be verified may be a voice input by the user, and specifically may be a voice file obtained by subjecting a recording of a speech of the user to silence suppression processing, where silence suppression processing may be a process of identifying and eliminating a long-time silence segment from a voice signal stream of the voice input by the user.

Step S320: and receiving a characteristic comparison result corresponding to the voiceprint characteristic data to be searched, and displaying the characteristic comparison result.

Further, as shown in fig. 6, step S310 further includes:

step S301: and obtaining voiceprint feature data to be stored.

As one way, the step of acquiring the voiceprint feature data to be stored includes: receiving a sound to be registered; transmitting the voice to be registered to a voiceprint feature extraction server so that the voiceprint feature extraction server extracts voiceprint feature data of the voice to be registered; and receiving voiceprint feature data of the voice to be registered, which is sent by the voiceprint feature extraction server, and taking the voiceprint feature data as the voiceprint feature data to be stored.

The to-be-registered sound and the to-be-verified sound are similar, and can be voice input by a user, specifically, a voice file obtained by silence suppression processing of a recording of a user speaking, and in order to ensure accuracy of the to-be-registered sound, a plurality of acquired to-be-registered sounds can be obtained.

Step S302: and synchronously storing the voiceprint characteristic data to be stored into the memory in the voiceprint retrieval server.

Specifically, as shown in fig. 7, when the service server receives the voiceprint registration instruction, the voiceprint feature data synchronization process acquires the voice to be registered, then sends a voiceprint feature extraction instruction to the voiceprint feature extraction server, and the voiceprint feature extraction server extracts voiceprint feature data of the voice to be registered based on the voiceprint feature extraction instruction and returns the voiceprint feature data of the voice to be registered to the service server.

When the service server receives the voiceprint feature data of the voice to be registered, the voiceprint feature data of the voice to be registered is cached to the Redis server, after the voiceprint feature data of the voice to be registered is cached by the Redis server, the storage result of the voiceprint feature data of the voice to be registered is returned to the service server, and then the service server can display the storage result through the client.

Meanwhile, the Redis server sends a voiceprint feature data synchronization instruction to the voiceprint retrieval server, and when the voiceprint retrieval server receives the voiceprint feature data synchronization instruction, the voiceprint feature data of the voice to be registered is stored in the Redis dynamic database module. When the Redis server and the Redis dynamic database module synchronize the voiceprint feature data of the sound to be registered, the voiceprint feature data of the sound to be registered can be synchronized into the memory in real time through a callback function.

Illustratively, the application procedure of the method of the above embodiment may be as follows:

when the user A inputs information to the banking system through the client in the service server to register the identity, the user A can register the identity in a voice registration mode. Specifically, when the service server receives the voice input by the user a, the voice of the user a is sent to the voiceprint feature extraction server to perform voiceprint feature extraction, and when the voiceprint feature extraction server extracts voiceprint feature data of the voice of the user a, the voiceprint feature data of the voice of the user a is sent to the service server, so that the service server can send the voiceprint feature data of the voice of the user a to the Redis server to store.

After receiving the voiceprint feature data of the voice of the user A, the Redis server sends a voiceprint feature data synchronization instruction to the voiceprint retrieval server so as to store the voiceprint feature data of the voice of the user A to the Redis dynamic database module.

When the Redis server and the Redis dynamic database module synchronize the voiceprint feature data of the voice of the user A, the voiceprint feature data of the voice of the user A can be synchronized into the memory in real time through a callback function.

According to the voiceprint retrieval method, voiceprint feature data to be retrieved is sent to the voiceprint retrieval server, so that the voiceprint retrieval module of the voiceprint retrieval server can conduct feature comparison on the voiceprint feature data to be retrieved and the voiceprint feature data stored in the memory in advance, then the feature comparison result corresponding to the voiceprint feature data to be retrieved is received, and the feature comparison result is displayed. According to the method, when the voiceprint feature data to be searched is obtained, the voiceprint feature data to be searched is directly compared with the voiceprint feature data stored in the memory in advance, the voiceprint feature comparison result is output, and the voiceprint feature searching speed can be improved by directly traversing the voiceprint feature data in the memory.

Referring to fig. 8, a voiceprint retrieval method provided in an embodiment of the present application is applied to a voiceprint retrieval system, where the system includes a voiceprint retrieval server and a service server, and the method includes:

step S410: and the service server sends voiceprint feature data to be searched to the voiceprint search server.

Step S420: and the voiceprint retrieval server acquires the voiceprint feature data to be retrieved.

Step S430: and the voiceprint retrieval server compares the characteristic of the voiceprint characteristic data to be retrieved with the characteristic of the voiceprint characteristic data stored in the memory in advance.

Step S440: and the service server receives the feature comparison result corresponding to the voiceprint feature data to be searched, which is sent by the voiceprint search server, and displays the feature comparison result.

For example, the application process of the method of the embodiment of the present application may be as shown in fig. 9:

the mobile terminal user performs business negotiations with the client A by telephone communication. When the business server receives the voice of the client A, the voice of the client A is sent to the voiceprint feature extraction server to carry out voiceprint feature extraction, after the business server receives the voiceprint feature data of the voice of the client A, a voiceprint feature retrieval instruction is sent to the voiceprint retrieval server, and a voiceprint retrieval module of the voiceprint feature retrieval server carries out feature comparison on the voiceprint feature data of the voice of the client A and voiceprint feature data stored in a memory in advance by utilizing a voiceprint comparison algorithm in a core algorithm module, if the voiceprint feature data of the voice of the client A is not matched with the voiceprint feature data stored in the memory, the mobile terminal user and the client A are indicated to carry out business negotiation for the first time. When the voice print feature data of the client A is first negotiated with the client A, the voice print feature data of the client A can be stored in the memory, and meanwhile, the voice print feature data of the client A can be sent to the Redis server for storage, so that the voice print feature data of the client A is prevented from being lost when the mobile terminal is replaced. Since the voiceprint is unique, the mobile terminal can identify the client a by using the voiceprint's uniqueness when conducting a business negotiation with the client a again. When the mobile terminal user conducts business negotiation with the client A again, voice print characteristic data of the client A are extracted when voice of the client A is received. Because the voiceprint feature data of the client A is stored in the memory in advance, the voiceprint can be successfully matched, and the mobile terminal automatically starts recording the audio data of the business negotiation so as to be used as evidence when needed, and the mobile terminal user is not required to actively trigger the recording operation. Of course, if the communication with the mobile terminal user is client B, the voiceprint matching is unsuccessful, and the mobile terminal will not automatically record audio data.

According to the voiceprint retrieval method, the service server sends voiceprint feature data to be retrieved to the voiceprint retrieval server, the voiceprint retrieval server obtains the voiceprint feature data to be retrieved, then the voiceprint retrieval server compares the voiceprint feature data to be retrieved with the voiceprint feature data stored in the memory in advance, and finally the service server receives a feature comparison result corresponding to the voiceprint feature data to be retrieved, which is sent by the voiceprint retrieval server, and displays the feature comparison result. According to the method, when the voiceprint feature data to be searched is obtained, the voiceprint feature data to be searched is directly compared with the voiceprint feature data stored in the memory in advance, the voiceprint feature comparison result is output, and the voiceprint feature searching speed can be improved by directly traversing the voiceprint feature data in the memory.

Referring to fig. 10, in a voiceprint retrieval apparatus 500 provided in an embodiment of the present application, a voiceprint retrieval module running in a voiceprint retrieval server, the apparatus 500 includes:

the data obtaining unit 510 is configured to obtain voiceprint feature data to be retrieved.

Optionally, the data obtaining unit 510 is configured to receive a voiceprint retrieval instruction sent by the service server; and responding to the voiceprint retrieval instruction, and acquiring the voiceprint feature data to be retrieved.

The feature comparison unit 520 is configured to perform feature comparison on the voiceprint feature data to be retrieved and voiceprint feature data stored in the memory in advance.

The result obtaining unit 530 is configured to obtain a feature comparison result corresponding to the voiceprint feature data to be retrieved, and send the feature comparison result.

Optionally, the result obtaining unit 530 is configured to obtain a feature comparison result corresponding to the voiceprint feature data to be retrieved, and send the feature comparison result to the service server for display.

Referring to fig. 11, the apparatus 500 further includes:

a data synchronization unit 540, configured to receive a voiceprint feature data synchronization instruction; and synchronously storing the voiceprint feature data stored in the Redis dynamic database module to the memory based on the voiceprint feature data synchronization instruction.

Optionally, the data synchronization unit 540 is configured to store, when the voiceprint feature data synchronization instruction is received, voiceprint feature data stored in the Redis dynamic database module to the memory through a callback function.

Referring to fig. 12, a voiceprint retrieval apparatus 600 provided in an embodiment of the present application is operated in a service server, where the apparatus 600 includes:

The data sending unit 610 is configured to send voiceprint feature data to be searched to a voiceprint search server, so that a voiceprint search module of the voiceprint search server performs feature comparison on the voiceprint feature data to be searched and voiceprint feature data stored in a memory in advance.

And the display unit 620 is configured to receive a feature comparison result corresponding to the voiceprint feature data to be retrieved, and display the feature comparison result.

Referring to fig. 13, the apparatus 600 further includes:

a data storage unit 630, configured to obtain voiceprint feature data to be stored; and synchronously storing the voiceprint characteristic data to be stored into the memory in the voiceprint retrieval server.

Optionally, the data storage unit 630 is configured to receive a sound to be registered; transmitting the voice to be registered to a voiceprint feature extraction server so that the voiceprint feature extraction server extracts voiceprint feature data of the voice to be registered; and receiving voiceprint feature data of the voice to be registered, which is sent by the voiceprint feature extraction server, and taking the voiceprint feature data as the voiceprint feature data to be stored.

Referring to fig. 14, a voiceprint retrieval system 700 provided in an embodiment of the present application, the system 700 includes a voiceprint retrieval server 710 and a service server 720;

The service server 720 is configured to send voiceprint feature data to be retrieved to the voiceprint retrieval server 710.

The voiceprint retrieval server 720 is configured to obtain the voiceprint feature data to be retrieved.

The voiceprint retrieval server 720 is configured to perform feature comparison on the voiceprint feature data to be retrieved and voiceprint feature data stored in the memory in advance.

The service server 710 is configured to receive a feature comparison result corresponding to the voiceprint feature data to be retrieved, which is sent by the voiceprint retrieval server 720, and display the feature comparison result.

It should be noted that, in the present application, the device embodiment and the foregoing method embodiment correspond to each other, and specific principles in the device embodiment may refer to the content in the foregoing method embodiment, which is not described herein again.

A server provided in the present application will be described with reference to fig. 15.

Referring to fig. 15, based on the above-mentioned voiceprint retrieval method and apparatus, another server 800 capable of executing the above-mentioned voiceprint retrieval method is provided in the embodiments of the present application. The server 800 includes one or more (only one shown) processors 802, memory 804, and a network module 806 coupled to each other. The memory 804 stores therein a program capable of executing the contents of the foregoing embodiments, and the processor 802 can execute the program stored in the memory 804.

Wherein the processor 802 may include one or more processing cores. The processor 802 utilizes various interfaces and lines to connect various portions of the overall server 800, perform various functions of the server 800, and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 804, and invoking data stored in the memory 804. Alternatively, the processor 802 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 802 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for being responsible for rendering and drawing of display content; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 802 and may be implemented solely by a single communication chip.

The Memory 804 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (ROM). Memory 804 may be used to store instructions, programs, code, sets of codes, or instruction sets. The memory 804 may include a stored program area that may store instructions for implementing an operating system, instructions for implementing at least one function (e.g., a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described below, etc., and a stored data area. The storage data area may also store data created by the server 800 in use (e.g., phonebook, audio-video data, chat log data), etc.

The network module 806 is configured to receive and transmit electromagnetic waves, and to implement mutual conversion between electromagnetic waves and electrical signals, so as to communicate with a communication network or other devices, such as an audio playback device. The network module 806 may include various existing circuit elements for performing these functions, such as an antenna, a radio frequency transceiver, a digital signal processor, an encryption/decryption chip, a Subscriber Identity Module (SIM) card, memory, and the like. The network module 806 may communicate with various networks such as the internet, intranets, wireless networks, or with other devices via wireless networks. The wireless network may include a cellular telephone network, a wireless local area network, or a metropolitan area network. For example, the network module 806 may interact with base stations.

Referring to fig. 16, a block diagram of a computer readable storage medium according to an embodiment of the present application is shown. The computer readable medium 900 has stored therein program code which can be invoked by a processor to perform the methods described in the method embodiments described above.

The computer readable storage medium 900 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable programmable read only memory), an EPROM, a hard disk, or a ROM. Optionally, computer readable storage medium 900 includes non-volatile computer readable media (non-transitory computer-readable storage medium). The computer readable storage medium 900 has storage space for program code 910 that performs any of the method steps described above. The program code can be read from or written to one or more computer program products. Program code 910 may be compressed, for example, in a suitable form.

According to the voiceprint retrieval method, device, system, server and storage medium, voiceprint feature data to be retrieved are firstly obtained, then feature comparison is carried out on the voiceprint feature data to be retrieved and the voiceprint feature data stored in the memory in advance, finally feature comparison results corresponding to the voiceprint feature data to be retrieved are obtained, and the feature comparison results are sent. According to the method, when the voiceprint feature data to be searched is obtained, the voiceprint feature data to be searched is directly compared with the voiceprint feature data stored in the memory in advance, the voiceprint feature comparison result is output, and the voiceprint feature searching speed can be improved by directly traversing the voiceprint feature data in the memory.

The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims

1. A voiceprint retrieval method, characterized by being applied to a voiceprint retrieval module of a voiceprint retrieval server, the voiceprint retrieval module being integrated with a Redis dynamic database module, the method comprising:

the Redis server sends a voiceprint feature data synchronization instruction to the voiceprint retrieval server, and when the voiceprint retrieval server receives the voiceprint feature data synchronization instruction, voiceprint feature data of a sound to be registered are synchronized to the Redis dynamic database module; when the Redis server and the Redis dynamic database module synchronize the voiceprint feature data of the sound to be registered, the voiceprint feature data stored in the Redis dynamic database module are synchronously stored into a memory through a callback function, wherein the callback function comprises a database state callback function and is used for starting the memory;

Acquiring voiceprint feature data to be retrieved;

performing feature comparison on the voiceprint feature data to be retrieved and voiceprint feature data stored in a memory in advance;

and acquiring a feature comparison result corresponding to the voiceprint feature data to be searched, and sending the feature comparison result.

2. The method of claim 1, wherein the voiceprint feature data stored in the dis dynamic database module is voiceprint feature data of a registrant collected during a registration phase.

3. The method of claim 1, wherein the obtaining voiceprint feature data to be retrieved comprises:

receiving a voiceprint retrieval instruction sent by a service server;

responding to the voiceprint retrieval instruction, and acquiring voiceprint feature data to be retrieved;

the step of obtaining the feature comparison result corresponding to the voiceprint feature data to be retrieved, and sending the feature comparison result comprises the following steps:

and acquiring a feature comparison result corresponding to the voiceprint feature data to be searched, and sending the feature comparison result to the service server for display.

4. A voiceprint retrieval method, for use with a service server, the method comprising:

Sending voiceprint feature data to be searched to a voiceprint search server, so that a voiceprint search module of the voiceprint search server performs feature comparison on the voiceprint feature data to be searched and voiceprint feature data stored in a memory in advance, wherein the voiceprint search module is integrated with a Redis dynamic database module; the voice print characteristic data stored in the memory in advance is that when the voice print retrieval module receives a voice print characteristic data synchronization instruction sent by a Redis server, voice print characteristic data of a voice to be registered is synchronized to the Redis dynamic database module, and when the Redis server and the Redis dynamic database module synchronize the voice print characteristic data of the voice to be registered, the voice print characteristic data stored in the Redis dynamic database module is synchronously stored to the memory through a callback function, wherein the callback function comprises a database state callback function used for starting the memory;

and receiving a characteristic comparison result corresponding to the voiceprint characteristic data to be searched, and displaying the characteristic comparison result.

5. The method of claim 4, wherein before sending the voiceprint feature data to be retrieved to the voiceprint retrieval server further comprises:

Acquiring voiceprint feature data to be stored;

and synchronously storing the voiceprint characteristic data to be stored into the memory in the voiceprint retrieval server.

6. The method of claim 5, wherein the obtaining voiceprint feature data to be stored comprises:

receiving a sound to be registered;

transmitting the voice to be registered to a voiceprint feature extraction server so that the voiceprint feature extraction server extracts voiceprint feature data of the voice to be registered;

and receiving voiceprint feature data of the voice to be registered, which is sent by the voiceprint feature extraction server, and taking the voiceprint feature data as the voiceprint feature data to be stored.

7. A voiceprint retrieval method, characterized by being applied to a voiceprint retrieval system, the system comprising a voiceprint retrieval server and a service server, the method comprising:

the voiceprint retrieval server receives a voiceprint feature data synchronization instruction, wherein the voiceprint feature data synchronization instruction is an instruction sent by a Redis server; synchronizing voiceprint feature data of the sound to be registered to the Redis dynamic database module when the voiceprint feature data synchronization instruction is received; when the Redis server and the Redis dynamic database module synchronize the voiceprint feature data of the sound to be registered, the voiceprint feature data stored in the Redis dynamic database module are synchronously stored into a memory through a callback function, wherein the callback function comprises a database state callback function and is used for starting the memory;

The service server sends voiceprint feature data to be searched to the voiceprint search server;

the voiceprint retrieval server acquires voiceprint feature data to be retrieved;

the voiceprint retrieval server compares the characteristic of the voiceprint characteristic data to be retrieved with the characteristic data of the voiceprint stored in the memory in advance;

and the service server receives the feature comparison result corresponding to the voiceprint feature data to be searched, which is sent by the voiceprint search server, and displays the feature comparison result.

8. A voiceprint retrieval apparatus, the voiceprint retrieval module operating on a voiceprint retrieval server, the voiceprint retrieval module integrated with a Redis dynamic database module, the apparatus comprising:

the data synchronization unit is used for receiving a voiceprint feature data synchronization instruction, wherein the voiceprint feature data synchronization instruction is an instruction sent by a Redis server; synchronizing voiceprint feature data of the sound to be registered to the Redis dynamic database module when the voiceprint retrieval server receives the voiceprint feature data synchronization instruction; when the Redis server and the Redis dynamic database module synchronize the voiceprint feature data of the sound to be registered, the voiceprint feature data stored in the Redis dynamic database module are synchronously stored into a memory through a callback function, wherein the callback function comprises a database state callback function and is used for starting the memory;

The data acquisition unit is used for acquiring voiceprint feature data to be retrieved;

the feature comparison unit is used for comparing the feature of the voiceprint feature data to be retrieved with the feature data of the voiceprint stored in the memory in advance;

and the result acquisition unit is used for acquiring the characteristic comparison result corresponding to the voiceprint characteristic data to be searched and transmitting the characteristic comparison result.

9. A voiceprint retrieval apparatus operable on a service server, the apparatus comprising:

the voice print searching device comprises a data sending unit, a voice print searching server and a voice print searching unit, wherein the data sending unit is used for sending voice print characteristic data to be searched to the voice print searching server so that a voice print searching module of the voice print searching server can conduct characteristic comparison on the voice print characteristic data to be searched and voice print characteristic data stored in a memory in advance, and the voice print searching module is integrated with a Redis dynamic database module; the voiceprint feature data pre-stored in the memory is that the voiceprint feature data of the voice to be registered are synchronized to the Redis dynamic database module when the voiceprint retrieval module receives a voiceprint feature data synchronization instruction sent by the Redis server; when the Redis server and the Redis dynamic database module synchronize the voiceprint feature data of the sound to be registered, the voiceprint feature data stored in the Redis dynamic database module are synchronously stored into a memory through a callback function, wherein the callback function comprises a database state callback function and is used for starting the memory;

And the display unit is used for receiving the characteristic comparison result corresponding to the voiceprint characteristic data to be searched and displaying the characteristic comparison result.

10. A voiceprint retrieval system, the system comprising a voiceprint retrieval server and a service server;

the service server is used for sending voiceprint feature data to be searched to the voiceprint search server;

the voiceprint retrieval server is used for receiving a voiceprint feature data synchronization instruction, wherein the voiceprint feature data synchronization instruction is an instruction sent by the Redis server; synchronizing voiceprint feature data of the sound to be registered to the Redis dynamic database module when the voiceprint feature data synchronization instruction is received; when the Redis server and the Redis dynamic database module synchronize the voiceprint feature data of the sound to be registered, the voiceprint feature data stored in the Redis dynamic database module are synchronously stored into a memory through a callback function, wherein the callback function comprises a database state callback function and is used for starting the memory;

the voiceprint retrieval server is used for acquiring voiceprint feature data to be retrieved;

the voiceprint retrieval server is used for comparing the voiceprint characteristic data to be retrieved with voiceprint characteristic data stored in the memory in advance;

The service server is used for receiving the feature comparison result corresponding to the voiceprint feature data to be searched, which is sent by the voiceprint search server, and displaying the feature comparison result.

11. A server comprising one or more processors and memory; one or more programs are stored in the memory and configured to perform the methods of any of claims 1-3, any of claims 4-6, and claim 7 by the one or more processors.

12. A computer readable storage medium, characterized in that the computer readable storage medium has stored therein a program code, wherein the program code, when being executed by a processor, performs the method of any of claims 1-3, any of claims 4-6, and claim 7.