CN109712635B

CN109712635B - Sound data processing method, intelligent terminal and storage medium

Info

Publication number: CN109712635B
Application number: CN201811629739.3A
Authority: CN
Inventors: 付华东; 王余生
Original assignee: Shenzhen Skyworth RGB Electronics Co Ltd
Current assignee: Shenzhen Skyworth RGB Electronics Co Ltd
Priority date: 2018-12-28
Filing date: 2018-12-28
Publication date: 2020-10-09
Anticipated expiration: 2038-12-28
Also published as: CN109712635A

Abstract

The invention discloses a sound data processing method, an intelligent terminal and a storage medium, wherein the method comprises the following steps: receiving data input by a microphone during recording, monitoring and analyzing the data to form a recording sample, and storing the qualified recording sample as a voiceprint sample; acquiring voice data sent by a user, processing and analyzing the voice data, comparing the processed and analyzed voice data with the voiceprint sample, and acquiring a voiceprint sample corresponding to the voice of the user according to a comparison result; and selecting a preset frequency band as an optimized frequency band, and optimizing the sound data according to the optimized frequency band and the corresponding voiceprint sample to obtain the corrected frequency response of the user. According to the invention, the voiceprints of each user are classified and analyzed, high-sound-quality samples are established for different age groups and gender groups, and sound quality optimization is carried out according to the data of the high-sound-quality samples, so that the sound quality of each user is improved, the singing effect of K is improved, and the sound quality requirements of different users during singing K are met.

Description

Sound data processing method, intelligent terminal and storage medium

Technical Field

The invention relates to the technical field of intelligent terminals, in particular to a sound data processing method, an intelligent terminal and a storage medium.

Background

Since the liquid crystal display television is in the network era, a plurality of entertainment functions related to the network world are rapidly realized by virtue of strong network functions, especially after the Android smart television appears, the smart television becomes an entertainment center of a family, although the content is more and more abundant, the application which is very popular with users is still few. Singing is taken as a K-song function which is one of main family entertainment modes, for example, the K-song of the whole people, the K-song of nature and the like are taken as a main selling point of a television.

Everyone feels different to vocal music, and the rhythm is different, and the throat is just like the wave filter moreover, and not everyone has golden throat naturally, and nearly everyone's vocal cords all has the defect, and the people singing through professional vocal music study can be more listened. Therefore, sound characteristics of everyone are different and cannot be matched with the adaptive K song parameters, so that the national K song is caused, the effect of the natural K song and the like on the smart television on the K song experienced by the user is not good, the requirements of different users on tone quality when the K song is not met, and the tone quality of the existing intelligent terminal cannot be optimized according to the sounds of the different users.

Accordingly, the prior art is yet to be improved and developed.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a sound data processing method, an intelligent terminal and a storage medium aiming at solving the defects of the prior art, and the method is to classify and analyze the voiceprint of each user, establish a high-sound-quality sample aiming at each group of different ages and genders, optimize the sound quality according to the data of the high-sound-quality sample, improve the sound quality of each user, improve the singing effect, meet the sound quality requirements of different users during singing and simultaneously improve the competitiveness of intelligent terminal products in the market.

The technical scheme adopted by the invention for solving the technical problem is as follows:

a sound data processing method, wherein the sound data processing method comprises:

receiving data input by a microphone during recording, monitoring and analyzing the data to form a recording sample, and storing the qualified recording sample as a voiceprint sample;

acquiring voice data sent by a user, processing and analyzing the voice data, comparing the processed and analyzed voice data with the voiceprint sample, and acquiring a voiceprint sample corresponding to the voice of the user according to a comparison result;

and selecting a preset frequency band as an optimized frequency band, and optimizing the sound data according to the optimized frequency band and the corresponding voiceprint sample to obtain the corrected frequency response of the user.

The sound data processing method includes the steps of receiving data input by a microphone during recording, monitoring and analyzing the data to form a recording sample, and storing the qualified recording sample as a voiceprint sample specifically includes:

before collecting a recording sample, grouping collection users meeting vocal music requirements according to age and gender;

recording each group member through a microphone according to different grouping standards, wherein the recorded content is preset content, and the preset content comprises sound content of each frequency band of human voice;

and monitoring and analyzing the data input by the microphone to obtain a recording sample, and selecting the qualified recording sample as a voiceprint sample for storage.

The sound data processing method includes the steps of collecting sound data sent by a user, comparing the sound data with the voiceprint sample after the sound data are processed and analyzed, and obtaining the voiceprint sample corresponding to the sound of the user according to a comparison result, wherein the steps of:

prompting a user to collect a recording, and collecting sound data according to a sample recording condition during recording;

after the recording is finished, preprocessing the sound data through a band-pass filter, then performing FFT analysis, and comparing the analyzed sound data with all the voiceprint samples;

and selecting the voiceprint sample closest to the sound data from all the voiceprint samples according to a calculation result of a preset algorithm, and classifying the user into the currently selected voiceprint sample.

The sound data processing method comprises the following steps of after the recording is finished, preprocessing the sound data by a band-pass filter, then performing FFT analysis, and comparing the analyzed sound data with all the voiceprint samples, wherein the method further comprises the following steps:

obtaining an approximation value of the sound data and each voiceprint sample through the preset algorithm, and selecting a preset number of voiceprint samples from small to large according to the approximation value;

acquiring identity information corresponding to the preset number of voiceprint samples, and displaying the identity information and the voiceprint samples together to prompt a user to select and confirm;

and after detecting an instruction of selecting and confirming the user, classifying the user into the currently selected voiceprint sample according to the selection.

The method for processing the voice data comprises the following steps of obtaining the identity information corresponding to the preset number of voiceprint samples, and displaying the identity information and the voiceprint samples together to prompt a user to select and confirm:

and if the selection confirmation instruction of the user is not received within the preset time, selecting the voiceprint sample with the minimum approximation value by default, and classifying the user into the voiceprint sample with the minimum approximation value.

The sound data processing method includes the steps of selecting a preset frequency band as an optimized frequency band, and performing optimization processing on sound data according to the optimized frequency band and a corresponding voiceprint sample to obtain a corrected frequency response of a user:

selecting a preset frequency band as an optimized frequency band according to requirements, and processing within the range of the optimized frequency band according to a preset rule;

and obtaining the parameter value of each frequency spectrum group needing to be compensated after the preset rule processing, and substituting the frequency spectrum parameter value into the sound data of the user to obtain the corrected frequency response of the user.

The sound data processing method, wherein the sound data processing method further comprises:

when detecting that a new user uses the intelligent terminal for the first time, establishing a user account by taking the sound collected by the voiceprint recognition module as a distinguished login name, and simultaneously recording tone quality optimization parameters corrected by the new user account;

when the voiceprint recognition module is started, a user account is matched through voiceprint data acquisition and data analysis, and then the user account is set according to the tone quality optimization parameters corrected by the user account.

when detecting that the old user uses the intelligent terminal again, matching the voice of the user with the voice of the user account;

and when the matching is successful, directly setting the tone quality optimization parameters corrected in the user account.

An intelligent terminal, wherein the intelligent terminal comprises: a memory, a processor and a sound data processing program stored on the memory and executable on the processor, the sound data processing program, when executed by the processor, implementing the steps of the sound data processing method as described above.

A storage medium, wherein the storage medium stores a sound data processing program that implements the steps of the sound data processing method as described above when executed by a processor.

Drawings

FIG. 1 is a flow chart of a preferred embodiment of a method for processing sound data according to the present invention;

FIG. 2 is a flowchart of step S10 in the preferred embodiment of the sound data processing method of the present invention;

FIG. 3 is a flowchart of step S20 in the preferred embodiment of the sound data processing method of the present invention;

FIG. 4 is a flowchart of step S30 in the preferred embodiment of the sound data processing method of the present invention;

fig. 5 is a schematic operating environment diagram of an intelligent terminal according to a preferred embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer and clearer, the present invention is further described in detail below with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

As shown in fig. 1, the method for processing sound data according to the preferred embodiment of the present invention includes the following steps:

and step S10, receiving data input by a microphone during recording, monitoring and analyzing the data to form a recording sample, and storing the qualified recording sample as a voiceprint sample.

Please refer to fig. 2, which is a flowchart of step S10 in the method for processing audio data according to the present invention.

As shown in fig. 2, the step S10 includes:

s11, grouping the collection users meeting the vocal music requirement according to age and gender before collecting the recording samples;

s12, recording each group member through a microphone according to different grouping standards, wherein the recorded content is preset content, and the preset content comprises sound content of each frequency band of human voice;

and S13, monitoring and analyzing the data input by the microphone to obtain a recording sample, and selecting a qualified recording sample as a voiceprint sample for storage.

For example, if the sound data processing method of the present invention is applied to the karaoke system, there is a requirement for the music base of the collected samples, the samples (collected users) are grouped according to their ages, and then grouped according to their males and females in each age group, and the number of all the members in the group is equal.

Specifically, the age range may be 1-80 years, with 5 years as age groups (e.g., 1-5 years, 6-10 years.) and grouped by gender within each age group, for a total of 32 groups (16 groups for males, 16 groups for females at different stages), with each group having equal numbers of panelists, with 12 panelists per group.

The method comprises the steps that each group member is required to record according to a certain standard, the recording content is a representative sentence, a microphone is adopted for collection, an intelligent terminal receives data input by the microphone, real-time monitoring and analysis are conducted on the input data, average sound pressure (measurement indexes can be changed), as the age range is 1-80 years old, children, adults and old people are included, the average sound pressure set aiming at each age range is different, qualified sample recording is obtained within a certain range, FFT (Fast Fourier transform, Fast Fourier transform algorithm) analysis is conducted on samples collected according to groups, data after FFT analysis of the same group are superposed, the average value is obtained, 500 Hz-3 kHz is used as a main reference frequency band, and the frequency band is divided into 6 groups.

And step S20, collecting voice data sent by a user, comparing the voice data with the voiceprint sample after the voice data is processed and analyzed, and obtaining the voiceprint sample corresponding to the voice of the user according to a comparison result.

Please refer to fig. 3, which is a flowchart of step S20 in the method for processing audio data according to the present invention.

As shown in fig. 3, the step S20 includes:

s21, prompting a user to collect the recording, and collecting sound data according to the recording condition of the sample during recording;

s22, after the recording is finished, preprocessing the sound data through a band-pass filter, then performing FFT analysis, and comparing the analyzed sound data with all the voiceprint samples;

and S23, selecting the voiceprint sample closest to the sound data from all the voiceprint samples according to the calculation result of the preset algorithm, and classifying the user into the currently selected voiceprint sample.

Further, the step S22 is followed by: obtaining an approximation value of the sound data and each voiceprint sample through the preset algorithm, and selecting a preset number of voiceprint samples from small to large according to the approximation value; acquiring identity information corresponding to the preset number of voiceprint samples, and displaying the identity information and the voiceprint samples together to prompt a user to select and confirm; after detecting a selection confirmation instruction of the user, classifying the user into the currently selected voiceprint sample according to the selection; and if the selection confirmation instruction of the user is not received within the preset time, selecting the voiceprint sample with the minimum approximation value by default, and classifying the user into the voiceprint sample with the minimum approximation value.

Specifically, a representative speech is obtained, a user is prompted to collect a recording according to an operation logic instruction, such as an intelligent television, and sound collection is performed according to a sample recording condition; the smart television starts the voiceprint recognition module, collects sounds made by a user, the voiceprint recognition module comprises a voiceprint data collection module and a voiceprint data analysis module, the voiceprint data collection module is used for collecting the sounds made by the user, and the voiceprint data analysis module is used for comparing and analyzing the sounds and the like.

Sound collection is carried out according to the condition of sample recording, after the recording is finished, preprocessing is carried out on the sound through a 500 Hz-3 KHZ band-pass filter (the band-pass filter is a device which allows waves in a specific frequency band to pass through and shields other frequency bands), FFT analysis is carried out, the analyzed data are compared with the voiceprint samples, the 500 Hz-3 KHZ is taken as a main reference frequency band which is divided into 6 groups, 6 groups of data which are collected are subtracted from 6 groups of data in a 32-group sample library, and 32 groups of frequency spectrum difference values are obtained.

Because the influence degree of each frequency band of the human voice on the tone is different, the 6 groups of data are subjected to coefficient weighting, for example, the first group of coefficients is A, the second group of coefficients is B, the 6 th group of coefficients is N, A + B + · · + N =1, the 6 groups of spectrum difference values are subjected to absolute value calculation, the absolute value is multiplied by the coefficients to sum, and then the arithmetic mean value of the 6 groups is calculated to obtain an approximation value x. By analogy, as the total number of the 32 sample libraries is 32, 32 approximation values are obtained, the minimum number is found from the 32 approximation values, and the smaller the approximation value is, the closer the voiceprint sample is to the voice of the user, that is, the collected voiceprint data of the user is closer to a corresponding voiceprint sample, that is, the user is classified into the voiceprint sample.

Further, 32 approximation values can be obtained by the above preset algorithm, and in order to be closer to the identity information of the user: and (3) displaying age and gender information of the voiceprint sample corresponding to the first 5 approximation values of the obtained 32 approximation values from small to large to the user for the user to select and confirm. If the user does not select confirmation, the user is classified into the voiceprint sample by default by using the voiceprint sample corresponding to the minimum number in the approximation values; if the user chooses to confirm one of the 5 voiceprint samples, the voiceprint samples are classified into the corresponding voiceprint sample according to the choice of the user.

And step S30, selecting a preset frequency band as an optimized frequency band, and performing optimization processing on the sound data according to the optimized frequency band and the corresponding voiceprint sample to obtain the corrected frequency response of the user.

Please refer to fig. 4, which is a flowchart of step S30 in the method for processing audio data according to the present invention.

As shown in fig. 4, the step S30 includes:

s31, selecting a preset frequency band as an optimized frequency band according to requirements, and processing within the optimized frequency band according to a preset rule;

and S32, obtaining each frequency spectrum group parameter value needing to be compensated after the preset rule processing, and substituting the frequency spectrum parameter value into the voice data of the user to obtain the corrected frequency response of the user.

Specifically, 200 Hz-8 kHz is selected as an optimized frequency band according to requirements, FFT data of a user in the frequency band and FFT data of a voiceprint sample are subtracted to obtain a frequency spectrum difference value needing reverse compensation, the frequency spectrum difference value of each section of the group is multiplied by-1 to obtain a parameter value of each frequency spectrum group needing compensation, and the parameter value of the frequency spectrum is substituted into sound data of the user to obtain a corrected frequency response of the user.

Further, when a new user is detected to use the intelligent terminal for the first time, the voice collected by the voiceprint recognition module is used as a distinguished login name to establish a user account, and meanwhile, the tone quality optimization parameters corrected by the new user account are recorded; when the voiceprint recognition module is started, a user account is matched through voiceprint data acquisition and data analysis, and then the user account is set according to the tone quality optimization parameters corrected by the user account.

Further, when detecting that the old user uses the intelligent terminal again, matching the voice of the user with the voice of the user account; and when the matching is successful, directly setting the tone quality optimization parameters corrected in the user account.

The sound data processing method is used for improving the tone quality, is mainly applied to the K song application of a family television, improves the effect of a user when the user plays the K song, improves the tone quality, and can be applied to other electronic equipment to improve the tone quality of the user; according to the method, the voiceprints of each person are classified and analyzed according to different voiceprint characteristics of each user, the voiceprint samples are established according to different age groups and gender groups, the voiceprint sample data are optimized, the tone quality of each person is improved, the experience effect of the user at home for playing the K song can be improved, the effect of showing the K song in a market can be improved, and the market competitiveness of products such as televisions is improved.

As shown in fig. 5, based on the above sound data processing method, the present invention also provides an intelligent terminal (e.g., an intelligent television) correspondingly, where the intelligent terminal includes a processor 10, a memory 20, and a display 30 (in the case of being connected with a display screen or being externally connected with a display screen). Fig. 5 shows only some of the components of the smart terminal, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.

The memory 20 may be an internal storage unit of the intelligent terminal in some embodiments, such as a hard disk or a memory of the intelligent terminal. The memory 20 may also be an external storage device of the Smart terminal in other embodiments, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the Smart terminal. Further, the memory 20 may also include both an internal storage unit and an external storage device of the smart terminal. The memory 20 is used for storing application software installed in the intelligent terminal and various data, such as program codes of the installed intelligent terminal. The memory 20 may also be used to temporarily store data that has been output or is to be output. In one embodiment, the memory 20 stores a sound data processing program 40, and the sound data processing program 40 can be executed by the processor 10, so as to implement the sound data processing method of the present application.

The processor 10 may be a Central Processing Unit (CPU), microprocessor or other data Processing chip in some embodiments, and is used for executing program codes stored in the memory 20 or Processing data, such as executing the sound data Processing method.

The display 30 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch panel, or the like in some embodiments. The display 30 is used for displaying information at the intelligent terminal and for displaying a visual user interface. The components 10-30 of the intelligent terminal communicate with each other via a system bus.

In one embodiment, when the processor 10 executes the sound data processing program 40 in the memory 20, the following steps are implemented:

The data that pass through microphone input during the receipt recording, right data monitor and form the recording sample after the analysis, regard qualified recording sample as the voiceprint sample to save and specifically include:

The method comprises the following steps of collecting sound data sent by a user, processing and analyzing the sound data, comparing the sound data with a voiceprint sample, and acquiring the voiceprint sample corresponding to the sound of the user according to a comparison result, wherein the method specifically comprises the following steps:

After the recording is finished, the sound data is preprocessed through a band-pass filter, FFT analysis is carried out, and the analyzed sound data and all the voiceprint samples are compared and then the method further comprises the following steps:

The obtaining of the identity information corresponding to the preset number of voiceprint samples and the displaying of the identity information together with the voiceprint samples prompt a user to select and confirm further comprises:

The selecting a preset frequency band as an optimized frequency band, and performing optimization processing on the sound data according to the optimized frequency band and a corresponding voiceprint sample to obtain a corrected frequency response of a user specifically comprises:

The present invention also provides a storage medium, wherein the storage medium stores a sound data processing program, and the sound data processing program realizes the steps of the sound data processing method as described above when executed by a processor.

In summary, the present invention provides a sound data processing method, an intelligent terminal and a storage medium, wherein the method includes: receiving data input by a microphone during recording, monitoring and analyzing the data to form a recording sample, and storing the qualified recording sample as a voiceprint sample; acquiring voice data sent by a user, processing and analyzing the voice data, comparing the processed and analyzed voice data with the voiceprint sample, and acquiring a voiceprint sample corresponding to the voice of the user according to a comparison result; and selecting a preset frequency band as an optimized frequency band, and optimizing the sound data according to the optimized frequency band and the corresponding voiceprint sample to obtain the corrected frequency response of the user. According to the invention, the voiceprints of each user are classified and analyzed, high-sound-quality samples are established for different age groups and gender groups, and sound quality optimization is carried out according to the data of the high-sound-quality samples, so that the sound quality of each user is improved, the singing effect of K is improved, and the sound quality requirements of different users during singing K are met.

Of course, it will be understood by those skilled in the art that all or part of the processes of the methods of the above embodiments may be implemented by a computer program instructing relevant hardware (such as a processor, a controller, etc.), and the program may be stored in a computer readable storage medium, and when executed, the program may include the processes of the above method embodiments. The storage medium may be a memory, a magnetic disk, an optical disk, etc.

It is to be understood that the invention is not limited to the examples described above, but that modifications and variations may be effected thereto by those of ordinary skill in the art in light of the foregoing description, and that all such modifications and variations are intended to be within the scope of the invention as defined by the appended claims.

Claims

1. A sound data processing method, characterized by comprising:

2. The method of claim 1, wherein the receiving the data input by the microphone during the recording, monitoring and analyzing the data to form a recording sample, and saving the qualified recording sample as a voiceprint sample specifically comprises:

3. The sound data processing method according to claim 2, wherein the acquiring sound data emitted by the user, processing and analyzing the sound data, comparing the sound data with the voiceprint sample, and obtaining the voiceprint sample corresponding to the sound of the user according to the comparison result specifically comprises:

4. The method of claim 3, wherein after the recording is finished, the sound data is preprocessed by a band-pass filter, and then subjected to FFT analysis, and after the analyzed sound data is compared with all the voiceprint samples, the method further comprises:

5. The method according to claim 4, wherein the obtaining the identity information corresponding to each of the preset number of voiceprint samples and displaying the identity information together with the voiceprint samples to prompt the user to confirm the selection further comprises:

6. The method according to claim 3, wherein the selecting a preset frequency band as an optimized frequency band, and performing optimization processing on the sound data according to the optimized frequency band and a corresponding voiceprint sample to obtain a corrected frequency response of a user specifically comprises:

and obtaining each frequency spectrum group parameter value needing to be compensated after the preset rule processing, and substituting the frequency spectrum group parameter value into the sound data of the user to obtain the corrected frequency response of the user.

7. The sound data processing method according to claim 1, further comprising:

8. The sound data processing method according to claim 7, further comprising:

9. An intelligent terminal, characterized in that, intelligent terminal includes: memory, a processor and a sound data processing program stored on the memory and executable on the processor, the sound data processing program, when executed by the processor, implementing the steps of the sound data processing method according to any one of claims 1-8.

10. A storage medium characterized by storing a sound data processing program which realizes the steps of the sound data processing method according to any one of claims 1 to 8 when executed by a processor.