CN109712635A

CN109712635A - A kind of voice data processing method, intelligent terminal and storage medium

Info

Publication number: CN109712635A
Application number: CN201811629739.3A
Authority: CN
Inventors: 付华东; 王余生
Original assignee: Shenzhen Skyworth RGB Electronics Co Ltd
Current assignee: Shenzhen Skyworth RGB Electronics Co Ltd
Priority date: 2018-12-28
Filing date: 2018-12-28
Publication date: 2019-05-03
Anticipated expiration: 2038-12-28
Also published as: CN109712635B

Abstract

The invention discloses a kind of voice data processing method, intelligent terminal and storage mediums, the described method includes: passing through the data of microphone input when receiving recording, voice sample is formed after the data are monitored and analyzed, qualified voice sample is saved as vocal print sample；The voice data that user issues is acquired, is compared after the voice data is handled and analyzed with the vocal print sample, the corresponding vocal print sample of user voice is obtained according to comparison result；It selects default frequency range as optimization frequency range, processing is optimized to the voice data according to the optimization frequency range and corresponding vocal print sample, obtains the correction frequency response of user.The present invention is by carrying out classification analysis for the vocal print of each user, for each different age group and gender groups, establish high tone quality sample, sound quality optimization is carried out according to the data of the high tone quality sample, improve the sound quality of each user, it promotes K and sings effect, meet sound quality requirement of the different user in K song.

Description

A kind of voice data processing method, intelligent terminal and storage medium

Technical field

The present invention relates to intelligent terminal technical field more particularly to a kind of voice data processing method, intelligent terminal and deposit Storage media.

Background technique

LCD TV just relies on its powerful network function after stepping into cybertimes, has fast implemented and network generation After the relevant many amusement functions in boundary, especially Android intelligent television occur, smart television becomes the recreation center of family, Although content is more and more abundant, user be delithted with application or it is seldom.Using sing as in a manner of main home entertaining it One K sings function, such as whole people K song, the main attraction initially as TV such as sounds of nature K song.

Everyone is different to the impression of vocal music, and the rhythm is different, and throat is just as filter, is not that everyone day is born with There is siren, almost everyone vocal cords are defective, and the people's meeting of singing for passing through professional vocality study is more interesting to listen to.Therefore, Since everyone sound property is different, parameter can not be sung with the K of adaptation and matched, whole people K is caused to sing, sounds of nature K song etc. is answered It is not very well, to be unable to satisfy different user and wanted in K song to sound quality with the K song effect on smart television to user experience It asks, that is to say, that existing intelligent terminal can not optimize sound quality for the sound of different user.

Therefore, the existing technology needs to be improved and developed.

Summary of the invention

The technical problem to be solved in the present invention is that for prior art defect, the present invention is provided at a kind of voice data Reason method, intelligent terminal and storage medium, it is intended to by the way that the vocal print of each user is carried out classification analysis, for each not the same year Age section and gender groups, establish high tone quality sample, carry out sound quality optimization according to the data of the high tone quality sample, improve each use The sound quality at family promotes K and sings effect, meets sound quality requirement of the different user in K song, while improving smart terminal product in market Competitiveness.

The technical proposal for solving the technical problem of the invention is as follows:

A kind of voice data processing method, wherein the voice data processing method includes:

Data when recording by microphone input are received, form voice sample after the data are monitored and analyzed, it will Qualified voice sample is saved as vocal print sample；

The voice data that user issues is acquired, is compared after the voice data is handled and analyzed with the vocal print sample It is right, the corresponding vocal print sample of user voice is obtained according to comparison result；

Select default frequency range as optimizing frequency range, according to the optimization frequency range and corresponding vocal print sample to the voice data into Row optimization processing obtains the correction frequency response of user.

The voice data processing method, wherein the data received when recording by microphone input, to described Data are monitored and form voice sample after being analyzed, and carry out qualified voice sample to save specific packet as vocal print sample It includes:

Before acquiring voice sample, the acquisition user for meeting vocal music requirement is grouped according to age and gender；

It is recorded by microphone to each group member according to different grouping standards, the content of the recording is preset content, The preset content includes the sound-content of each frequency range of voice；

The data of the microphone input monitor and analysis obtains voice sample, and select qualified voice sample as Vocal print sample is saved.

The voice data processing method, wherein the voice data that the acquisition user issues, by the voice data It is compared after being handled and being analyzed with the vocal print sample, the corresponding vocal print sample of user voice is obtained according to comparison result It specifically includes:

Prompt user carries out recording acquisition, carries out voice data acquisition according to the condition that sample is recorded in recording；

After End of Tape, the voice data is first passed through into bandpass filter and is pre-processed, then carry out fft analysis, after analysis Voice data and all vocal print samples be compared；

According to the calculated result of preset algorithm, selection and the closest sound of the voice data in all vocal print samples User is referred in the vocal print sample currently selected by grain pattern sheet.

The voice data processing method, wherein after the End of Tape, the voice data is first passed through into band logical filter Wave device is pre-processed, then carries out fft analysis, and the voice data and all vocal print samples after analysis are gone back after being compared Include:

Value is approached by what the preset algorithm obtained the voice data and each vocal print sample, according to approaching value extremely The vocal print sample of preset quantity is selected from small to large；

The corresponding identity information of vocal print sample of the preset quantity is obtained, and display together out with the vocal print sample Prompt user carries out selection confirmation；

After detecting the instruction of selection confirmation of user, user is referred in the vocal print sample currently selected according to selection.

The voice data processing method, wherein the vocal print sample for obtaining the preset quantity is corresponding Identity information, and display together out after prompt user carries out selection confirmation with the vocal print sample further include:

If not receiving the selection confirmation instruction of user within a preset time, default choice approaches the smallest vocal print sample of value, And user is referred to and is approached in the smallest vocal print sample of value.

The voice data processing method, wherein it is described to select default frequency range as optimization frequency range, according to the optimization Frequency range and corresponding vocal print sample optimize processing to the voice data, and the correction frequency response for obtaining user specifically includes:

According to requiring to select default frequency range as optimizing frequency range, in the range of the optimization frequency range according to preset rules at Reason；

The each frequency spectrum group parameter value for needing to compensate is obtained after preset rules processing, the spectral parameter values are updated to In the voice data of user, the correction frequency response of user is obtained.

The voice data processing method, wherein the voice data processing method further include:

When detecting that new user uses intelligent terminal for the first time, logged in by the collected sound of voiceprint identification module as differentiation Name establishes user account, while recording the modified sound quality Optimal Parameters of new user account；

When starting voiceprint identification module, is analyzed by voice print database acquisition and data, user account is matched, further according to user The modified sound quality Optimal Parameters of account are configured.

When detecting that old user reuses intelligent terminal, by the progress of the sound of the sound of user's sounding and user account Match；

After successful match, modified sound quality Optimal Parameters are configured in the user account that then be used directly.

A kind of intelligent terminal, wherein the intelligent terminal includes: memory, processor and is stored on the memory And the voice data processing routine that can be run on the processor, the voice data processing routine are executed by the processor The step of Shi Shixian voice data processing method as described above.

A kind of storage medium, wherein the storage medium is stored with voice data processing routine, the voice data processing The step of voice data processing method as described above is realized when program is executed by processor.

The invention discloses a kind of voice data processing method, intelligent terminal and storage mediums, which comprises receives By the data of microphone input when recording, voice sample is formed after the data are monitored and analyzed, by qualified record Sound sample is saved as vocal print sample；The voice data that user issues is acquired, the voice data is handled and divided It is compared after analysis with the vocal print sample, the corresponding vocal print sample of user voice is obtained according to comparison result；The default frequency of selection Duan Zuowei optimizes frequency range, optimizes processing to the voice data according to the optimization frequency range and corresponding vocal print sample, obtains To the correction frequency response of user.The present invention by the way that the vocal print of each user is carried out classification analysis, for each different age group and Gender groups establish high tone quality sample, carry out sound quality optimization according to the data of the high tone quality sample, improve the sound of each user Matter promotes K and sings effect, meets sound quality requirement of the different user in K song.

Detailed description of the invention

Fig. 1 is the flow chart of the preferred embodiment of voice data processing method of the present invention；

Fig. 2 is the flow chart of step S10 in the preferred embodiment of voice data processing method of the present invention；

Fig. 3 is the flow chart of step S20 in the preferred embodiment of voice data processing method of the present invention；

Fig. 4 is the flow chart of step S30 in the preferred embodiment of voice data processing method of the present invention；

Fig. 5 is the running environment schematic diagram of the preferred embodiment of intelligent terminal of the present invention.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer and more explicit, right as follows in conjunction with drawings and embodiments The present invention is further described.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and do not have to It is of the invention in limiting.

Voice data processing method described in present pre-ferred embodiments, as shown in Figure 1, the voice data processing method The following steps are included:

Step S10, data when recording by microphone input are received, form recording after the data are monitored and analyzed Sample saves qualified voice sample as vocal print sample.

Detailed process is referring to Fig. 2, it is the flow chart of step S10 in voice data processing method provided by the invention.

As shown in Fig. 2, the step S10 includes:

S11, before acquiring voice sample, the acquisition user for meeting vocal music requirement is grouped according to age and gender；

S12, it is recorded by microphone to each group member according to different grouping standards, the content of the recording is in default Hold, the preset content includes the sound-content of each frequency range of voice；

S13, the data of the microphone input monitor and analysis obtains voice sample, and select qualified voice sample It is saved as vocal print sample.

System is sung in K for example, applying voice data processing method of the invention, then having vocal music basic collecting sample Requirement, sample (acquisition user) is first grouped according to the age, then is grouped in each age bracket according to men and women, institute Some group member's numbers are equal.

Specifically, the range of age can be 1-80 years old, with 5 years old for age bracket (such as 1-5 years old, 6-10 years old ... ..76- 80 years old), it is grouped in each age bracket according to men and women, it is 32 groups a total of (16 groups of the male of different phase, 16 groups of women), Every group of group member's number must be equal, and every group of group member is 12 people.

It is required that each group member records according to certain standard, recording substance is one section of representational sentence, is used Microphone acquisition, intelligent terminal receive the data of microphone input, the data of input are monitored and analyzed in real time, average sound It presses (measurement index can be changed), since the range of age is 1-80 years old, contains child, adult, old man, for each age bracket The average sound pressure of setting will be different, and obtain qualified sample recording in a certain range, carry out FFT by the sample of grouping acquisition (Fast Fourier Transformation, the fast algorithm of discrete fourier transform) analysis, after same group of fft analysis Data are overlapped, and are averaged, and 500Hz ~ 3kHz is divided into 6 groups as Primary Reference frequency range, the frequency range.

Step S20, acquisition user issue voice data, after the voice data is handled and is analyzed with the sound Grain pattern is originally compared, and obtains the corresponding vocal print sample of user voice according to comparison result.

Detailed process is referring to Fig. 3, it is the flow chart of step S20 in voice data processing method provided by the invention.

As shown in figure 3, the step S20 includes:

S21, prompt user carry out recording acquisition, carry out voice data acquisition according to the condition that sample is recorded in recording；

After S22, End of Tape, the voice data is first passed through into bandpass filter and is pre-processed, then carries out fft analysis, point Voice data and all vocal print samples after analysis are compared；

S23, the calculated result according to preset algorithm are selected closest with the voice data in all vocal print samples Vocal print sample, user is referred in the vocal print sample currently selected.

Further, after the step S22 further include: by the preset algorithm obtain the voice data with it is each A vocal print sample approaches value, according to approach value to from small to large select preset quantity vocal print sample；It obtains described pre- If the corresponding identity information of vocal print sample of quantity, and display together out prompt user with the vocal print sample and select Select confirmation；After detecting the instruction of selection confirmation of user, user is referred to according to selection by the vocal print sample currently selected In；If not receiving the selection confirmation instruction of user within a preset time, default choice approaches the smallest vocal print sample of value, and User is referred to and is approached in the smallest vocal print sample of value.

Specifically, one section of representational language is obtained, is indicated by operation logic, such as smart television prompt user carries out Recording acquisition carries out sound collection according to the condition that sample is recorded；Smart television starts voiceprint identification module, and acquisition user issues Sound, voiceprint identification module includes voice print database acquisition module and vocal print data analysis module, and wherein voice print database acquires mould Block is used to acquire the sound of user's sending, and voice print database analysis module is for being compared sound, analyzing.

Sound collection is carried out according to the condition that sample is recorded, after End of Tape, first passes through the filter of 500 Hz ~ 3KHZ band logical Wave device (bandpass filter is the equipment that the wave of a permission special frequency channel passes through while shielding other frequency ranges) is pre-processed, then Fft analysis is carried out, vocal print sample of the data after analysis with more than is compared, and takes 500Hz ~ 3KHZ as Primary Reference frequency Section, the frequency range are divided into 6 groups, every 6 groups of data of 6 groups of data of acquisition and 32 groups of sample databases are subtracted each other, 32 groups of frequency spectrums are obtained Difference.

Because each frequency range of voice is different to the influence degree of tone color, 6 groups of data are subjected to coefficient weighting, such as the first system Number takes A, and second group of coefficient takes B, and the 6th group of coefficient takes N, and A+B++N=1 takes absolute value 6 groups of spectrum differences, multiplied by being Number summation, then 6 groups of arithmetic mean of instantaneous value is sought, it obtains one and approaches value x.And so on, due to a total of 32 groups of sample databases, then It will obtain 32 and approach value, be approached from this 32 and look for minimum number in value, approach that value is smaller to indicate the vocal print sample and user Sound is closer, and as collected user's voice print database and certain corresponding vocal print sample are closer, i.e., is referred to the user In the vocal print sample.

Further, value is approached by above preset algorithm available 32, in order to believe closer to the identity of user Breath: age, gender approach value from small to large first 5 32 obtained and approach the age of the corresponding vocal print sample of value, gender Information is shown to user, selects to confirm for user.If user does not select to confirm, just default uses the minimum approached in value The corresponding vocal print sample of number, user is referred in the vocal print sample；If user's selection confirmed in this 5 vocal print samples Some, that is just referred in that corresponding vocal print sample according to the selection of user.

Step S30, select default frequency range as optimization frequency range, according to the optimization frequency range and corresponding vocal print sample to institute It states voice data and optimizes processing, obtain the correction frequency response of user.

Specific process is referring to Fig. 4, it is the process of step S30 in voice data processing method provided by the invention Figure.

As shown in figure 4, the step S30 includes:

S31, according to requiring to select default frequency range as optimizing frequency range, in the range of the optimization frequency range according to preset rules into Row processing；

S32, each frequency spectrum group parameter value for needing to compensate is obtained after preset rules processing, by the spectral parameter values generation Enter into the voice data of user, obtains the correction frequency response of user.

Specifically, according to requiring to choose 200 Hz ~ 8kHz as optimization frequency range, the FFT data and vocal print of frequency range user Sample FFT data subtracts each other to get to the spectrum difference for needing Contrary compensation, and every section of spectrum difference of the group is arrived multiplied by -1 The each frequency spectrum group parameter value for needing to compensate substitutes into the spectral parameter values to user voice data, the school of the user can be obtained Positive frequency response.

Further, when detecting that new user uses intelligent terminal for the first time, pass through the collected sound of voiceprint identification module Sound establishes user account as login name is distinguished, while recording the modified sound quality Optimal Parameters of new user account；In starting vocal print When identification module, is analyzed by voice print database acquisition and data, user account is matched, further according to the modified sound quality of user account Optimal Parameters are configured.

Further, when detecting that old user reuses intelligent terminal, by the sound and user account of user's sounding Sound matched；After successful match, modified sound quality Optimal Parameters are configured in the user account that then be used directly.

The voice data processing method of the invention for improving sound quality, is applied by the K song for being mainly used in home videos, To improve effect of the user in K song, promotion sound quality, naturally it is also possible to improve the sound quality of user applied to other electronic equipments； Everyone vocal print is carried out classification analysis, for each not the same year according to the difference of the vocal print characteristic of each user by the present invention Age section and gender groups, establish vocal print sample, optimize according to the vocal print sample data, improve everyone sound quality, no Can only be promoted user be in K song experience effect, moreover it is possible to promoted sales field show K song effect, improve television set this kind of product Competitiveness in market.

As shown in figure 5, being based on above sound data processing method, the present invention further correspondingly provides a kind of intelligent terminal (example Such as smart television), the intelligent terminal includes in the case that processor 10, memory 20 and display 30(are connected with display screen Or the case where externally connected with display screen).Fig. 5 illustrates only the members of intelligent terminal, it should be understood that being not required for reality Apply all components shown, the implementation that can be substituted is more or less component.

The memory 20 can be the internal storage unit of the intelligent terminal in some embodiments, such as intelligence is eventually The hard disk or memory at end.The external storage that the memory 20 is also possible to the intelligent terminal in further embodiments is set Plug-in type hard disk that is standby, such as being equipped on the intelligent terminal, intelligent memory card (Smart Media Card, SMC), safety Digital (Secure Digital, SD) card, flash card (Flash Card) etc..Further, the memory 20 can also be both Internal storage unit including the intelligent terminal also includes External memory equipment.The memory 20 is installed on institute for storing State the application software and Various types of data of intelligent terminal, such as the program code etc. of the installation intelligent terminal.The memory 20 It can be also used for temporarily storing the data that has exported or will export.In one embodiment, it is stored on memory 20 Voice data processing routine 40, the voice data processing routine 40 can be performed by processors 10, to realize sound in the application Sound data processing method.

The processor 10 can be in some embodiments a central processing unit (Central Processing Unit, CPU), microprocessor or other data processing chips, for running the program code stored in the memory 20 or processing number According to, such as execute the voice data processing method etc..

The display 30 can be light-emitting diode display, liquid crystal display, touch-control liquid crystal display in some embodiments And OLED(Organic Light-Emitting Diode, Organic Light Emitting Diode) touch device etc..The display 30 is used In the information for being shown in the intelligent terminal and for showing visual user interface.The component 10- of the intelligent terminal 30 are in communication with each other by system bus.

In one embodiment, it is realized when processor 10 executes voice data processing routine 40 in the memory 20 following Step:

The data received when recording by microphone input, form recording after the data are monitored and analyzed Qualified voice sample is carried out preservation as vocal print sample and specifically included by sample:

It is described acquisition user issue voice data, after the voice data is handled and is analyzed with the vocal print sample Originally it is compared, the corresponding vocal print sample of user voice is obtained according to comparison result and is specifically included:

After the End of Tape, the voice data is first passed through into bandpass filter and is pre-processed, then carries out FFT points Analysis, after voice data and all vocal print samples after analysis are compared further include:

The corresponding identity information of vocal print sample for obtaining the preset quantity, and together with the vocal print sample After showing that prompt user carries out selection confirmation further include:

It is described to select default frequency range as optimization frequency range, according to the optimization frequency range and corresponding vocal print sample to the sound Sound data optimize processing, and the correction frequency response for obtaining user specifically includes:

When detecting that new user uses intelligent terminal for the first time, by the collected sound of voiceprint identification module as differentiation Login name establishes user account, while recording the modified sound quality Optimal Parameters of new user account；

When detecting that old user reuses intelligent terminal, the sound of the sound of user's sounding and user account is carried out Matching；

The present invention also provides a kind of storage mediums, wherein the storage medium is stored with voice data processing routine, described Voice data processing routine realizes the step of voice data processing method as described above when being executed by processor.

In conclusion the present invention provides a kind of voice data processing method, intelligent terminal and storage medium, the method packet It includes: receiving data when recording by microphone input, form voice sample after the data are monitored and analyzed, will close The voice sample of lattice is saved as vocal print sample；The voice data that user issues is acquired, at the voice data It is compared after reason and analysis with the vocal print sample, the corresponding vocal print sample of user voice is obtained according to comparison result；Selection Default frequency range optimizes place to the voice data according to the optimization frequency range and corresponding vocal print sample as optimization frequency range Reason, obtains the correction frequency response of user.The present invention is by carrying out classification analysis for the vocal print of each user, for each all ages and classes Section and gender groups, establish high tone quality sample, carry out sound quality optimization according to the data of the high tone quality sample, improve each user Sound quality, promoted K sing effect, meet different user K song when sound quality requirement.

Certainly, those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, It is that related hardware (such as processor, controller etc.) can be instructed to complete by computer program, the program can store In a computer-readable storage medium, described program may include the process such as above-mentioned each method embodiment when being executed. Wherein the storage medium can be memory, magnetic disk, CD etc..

It should be understood that the application of the present invention is not limited to the above for those of ordinary skills can With improvement or transformation based on the above description, all these modifications and variations all should belong to the guarantor of appended claims of the present invention Protect range.

Claims

1. a kind of voice data processing method, which is characterized in that the voice data processing method includes:

2. voice data processing method according to claim 1, which is characterized in that the reception passes through microphone when recording The data of input form voice sample after the data are monitored and analyzed, using qualified voice sample as vocal print sample This save and specifically includes:

3. voice data processing method according to claim 2, which is characterized in that the sound number that the acquisition user issues According to, be compared after the voice data is handled and analyzed with the vocal print sample, according to comparison result obtain user The corresponding vocal print sample of sound specifically includes:

4. voice data processing method according to claim 3, which is characterized in that after the End of Tape, by the sound Sound data first pass through bandpass filter and are pre-processed, then carry out fft analysis, voice data and all vocal prints after analysis After sample is compared further include:

5. voice data processing method according to claim 4, which is characterized in that the sound for obtaining the preset quantity The corresponding identity information of grain pattern sheet, and display together out after prompt user carries out selection confirmation with the vocal print sample Further include:

6. voice data processing method according to claim 3, which is characterized in that described to select default frequency range as optimization Frequency range optimizes processing to the voice data according to the optimization frequency range and corresponding vocal print sample, obtains the school of user Positive frequency response specifically includes:

7. voice data processing method according to claim 1, which is characterized in that the voice data processing method is also wrapped It includes:

8. voice data processing method according to claim 7, which is characterized in that the voice data processing method is also wrapped It includes:

9. a kind of intelligent terminal, which is characterized in that the intelligent terminal includes: memory, processor and is stored in the storage On device and the voice data processing routine that can run on the processor, the voice data processing routine is by the processor The step of voice data processing methods as described in any item such as claim 1-8 are realized when execution.

10. a kind of storage medium, which is characterized in that the storage medium is stored with voice data processing routine, the sound number The step of voice data processing methods as described in any item such as claim 1-8 are realized when being executed by processor according to processing routine.