CN107958672A - The method and apparatus for obtaining pitch waveform data - Google Patents
The method and apparatus for obtaining pitch waveform data Download PDFInfo
- Publication number
- CN107958672A CN107958672A CN201711337024.6A CN201711337024A CN107958672A CN 107958672 A CN107958672 A CN 107958672A CN 201711337024 A CN201711337024 A CN 201711337024A CN 107958672 A CN107958672 A CN 107958672A
- Authority
- CN
- China
- Prior art keywords
- audio
- frequency
- target
- audio frame
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 238000001228 spectrum Methods 0.000 claims abstract description 72
- 238000000605 extraction Methods 0.000 claims abstract description 21
- 230000008569 process Effects 0.000 claims description 14
- 230000009466 transformation Effects 0.000 claims description 9
- 230000006870 function Effects 0.000 description 16
- 238000012545 processing Methods 0.000 description 12
- 230000001133 acceleration Effects 0.000 description 9
- 238000004891 communication Methods 0.000 description 8
- 230000002093 peripheral effect Effects 0.000 description 6
- 230000008859 change Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 239000000919 ceramic Substances 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 230000002045 lasting effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/003—Changing voice quality, e.g. pitch or formants
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The disclosure is directed to a kind of method and apparatus for obtaining pitch waveform data, belong to Audiotechnica field.The described method includes:Pitch extraction is carried out to each audio frame in target audio, obtains the corresponding target frequency of each audio frame;For each audio frame, based on the corresponding target frequency of the audio frame, in the frequency spectrum data of the audio frame, corresponding target amplitude is determined;Based on the corresponding target amplitude of each audio frame and target frequency, the pitch waveform data of the target audio are determined.Using the disclosure, according to the pitch relation directly proportional to the vibration frequency of fundamental tone, the average frequency of fundamental tone in each frame audio is determined by the pitch of each frame audio, the average frequency for being then based on fundamental tone obtains the pitch waveform data of each frame audio, finally obtain the pitch waveform data of target audio, and then it can accurately obtain the fundamental tone Vibration Condition of target audio.
Description
Technical field
The disclosure is directed to Audiotechnica field, especially with respect to a kind of method and apparatus for obtaining pitch waveform data
Background technology
With people's the accelerating rhythm of life, sing have become people loosen mood common amusement and leisure mode it
One, its tone can be adjusted using multimedia equipment for often singing user out of tune, to approach the mark of respective songs
Quasi- pitch data.The standard pitch data of song are stored in usual multimedia equipment in advance, multimedia equipment can be based on should
The tone for people's sound audio that standard pitch data sing the user collected is adjusted.
Sound is produced by vibration, including the vibration of fundamental tone and the vibration of overtone, and tone is determined by the vibration of fundamental tone
's.Therefore the key for changing tone is the fundamental tone for obtaining people's sound audio, according to the Vibration Condition of fundamental tone and standard pitch data
Contrast, people's sound audio is adjusted, and then realize the change tone stationary tone color of people's sound audio.Therefore, the key of tone is become
It is the Vibration Condition for accurately obtaining fundamental tone, in the prior art, time-domain filtering is carried out to audio usually using bandpass filter, its
In, the frequency range of passband is arranged to the frequency range of the fundamental tone of general people's sound audio.
During the disclosure is realized, inventor has found to have at least the following problems:
The frequency fluctuation of one complete song its fundamental tone is larger, for example, the frequency in starting stage fundamental tone is relatively low, in
Between stage climax stage fundamental tone frequency it is higher, in this way, the frequency range of the passband of bandpass filter must be provided with it is sufficiently wide,
Can cover the frequency of all fundamental tones, but so also can covering part overtone frequency, so cannot accurately obtain shaking for fundamental tone
Emotionally condition.
The content of the invention
In order to overcome problem present in correlation technique, present disclose provides it is a kind of obtain pitch waveform data method and
Device.The technical solution is as follows:
According to the embodiment of the present disclosure, there is provided a kind of method for obtaining pitch waveform data, the described method includes:
Pitch extraction is carried out to each audio frame in target audio, obtains the corresponding target frequency of each audio frame;
For each audio frame, based on the corresponding target frequency of the audio frame, in the frequency spectrum data of the audio frame,
Determine corresponding target amplitude;
Based on the corresponding target amplitude of each audio frame and target frequency, the pitch waveform number of the target audio is determined
According to.
Optionally, it is described for each audio frame, based on the corresponding target frequency of the audio frame, in the audio frame
In frequency spectrum data, corresponding target amplitude is determined, including:
To the audio waveform data of each audio frame, Fourier transformation is carried out respectively, obtains the spectrum number of each audio frame
According to;
In the frequency spectrum data of each audio frame, the corresponding target amplitude of target frequency is determined.
Optionally, it is described to be based on the corresponding target amplitude of each audio frame and target frequency, determine the target audio
Pitch waveform data, including:
In the frequency spectrum data of each audio frame, keep the corresponding target amplitude of target frequency constant, and by other frequencies
Corresponding amplitude zero setting, obtains the frequency spectrum data after the adjustment of each audio frame;
Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the fundamental tone of the target audio
Wave data.
Optionally, it is described to be based on the corresponding target amplitude of each audio frame and target frequency, determine the target audio
Pitch waveform data, including:
Based on the corresponding target amplitude of each audio frame and target frequency, the frequency after the adjustment of each audio frame is generated respectively
Modal data;
Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the fundamental tone of the target audio
Wave data.
Optionally, the method further includes:
Pitch waveform data based on the target audio, store in advance with the corresponding standard pronunciation of the target audio
High data, tone adjustment is carried out to the target audio.
According to the embodiment of the present disclosure, there is provided a kind of method of audio frequency process, the described method includes:
By each cycle corresponding frequency values in pitch waveform data described above, respectively with standard pitch data
Corresponding standard frequency value is compared on time, if the absolute value of the difference of frequency values and standard frequency value is more than present count
Value, then be adjusted the target audio in cycle where the frequency values.
According to the embodiment of the present disclosure, there is provided a kind of device for obtaining pitch waveform data, described device include:
Extraction module, for carrying out pitch extraction to each audio frame in target audio, it is corresponding to obtain each audio frame
Target frequency;
First determining module, for for each audio frame, based on the corresponding target frequency of the audio frame, in the sound
In the frequency spectrum data of frequency frame, corresponding target amplitude is determined;
Second determining module, for based on the corresponding target amplitude of each audio frame and target frequency, determining the target
The pitch waveform data of audio.
Optionally, first determining module, is specifically used for:
To the audio waveform data of each audio frame, Fourier transformation is carried out respectively, obtains the spectrum number of each audio frame
According to;
In the frequency spectrum data of each audio frame, the corresponding target amplitude of target frequency is determined.
Optionally, second determining module, is specifically used for:
In the frequency spectrum data of each audio frame, keep the corresponding target amplitude of target frequency constant, and by other frequencies
Corresponding amplitude zero setting, obtains the frequency spectrum data after the adjustment of each audio frame;
Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the fundamental tone of the target audio
Wave data.
Optionally, second determining module, is specifically used for:
Based on the corresponding target amplitude of each audio frame and target frequency, the frequency after the adjustment of each audio frame is generated respectively
Modal data;
Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the fundamental tone of the target audio
Wave data.
Optionally, described device further includes:
Module is adjusted, for pitch waveform data based on the target audio, storing in advance with the target audio
Corresponding standard pitch data, tone adjustment is carried out to the target audio.
According to the embodiment of the present disclosure, there is provided a kind of device of audio frequency process, described device include audio adjustment module, use
In:
By each cycle corresponding frequency values in pitch waveform data described above, respectively with standard pitch data
Corresponding standard frequency value is compared on time, if the absolute value of the difference of frequency values and standard frequency value is more than present count
Value, then be adjusted the target audio in cycle where the frequency values.
According to the embodiment of the present disclosure, there is provided a kind of terminal, the terminal include processor and memory, in the memory
At least one instruction is stored with, described instruction is loaded by the processor and performed to realize acquisition pitch waveform described above
The method of data.
According to the first aspect of the embodiment of the present disclosure, there is provided a kind of computer-readable recording medium, in the storage medium
At least one instruction is stored with, described instruction is loaded by processor and performed to realize acquisition pitch waveform data described above
Method.
The technical scheme provided by this disclosed embodiment can include the following benefits:
In the embodiment of the present disclosure, terminal such as multimedia equipment use the above method first to each audio frame in target audio
Pitch extraction is carried out, obtains the corresponding target frequency of each audio frame;For each audio frame, based on the corresponding target of audio frame
Frequency, in the frequency spectrum data of audio frame, determines corresponding target amplitude;Based on the corresponding target amplitude of each audio frame and mesh
Frequency is marked, determines the pitch waveform data of target audio.It is this directly proportional to the vibration frequency of fundamental tone according to pitch, by each
The pitch of frame audio determines the average frequency of fundamental tone in each frame audio, and the average frequency for being then based on fundamental tone obtains each frame sound
The pitch waveform data of frequency, the method for finally obtaining the pitch waveform data of target audio, can accurately obtain target audio
Fundamental tone Vibration Condition.
It should be appreciated that the general description and following detailed description of the above are only exemplary and explanatory, not
The disclosure can be limited.
Brief description of the drawings
Attached drawing herein is merged in specification and forms the part of this specification, shows the implementation for meeting the disclosure
Example, and be used to together with specification to explain the principle of the disclosure.In the accompanying drawings:
Fig. 1 is the flow chart according to a kind of method for the acquisition pitch waveform data for implementing to exemplify;
Fig. 2 is the schematic diagram according to a kind of device for the pitch waveform data for implementing to exemplify;
Fig. 3 is the schematic diagram according to a kind of device for the pitch waveform data for implementing to exemplify;
Fig. 4 is the structure diagram according to a kind of terminal for implementing to exemplify.
Pass through above-mentioned attached drawing, it has been shown that the clear and definite embodiment of the disclosure, will hereinafter be described in more detail.These attached drawings
It is not intended to limit the scope of disclosure design by any mode with word description, but is by reference to specific embodiment
Those skilled in the art illustrate the concept of the disclosure.
Embodiment
Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Following description is related to
During attached drawing, unless otherwise indicated, the same numbers in different attached drawings represent the same or similar key element.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the disclosure.On the contrary, they be only with it is such as appended
The example of the consistent apparatus and method of some aspects be described in detail in claims, the disclosure.
An embodiment of the present invention provides a kind of method for obtaining pitch waveform data, this method can be realized by terminal.Its
In, terminal can be tablet computer, desktop computer, notebook etc..Terminal can include the portions such as processor, memory
Part.Processor, can be CPU (Central Processing Unit, central processing unit) etc., can be used for target sound
Each audio frame carries out pitch extraction in frequency, obtains the corresponding target frequency of each audio frame, waits processing.Memory, Ke Yiwei
RAM (RandomAccess Memory, random access memory), Flash (flash memory) etc., can be used for storing data, treat
Data generated in the data of Cheng Suoxu, processing procedure etc., such as audio.
Terminal can also include transceiver, input block, display unit, audio output part etc..Transceiver, can be used for
Carry out data transmission with server, transceiver can include bluetooth component, WiFi (Wireless-Fidelity, Wireless Fidelity
Technology) component, antenna, match circuit, modem etc..Input block can be touch-screen, keyboard, mouse etc..Audio is defeated
It can be speaker, earphone etc. to go out component.
The embodiment of the present disclosure provides a kind of method for obtaining pitch waveform data, wherein, pitch waveform data that is to say
The amplitude of fundamental tone and the data of time relationship.As shown in Figure 1, the process flow of this method can include the steps:
In a step 101, pitch extraction is carried out to each audio frame in target audio, obtains the corresponding mesh of each audio frame
Mark frequency.
Wherein, target audio can be people's sound audio or accompaniment sound audio, the present embodiment are shown with people's sound audio
Example.
Sound is typically that the different vibration of a series of frequencies for being sent by sounding body, amplitude is combined, these
There is the minimum vibration of a frequency in vibration, the sound sent by it is exactly fundamental tone, remaining is overtone.Pitch refers to various different high
The height of low sound, i.e. sound, is determined, both are proportional by the vibration frequency of fundamental tone.
In force, terminal-pair target audio carries out time-domain analysis, and target audio is cut into each audio frame, each audio
The duration of frame is generally in 10ms between 30ms.Pitch extraction is carried out using pitch extraction algorithm to each audio frame, wherein, should
Pitch is the average pitch of each audio frame, since the vibration frequency of pitch and fundamental tone is proportional, and then can be obtained every
The corresponding target frequency of a audio frame, which is the average frequency of the fundamental tone of each audio frame.Wherein, common sound
High extraction algorithm has auto-relativity function method, Cepstrum Method and the YIN algorithms for being combined auto-relativity function method with Cepstrum Method.
In a step 102, for each audio frame, based on the corresponding target frequency of audio frame, in the spectrum number of audio frame
In, corresponding target amplitude is determined.
Optionally, after terminal determines the target frequency of each frame, it may further determine that the corresponding mesh of target frequency
Amplitude is marked, corresponding processing can, to the audio waveform data of each audio frame, carry out Fourier transformation respectively, obtain every
The frequency spectrum data of a audio frame;In the frequency spectrum data of each audio frame, the corresponding target amplitude of target frequency, the target are determined
Amplitude that is to say the corresponding amplitude of average frequency of fundamental tone in each frame audio.
Wherein, audio waveform data is specifically converted into frequency spectrum data, used fourier formula is:
In force, first, will by above-mentioned Fourier's mode after terminal determines the target frequency of each audio frame
Each audio frame is converted to the frequency domain data in short-term of each frame from time domain data, wherein, frequency spectrum data namely it is intended to indicate that amplitude
With the data of frequency correspondence.Then, terminal determines the corresponding target of target frequency in the frequency spectrum data of each audio frame
Amplitude, the target frequency of each frame audio are the average frequency of fundamental tone, its corresponding target amplitude is the amplitude of fundamental tone.
In step 103, based on the corresponding target amplitude of each audio frame and target frequency, the target audio is determined
Pitch waveform data.
Wherein, pitch waveform data are the data for representing the amplitude of fundamental tone and the correspondence of time.
In force, target frequency and target amplitude of the terminal based on each audio frame, can further obtain target sound
The pitch waveform data of frequency, the step for be also to each frame target audio carry out spectral filtering process, that is to say will be each
The corresponding amplitude of frequency of fundamental tone in frame target audio remains, and the corresponding amplitude of the frequency of overtone is decayed to zero.Tool
Body can have following two modes:
Wherein, need to use inverse Fourier transform during every frame pitch waveform data are obtained, its formula is:
Mode one, terminal keep the corresponding target amplitude of target frequency constant in the frequency spectrum data of each audio frame, and
By the corresponding amplitude zero setting of other frequencies, the frequency spectrum data after the adjustment of each audio frame is obtained;Adjustment to each audio frame
Frequency spectrum data afterwards carries out inverse Fourier transform, obtains the pitch waveform data of target audio.
In force, the corresponding amplitude zero setting of other frequencies that is to say terminal by the corresponding amplitude of these frequencies by terminal
Zero is decayed to, in each frame audio, terminal and then obtains the corresponding target amplitude of target frequency, non-targeted frequency is corresponding to shake
The frequency spectrum data that width is zero.Then terminal-pair comprises only the frequency spectrum data of target amplitude, using above-mentioned inverse Fourier transform, obtains
To the Wave data for comprising only target frequency, it that is to say the pitch waveform data of each frame audio, finally obtain target audio
Pitch waveform data.
Mode two, terminal are based on the corresponding target amplitude of each audio frame and target frequency, generate each audio frame respectively
Adjustment after frequency spectrum data;Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains target sound
The pitch waveform data of frequency.
In force, for each frame audio, after terminal determines target frequency and target amplitude, target can be generated
The corresponding amplitude of frequency is target amplitude, the frequency spectrum data that the corresponding amplitude of other frequencies is zero, then to the frequency spectrum data
Above-mentioned inverse Fourier transform is recycled, can also obtain comprising only the Wave data of target frequency, that is to say each frame audio
Pitch waveform data, finally obtain the pitch waveform data of target audio.
Although the mathematical processes of above two mode differ, but the result finally obtained is the same, and mode one is logical
Cross the frequency spectrum data after the corresponding amplitude zero setting of non-targeted frequency in every frame frequency modal data is adjusted;And mode two is eventually
After end determines target frequency and target amplitude, target frequency and target amplitude are extracted to the other frequencies for being zero with amplitude
Rate generates the frequency spectrum data after adjustment together.As it can be seen that the frequency spectrum data after the two obtained adjustment is identical, then by Fourier
The pitch waveform data that inverse transformation obtains are also identical.In this way, pitch waveform data of the terminal based on target audio can obtain this
The vibration period of target audio and the start time point in each cycle and end time point etc..
Based on described above, terminal carries out pitch to each audio frame in target audio first using the above method and carries
Take, obtain the corresponding target frequency of each audio frame, since pitch is directly proportional to the vibration frequency of fundamental tone, and then can determine every
The average frequency of fundamental tone in one frame audio, is denoted as the target frequency of the audio frame;Then, each audio frame of terminal-pair, based on sound
The corresponding target frequency of frequency frame, in the frequency spectrum data of audio frame, determines corresponding target amplitude;Finally, terminal is based on each
The corresponding target amplitude of audio frame and target frequency, determine the pitch waveform data of target audio.It is this to pass through each frame audio
Pitch determine the average frequency of fundamental tone in each frame audio, be then based on the average frequency of fundamental tone to each frame audio into line frequency
Domain filters to obtain the pitch waveform data of each frame audio, the method for finally obtaining the pitch waveform data of target audio, can be with
Accurately obtain the fundamental tone Vibration Condition of target audio.
Optionally, after terminal obtains the pitch waveform data of target audio using the above method, to above-mentioned target audio
Carry out tone adjustment, corresponding processing can be, terminal can the pitch waveform data based on target audio, store in advance with
The corresponding standard pitch data of target audio, tone adjustment is carried out to target audio.
Wherein, standard pitch data are stored in the form of note data in the terminal, and a note is usually by three data
The pitch of the note is formed, the initial time of the pitch and end time, pitch are represented that each pitch is lasting by frequency values
Duration is usually several seconds, such as 3 seconds etc..Each frame pitch waveform data in the pitch waveform data obtained by the above method
Containing a kind of frequency, therefore each frame pitch waveform data all have periodically, may contain in a frame pitch waveform data
Multiple periodic waveform data, in this way, the cycle duration in each frame pitch waveform data is usually several milliseconds.In this way, standard
The cycle of multiple pitch waveform data can be covered in pitch data in the duration of each pitch, and then, in relatively fundamental tone ripple
Graphic data is with that in standard pitch data, only need to compare the frequency values of the two in corresponding duration.
It is above-mentioned that the adjustment of target audio progress tone (can be somebody's turn to do with the algorithm for target audio become tone stationary tone color
Algorithm is also known as Lent and becomes tone stationary tone color algorithm).In force, terminal is using tone stationary tone color algorithm is become, based on target
Pitch waveform data, the standard pitch data of audio, carry out tone adjustment, wherein above-mentioned change tone stationary tone color to target audio
Algorithm principle can be that corresponding frequency values of each cycle in pitch waveform data are corresponded into the time with standard pitch data
Standard frequency value in section is compared, if the frequency values in a certain cycle and corresponding standard frequency value there are difference,
The target audio in the cycle is adjusted, if the frequency values in a certain cycle are with corresponding standard frequency value, there is no poor
Not, then tone adjustment is not carried out to the target audio in the cycle.Become the algorithm of tone stationary tone color to pitch waveform data and mark
The specific comparison of quasi- pitch data can be:
For example, the start time point for a certain cycle of pitch waveform data is 15.050 seconds, end time point is
15.052 seconds, then the cycle corresponding frequency values are 500 hertz, the pitch frequencies in standard pitch data between 15 seconds to 16 seconds
For ω0, 500 hertz and ω0It is compared.If 500 hertz and ω0Difference absolute value within a preset range, wherein, this is pre-
If scope is the scope close to zero, it may be considered that the frequency values in above-mentioned cycle and the pitch frequencies phase in standard pitch data
Deng terminal is not adjusted the target audio in the cycle.If 500 hertz and ω0Difference absolute value not in default model
In enclosing, then terminal is adjusted the target audio in the cycle using the algorithm for becoming tone stationary tone color.
The scene in practical applications of the above method can be:
When user is sung using multimedia equipment, the microphone of multimedia equipment is by people's sound audio of collection, hair
Give the processor of multimedia terminal.People's sound audio is divided into multiple audio frames by processor first, and to each audio frame profit
Pitch extraction is carried out with pitch extraction algorithm, obtains the target frequency of each audio frame.Then, processor to each audio frame into
Row Fourier transformation, is converted to frequency spectrum data, and the corresponding target amplitude of target frequency is determined in each frame frequency modal data.Most
Afterwards, the processor in multimedia equipment determines the pitch waveform data of people's sound audio based on target frequency and target amplitude.More matchmakers
After body determines the pitch waveform data of people's sound audio, based on pitch waveform data, the standard pitch data of above-mentioned song, to people
The Wave data of sound audio adjusts accordingly, so that song and the song of above-mentioned song standard that multimedia equipment outwards exports
Relatively.
For example, user is when singing " Qinghai-Tibet Platean " using multimedia equipment, the pitch ratio of " Qinghai-Tibet Platean " in climax parts
Higher, user can sing up, this when, and multimedia equipment can be based on the above method, by the user collected in climax portion
The tone data divided is adjusted, so that the song standard of comparison that user sings.
In another example for the user that gets out of tune of doing much singing, when being sung using multimedia equipment, multimedia equipment can
Using the above method, the tone data of the user collected to be adjusted, so that the song standard of comparison that user sings.
In the embodiment of the present disclosure, terminal such as multimedia equipment use the above method first to each audio frame in target audio
Pitch extraction is carried out, obtains the corresponding target frequency of each audio frame;For each audio frame, based on the corresponding target of audio frame
Frequency, in the frequency spectrum data of audio frame, determines corresponding target amplitude;Based on the corresponding target amplitude of each audio frame and mesh
Frequency is marked, determines the pitch waveform data of target audio.It is this directly proportional to the vibration frequency of fundamental tone according to pitch, by each
The pitch of frame audio determines the average frequency of fundamental tone in each frame audio, and the average frequency for being then based on fundamental tone obtains each frame sound
The pitch waveform data of frequency, the method for finally obtaining the pitch waveform data of target audio, can accurately obtain target audio
Fundamental tone Vibration Condition.
The embodiment of the present disclosure additionally provides a kind of device for obtaining pitch waveform data, which can be above-described embodiment
In terminal, as shown in Fig. 2, described device includes:
Extraction module 210, for carrying out pitch extraction to each audio frame in target audio, obtains each audio frame and corresponds to
Target frequency;
First determining module 220, for for each audio frame, based on the corresponding target frequency of the audio frame, in institute
In the frequency spectrum data for stating audio frame, corresponding target amplitude is determined;
Second determining module 230, for based on the corresponding target amplitude of each audio frame and target frequency, determining the mesh
The pitch waveform data of mark with phonetic symbols frequency.
Optionally, first determining module 220, is specifically used for:
To the audio waveform data of each audio frame, Fourier transformation is carried out respectively, obtains the spectrum number of each audio frame
According to;
In the frequency spectrum data of each audio frame, the corresponding target amplitude of target frequency is determined.
Optionally, second determining module 230, is specifically used for:
In the frequency spectrum data of each audio frame, keep the corresponding target amplitude of target frequency constant, and by other frequencies
Corresponding amplitude zero setting, obtains the frequency spectrum data after the adjustment of each audio frame;
Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the fundamental tone of the target audio
Wave data.
Optionally, second determining module 230, is specifically used for:
Based on the corresponding target amplitude of each audio frame and target frequency, the frequency after the adjustment of each audio frame is generated respectively
Modal data;
Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the fundamental tone of the target audio
Wave data.
Optionally, as shown in figure 3, described device further includes:
Module 240 is adjusted, for pitch waveform data based on the target audio, storing in advance with the target sound
Frequently corresponding standard pitch data, tone adjustment is carried out to the target audio.
In the embodiment of the present disclosure, terminal such as multimedia equipment use above device first to each audio frame in target audio
Pitch extraction is carried out, obtains the corresponding target frequency of each audio frame;For each audio frame, based on the corresponding target of audio frame
Frequency, in the frequency spectrum data of audio frame, determines corresponding target amplitude;Based on the corresponding target amplitude of each audio frame and mesh
Frequency is marked, determines the pitch waveform data of target audio.It is this directly proportional to the vibration frequency of fundamental tone according to pitch, by each
The pitch of frame audio determines the average frequency of fundamental tone in each frame audio, and the average frequency for being then based on fundamental tone obtains each frame sound
The pitch waveform data of frequency, the method for finally obtaining the pitch waveform data of target audio, can accurately obtain target audio
Fundamental tone Vibration Condition.
It should be noted that:The device for the acquisition pitch waveform data that above-described embodiment provides is obtaining pitch waveform data
When, only with the division progress of above-mentioned each function module for example, in practical application, above-mentioned function can be divided as needed
With by different function module completions, i.e., the internal structure of device is divided into different function modules, to complete above description
All or part of function.In addition, the device for the acquisition pitch waveform data that above-described embodiment provides is with obtaining pitch waveform
The embodiment of the method for data belongs to same design, its specific implementation process refers to embodiment of the method, and which is not described herein again.
A kind of device of audio frequency process is additionally provided according to the embodiment of the present disclosure, described device includes audio adjustment module,
For:
By each cycle corresponding frequency values in pitch waveform data described above, respectively with standard pitch data
Corresponding standard frequency value is compared on time, if the absolute value of the difference of frequency values and standard frequency value is more than present count
Value, then be adjusted the target audio in cycle where the frequency values.
In the embodiment of the present disclosure, terminal such as multimedia equipment is accurately obtained using the device of above-mentioned acquisition pitch waveform data
After the fundamental tone Vibration Condition of target audio, terminal the pitch waveform data based on target audio, standard pitch data again, to mesh
Mark with phonetic symbols frequency carries out tone adjustment, and then can make the tone of the tone of target audio and standard pitch data relatively.
It should be noted that:The device for the audio frequency process that above-described embodiment provides is when carrying out audio frequency process, only with above-mentioned
The division progress of each function module, can be as needed and by above-mentioned function distribution by different for example, in practical application
Function module is completed, i.e., the internal structure of device is divided into different function modules, with complete it is described above whole or
Partial function.In addition, the device of audio frequency process and the embodiment of the method for audio frequency process that above-described embodiment provides belong to same structure
Think, its specific implementation process refers to embodiment of the method, and which is not described herein again.
A kind of terminal is additionally provided according to the disclosure, the terminal includes processor and memory, deposited in the memory
At least one instruction is contained, described instruction is loaded by the processor and performed to realize acquisition pitch waveform number described above
According to method.
Fig. 4 shows the structure diagram for the terminal 400 that an illustrative embodiment of the invention provides.The terminal 400 can be with
It is:Smart mobile phone, tablet computer, MP3 player (Moving Picture Experts Group Audio Layer III,
Dynamic image expert's compression standard audio aspect 3), MP4 (Moving Picture Experts Group Audio Layer
IV, dynamic image expert's compression standard audio aspect 4) player, laptop or desktop computer.Terminal 400 be also possible to by
Referred to as other titles such as user equipment, portable terminal, laptop terminal, terminal console.
In general, terminal 400 includes:Processor 401 and memory 402.
Processor 401 can include one or more processing cores, such as 4 core processors, 8 core processors etc..Place
Reason device 401 can use DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field-
Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed
Logic array) at least one of example, in hardware realize.Processor 401 can also include primary processor and coprocessor, main
Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing
Unit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state.
In some embodiments, processor 401 can be integrated with GPU (Graphics Processing Unit, image processor),
GPU is used to be responsible for rendering and drawing for content to be shown needed for display screen.In some embodiments, processor 401 can also wrap
AI (Artificial Intelligence, artificial intelligence) processor is included, which is used to handle related machine learning
Calculate operation.
Memory 402 can include one or more computer-readable recording mediums, which can
To be non-transient.Memory 402 may also include high-speed random access memory, and nonvolatile memory, such as one
Or multiple disk storage equipments, flash memory device.In certain embodiments, the non-transient computer in memory 402 can
Read storage medium to be used to store at least one instruction, which is used for performed by processor 401 to realize this Shen
Please in embodiment of the method provide XXXX methods.
In certain embodiments, terminal 400 is also optional includes:Peripheral interface 403 and at least one ancillary equipment.
It can be connected between processor 401, memory 402 and peripheral interface 403 by bus or signal wire.Each ancillary equipment
It can be connected by bus, signal wire or circuit board with peripheral interface 403.Specifically, ancillary equipment includes:Radio circuit
404th, at least one of touch display screen 405, camera 406, voicefrequency circuit 407, positioning component 408 and power supply 409.
Peripheral interface 403 can be used for I/O (Input/Output, input/output) is relevant at least one outer
Peripheral equipment is connected to processor 401 and memory 402.In certain embodiments, processor 401, memory 402 and ancillary equipment
Interface 403 is integrated on same chip or circuit board;In some other embodiments, processor 401, memory 402 and outer
Any one or two in peripheral equipment interface 403 can realize on single chip or circuit board, the present embodiment to this not
It is limited.
Radio circuit 404 is used to receive and launch RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.Penetrate
Frequency circuit 404 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 404 turns electric signal
It is changed to electromagnetic signal to be transmitted, alternatively, the electromagnetic signal received is converted to electric signal.Alternatively, radio circuit 404 wraps
Include:Antenna system, RF transceivers, one or more amplifiers, tuner, oscillator, digital signal processor, codec chip
Group, user identity module card etc..Radio circuit 404 can be carried out by least one wireless communication protocol with other terminals
Communication.The wireless communication protocol includes but not limited to:WWW, Metropolitan Area Network (MAN), Intranet, each third generation mobile communication network (2G, 3G,
4G and 5G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In certain embodiments, penetrate
Frequency circuit 404 can also include NFC (Near Field Communication, wireless near field communication) and related circuit, this Shen
Please this is not limited.
Display screen 405 is used to show UI (User Interface, user interface).The UI can include figure, text, figure
Mark, video and its their any combination.When display screen 405 is touch display screen, display screen 405 also there is collection to show
The surface of screen 405 or the ability of the touch signal of surface.The touch signal can be inputted to processor as control signal
401 are handled.At this time, display screen 405 can be also used for providing virtual push button and/or dummy keyboard, also referred to as soft key and/or
Soft keyboard.In certain embodiments, display screen 405 can be one, set the front panel of terminal 400;In other embodiments
In, display screen 405 can be at least two, be separately positioned on the different surfaces of terminal 400 or in foldover design;In still other reality
Apply in example, display screen 405 can be flexible display screen, be arranged on the curved surface of terminal 400 or on fold plane.Even, show
Display screen 405 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 405 can use LCD (Liquid
Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode)
Prepared etc. material.
CCD camera assembly 406 is used to gather image or video.Alternatively, CCD camera assembly 406 include front camera and
Rear camera.In general, front camera is arranged on the front panel of terminal, rear camera is arranged on the back side of terminal.One
In a little embodiments, rear camera at least two, is main camera, depth of field camera, wide-angle camera, focal length shooting respectively
Head in any one, with realize main camera and the depth of field camera fusion realize background blurring function, main camera and wide-angle
Camera fusion realizes that pan-shot and VR (Virtual Reality, virtual reality) shooting functions or other fusions are clapped
Camera shooting function.In certain embodiments, CCD camera assembly 406 can also include flash lamp.Flash lamp can be monochromatic warm flash lamp,
It can also be double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, can be used for not
With the light compensation under colour temperature.
Voicefrequency circuit 407 can include microphone and loudspeaker.Microphone is used for the sound wave for gathering user and environment, and will
Sound wave, which is converted to electric signal and inputs to processor 401, to be handled, or input to radio circuit 404 to realize voice communication.
For stereo collection or the purpose of noise reduction, microphone can be multiple, be separately positioned on the different parts of terminal 400.Mike
Wind can also be array microphone or omnidirectional's collection type microphone.Loudspeaker is then used to that processor 401 or radio circuit will to be come from
404 electric signal is converted to sound wave.Loudspeaker can be traditional wafer speaker or piezoelectric ceramic loudspeaker.When
When loudspeaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, can also be by telecommunications
Sound wave that the mankind do not hear number is converted to carry out the purposes such as ranging.In certain embodiments, voicefrequency circuit 407 can also include
Earphone jack.
Positioning component 408 is used for the current geographic position of positioning terminal 400, to realize navigation or LBS (Location
Based Service, location Based service).Positioning component 408 can be the GPS (Global based on the U.S.
Positioning System, global positioning system), China dipper system or Russia Galileo system positioning group
Part.
Power supply 409 is used to be powered for the various components in terminal 400.Power supply 409 can be alternating current, direct current,
Disposable battery or rechargeable battery.When power supply 409 includes rechargeable battery, which can be wired charging electricity
Pond or wireless charging battery.Wired charging battery is the battery to be charged by Wireline, and wireless charging battery is by wireless
The battery of coil charges.The rechargeable battery can be also used for supporting fast charge technology.
In certain embodiments, terminal 400 has further included one or more sensors 410.The one or more sensors
410 include but not limited to:Acceleration transducer 411, gyro sensor 412, pressure sensor 413, fingerprint sensor 414,
Optical sensor 415 and proximity sensor 416.
The acceleration that acceleration transducer 411 can be detected in three reference axis of the coordinate system established with terminal 400 is big
It is small.For example acceleration transducer 411 can be used for detecting component of the acceleration of gravity in three reference axis.Processor 401 can
With the acceleration of gravity signal gathered according to acceleration transducer 411, control touch display screen 405 is regarded with transverse views or longitudinal direction
Figure carries out the display of user interface.Acceleration transducer 411 can be also used for game or the collection of the exercise data of user.
Gyro sensor 412 can be with the body direction of detection terminal 400 and rotational angle, and gyro sensor 412 can
To cooperate with collection user to act the 3D of terminal 400 with acceleration transducer 411.Processor 401 is according to gyro sensor 412
The data of collection, it is possible to achieve following function:When action induction (for example changing UI according to the tilt operation of user), shooting
Image stabilization, game control and inertial navigation.
Pressure sensor 413 can be arranged on the side frame of terminal 400 and/or the lower floor of touch display screen 405.Work as pressure
When sensor 413 is arranged on the side frame of terminal 400, gripping signal of the user to terminal 400 can be detected, by processor 401
The gripping signal gathered according to pressure sensor 413 carries out right-hand man's identification or prompt operation.When pressure sensor 413 is arranged on
During the lower floor of touch display screen 405, the pressure operation by processor 401 according to user to touch display screen 405, is realized to UI circle
Operability control on face is controlled.Operability control includes button control, scroll bar control, icon control, menu
At least one of control.
Fingerprint sensor 414 is used for the fingerprint for gathering user, is collected by processor 401 according to fingerprint sensor 414
The identity of fingerprint recognition user, alternatively, by fingerprint sensor 414 according to the identity of the fingerprint recognition user collected.Identifying
When the identity for going out user is trusted identity, the user is authorized to perform relevant sensitive operation, the sensitive operation bag by processor 401
Solution lock screen is included, encryption information is checked, downloads software, payment and change setting etc..Terminal can be set in fingerprint sensor 414
400 front, the back side or side.When being provided with physical button or manufacturer Logo in terminal 400, fingerprint sensor 414 can be with
Integrated with physical button or manufacturer Logo.
Optical sensor 415 is used to gather ambient light intensity.In one embodiment, processor 401 can be according to optics
The ambient light intensity that sensor 415 gathers, controls the display brightness of touch display screen 405.Specifically, when ambient light intensity is higher
When, heighten the display brightness of touch display screen 405;When ambient light intensity is relatively low, the display for turning down touch display screen 405 is bright
Degree.In another embodiment, the ambient light intensity that processor 401 can also be gathered according to optical sensor 415, dynamic adjust
The acquisition parameters of CCD camera assembly 406.
Proximity sensor 416, also referred to as range sensor, are generally arranged at the front panel of terminal 400.Proximity sensor 416
The distance between front for gathering user and terminal 400.In one embodiment, when proximity sensor 416 detects use
When the distance between family and the front of terminal 400 taper into, touch display screen 405 is controlled from bright screen state by processor 401
It is switched to breath screen state;When proximity sensor 416 detects that the distance between front of user and terminal 400 becomes larger,
Touch display screen 405 is controlled to be switched to bright screen state from breath screen state by processor 401.
It will be understood by those skilled in the art that the restriction of the structure shown in Fig. 4 not structure paired terminal 400, can wrap
Include than illustrating more or fewer components, either combine some components or arranged using different components.
The another embodiment of the disclosure provides a kind of non-transitorycomputer readable storage medium, when the storage medium
In instruction by terminal processor perform when so that terminal is able to carry out:
Pitch extraction is carried out to each audio frame in target audio, obtains the corresponding target frequency of each audio frame;
For each audio frame, based on the corresponding target frequency of the audio frame, in the frequency spectrum data of the audio frame,
Determine corresponding target amplitude;
Based on the corresponding target amplitude of each audio frame and target frequency, the pitch waveform number of the target audio is determined
According to.
Optionally, it is described for each audio frame, based on the corresponding target frequency of the audio frame, in the audio frame
In frequency spectrum data, corresponding target amplitude is determined, including:
To the audio waveform data of each audio frame, Fourier transformation is carried out respectively, obtains the spectrum number of each audio frame
According to;
In the frequency spectrum data of each audio frame, the corresponding target amplitude of target frequency is determined.
Optionally, it is described to be based on the corresponding target amplitude of each audio frame and target frequency, determine the target audio
Pitch waveform data, including:
In the frequency spectrum data of each audio frame, keep the corresponding target amplitude of target frequency constant, and by other frequencies
Corresponding amplitude zero setting, obtains the frequency spectrum data after the adjustment of each audio frame;
Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the fundamental tone of the target audio
Wave data.
Optionally, it is described to be based on the corresponding target amplitude of each audio frame and target frequency, determine the target audio
Pitch waveform data, including:
Based on the corresponding target amplitude of each audio frame and target frequency, the frequency after the adjustment of each audio frame is generated respectively
Modal data;
Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the fundamental tone of the target audio
Wave data.
Optionally, the method further includes:
Pitch waveform data based on the target audio, store in advance with the corresponding standard pronunciation of the target audio
High data, tone adjustment is carried out to the target audio.
In the embodiment of the present disclosure, terminal such as multimedia equipment use the above method first to each audio frame in target audio
Pitch extraction is carried out, obtains the corresponding target frequency of each audio frame;For each audio frame, based on the corresponding target of audio frame
Frequency, in the frequency spectrum data of audio frame, determines corresponding target amplitude;Based on the corresponding target amplitude of each audio frame and mesh
Frequency is marked, determines the pitch waveform data of target audio.It is this directly proportional to the vibration frequency of fundamental tone according to pitch, by each
The pitch of frame audio determines the average frequency of fundamental tone in each frame audio, and the average frequency for being then based on fundamental tone obtains each frame sound
The pitch waveform data of frequency, the method for finally obtaining the pitch waveform data of target audio, can accurately obtain target audio
Fundamental tone Vibration Condition.
Those skilled in the art will readily occur to the disclosure its after considering specification and putting into practice disclosure disclosed herein
Its embodiment.This application is intended to cover any variations, uses, or adaptations of the disclosure, these modifications, purposes or
Person's adaptive change follows the general principle of the disclosure and including the undocumented common knowledge in the art of the disclosure
Or conventional techniques.Description and embodiments are considered only as exemplary, and the true scope and spirit of the disclosure are by above
Claim is pointed out.
It should be appreciated that the present disclosure is not limited to the precise structures that have been described above and shown in the drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present disclosure is only limited by appended claim.
Claims (14)
- A kind of 1. method for obtaining pitch waveform data, it is characterised in that the described method includes:Pitch extraction is carried out to each audio frame in target audio, obtains the corresponding target frequency of each audio frame;For each audio frame, based on the corresponding target frequency of the audio frame, in the frequency spectrum data of the audio frame, determine Corresponding target amplitude;Based on the corresponding target amplitude of each audio frame and target frequency, the pitch waveform data of the target audio are determined.
- 2. according to the method described in claim 1, it is characterized in that, described for each audio frame, based on the audio frame pair The target frequency answered, in the frequency spectrum data of the audio frame, determines corresponding target amplitude, including:To the audio waveform data of each audio frame, Fourier transformation is carried out respectively, obtains the frequency spectrum data of each audio frame;In the frequency spectrum data of each audio frame, the corresponding target amplitude of target frequency is determined.
- 3. according to the method described in claim 2, it is characterized in that, described be based on the corresponding target amplitude of each audio frame and mesh Frequency is marked, determines the pitch waveform data of the target audio, including:In the frequency spectrum data of each audio frame, keep the corresponding target amplitude of target frequency constant, and other frequencies are corresponded to Amplitude zero setting, obtain the frequency spectrum data after the adjustment of each audio frame;Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the pitch waveform of the target audio Data.
- 4. according to the method described in claim 2, it is characterized in that, described be based on the corresponding target amplitude of each audio frame and mesh Frequency is marked, determines the pitch waveform data of the target audio, including:Based on the corresponding target amplitude of each audio frame and target frequency, the spectrum number after the adjustment of each audio frame is generated respectively According to;Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the pitch waveform of the target audio Data.
- 5. according to claim 1-4 any one of them methods, it is characterised in that the method further includes:Pitch waveform data based on the target audio, store in advance with the corresponding standard pitch number of the target audio According to target audio progress tone adjustment.
- A kind of 6. method of audio frequency process, it is characterised in that the described method includes:By each cycle corresponding frequency values in claim 1-5 any one of them pitch waveform data, respectively with standard pronunciation Corresponding standard frequency value is compared in time in high data, if the absolute value of the difference of frequency values and standard frequency value More than default value, then the target audio in cycle where the frequency values is adjusted.
- 7. a kind of device for obtaining pitch waveform data, it is characterised in that described device includes:Extraction module, for carrying out pitch extraction to each audio frame in target audio, obtains the corresponding target of each audio frame Frequency;First determining module, for for each audio frame, based on the corresponding target frequency of the audio frame, in the audio frame Frequency spectrum data in, determine corresponding target amplitude;Second determining module, for based on the corresponding target amplitude of each audio frame and target frequency, determining the target audio Pitch waveform data.
- 8. device according to claim 7, it is characterised in that first determining module, is specifically used for:To the audio waveform data of each audio frame, Fourier transformation is carried out respectively, obtains the frequency spectrum data of each audio frame;In the frequency spectrum data of each audio frame, the corresponding target amplitude of target frequency is determined.
- 9. device according to claim 8, it is characterised in that second determining module, is specifically used for:In the frequency spectrum data of each audio frame, keep the corresponding target amplitude of target frequency constant, and other frequencies are corresponded to Amplitude zero setting, obtain the frequency spectrum data after the adjustment of each audio frame;Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the pitch waveform of the target audio Data.
- 10. device according to claim 8, it is characterised in that second determining module, is specifically used for:Based on the corresponding target amplitude of each audio frame and target frequency, the spectrum number after the adjustment of each audio frame is generated respectively According to;Frequency spectrum data after adjustment to each audio frame carries out inverse Fourier transform, obtains the pitch waveform of the target audio Data.
- 11. according to claim 7-10 any one of them devices, it is characterised in that described device further includes:Adjust module, for the pitch waveform data based on the target audio, store in advance it is opposite with the target audio The standard pitch data answered, tone adjustment is carried out to the target audio.
- 12. a kind of device of audio frequency process, it is characterised in that described device includes audio adjustment module, is used for:By each cycle corresponding frequency values in claim 7-11 any one of them pitch waveform data, respectively with standard pronunciation Corresponding standard frequency value is compared in time in high data, if the absolute value of the difference of frequency values and standard frequency value More than default value, then the target audio in cycle where the frequency values is adjusted.
- 13. a kind of terminal, it is characterised in that the terminal includes processor and memory, is stored with least in the memory One instruction, described instruction are loaded by the processor and performed to realize the acquisition fundamental tone as described in claim 1 to 5 is any The method of Wave data.
- 14. a kind of computer-readable recording medium, it is characterised in that at least one instruction, institute are stored with the storage medium Instruction is stated to be loaded by processor and performed to realize the method for the acquisition pitch waveform data as described in claim 1 to 5 is any.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711337024.6A CN107958672A (en) | 2017-12-12 | 2017-12-12 | The method and apparatus for obtaining pitch waveform data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711337024.6A CN107958672A (en) | 2017-12-12 | 2017-12-12 | The method and apparatus for obtaining pitch waveform data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107958672A true CN107958672A (en) | 2018-04-24 |
Family
ID=61958918
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711337024.6A Pending CN107958672A (en) | 2017-12-12 | 2017-12-12 | The method and apparatus for obtaining pitch waveform data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107958672A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109712220A (en) * | 2018-11-15 | 2019-05-03 | 贵阳语玩科技有限公司 | A kind of end iOS drawing audio waveforms method and apparatus and computer readable storage medium |
CN110176242A (en) * | 2019-07-10 | 2019-08-27 | 广州荔支网络技术有限公司 | A kind of recognition methods of tone color, device, computer equipment and storage medium |
CN110661760A (en) * | 2018-06-29 | 2020-01-07 | 视联动力信息技术股份有限公司 | Data processing method and device |
CN111883147A (en) * | 2020-07-23 | 2020-11-03 | 北京达佳互联信息技术有限公司 | Audio data processing method and device, computer equipment and storage medium |
CN112885374A (en) * | 2021-01-27 | 2021-06-01 | 吴怡然 | Sound accuracy judgment method and system based on spectrum analysis |
WO2021164267A1 (en) * | 2020-02-21 | 2021-08-26 | 平安科技(深圳)有限公司 | Anomaly detection method and apparatus, and terminal device and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1473325A (en) * | 2001-08-31 | 2004-02-04 | ��ʽ���罨�� | Pitch waveform signal generation apparatus, pitch waveform signal generation method, and program |
CN1514931A (en) * | 2002-06-07 | 2004-07-21 | ��ʽ���罨�� | Voice signal interpolation device, method and program |
CN101399044A (en) * | 2007-09-29 | 2009-04-01 | 国际商业机器公司 | Voice conversion method and system |
CN102227770A (en) * | 2009-07-06 | 2011-10-26 | 松下电器产业株式会社 | Voice tone converting device, voice pitch converting device, and voice tone converting method |
CN103258539A (en) * | 2012-02-15 | 2013-08-21 | 展讯通信(上海)有限公司 | Method and device for transforming voice signal characteristics |
CN105118523A (en) * | 2015-07-13 | 2015-12-02 | 努比亚技术有限公司 | Audio processing method and device |
CN105513605A (en) * | 2015-12-01 | 2016-04-20 | 南京师范大学 | Voice enhancement system and method for cellphone microphone |
-
2017
- 2017-12-12 CN CN201711337024.6A patent/CN107958672A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1473325A (en) * | 2001-08-31 | 2004-02-04 | ��ʽ���罨�� | Pitch waveform signal generation apparatus, pitch waveform signal generation method, and program |
CN1514931A (en) * | 2002-06-07 | 2004-07-21 | ��ʽ���罨�� | Voice signal interpolation device, method and program |
CN101399044A (en) * | 2007-09-29 | 2009-04-01 | 国际商业机器公司 | Voice conversion method and system |
CN102227770A (en) * | 2009-07-06 | 2011-10-26 | 松下电器产业株式会社 | Voice tone converting device, voice pitch converting device, and voice tone converting method |
CN103258539A (en) * | 2012-02-15 | 2013-08-21 | 展讯通信(上海)有限公司 | Method and device for transforming voice signal characteristics |
CN105118523A (en) * | 2015-07-13 | 2015-12-02 | 努比亚技术有限公司 | Audio processing method and device |
CN105513605A (en) * | 2015-12-01 | 2016-04-20 | 南京师范大学 | Voice enhancement system and method for cellphone microphone |
Non-Patent Citations (1)
Title |
---|
腾旭 等: "《电子***抗干扰实用技术》", 31 July 2004 * |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110661760A (en) * | 2018-06-29 | 2020-01-07 | 视联动力信息技术股份有限公司 | Data processing method and device |
CN109712220A (en) * | 2018-11-15 | 2019-05-03 | 贵阳语玩科技有限公司 | A kind of end iOS drawing audio waveforms method and apparatus and computer readable storage medium |
CN110176242A (en) * | 2019-07-10 | 2019-08-27 | 广州荔支网络技术有限公司 | A kind of recognition methods of tone color, device, computer equipment and storage medium |
WO2021164267A1 (en) * | 2020-02-21 | 2021-08-26 | 平安科技(深圳)有限公司 | Anomaly detection method and apparatus, and terminal device and storage medium |
CN111883147A (en) * | 2020-07-23 | 2020-11-03 | 北京达佳互联信息技术有限公司 | Audio data processing method and device, computer equipment and storage medium |
CN111883147B (en) * | 2020-07-23 | 2024-05-07 | 北京达佳互联信息技术有限公司 | Audio data processing method, device, computer equipment and storage medium |
CN112885374A (en) * | 2021-01-27 | 2021-06-01 | 吴怡然 | Sound accuracy judgment method and system based on spectrum analysis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107958672A (en) | The method and apparatus for obtaining pitch waveform data | |
CN107967706A (en) | Processing method, device and the computer-readable recording medium of multi-medium data | |
CN108008930A (en) | The method and apparatus for determining K song score values | |
US11574009B2 (en) | Method, apparatus and computer device for searching audio, and storage medium | |
CN108829881A (en) | video title generation method and device | |
CN108401124A (en) | The method and apparatus of video record | |
CN109379643A (en) | Image synthesizing method, device, terminal and storage medium | |
CN109003621A (en) | A kind of audio-frequency processing method, device and storage medium | |
CN109147757A (en) | Song synthetic method and device | |
CN110491358A (en) | Carry out method, apparatus, equipment, system and the storage medium of audio recording | |
CN108848394A (en) | Net cast method, apparatus, terminal and storage medium | |
CN108965757A (en) | video recording method, device, terminal and storage medium | |
CN108965922A (en) | Video cover generation method, device and storage medium | |
CN109346111A (en) | Data processing method, device, terminal and storage medium | |
CN109192218A (en) | The method and apparatus of audio processing | |
CN108320756A (en) | It is a kind of detection audio whether be absolute music audio method and apparatus | |
CN109635133A (en) | Visualize audio frequency playing method, device, electronic equipment and storage medium | |
CN109547843A (en) | The method and apparatus that audio-video is handled | |
CN107871012A (en) | Audio-frequency processing method, device, storage medium and terminal | |
CN110139143A (en) | Virtual objects display methods, device, computer equipment and storage medium | |
CN109192223A (en) | The method and apparatus of audio alignment | |
CN109065068A (en) | Audio-frequency processing method, device and storage medium | |
CN109982129A (en) | Control method for playing back, device and the storage medium of short-sighted frequency | |
CN108364660A (en) | Accent identification method, device and computer readable storage medium | |
CN109102811A (en) | Generation method, device and the storage medium of audio-frequency fingerprint |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180424 |