CN104036788B - The acoustic fidelity identification method of audio file and device - Google Patents

The acoustic fidelity identification method of audio file and device Download PDF

Info

Publication number
CN104036788B
CN104036788B CN201410235733.3A CN201410235733A CN104036788B CN 104036788 B CN104036788 B CN 104036788B CN 201410235733 A CN201410235733 A CN 201410235733A CN 104036788 B CN104036788 B CN 104036788B
Authority
CN
China
Prior art keywords
tonequality
audio file
data
voice data
channel
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201410235733.3A
Other languages
Chinese (zh)
Other versions
CN104036788A (en
Inventor
田彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Taile Culture Technology Co.,Ltd.
Original Assignee
Beijing Yinzhibang Culture Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Yinzhibang Culture Technology Co Ltd filed Critical Beijing Yinzhibang Culture Technology Co Ltd
Priority to CN201410235733.3A priority Critical patent/CN104036788B/en
Publication of CN104036788A publication Critical patent/CN104036788A/en
Application granted granted Critical
Publication of CN104036788B publication Critical patent/CN104036788B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Stereophonic System (AREA)

Abstract

The present invention provides acoustic fidelity identification method and the device of a kind of audio file.The embodiment of the present invention is by obtaining target audio file to be identified, and then according to described target audio file, obtain at least one in the time domain waveform feature of described target audio file and the frequency domain spectral line characteristic of described target audio file, make it possible to according at least one in described time domain waveform feature and described frequency domain spectral line characteristic, the tonequality identifying described target audio file is the first tonequality or the second tonequality, described first tonequality is higher than described second tonequality, so, the audio file of real high tone quality can be provided a user with, allow users to appreciate the audio file of real high tone quality.

Description

The acoustic fidelity identification method of audio file and device
[technical field]
The present invention relates to audio signal processing technique, particularly relate to acoustic fidelity identification method and the dress of a kind of audio file Put.
[background technology]
The tonequality of audio file, refers to the fidelity of original audio data after overcompression processes.High The audio file of tonequality, it is possible to recover original audio data completely, and do not cause any distortion;And bass The audio file of matter, then can not recover original audio data completely, and cause partial distortion.At present, Occur in that some switch technologies, it is possible to the audio file of low tonequality is converted into the audio file of pseudo-high tone quality. It practice, the audio file of this pseudo-high tone quality, its tonequality with change before the tonequality of audio file be The same, and it is not belonging to real high tone quality.User obtains these pseudo-high pitchs by the application of some music classes After the audio file of matter, cannot enjoy real high tone quality, this can affect these music classes should at all Brand image, even also result in legal dispute.
Therefore, in order to provide a user with the audio file of real high tone quality, allow users to appreciate The audio file of real high tone quality, effectively identifies the tonequality of audio file, is problem demanding prompt solution.
[summary of the invention]
The many aspects of the present invention provide acoustic fidelity identification method and the device of a kind of audio file, in order to realize The tonequality identification of audio file.
An aspect of of the present present invention, it is provided that the acoustic fidelity identification method of a kind of audio file, including:
Obtain target audio file to be identified;
According to described target audio file, it is thus achieved that the time domain waveform feature of described target audio file and described At least one in the frequency domain spectral line characteristic of target audio file;
According at least one in described time domain waveform feature and described frequency domain spectral line characteristic, identify described mesh The tonequality of mark with phonetic symbols frequency file is the first tonequality or the second tonequality, and described first tonequality is higher than described second tonequality.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State according to described target audio file, it is thus achieved that the time domain waveform feature of described target audio file and described mesh At least one in the frequency domain spectral line characteristic of mark with phonetic symbols frequency file, including:
Determine the number of channels of described target audio file;
The data block of described target audio file is decoded, to obtain original audio data;
According to described number of channels and described original audio data, it is thus achieved that the sound channel sound corresponding to each sound channel Frequency evidence.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State according at least one in described time domain waveform feature and described frequency domain spectral line characteristic, identify described target The tonequality of audio file is the first tonequality or the second tonequality, including:
If described number of channels is more than or equal to 2, according to the channel audio data corresponding to each sound channel, obtain Obtain the first channel audio data corresponding at least two sound channel and second sound channel voice data;
Described first channel audio data and described second sound channel voice data are carried out addition process, to obtain Obtain mixed layer sound channel voice data;
If described mixed layer sound channel voice data is more than or equal to described first channel audio data/N or described the Two channel audio data/M, identifies that the tonequality of described target audio file is described first tonequality;
If described mixed layer sound channel voice data is less than described first channel audio data/N or described second sound channel Voice data/M, identifies that the tonequality of described target audio file is described second tonequality;Wherein,
N is the number more than 1;M is the number more than 1.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State according at least one in described time domain waveform feature and described frequency domain spectral line characteristic, identify described target The tonequality of audio file is the first tonequality or the second tonequality, including:
If difference in the value of the target channels voice data specified number continuously between any two, less than or etc. In the first amplitude threshold, identify that the tonequality of described target audio file is described second tonequality, described target Channel audio data include the sound corresponding to arbitrary sound channel in the channel audio data corresponding to each sound channel Audio data;Or
If the difference of the value of the target channels voice data of continuous two, more than or equal to the second amplitude threshold, And the symbol of the value of the target channels voice data of described continuous two is contrary, identify described target audio literary composition The tonequality of part is described second tonequality, and described target channels voice data includes the sound corresponding to each sound channel The channel audio data corresponding to arbitrary sound channel in audio data.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State according to described number of channels and described original audio data, it is thus achieved that the channel audio corresponding to each sound channel After data, also include:
Target channels voice data is carried out sub-frame processing, to obtain at least one frame voice data, described mesh Mark channel audio data include corresponding to the arbitrary sound channel in the channel audio data corresponding to each sound channel Channel audio data;
To described at least one frame voice data, carry out frequency domain transform process, to obtain every frame voice data institute Corresponding frequency domain data.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State according at least one in described time domain waveform feature and described frequency domain spectral line characteristic, identify described target The tonequality of audio file is the first tonequality or the second tonequality, including:
According to the frequency domain data corresponding to every frame voice data, it is thus achieved that every frequency domain corresponding to frame voice data Data energy component at each frequency;
If in the energy component that every frequency domain data corresponding to frame voice data is at least one identical frequency Difference between any two, less than or equal to described energy threshold, identifies the tonequality of described target audio file For described second tonequality.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State before obtaining target audio file to be identified, also include:
Obtain the format parameter of candidate audio files;
According to described format parameter, determine that described candidate audio files is described target audio file;Or The tonequality identifying described candidate audio files is described second tonequality.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State format parameter and include at least one in compressed format, sample rate, sampling depth and code check.
Another aspect of the present invention, it is provided that the tonequality identification device of a kind of audio file, including:
Acquiring unit, for obtaining target audio file to be identified;
Feature unit, for according to described target audio file, it is thus achieved that the time domain of described target audio file At least one in the frequency domain spectral line characteristic of wave character and described target audio file;
Recognition unit, for according at least in described time domain waveform feature and described frequency domain spectral line characteristic , identifying that the tonequality of described target audio file is the first tonequality or the second tonequality, described first tonequality is high In described second tonequality.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State feature unit, specifically for
Determine the number of channels of described target audio file;
The data block of described target audio file is decoded, to obtain original audio data;And
According to described number of channels and described original audio data, it is thus achieved that the sound channel sound corresponding to each sound channel Frequency evidence.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State recognition unit, specifically for
If described number of channels is more than or equal to 2, according to the channel audio data corresponding to each sound channel, obtain Obtain the first channel audio data corresponding at least two sound channel and second sound channel voice data;
Described first channel audio data and described second sound channel voice data are carried out addition process, to obtain Obtain mixed layer sound channel voice data;And
If described mixed layer sound channel voice data is more than or equal to described first channel audio data/N or described the Two channel audio data/M, identifies that the tonequality of described target audio file is described first tonequality;
If described mixed layer sound channel voice data is less than described first channel audio data/N or described second sound channel Voice data/M, identifies that the tonequality of described target audio file is described second tonequality;Wherein,
N is the number more than 1;M is the number more than 1.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State recognition unit, specifically for
If difference in the value of the target channels voice data specified number continuously between any two, less than or etc. In the first amplitude threshold, identify that the tonequality of described target audio file is described second tonequality, described target Channel audio data include the sound corresponding to arbitrary sound channel in the channel audio data corresponding to each sound channel Audio data;Or
If the difference of the value of the target channels voice data of continuous two, more than or equal to the second amplitude threshold, And the symbol of the value of the target channels voice data of described continuous two is contrary, identify described target audio literary composition The tonequality of part is described second tonequality, and described target channels voice data includes the sound corresponding to each sound channel The channel audio data corresponding to arbitrary sound channel in audio data.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State feature unit, be additionally operable to
Target channels voice data is carried out sub-frame processing, to obtain at least one frame voice data, described mesh Mark channel audio data include corresponding to the arbitrary sound channel in the channel audio data corresponding to each sound channel Channel audio data;And
To described at least one frame voice data, carry out frequency domain transform process, to obtain every frame voice data institute Corresponding frequency domain data.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State recognition unit, specifically for
According to the frequency domain data corresponding to every frame voice data, it is thus achieved that every frequency domain corresponding to frame voice data Data energy component at each frequency;And
If in the energy component that every frequency domain data corresponding to frame voice data is at least one identical frequency Difference between any two, less than or equal to described energy threshold, identifies the tonequality of described target audio file For described second tonequality.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State recognition unit, be additionally operable to
Obtain the format parameter of candidate audio files;And
According to described format parameter, determine that described candidate audio files is described target audio file;Or The tonequality identifying described candidate audio files is described second tonequality.
Aspect as above and arbitrary possible implementation, it is further provided a kind of implementation, institute State format parameter and include at least one in compressed format, sample rate, sampling depth and code check.
As shown from the above technical solution, the embodiment of the present invention by obtaining target audio file to be identified, And then according to described target audio file, it is thus achieved that the time domain waveform feature of described target audio file and described At least one in the frequency domain spectral line characteristic of target audio file, enabling special according to described time domain waveform Seek peace at least one in described frequency domain spectral line characteristic, identify that the tonequality of described target audio file is first Tonequality or the second tonequality, described first tonequality is higher than described second tonequality, so, can carry to user For the audio file of real high tone quality, allow users to appreciate the audio file of real high tone quality.
It addition, use the technical scheme that the present invention provides, simple to operate, it is possible to be effectively improved audio file The efficiency of tonequality identification.
[accompanying drawing explanation]
For the technical scheme being illustrated more clearly that in the embodiment of the present invention, below will be to embodiment or existing In technology description, the required accompanying drawing used is briefly described, it should be apparent that, in describing below Accompanying drawing is some embodiments of the present invention, for those of ordinary skill in the art, is not paying creation On the premise of property is laborious, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
The schematic flow sheet of the acoustic fidelity identification method of the audio file that Fig. 1 provides for one embodiment of the invention;
Fig. 2 is an original audio data i.e. time domain of target channels voice data in the embodiment that Fig. 1 is corresponding Waveform diagram;
When Fig. 3 is another of the i.e. target channels voice data of original audio data in the embodiment that Fig. 1 is corresponding Domain waveform schematic diagram;
Fig. 4 is beginning voice data i.e. frequency corresponding to target channels voice data in the embodiment that Fig. 1 is corresponding The energy spectrum schematic diagram of numeric field data;
The structural representation of the tonequality identification device of the audio file that Fig. 5 provides for another embodiment of the present invention.
[detailed description of the invention]
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with this Accompanying drawing in bright embodiment, is clearly and completely described the technical scheme in the embodiment of the present invention, Obviously, described embodiment is a part of embodiment of the present invention rather than whole embodiments.Based on Embodiment in the present invention, those of ordinary skill in the art are obtained under not making creative work premise Other embodiments whole obtained, broadly fall into the scope of protection of the invention.
It should be noted that terminal involved in the embodiment of the present invention can include but not limited to mobile phone, Personal digital assistant (Personal Digital Assistant, PDA), wireless handheld device, wireless on Net basis, portable computer, PC (Personal Computer, PC), MP3 player, MP4 Player etc..
It addition, the terms "and/or", a kind of incidence relation describing affiliated partner, represent Three kinds of relations, such as, A and/or B can be there are, can represent: individualism A, there is A simultaneously And B, individualism B these three situation.It addition, character "/" herein, typically represent forward-backward correlation pair As if the relation of a kind of "or".
The flow process signal of the acoustic fidelity identification method of a kind of audio file that Fig. 1 provides for one embodiment of the invention Figure, as shown in Figure 1.
101, target audio file to be identified is obtained.
Wherein, described target audio file can include the audio file of various coded formats in prior art, Such as, dynamic image expert group (Moving Picture Experts Group, MPEG) layer 3 (MPEGLayer-3, MP3) formatted audio files, WMA (Windows Media Audio) lattice Formula audio file, Advanced Audio Coding (Advanced Audio Coding, AAC) formatted audio files, Lossless Audio Compression coding (Free Lossless Audio Codec, FLAC) or APE format audio Files etc., this is not particularly limited by the present embodiment.
102, according to described target audio file, it is thus achieved that the time domain waveform feature of described target audio file and At least one in the frequency domain spectral line characteristic of described target audio file.
Wherein, the time domain waveform feature of described target audio file, original audio can be included but not limited to The amplitude information of data.
Original audio data, is by the digital signal converting acoustical signal, such as, to described sound Tone signal is sampled, quantifies and coded treatment, to obtain pulse code modulation (Pulse Code Modulation, PCM) data, specifically can be by the data block of target audio file be resolved Obtain.
Wherein, the frequency domain spectral line characteristic of described target audio file, original audio can be included but not limited to The spectrum information of data.
103, according at least one in described time domain waveform feature and described frequency domain spectral line characteristic, institute is identified The tonequality stating target audio file is the first tonequality or the second tonequality, and described first tonequality is higher than described second Tonequality.
It should be noted that the executive agent of 101~103 can be processing means, may be located at this locality Application (Application, App) such as, in Baidu's music, or may be located on the service of network side In device, or can also be in the application that is located locally of a part, another part is positioned at the server of network side.
It is understood that the application program (nativeAPP) that described application can be mounted in terminal, Or can also is that a webpage (webAPP) of browser in terminal, as long as being capable of audio frequency number According to process objective reality form can, this is not defined by the present embodiment.
So, by obtaining target audio file to be identified, and then according to described target audio file, Obtain time domain waveform feature and the frequency domain spectral line characteristic of described target audio file of described target audio file In at least one, enabling according in described time domain waveform feature and described frequency domain spectral line characteristic extremely One item missing, identifies that the tonequality of described target audio file is the first tonequality or the second tonequality, described first sound Matter is higher than described second tonequality, so, can provide a user with the audio file of real high tone quality, Allow users to appreciate the audio file of real high tone quality.
Alternatively, in a possible implementation of the present embodiment, before 101, processing means The format parameter of candidate audio files can also be obtained further.Then, described processing means then can root According to described format parameter, determine that described candidate audio files is described target audio file;Or identify institute The tonequality stating candidate audio files is described second tonequality.
Wherein, described format parameter can include but not limited to compressed format, sample rate, sampling depth and At least one in code check.
Described compressed format, original audio data performs the compression method of compression through certain program, such as MP3 format, WMA form, AAC form, FLAC form or APE form etc..
Described sample rate, also referred to as sample rate or sample frequency, define per second from continuous signal Extracting and form the number of samples of discrete signal, it represents with hertz (Hz).
Described sampling depth, refers to that the value of a sampled point is represented by several bit numbers, which determines each adopting The figure place of the value of sampling point, such as, 8 bits (bit), 16 or 24 etc..
Described code check, the quantity of the bit processed in referring to the unit interval, unit is bit per second (bps).
Specifically, the frame head of candidate audio files specifically can be resolved by processing means, to obtain time Select the format parameter of audio file.
Such as, if sampling depth is 8bit, identify that the tonequality of described candidate audio files is described second tonequality; If sampling depth is 16bit, determine that described candidate audio files is described target audio file.
Or, more such as, if sample rate is less than 44100Hz, identify the tonequality of described candidate audio files For described second tonequality;If sample rate is more than or equal to 44100Hz, determine that described candidate audio files is Described target audio file.
Or, more such as, compressed format is MP3, and code check is less than 320 kilobits per seconds (kbps), The tonequality identifying described candidate audio files is described second tonequality;Compressed format is MP3, and code check is big In or equal to 320kbps, determine that described candidate audio files is described target audio file.
So, by obtain candidate audio files format parameter, and then can according to described format parameter, The tonequality identifying described candidate audio files in advance is described second tonequality so that this candidate audio files without Need to be as target audio file, to identify further, it is possible to the tonequality being effectively improved audio file is known Other efficiency.
Further, since without candidate audio files is decoded, it is only necessary to carry out frame head resolving and just may be used To obtain the format parameter of candidate audio files, therefore, it is possible to the tonequality improving audio file further is known Other efficiency.
Alternatively, in a possible implementation of the present embodiment, in 102, processing means has Body may determine that the number of channels of described target audio file, and the data to described target audio file Block is decoded, to obtain original audio data.Then, described processing means then can be according to described sound Road number and described original audio data, it is thus achieved that the channel audio data corresponding to each sound channel.Wherein, The detailed description of analytic method and coding/decoding method may refer to related content of the prior art, the most no longer Repeat.
Such as, the frame head of described target audio file specifically can be resolved, to determine by processing means The number of channels of described target audio file.
Or the most such as, the file header of described target audio file is specifically resolved by processing means, with Determine the number of channels of described target audio file.
Or the most such as, other parts of target audio file can also be resolved by processing means, with Determining the number of channels of described target audio file, this is not particularly limited by the present embodiment.
Or the most such as, processing means specifically can also be from configuration file, it is thus achieved that described target audio literary composition The number of channels of part.
It is understood that " determining the number of channels of described target audio file ", and " to described The data block of target audio file is decoded, to obtain original audio data " two steps, do not have Permanent order, described processing means can first carry out " number of channels determining described target audio file " Step, then perform " data block of described target audio file to be decoded, to obtain original audio Data " step, or can also first carry out " data block of described target audio file is decoded, To obtain original audio data " step, then perform " determining the number of channels of described target audio file " Step, or the two step can also be performed simultaneously, this is not particularly limited by the present embodiment.
Correspondingly, in a possible implementation of the present embodiment, in 103, if described sound channel Number is more than or equal to 2, and processing means then can obtain according to the channel audio data corresponding to each sound channel Obtain the first channel audio data corresponding at least two sound channel and second sound channel voice data, and then by institute State the first channel audio data and described second sound channel voice data carries out addition process, to obtain compound voice Audio data.
If described mixed layer sound channel voice data is more than or equal to described first channel audio data/N or described the Two channel audio data/M, described processing means then can identify that the tonequality of described target audio file is institute State the first tonequality.Wherein, N is the number more than 1;M is the number more than 1.
If described mixed layer sound channel voice data is less than described first channel audio data/N or described second sound channel Voice data/M, described processing means then can identify that the tonequality of described target audio file is described second Tonequality;Wherein, N is the number more than 1;M is the number more than 1.
Correspondingly, in a possible implementation of the present embodiment, in 103, if specifying continuously Difference between any two in the value of the target channels voice data of number (such as 3), less than or equal to One amplitude threshold, the waveform corresponding to this situation can be as shown in Figure 2, then, described processing means The tonequality that then can identify described target audio file is described second tonequality.Wherein, target channels audio frequency Data can channel audio data corresponding to any one sound channel, this is not carried out especially by the present embodiment Limit.In Fig. 2, abscissa express time, vertical coordinate represents amplitude.
Correspondingly, in a possible implementation of the present embodiment, in 103, if continuous two The difference of value of target channels voice data, more than or equal to the second amplitude threshold, and described continuous two The symbol of the value of individual target channels voice data is contrary, and the waveform corresponding to this situation can be such as Fig. 3 Shown in, then, described processing means then can identify that the tonequality of described target audio file is described second Tonequality.Wherein, target channels voice data can channel audio data corresponding to any one sound channel, This is not particularly limited by the present embodiment.In Fig. 3, abscissa express time, vertical coordinate represents amplitude,.
Alternatively, in a possible implementation of the present embodiment, in 102, processing means exists After obtaining the channel audio data corresponding to each sound channel, it is also possible to further to target channels audio frequency number According to carrying out sub-frame processing, to obtain at least one frame voice data, described target channels voice data includes often The channel audio data corresponding to arbitrary sound channel in channel audio data corresponding to individual sound channel.Then, Described processing means then can carry out frequency domain transform process to described at least one frame voice data, to obtain Every frequency domain data corresponding to frame voice data.Wherein, target channels voice data can be any one Channel audio data corresponding to sound channel, this is not particularly limited by the present embodiment.
Specifically, described frequency domain transform processes and can include but not limited to fast Fourier transform (Fast Fourier Transform, FFT).
Such as, processing means can carry out framing to target channels voice data according to the interval of 20ms Process, and have the data overlap of 50% between consecutive frame, to obtain at least one frame voice data.Then, Described processing means then can carry out FFT process, to obtain every frame to described at least one frame voice data Frequency domain data corresponding to voice data, is designated as Ai,j;Wherein, i represents the numbering of frequency, and j represents frame Numbering, Ai,jRepresent jth frame frequency domain data at i-th frequency.
Correspondingly, in a possible implementation of the present embodiment, in 103, described process fills Putting specifically can be according to the frequency domain data corresponding to every frame voice data, it is thus achieved that corresponding to every frame voice data Frequency domain data energy component at each frequency.If every frequency domain data corresponding to frame voice data exists Difference between any two in energy component at least one identical frequency, less than or equal to described energy cut-off Value, the energy spectrum corresponding to this situation can be as shown in Figure 4, then, described processing means is the most permissible The tonequality identifying described target audio file is described second tonequality.In Fig. 4, abscissa express time, Vertical coordinate represents that frequency, the color of each point represent energy.
Such as, processing means is designated as A according to the frequency domain data corresponding to the every frame voice data obtainedi,j, Obtain energy component E at each frequency of the frequency domain data corresponding to every frame voice datai,j;Wherein, i Representing the numbering of frequency, j represents the numbering of frame, Ei,jRepresent that jth frame energy at i-th frequency divides Amount.
In the present embodiment, by obtaining target audio file to be identified, and then according to described target audio File, it is thus achieved that the time domain waveform feature of described target audio file and the frequency domain spectra of described target audio file At least one in line feature, enabling according to described time domain waveform feature and described frequency domain spectral line characteristic In at least one, identify that the tonequality of described target audio file is the first tonequality or the second tonequality, described First tonequality is higher than described second tonequality, so, can provide a user with the audio frequency of real high tone quality File, allows users to appreciate the audio file of real high tone quality.
It addition, use the technical scheme that the present invention provides, simple to operate, it is possible to be effectively improved audio file The efficiency of tonequality identification.
It should be noted that for aforesaid each method embodiment, in order to be briefly described, therefore by its all table Stating as a series of combination of actions, but those skilled in the art should know, the present invention is by being retouched The restriction of the sequence of movement stated because according to the present invention, some step can use other orders or with Shi Jinhang.Secondly, those skilled in the art also should know, embodiment described in this description all belongs to In preferred embodiment, necessary to involved action and the module not necessarily present invention.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and does not has in certain embodiment The part described in detail, may refer to the associated description of other embodiments.
The structural representation of the tonequality identification device of the audio file that Fig. 5 provides for another embodiment of the present invention, As shown in Figure 5.The tonequality identification device of the audio file of the present embodiment can include acquiring unit 51, spy Levy unit 52 and recognition unit 53.Wherein,
Acquiring unit 51, for obtaining target audio file to be identified.
Wherein, described target audio file can include the audio file of various coded formats in prior art, Such as, dynamic image expert group (Moving Picture Experts Group, MPEG) layer 3 (MPEGLayer-3, MP3) formatted audio files, WMA (Windows Media Audio) lattice Formula audio file, Advanced Audio Coding (Advanced Audio Coding, AAC) formatted audio files, Lossless Audio Compression coding (Free Lossless Audio Codec, FLAC) or APE format audio Files etc., this is not particularly limited by the present embodiment.
Feature unit 52, for according to described target audio file, it is thus achieved that described target audio file time At least one in the frequency domain spectral line characteristic of domain waveform feature and described target audio file.
Wherein, the time domain waveform feature of described target audio file, original audio can be included but not limited to The amplitude information of data.
Original audio data, is by the digital signal converting acoustical signal, such as, to described sound Tone signal is sampled, quantifies and coded treatment, to obtain pulse code modulation (Pulse Code Modulation, PCM) data, specifically can be by the data block of target audio file be resolved Obtain.
Wherein, the frequency domain spectral line characteristic of described target audio file, original audio can be included but not limited to The spectrum information of data.
Recognition unit 53, for according in described time domain waveform feature and described frequency domain spectral line characteristic at least One, identify that the tonequality of described target audio file is the first tonequality or the second tonequality, described first tonequality Higher than described second tonequality.
It should be noted that the tonequality identification device of audio file that the present embodiment is provided can be to process Device, may be located at the application (Application, App) of this locality such as, in Baidu's music, or also May be located in the server of network side, or can also be in the application that is located locally of a part, another portion Divide the server being positioned at network side.
It is understood that the application program (nativeAPP) that described application can be mounted in terminal, Or can also is that a webpage (webAPP) of browser in terminal, as long as being capable of audio frequency number According to process objective reality form can, this is not defined by the present embodiment.
So, obtain target audio file to be identified by acquiring unit, so by feature unit according to Described target audio file, it is thus achieved that the time domain waveform feature of described target audio file and described target audio At least one in the frequency domain spectral line characteristic of file so that recognition unit can be special according to described time domain waveform Seek peace at least one in described frequency domain spectral line characteristic, identify that the tonequality of described target audio file is first Tonequality or the second tonequality, described first tonequality is higher than described second tonequality, so, can carry to user For the audio file of real high tone quality, allow users to appreciate the audio file of real high tone quality.
Alternatively, in a possible implementation of the present embodiment, described recognition unit, it is also possible to It is further used for obtaining the format parameter of candidate audio files;And according to described format parameter, determine institute Stating candidate audio files is described target audio file;Or the tonequality identifying described candidate audio files is Described second tonequality.
Wherein, described format parameter can include but not limited to compressed format, sample rate, sampling depth and At least one in code check.
Described compressed format, original audio data performs the compression method of compression through certain program, such as MP3 format, WMA form, AAC form, FLAC form or APE form etc..
Described sample rate, also referred to as sample rate or sample frequency, define per second from continuous signal Extracting and form the number of samples of discrete signal, it represents with hertz (Hz).
Described sampling depth, refers to that the value of a sampled point is represented by several bit numbers, which determines each adopting The figure place of the value of sampling point, such as, 8 bits (bit), 16 or 24 etc..
Described code check, the quantity of the bit processed in referring to the unit interval, unit is bit per second (bps).
Specifically, the frame head of candidate audio files specifically can be resolved by described recognition unit 53, with Obtain the format parameter of candidate audio files.
Such as, if sampling depth is 8bit, identify that the tonequality of described candidate audio files is described second tonequality; If sampling depth is 16bit, determine that described candidate audio files is described target audio file.
Or, more such as, if sample rate is less than 44100Hz, identify the tonequality of described candidate audio files For described second tonequality;If sample rate is more than or equal to 44100Hz, determine that described candidate audio files is Described target audio file.
Or, more such as, compressed format is MP3, and code check is less than 320 kilobits per seconds (kbps), The tonequality identifying described candidate audio files is described second tonequality;Compressed format is MP3, and code check is big In or equal to 320kbps, determine that described candidate audio files is described target audio file.
So, obtained the format parameter of candidate audio files by recognition unit, and then can be according to described Format parameter, identifies that the tonequality of described candidate audio files is described second tonequality so that this candidate in advance Audio file is without as target audio file, to identify further, it is possible to be effectively improved audio frequency literary composition The efficiency of the tonequality identification of part.
Further, since without candidate audio files is decoded, it is only necessary to carry out frame head resolving and just may be used To obtain the format parameter of candidate audio files, therefore, it is possible to the tonequality improving audio file further is known Other efficiency.
Alternatively, in a possible implementation of the present embodiment, described feature unit 52, specifically It is determined for the number of channels of described target audio file;Data block to described target audio file It is decoded, to obtain original audio data;And according to described number of channels and described original audio number According to, it is thus achieved that the channel audio data corresponding to each sound channel.Wherein, analytic method and coding/decoding method is detailed Thin description may refer to related content of the prior art, and here is omitted.
Such as, the frame head of described target audio file specifically can be resolved by described feature unit 52, To determine the number of channels of described target audio file.
Or the most such as, the file header of described target audio file is specifically solved by described feature unit 52 Analysis, to determine the number of channels of described target audio file.
Or the most such as, other parts of target audio file can also be solved by described feature unit 52 Analysis, to determine the number of channels of described target audio file, this is not particularly limited by the present embodiment.
Or the most such as, described feature unit 52 specifically can also be from configuration file, it is thus achieved that described target The number of channels of audio file.
Correspondingly, in a possible implementation of the present embodiment, described recognition unit 53, specifically If may be used for described number of channels to be more than or equal to 2, according to the channel audio number corresponding to each sound channel According to, it is thus achieved that the first channel audio data corresponding at least two sound channel and second sound channel voice data;Will Described first channel audio data and described second sound channel voice data carry out addition process, to obtain mixing Channel audio data;And if described mixed layer sound channel voice data is more than or equal to described first channel audio Data/N or described second sound channel voice data/M, identifies that the tonequality of described target audio file is described One tonequality;If described mixed layer sound channel voice data is less than described first channel audio data/N or described second Channel audio data/M, identifies that the tonequality of described target audio file is described second tonequality;Wherein, N For the number more than 1;M is the number more than 1.
Correspondingly, in a possible implementation of the present embodiment, described recognition unit 53, specifically If may be used for specifying number continuously in the value of the target channels voice data of (such as 3) between any two Difference, less than or equal to the first amplitude threshold, identifies that the tonequality of described target audio file is described second Tonequality, it is arbitrary that described target channels voice data includes in the channel audio data corresponding to each sound channel Channel audio data corresponding to sound channel.Waveform corresponding to this situation can be as shown in Figure 2.Wherein, Target channels voice data can channel audio data corresponding to any one sound channel, the present embodiment pair This is not particularly limited.
Correspondingly, in a possible implementation of the present embodiment, described recognition unit 53, specifically If may be used for the difference of the value of the target channels voice data of continuous two, more than or equal to the second amplitude Threshold value, and the symbol of the value of the target channels voice data of described continuous two is contrary, identifies described target The tonequality of audio file is described second tonequality, and described target channels voice data includes that each sound channel institute is right The channel audio data corresponding to arbitrary sound channel in the channel audio data answered.Corresponding to this situation Waveform can be as shown in Figure 3.Wherein, target channels voice data can be corresponding to any one sound channel Channel audio data, this is not particularly limited by the present embodiment.
Alternatively, in a possible implementation of the present embodiment, described feature unit 52, also may be used To be further used for target channels voice data is carried out sub-frame processing, to obtain at least one frame voice data, Described target channels voice data includes the arbitrary sound channel institute in the channel audio data corresponding to each sound channel Corresponding channel audio data;And to described at least one frame voice data, carry out frequency domain transform process, To obtain the frequency domain data corresponding to every frame voice data.Wherein, target channels voice data can be to appoint Anticipating the channel audio data corresponding to a sound channel, this is not particularly limited by the present embodiment.
Specifically, described frequency domain transform processes and can include but not limited to fast Fourier transform (Fast Fourier Transform, FFT).
Such as, described feature unit 52 can enter to target channels voice data according to the interval of 20ms Row sub-frame processing, and have the data overlap of 50% between consecutive frame, to obtain at least one frame voice data. Then, described feature unit 52 then can carry out FFT process to described at least one frame voice data, with Obtain the frequency domain data corresponding to every frame voice data, be designated as Ai,j;Wherein, i represents the numbering of frequency, j Represent the numbering of frame, Ai,jRepresent jth frame frequency domain data at i-th frequency.
Correspondingly, in a possible implementation of the present embodiment, described recognition unit 53, specifically May be used for according to the frequency domain data corresponding to every frame voice data, it is thus achieved that corresponding to every frame voice data Frequency domain data energy component at each frequency;If every frequency domain data corresponding to frame voice data is extremely Lack difference between any two in the energy component at an identical frequency, less than or equal to described energy threshold, The tonequality identifying described target audio file is described second tonequality.Energy spectrum corresponding to this situation can With as shown in Figure 4.
Such as, described recognition unit 53 is remembered according to the frequency domain data corresponding to the every frame voice data obtained For Ai,j, it is thus achieved that every frequency domain data corresponding to frame voice data energy component E at each frequencyi,j;Its In, i represents the numbering of frequency, and j represents the numbering of frame, Ei,jRepresent jth frame energy at i-th frequency Amount component.
In the present embodiment, obtain target audio file to be identified by acquiring unit, and then by feature list Unit is according to described target audio file, it is thus achieved that the time domain waveform feature of described target audio file and described mesh At least one in the frequency domain spectral line characteristic of mark with phonetic symbols frequency file so that recognition unit can be according to described time domain At least one in wave character and described frequency domain spectral line characteristic, identifies the tonequality of described target audio file Being the first tonequality or the second tonequality, described first tonequality is higher than described second tonequality, so, and can be to User provides the audio file of real high tone quality, allows users to appreciate the audio frequency literary composition of real high tone quality Part.
It addition, use the technical scheme that the present invention provides, simple to operate, it is possible to be effectively improved audio file The efficiency of tonequality identification.
Those skilled in the art is it can be understood that arrive, and for convenience and simplicity of description, above-mentioned retouches The specific works process of the system stated, device and unit, is referred to the correspondence in preceding method embodiment Process, does not repeats them here.
In several embodiments provided by the present invention, it should be understood that disclosed system, device and Method, can realize by another way.Such as, device embodiment described above is only shown Meaning property, such as, the division of described unit, be only a kind of logic function and divide, actual can when realizing There to be other dividing mode, the most multiple unit or assembly can in conjunction with or be desirably integrated into another System, or some features can ignore, or do not perform.Another point, shown or discussed each other Coupling direct-coupling or communication connection can be the INDIRECT COUPLING by some interfaces, device or unit Or communication connection, can be electrical, machinery or other form.
The described unit illustrated as separating component can be or may not be physically separate, makees The parts shown for unit can be or may not be physical location, i.e. may be located at a place, Or can also be distributed on multiple NE.Can select according to the actual needs part therein or The whole unit of person realizes the purpose of the present embodiment scheme.
It addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, Can also be that unit is individually physically present, it is also possible to two or more unit are integrated in a list In unit.Above-mentioned integrated unit both can realize to use the form of hardware, it would however also be possible to employ hardware adds software The form of functional unit realizes.
The above-mentioned integrated unit realized with the form of SFU software functional unit, can be stored in a computer In read/write memory medium.Above-mentioned SFU software functional unit is stored in a storage medium, including some fingers Make with so that computer installation (can be personal computer, Audio Processing engine, or network Device etc.) or processor (processor) perform the part steps of method described in each embodiment of the present invention. And aforesaid storage medium includes: USB flash disk, portable hard drive, read only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disc or light The various medium that can store program code such as dish.
Last it is noted that above example is only in order to illustrate technical scheme, rather than to it Limit;Although the present invention being described in detail with reference to previous embodiment, the ordinary skill of this area Personnel it is understood that the technical scheme described in foregoing embodiments still can be modified by it, or Person carries out equivalent to wherein portion of techniques feature;And these amendments or replacement, do not make corresponding skill The essence of art scheme departs from the spirit and scope of various embodiments of the present invention technical scheme.

Claims (12)

1. the acoustic fidelity identification method of an audio file, it is characterised in that including:
Obtain target audio file to be identified;
According to described target audio file, it is thus achieved that the time domain waveform feature of described target audio file and described At least one in the frequency domain spectral line characteristic of target audio file;
According at least one in described time domain waveform feature and described frequency domain spectral line characteristic, identify described mesh The tonequality of mark with phonetic symbols frequency file is the first tonequality or the second tonequality, and described first tonequality is higher than described second tonequality; Wherein,
Described according to described target audio file, it is thus achieved that the time domain waveform feature of described target audio file and At least one in the frequency domain spectral line characteristic of described target audio file, including:
Determine the number of channels of described target audio file;
The data block of described target audio file is decoded, to obtain original audio data;
According to described number of channels and described original audio data, it is thus achieved that the sound channel sound corresponding to each sound channel Frequency evidence;
Described according to described number of channels with described original audio data, it is thus achieved that the sound corresponding to each sound channel After audio data, also include:
Target channels voice data is carried out sub-frame processing, to obtain at least one frame voice data, described mesh Mark channel audio data include corresponding to the arbitrary sound channel in the channel audio data corresponding to each sound channel Channel audio data;
To described at least one frame voice data, carry out frequency domain transform process, to obtain every frame voice data institute Corresponding frequency domain data.
Method the most according to claim 1, it is characterised in that described special according to described time domain waveform Seek peace at least one in described frequency domain spectral line characteristic, identify that the tonequality of described target audio file is first Tonequality or the second tonequality, including:
If described number of channels is more than or equal to 2, according to the channel audio data corresponding to each sound channel, obtain Obtain the first channel audio data corresponding at least two sound channel and second sound channel voice data;
Described first channel audio data and described second sound channel voice data are carried out addition process, to obtain Obtain mixed layer sound channel voice data;
If described mixed layer sound channel voice data is more than or equal to described first channel audio data/N or described the Two channel audio data/M, identifies that the tonequality of described target audio file is described first tonequality;
If described mixed layer sound channel voice data is less than described first channel audio data/N or described second sound channel Voice data/M, identifies that the tonequality of described target audio file is described second tonequality;Wherein,
N is the number more than 1;M is the number more than 1.
Method the most according to claim 1, it is characterised in that described special according to described time domain waveform Seek peace at least one in described frequency domain spectral line characteristic, identify that the tonequality of described target audio file is first Tonequality or the second tonequality, including:
If difference in the value of the target channels voice data specified number continuously between any two, less than or etc. In the first amplitude threshold, identify that the tonequality of described target audio file is described second tonequality, described target Channel audio data include the sound corresponding to arbitrary sound channel in the channel audio data corresponding to each sound channel Audio data;Or
If the difference of the value of the target channels voice data of continuous two, more than or equal to the second amplitude threshold, And the symbol of the value of the target channels voice data of described continuous two is contrary, identify described target audio literary composition The tonequality of part is described second tonequality, and described target channels voice data includes the sound corresponding to each sound channel The channel audio data corresponding to arbitrary sound channel in audio data.
Method the most according to claim 1, it is characterised in that described special according to described time domain waveform Seek peace at least one in described frequency domain spectral line characteristic, identify that the tonequality of described target audio file is first Tonequality or the second tonequality, including:
According to the frequency domain data corresponding to every frame voice data, it is thus achieved that every frequency domain corresponding to frame voice data Data energy component at each frequency;
If in the energy component that every frequency domain data corresponding to frame voice data is at least one identical frequency Difference between any two, less than or equal to described energy threshold, identifies the tonequality of described target audio file For described second tonequality.
5. according to the method described in Claims 1 to 4 any claim, it is characterised in that described acquisition Before target audio file to be identified, also include:
Obtain the format parameter of candidate audio files;
According to described format parameter, determine that described candidate audio files is described target audio file;Or The tonequality identifying described candidate audio files is described second tonequality.
Method the most according to claim 5, it is characterised in that described format parameter includes compressing lattice At least one in formula, sample rate, sampling depth and code check.
7. the tonequality identification device of an audio file, it is characterised in that including:
Acquiring unit, for obtaining target audio file to be identified;
Feature unit, for according to described target audio file, it is thus achieved that the time domain of described target audio file At least one in the frequency domain spectral line characteristic of wave character and described target audio file;
Recognition unit, for according at least in described time domain waveform feature and described frequency domain spectral line characteristic , identifying that the tonequality of described target audio file is the first tonequality or the second tonequality, described first tonequality is high In described second tonequality;Wherein,
Described feature unit, specifically for
Determine the number of channels of described target audio file;
The data block of described target audio file is decoded, to obtain original audio data;And
According to described number of channels and described original audio data, it is thus achieved that the sound channel sound corresponding to each sound channel Frequency evidence;
Described feature unit, is additionally operable to
Target channels voice data is carried out sub-frame processing, to obtain at least one frame voice data, described mesh Mark channel audio data include corresponding to the arbitrary sound channel in the channel audio data corresponding to each sound channel Channel audio data;And
To described at least one frame voice data, carry out frequency domain transform process, to obtain every frame voice data institute Corresponding frequency domain data.
Device the most according to claim 7, it is characterised in that described recognition unit, specifically for
If described number of channels is more than or equal to 2, according to the channel audio data corresponding to each sound channel, obtain Obtain the first channel audio data corresponding at least two sound channel and second sound channel voice data;
Described first channel audio data and described second sound channel voice data are carried out addition process, to obtain Obtain mixed layer sound channel voice data;And
If described mixed layer sound channel voice data is more than or equal to described first channel audio data/N or described the Two channel audio data/M, identifies that the tonequality of described target audio file is described first tonequality;
If described mixed layer sound channel voice data is less than described first channel audio data/N or described second sound channel Voice data/M, identifies that the tonequality of described target audio file is described second tonequality;Wherein,
N is the number more than 1;M is the number more than 1.
Device the most according to claim 7, it is characterised in that described recognition unit, specifically for
If difference in the value of the target channels voice data specified number continuously between any two, less than or etc. In the first amplitude threshold, identify that the tonequality of described target audio file is described second tonequality, described target Channel audio data include the sound corresponding to arbitrary sound channel in the channel audio data corresponding to each sound channel Audio data;Or
If the difference of the value of the target channels voice data of continuous two, more than or equal to the second amplitude threshold, And the symbol of the value of the target channels voice data of described continuous two is contrary, identify described target audio literary composition The tonequality of part is described second tonequality, and described target channels voice data includes the sound corresponding to each sound channel The channel audio data corresponding to arbitrary sound channel in audio data.
Device the most according to claim 7, it is characterised in that described recognition unit, specifically uses In
According to the frequency domain data corresponding to every frame voice data, it is thus achieved that every frequency domain corresponding to frame voice data Data energy component at each frequency;And
If in the energy component that every frequency domain data corresponding to frame voice data is at least one identical frequency Difference between any two, less than or equal to described energy threshold, identifies the tonequality of described target audio file For described second tonequality.
11. according to the device described in claim 7~10 any claim, it is characterised in that described knowledge Other unit, is additionally operable to
Obtain the format parameter of candidate audio files;And
According to described format parameter, determine that described candidate audio files is described target audio file;Or The tonequality identifying described candidate audio files is described second tonequality.
12. devices according to claim 11, it is characterised in that described format parameter includes compression At least one in form, sample rate, sampling depth and code check.
CN201410235733.3A 2014-05-29 2014-05-29 The acoustic fidelity identification method of audio file and device Active CN104036788B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410235733.3A CN104036788B (en) 2014-05-29 2014-05-29 The acoustic fidelity identification method of audio file and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410235733.3A CN104036788B (en) 2014-05-29 2014-05-29 The acoustic fidelity identification method of audio file and device

Publications (2)

Publication Number Publication Date
CN104036788A CN104036788A (en) 2014-09-10
CN104036788B true CN104036788B (en) 2016-10-05

Family

ID=51467534

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410235733.3A Active CN104036788B (en) 2014-05-29 2014-05-29 The acoustic fidelity identification method of audio file and device

Country Status (1)

Country Link
CN (1) CN104036788B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105047200A (en) * 2015-07-21 2015-11-11 重庆邮电大学 FPGA-based FLAC hardware decoder and decoding method
CN105050021B (en) * 2015-08-05 2019-02-22 Oppo广东移动通信有限公司 Earphone sound quality detection method, system and terminal
CN105719661B (en) * 2016-01-29 2019-06-11 西安交通大学 A kind of stringed musical instrument performance sound quality automatic distinguishing method
CN106228994B (en) * 2016-07-26 2019-02-26 广州酷狗计算机科技有限公司 A kind of method and apparatus detecting sound quality
CN107895571A (en) * 2016-09-29 2018-04-10 亿览在线网络技术(北京)有限公司 Lossless audio file identification method and device
CN107886956B (en) * 2017-11-13 2020-12-11 广州酷狗计算机科技有限公司 Audio recognition method and device and computer storage medium
CN108111908A (en) * 2017-12-25 2018-06-01 深圳Tcl新技术有限公司 Audio quality determines method, equipment and computer readable storage medium
CN111554320A (en) * 2020-03-31 2020-08-18 紫光云技术有限公司 Audio stream Fourier analysis method based on Windows platform

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1213135A (en) * 1997-08-26 1999-04-07 三星电子株式会社 High quality audio encoding/decoding apparatus and digital versatile disc
CN1777891A (en) * 2003-04-24 2006-05-24 皇家飞利浦电子股份有限公司 Parameterized temporal feature analysis
CN1802696A (en) * 2003-06-05 2006-07-12 松下电器产业株式会社 Sound quality adjusting apparatus and sound quality adjusting method
EP1691348A1 (en) * 2005-02-14 2006-08-16 Ecole Polytechnique Federale De Lausanne Parametric joint-coding of audio sources
CN101479787A (en) * 2006-09-29 2009-07-08 Lg电子株式会社 Method for encoding and decoding object-based audio signal and apparatus thereof
CN101645265A (en) * 2008-08-05 2010-02-10 中兴通讯股份有限公司 Method and device for identifying audio category in real time
CN101762320A (en) * 2009-12-18 2010-06-30 深圳市万兴软件有限公司 Method for drawing audio waveform under MAC desktop and system thereof
CN102253987A (en) * 2011-07-01 2011-11-23 中山大学 Method and system for sequencing network MP3 (moving picture experts group audio layer-3) tone qualities
CN102510541A (en) * 2011-12-30 2012-06-20 Tcl数码科技(深圳)有限责任公司 Multi-screen interaction video and audio content switching method and media player
CN102568470A (en) * 2012-01-11 2012-07-11 广州酷狗计算机科技有限公司 Acoustic fidelity identification method and system for audio files
CN103262159A (en) * 2010-10-05 2013-08-21 华为技术有限公司 Method and apparatus for encoding/decoding multichannel audio signal

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110049068A (en) * 2009-11-04 2011-05-12 삼성전자주식회사 Method and apparatus for encoding/decoding multichannel audio signal

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1213135A (en) * 1997-08-26 1999-04-07 三星电子株式会社 High quality audio encoding/decoding apparatus and digital versatile disc
CN1777891A (en) * 2003-04-24 2006-05-24 皇家飞利浦电子股份有限公司 Parameterized temporal feature analysis
CN1802696A (en) * 2003-06-05 2006-07-12 松下电器产业株式会社 Sound quality adjusting apparatus and sound quality adjusting method
EP1691348A1 (en) * 2005-02-14 2006-08-16 Ecole Polytechnique Federale De Lausanne Parametric joint-coding of audio sources
CN101479787A (en) * 2006-09-29 2009-07-08 Lg电子株式会社 Method for encoding and decoding object-based audio signal and apparatus thereof
CN101645265A (en) * 2008-08-05 2010-02-10 中兴通讯股份有限公司 Method and device for identifying audio category in real time
CN101762320A (en) * 2009-12-18 2010-06-30 深圳市万兴软件有限公司 Method for drawing audio waveform under MAC desktop and system thereof
CN103262159A (en) * 2010-10-05 2013-08-21 华为技术有限公司 Method and apparatus for encoding/decoding multichannel audio signal
CN102253987A (en) * 2011-07-01 2011-11-23 中山大学 Method and system for sequencing network MP3 (moving picture experts group audio layer-3) tone qualities
CN102510541A (en) * 2011-12-30 2012-06-20 Tcl数码科技(深圳)有限责任公司 Multi-screen interaction video and audio content switching method and media player
CN102568470A (en) * 2012-01-11 2012-07-11 广州酷狗计算机科技有限公司 Acoustic fidelity identification method and system for audio files

Also Published As

Publication number Publication date
CN104036788A (en) 2014-09-10

Similar Documents

Publication Publication Date Title
CN104036788B (en) The acoustic fidelity identification method of audio file and device
CN108900725B (en) Voiceprint recognition method and device, terminal equipment and storage medium
CN1185626C (en) System and method for modifying speech signals
CN102063904B (en) Melody extraction method and melody recognition system for audio files
CN103854646A (en) Method for classifying digital audio automatically
CN1979491A (en) Method for music mood classification and system thereof
CN104064180A (en) Singing scoring method and device
CN111640411B (en) Audio synthesis method, device and computer readable storage medium
CN112133277B (en) Sample generation method and device
US20130266147A1 (en) System and method for identification of highly-variable vocalizations
CN104064191B (en) Sound mixing method and device
CN104252872A (en) Lyric generating method and intelligent terminal
CN107680584B (en) Method and device for segmenting audio
CN106233112A (en) Coding method and equipment and signal decoding method and equipment
CN104882146B (en) The processing method and processing device of audio promotion message
CN103390403B (en) The extracting method of MFCC feature and device
CN112151055B (en) Audio processing method and device
CN105283915B (en) Digital watermark embedding device and method and digital watermark detecting device and method
Zhan et al. Audio post-processing detection and identification based on audio features
KR100766170B1 (en) Music summarization apparatus and method using multi-level vector quantization
CN109213466B (en) Court trial information display method and device
US20130322645A1 (en) Data recognition and separation engine
CN105336327B (en) The gain control method of voice data and device
Suhaimy et al. Classification of ambulance siren sound with MFCC-SVM
CN104715756A (en) Audio data processing method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C41 Transfer of patent application or patent right or utility model
TA01 Transfer of patent application right

Effective date of registration: 20160321

Address after: 100027 Haidian District, Qinghe Qinghe East Road, No. 23, building two, floor 2108, No., No. 18

Applicant after: BEIJING YINZHIBANG CULTURE TECHNOLOGY Co.,Ltd.

Address before: 100085 Beijing, Haidian District, No. ten on the street Baidu building, No. 10

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220414

Address after: 518057 3305, floor 3, building 1, aerospace building, No. 51, Gaoxin South ninth Road, high tech Zone community, Yuehai street, Nanshan District, Shenzhen, Guangdong

Patentee after: Shenzhen Taile Culture Technology Co.,Ltd.

Address before: 2108, floor 2, building 23, No. 18, anningzhuang East Road, Qinghe, Haidian District, Beijing 100027

Patentee before: BEIJING YINZHIBANG CULTURE TECHNOLOGY Co.,Ltd.