CN104123949A - Clamped frame detection method and device - Google Patents

Clamped frame detection method and device Download PDF

Info

Publication number
CN104123949A
CN104123949A CN201410036425.8A CN201410036425A CN104123949A CN 104123949 A CN104123949 A CN 104123949A CN 201410036425 A CN201410036425 A CN 201410036425A CN 104123949 A CN104123949 A CN 104123949A
Authority
CN
China
Prior art keywords
frame
section
frame section
card
quiet
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201410036425.8A
Other languages
Chinese (zh)
Other versions
CN104123949B (en
Inventor
邹连平
张文婷
何航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201410036425.8A priority Critical patent/CN104123949B/en
Publication of CN104123949A publication Critical patent/CN104123949A/en
Application granted granted Critical
Publication of CN104123949B publication Critical patent/CN104123949B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Auxiliary Devices For Music (AREA)

Abstract

The invention discloses a clamped frame detection method and device. The clamped frame detection method comprises the steps of performing feature detection on audio signals to be detected to obtain features values of frames in the audio signals; looking up and marking frame sections with abnormal features values from the frames, wherein marked information of the frame sections includes at least one of time information of starting frames of the frame sections and lengths of the frame sections; selecting the frame sections the frames are clamped by judging whether the frame sections are mute sections; outputting the marked information of the clamped frame sections. The technical problem of low accuracy of audio frequency clamping detection in the prior art is solved, erroneous judgment when the clamped frame sections are detected is achieved, and the technical effect of accurately and efficiently detecting the clamped frame sections in clamped audio frequencies in an audio frequency communication system is achieved.

Description

Card frame detection method and device
Technical field
The present invention relates to the communications field, in particular to one card frame detection method and device.
Background technology
Along with the development of computing machine and multimedia communication technology, the application of audio frequency real-time Communication for Power in the networking telephone, Streaming Media, game VOIP, amusement audio/video are live is more and more wider.The complicacy of Internet situation, inevitably there is the factor impacts such as time delay/shake/packet loss, the existence of these factors can cause audio service not smooth, also there is audio card frame, but, at present industry for audio stream not the assessment reaction of fluency on audio quality, be to be undertaken by the objective evaluation such as ITU-TP.862PESQ and ITU-RBS.1387PEAQ standard, it is embodied in card frames such as audio frequency/not smooth in the scoring that overall sound quality assesses, like this, concept is not outstanding also fuzzyyer.During for audio frequency, the special project of fluency is not assessed, seldom have for the assessment for audio frequency fluency.
On the impact of time delay/shake factor, at present industry has comparative maturity solution based on dithering cache (JitterBuffer) to alleviate and absorbs time delay, but, this partly solves the problem of audio card only, the assessment of audio card also can be based on this one deck to depositing the time interval of the packet in dithering cache in or according to the card that has or not to assess audio frequency of current data packet in the dithering cache degree of pausing, but audio frequency is processed and is played back to final audio frequency through dithering cache, middle link may inevitably be processed audio frequency, as empty/replacement dithering cache (JitterBuffer) data, zero setting related data packets, or abandon the operations such as high-energy audio pack, the audio frame that these intermediate treatment flow processs cause is lost, have a strong impact on the accuracy of card frame assessment.The card that may cause for packet loss times, at present the disposal route of industry comparative maturity has method based on forward direction/backward copies or interframe interpolation is overlapping to make up audio frame to lose the card causing and pause, and mend audio frame itself that bag method repairs or the incoherence of front and back also likely can cause audio card, if for mend wrap this class card frame to the assessment of audio quality the assessment based on PESQ/PEAQ also just audio card frame entirety is included into audio quality part.
The not fluency of audio frequency blocks time, it is an epochmaking index in audio service, the order of severity that card pauses will affect user and experience, therefore be necessary using the assessment of audio frequency fluency (card) as special quantification of targets out, the audio frequency total solution that third party is provided or and competing product between fluency relatively better and bad with the fluency of assessment audio product, promote improvement and the lifting of the experience of audio product fluency.
Existing audio frequency fluency appraisal procedure is divided subjective evaluation and objective evaluation method.
In the appraisal procedure of audio frequency fluency, developer can have a set of standard of passing judgment on based on code of oneself, such as whether interval rank time of arrival that detects adjacent packets of audio data at jitter-buffer processing layer exceeds predetermined threshold values (such as 200ms, 200ms*2,200ms*3,200ms*4 ... 200ms*10) determine whether that having caused a secondary card pauses.But for appraiser, audio frequency system under test (SUT) may be black box, is easy to count on the frame of Fei Kadun when test card is paused, thereby make the accuracy of the mode that above-mentioned test card pauses lower.
Assessing more method for fluency is at present to pass judgment on based on the subjective sense of hearing.Subjective evaluation need to ask audient colony to carry out subjective feeling comparison, and human cost is high on the one hand; Pause for audio card on the other hand, be easy to allow audient produce unhealthy emotion or be sick of psychology, not only easily cause erroneous judgement but also can make appraiser's efficiency have a greatly reduced quality.Existing objective evaluation technology is for audio frequency fluency---and the evaluation index of a card seriousness does not quantize out separately, a part of just assessing as audio frequency total quality, therefore can not specifically react audio card frame number of times and card frame duration in the audio communication system unit interval, this assessment for audio product fluency is coarse and method poor efficiency, be difficult to react the slack order of severity of audio frequency, be unfavorable for the checking and the improvement that advance in time audio product fluency to experience.
For above-mentioned problem, effective solution is not yet proposed at present.
Summary of the invention
The embodiment of the present invention provides a kind of card frame detection method and device, with at least solve prior art sound intermediate frequency card pause detect the lower technical matters of accuracy.
According to the embodiment of the present invention aspect, a kind of card frame detection method is provided, comprising: treat survey sound signal and carry out feature detection, obtain the eigenwert of the each frame in sound signal to be measured; From each frame, search and mark eigenwert and occur abnormal frame section, wherein, the label information of frame section comprise following one of at least: the frame length of the temporal information of the start frame of frame section and frame section; Whether be the quiet section of frame section of carrying out to select to occur card frame from frame section according to frame section; There is the label information of the frame section of card frame in output.
Alternatively, whether be quiet section according to frame section and from frame section, select the frame section that occurs card frame to comprise: if frame section is quiet section, judgement belongs to the whether satisfied first card frame bar part of frame section of quiet section; If the frame section that belongs to quiet section does not meet the first card frame bar part, judge the frame section of the frame Duan Buwei appearance card frame that belongs to quiet section; If the frame section that belongs to quiet section meets the first card frame bar part, judge that to belong to the frame section of quiet section be the frame section that occurs card frame.
Alternatively, judgement belongs to the frame section of quiet section and whether meets the first card frame bar part and comprise: whether the frame number that judgement belongs to the frame section of quiet section is greater than the first predetermined threshold; If frame number is greater than the first predetermined threshold, judge satisfied the first card frame bar part of frame section that belongs to quiet section; If frame number is less than or equal to the first predetermined threshold, judge satisfied the first card frame bar part of frame section that belongs to quiet section.
Alternatively, judge belong to quiet section frame Duan Buwei occur card frame frame section comprise: the characteristic parameter to the frame section that belongs to quiet section detects; Judge the naturally quiet condition in whether satisfied the first card frame bar part of frame section that belongs to quiet section according to testing result; If the frame section that belongs to quiet section meets the quiet condition of nature, judge frame section and do not meet the first card frame bar part.
Alternatively, after according to testing result, judgement belongs to the naturally quiet condition in the whether satisfied first card frame bar part of frame section of quiet section, also comprise: do not meet the quiet condition of nature if belong to the frame section of quiet section, judgement belongs to the frame section of quiet section and whether meets the audio frequency hit condition in the first card frame bar part; If the frame section that belongs to quiet section meets audio frequency hit condition, judge whether the frame number of the frame section that meets audio frequency hit condition is greater than the second predetermined threshold; If frame number is greater than the second predetermined threshold, judge satisfied the first card frame bar part of frame section that meets audio frequency hit condition; If frame number is less than or equal to the second predetermined threshold, judge satisfied the first card frame bar part of frame section that meets audio frequency hit condition.
Alternatively, belong to after whether the frame section of quiet section meet the audio frequency hit condition in the first card frame bar part in judgement, also comprise: if belong to the discontented footsteps of the frame section hit condition frequently of quiet section, judge the sharp-pointed downslide/time domain truncation condition in whether satisfied the first card frame bar part of frame section that belongs to quiet section; If the discontented sharp downslide/time domain of the toe truncation condition of the frame section that belongs to quiet section, the frame section of judging the discontented sharp downslide/time domain of toe truncation condition meets the first card frame bar part; If the frame section that belongs to quiet section meets sharp-pointed downslide/time domain truncation condition, whether the frame number of the frame section of the satisfied sharp-pointed downslide/time domain truncation condition of judgement is greater than the 3rd predetermined threshold; If frame number is greater than the 3rd predetermined threshold, judge satisfied the first card frame bar part of frame section that meets sharp-pointed downslide/time domain truncation condition; If frame number is less than or equal to the 3rd predetermined threshold, judge satisfied the first card frame bar part of frame section that meets sharp-pointed downslide/time domain truncation condition.
Alternatively, be whether quiet section according to frame section and from frame section, select the frame section that occurs card frame to comprise: if quiet section of frame Duan Buwei, whether judgment frame section meets the second card frame bar part; If frame section does not meet the second card frame bar part, judge the frame section that card frame appears in frame Duan Buwei; If frame section meets the second card frame bar part, judging frame section is the frame section that occurs card frame.
Alternatively, whether satisfied the second card frame bar part of judgment frame section comprises: whether judgment frame section meets the stress condition in the second card frame bar part; If frame section is discontented with lumping weight sound condition, whether judgment frame section meets the magnetization/mechanical sound condition in the second card frame bar part; If frame section does not meet the magnetization/mechanical sound condition in the second card frame bar part, judge frame section and do not meet the second card frame bar part.
Alternatively, if frame section meets stress condition or meets magnetization/mechanical sound condition, method also comprises: whether the frame number that judgement belongs to frame section is greater than the 4th predetermined threshold; If frame number is greater than the 4th predetermined threshold, judges and belong to satisfied the second card frame bar part of frame section; If frame number is less than or equal to the 4th predetermined threshold, judges and belong to satisfied the second card frame bar part of frame section.
Alternatively, from each frame, search and mark eigenwert and occur that abnormal frame section comprises: at least one eigenwert of each in the multiple frames of liaison in each frame is not all within corresponding threshold range, the frame segment mark of continuous multiple frame compositions is designated as to eigenwert and occurs abnormal frame section, wherein, each the corresponding threshold range in eigenwert is identical or different.
Alternatively, eigenwert comprise following one of at least: energy envelope value, frequency spectrum flow, spectral smoothing degree, spectrum deflection, spectrum kurtosis.
According to the embodiment of the present invention on the other hand, also provide a kind of card frame pick-up unit, having comprised: detecting unit, carry out feature detection for treating survey sound signal, obtain the eigenwert of the each frame in sound signal to be measured; Search indexing unit, occur abnormal frame section for search and mark eigenwert from each frame, wherein, the label information of frame section comprise following one of at least: the frame length of the temporal information of the start frame of frame section and frame section; Selected cell, carrys out the frame section of selecting to occur blocking frame from frame section for whether being quiet section according to frame section; Output unit, for exporting the label information of the frame section that occurs card frame.
Alternatively, selected cell comprises: the first judge module, and in the time that frame section is quiet section, judgement belongs to the whether satisfied first card frame bar part of frame section of quiet section; In the time judging satisfied the first card frame bar part of frame section that belongs to quiet section, judge the frame section of the frame Duan Buwei appearance card frame that belongs to quiet section; Belong to the frame section of quiet section and meet the first card frame bar part judging, judge that to belong to the frame section of quiet section be the frame section that occurs card frame.
Alternatively, the first judge module comprises: first judges submodule, whether is greater than the first predetermined threshold for the frame number that judges the frame section that belongs to quiet section; In the time that frame number is greater than the first predetermined threshold, judge satisfied the first card frame bar part of frame section that belongs to quiet section; In the time that frame number is less than or equal to the first predetermined threshold, judge satisfied the first card frame bar part of frame section that belongs to quiet section.
Alternatively, the first judge module comprises: detection sub-module, for the characteristic parameter of the frame section that belongs to quiet section is detected; Second judges submodule, whether meets the naturally quiet condition of the first card frame bar part for belong to the frame section of quiet section according to the testing result judgement of detection module; In the time that the frame section that belongs to quiet section meets the quiet condition of nature, judge frame section and do not meet the first card frame bar part.
Alternatively, the first judge module comprises: the 3rd judges submodule, in the time that the frame section that belongs to quiet section does not meet the quiet condition of nature, judges the audio frequency hit condition in whether satisfied the first card frame bar part of frame section that belongs to quiet section; The 4th judges submodule, and in the time that the frame section that belongs to quiet section meets audio frequency hit condition, whether the frame number that judgement meets the frame section of audio frequency hit condition is greater than the second predetermined threshold; In the time that frame number is greater than the second predetermined threshold, judge satisfied the first card frame bar part of frame section that meets audio frequency hit condition; In the time that frame number is less than or equal to the second predetermined threshold, judge satisfied the first card frame bar part of frame section that meets audio frequency hit condition.
Alternatively, the first judge module comprises: the 5th judges submodule, and at the discontented footsteps of the frame section that belongs to quiet section frequently when hit condition, judgement belongs to the frame section of quiet section and whether meets the sharp-pointed downslide/time domain truncation condition in the first card frame bar part; In the time of the discontented sharp downslide/time domain of the toe truncation condition of frame section that belongs to quiet section, the frame section of judging the discontented sharp downslide/time domain of toe truncation condition meets the first card frame bar part; The 6th judges submodule, and in the time that the frame section that belongs to quiet section meets sharp-pointed downslide/time domain truncation condition, whether the frame number that judgement meets the frame section of sharp-pointed downslide/time domain truncation condition is greater than the 3rd predetermined threshold; In the time that frame number is greater than the 3rd predetermined threshold, judge satisfied the first card frame bar part of frame section that meets sharp-pointed downslide/time domain truncation condition; In the time that frame number is less than or equal to the 3rd predetermined threshold, judge satisfied the first card frame bar part of frame section that meets sharp-pointed downslide/time domain truncation condition.
Alternatively, selected cell comprises: the second judge module, and for when quiet section of the frame Duan Buwei, whether judgment frame section meets the second card frame bar part; In the time that frame section does not meet the second card frame bar part, judge the frame section that card frame appears in frame Duan Buwei; In the time that frame section meets the second card frame bar part, judging frame section is the frame section that occurs card frame.
Alternatively, the second judge module comprises: the 7th judges submodule, whether meets the stress condition of the second card frame bar part for judgment frame section; The 8th judges submodule, and in the time that frame section is discontented with lumping weight sound condition, whether judgment frame section meets the magnetization/mechanical sound condition in the second card frame bar part; In the time that frame section does not meet the magnetization/mechanical sound condition in the second card frame bar part, judge frame section and do not meet the second card frame bar part.
Alternatively, the second judge module comprises: the 9th judges submodule, for meeting stress condition in frame section or meeting when magnetization/mechanical sound condition, judges whether the frame number that belongs to frame section is greater than the 4th predetermined threshold; In the time that frame number is greater than the 4th predetermined threshold, judges and belong to satisfied the second card frame bar part of frame section; In the time that frame number is less than or equal to the 4th predetermined threshold, judges and belong to satisfied the second card frame bar part of frame section.
Alternatively in, searching indexing unit comprises: mark module, for each at least one eigenwert in the multiple frames of liaison of each frame all not within corresponding threshold range, the frame segment mark of continuous multiple frame compositions is designated as to eigenwert and occurs abnormal frame section, wherein, each the corresponding threshold range in eigenwert is identical or different.
Alternatively, eigenwert comprise following one of at least: energy envelope value, frequency spectrum flow, spectral smoothing degree, spectrum deflection, spectrum kurtosis.
In embodiments of the present invention, from there is abnormal frame section, extract the frame section that occurs that card pauses, and ignore other frame section, thereby eliminate the erroneous judgement in the time that test card is paused frame section, solve the prior art sound intermediate frequency card lower technical matters of accuracy detecting of pausing, realized the technique effect that detects accurately and efficiently the frame section that audio communication system sound intermediate frequency card pauses.
Brief description of the drawings
Accompanying drawing described herein is used to provide a further understanding of the present invention, forms the application's a part, and schematic description and description of the present invention is used for explaining the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is according to the process flow diagram of a kind of optional card frame detection method of the embodiment of the present invention;
Fig. 2 is the process flow diagram that optionally blocks frame detection method according to the another kind of the embodiment of the present invention;
Fig. 3 is according to the process flow diagram of another optional card frame detection method of the embodiment of the present invention;
Fig. 4 is according to the process flow diagram of another optional card frame detection method of the embodiment of the present invention;
Fig. 5 is according to the process flow diagram of another optional card frame detection method of the embodiment of the present invention;
Fig. 6 is according to the process flow diagram of another optional card frame detection method of the embodiment of the present invention;
Fig. 7 is according to the decision algorithm process flow diagram of quiet condition in a kind of optional card frame detection method of the embodiment of the present invention;
Fig. 8 is according to the decision algorithm process flow diagram of a kind of optional card frame detection method sound intermediate frequency hit condition of the embodiment of the present invention;
Fig. 9 is according to the sharp-pointed decision algorithm process flow diagram of downslide/time domain truncation condition in a kind of optional card frame detection method of the embodiment of the present invention;
Figure 10 is according to the decision algorithm process flow diagram of stress condition in a kind of optional card frame detection method of the embodiment of the present invention;
Figure 11 is according to the decision algorithm process flow diagram of magnetization/mechanical sound condition in a kind of optional card frame detection method of the embodiment of the present invention;
Figure 12 is according to the schematic diagram of a kind of optional card frame pick-up unit of the embodiment of the present invention;
Figure 13 is the schematic diagram that optionally blocks frame pick-up unit according to the another kind of the embodiment of the present invention;
Figure 14 is according to the schematic diagram of another optional card frame pick-up unit of the embodiment of the present invention; And
Figure 15 is the schematic diagram that detects Output rusults according to a kind of optional card frame of the embodiment of the present invention.
Embodiment
First the part noun or the term that, in the process that the embodiment of the present invention is described, occur are applicable to description below:
In order to make those skilled in the art person understand better the present invention program, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the embodiment of a part of the present invention, instead of whole embodiment.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtaining under creative work prerequisite, should belong to the scope of protection of the invention.
It should be noted that, term " first ", " second " etc. in instructions of the present invention and claims and above-mentioned accompanying drawing are for distinguishing similar object, and needn't be used for describing specific order or precedence.The data that should be appreciated that such use suitably can exchanged in situation, so as embodiments of the invention described herein can with except diagram here or describe those order enforcement.In addition, term " comprises " and " having " and their any distortion, intention is to cover not exclusive comprising, for example, those steps or unit that process, method, system, product or the equipment that has comprised series of steps or unit is not necessarily limited to clearly list, but can comprise clearly do not list or for these processes, method, product or equipment intrinsic other step or unit.
Embodiment 1
According to the embodiment of the present invention, a kind of card frame detection method is provided, as shown in Figure 1, the method comprises:
S102, treats survey sound signal and carries out feature detection, obtains the eigenwert of the each frame in sound signal to be measured;
Alternatively, the card frame detection method providing in the present embodiment can be, but not limited to be applied to audio system, and as shown in Figure 2, this tested audio system comprises local test originating end 202, remote test receiving end 204, test logic server (TestLogic Server) 206.Treat the output of survey audio system and carry out audio sound-recording, and carry out signature analysis detection based on this audio content, obtain the eigenwert of each frame.Optionally, audio file can be (for example, to comprise sampling rate, channel number with a form in the present embodiment, the information such as sample position bit number) audio file, the form of audio file can include but not limited to following one of at least: wav, wma, mp3.
Alternatively, in the present embodiment audio signal segment is carried out the analysis of time domain/time-frequency conversion/frequency domain character and is obtained the eigenwert of each frame corresponding domain, wherein, the eigenwert of each frame include but not limited to following one of at least: energy envelope value, frequency spectrum flow, spectral smoothing degree, spectrum deflection, spectrum kurtosis.
Alternatively, abnormal if at least one in the above-mentioned eigenwert of present frame to be detected occurs, judging this present frame to be detected is that abnormal frame appears in eigenwert.
For example, as shown in Figure 2, the recording testing process of this tested audio frequency comprises:
1) the card frame test App of local side card frame test App and far-end will sign in to respectively test logic server (TestLogic Server) 206, and keeps online;
2) network delay/shake packet loss of local test originating end 202 these simulation of configuration, and open the simulation of corresponding delay/packet loss, and notify opposite end remote test receiving end 204 current time delay/packet loss model;
3) local test originating end 202 starts audio plays code book, and loop play is set, the code book signal of output is transferred to after the far-end broadcasting output of tested audio system through tested audio system collection wherethrough reason flow process, gather through remote test App, and save with forms such as the audio frequency wav/wma/mp3 with audio head form;
4) remote test receiving end 204, within the time of setting, has gathered after the audio frequency sending through the environmental simulation of network delay/packet loss, and the eigenwert of the each frame to recording file sound intermediate frequency is blocked the automated analysis of frame.
S104 searches and marks eigenwert and occurs abnormal frame section from each frame;
Alternatively, the label information of frame section comprise following one of at least: the frame length of the temporal information of the start frame of frame section and frame section.
Alternatively, after the eigenwert of the each frame to above-mentioned audio frequency to be measured detects in the present embodiment, eigenwert is occurred to abnormal frame section carries out mark, and above-mentioned eigenwert is occurred to abnormal frame segment mark is designated as the first card frame section.
Alternatively, frame section is in the present embodiment the frame section that abnormal frame composition appears in continuous multiple eigenwert.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, after detecting the eigenwert of each frame, obtain occurring that abnormal frame section is A, B, C, D, E, the temporal information of the start frame to each frame (for example, time be t) and the frame length of described frame section (for example, frame length is N) make marks.
Whether S106 is the quiet section of frame section of carrying out to select to occur card frame from frame section according to frame section;
Alternatively, judge that in the present embodiment whether frame section in the signal segment of audio frequency is that the mode of quiet section includes but not limited to: carry out audio activity detection (VAD detects, Voice Activity Detection).
Alternatively, to occurring that abnormal frame section further judges, judge whether quiet section, and then therefrom select to occur the frame section of card frame.
Alternatively, in the present embodiment, can from be designated as the frame section of the first card frame section, select the frame section that occurs card frame, and the frame segment mark of selecting is designated as to the second card frame section.
S108, there is the label information of the frame section of card frame in output.
Alternatively, the label information of the frame section that occurs card frame is exported, for example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, after detecting the eigenwert of each frame, obtain (for example occurring abnormal frame section, the frame section that energy envelope value is abnormal) be A, B, C, D, E, obtain through further analyzing judgement again, frame section C, D, E is real effectively card frame, frame section A, B is erroneous judgement frame section, the frame section C of frame will effectively be blocked, D, the temporal information of the start frame of E (for example, time be t) and the frame length of above-mentioned frame section (for example, frame length is N) information output.
As Figure 15, in figure, show the label information of the frame section that occurs card frame, wherein, 6 audio files (WavFile) have been shown in file " Summery_KaDuninfo ", in 5 minutes, in each audio file, occur the number (5Min_KaDunTImes) of the frame section of card frame, and there is total duration (5MinContinousKaSeconds) of card frame phenomenon in each audio file.Taking audio file " 6.wav " as example, there are 7 frame sections that occur card frames in 5 minutes, the total duration taking is 0.76s.
In addition, the concrete card frame information of audio file " 6.wav " has been shown in file in Figure 15 " 6_KaDuninfo ", for example, there is the sequence number (KaDunNo) of the frame section of card frame, the timestamp of start frame in the frame section of each appearance card frame (KaPos[Min:Seconds]), totalframes (ContinousKaFrames (Frames/20ms)) in the frame section of appearance card frame (wherein, the duration of every frame is 20ms), and the duration (ContinousKaSeconds) of the frame section of each appearance card frame, the frame section of the appearance card frame taking sequence number as 1 is as example, the timestamp of start frame is 53.439999s, this frame section has 10 frames, total duration of 10 frames is 0.200000s.In Figure 15, also show audio file " 5.wav ", the card frame information that " 4.wav " is concrete, the application repeats no more this.
The embodiment providing by the application, extract the eigenwert that detects sound signal, and will there is abnormal frame segment mark out, through further selecting the frame section that occurs card frame after judgement, then there is the label information of the frame section of card frame in output, and then realization detects the frame section that audio communication system sound intermediate frequency card pauses accurately and efficiently.
As the optional scheme of one, as shown in Figure 3, whether be quiet section according to frame section and from frame section, select the frame section that occurs card frame to comprise:
S302, if frame section is quiet section, judgement belongs to the whether satisfied first card frame bar part of frame section of quiet section;
Alternatively, the first card frame bar part in the present embodiment include but not limited to following one of at least: card frame frame number, naturally quiet condition, audio frequency hit condition, sharp-pointed downslide/time domain truncation condition.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, judge and show that belonging to the frame section of quiet section is A, B, C, D, E, whether the card frame frame number of judgment frame section A, B, C, D, E meets preselected threshold condition (for example, frame number is greater than M).
S304, if belong to the satisfied first card frame bar part of frame section of quiet section, judges the frame section of the frame Duan Buwei appearance card frame that belongs to quiet section;
For example, the card frame frame number that belongs to frame section D, E in the frame section A, B, C, D, E of quiet section does not meet the first card frame bar part, and for example, frame number is less than or equal to M, judge and show that frame section D, E, not for occurring the frame section of card frame, are not labeled as frame section D, E the second card frame section.
S306, meets the first card frame bar part if belong to the frame section of quiet section, judges that to belong to the frame section of quiet section be the frame section that occurs card frame.
For example, the card frame frame number that belongs to frame section A, B, C in the frame section A, B, C, D, E of quiet section meets the first card frame bar part, and for example, frame number is greater than M, judge and show that frame section A, B, C are the frame section that occurs card frame, and frame section A, B, C are labeled as to the second card frame section.
It should be noted that, Yin Rener district distinguishes limited in one's ability, and the window of each frame windowing is in millisecond rank, and the frame number of continuous-form card frame too hour, is difficult to experience extremely short audio region based on people's ear subjectivity, and therefore, such card frame can be left in the basket and disregard.
The embodiment providing by the application, carries out the judgement of refinement to belonging to the frame section of quiet section, judge whether to meet the first card frame bar part, and then can accurately draw the card frame frame section that can be identified in audio communication system.
As the optional scheme of one, as shown in Figure 4, judgement belongs to the frame section of quiet section and whether meets the first card frame bar part and comprise:
S402, whether the frame number that judgement belongs to the frame section of quiet section is greater than the first predetermined threshold;
Alternatively, the setting of the first predetermined threshold is in the present embodiment relevant with the recognition capability of the Caton phenomenon of people's ear to audio frequency, this first predetermined threshold in actual assessment can by training obtain or according to product quality strictly etc. stage determine.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, judges that belonging to the frame section of quiet section is A, B, C, D, E, and whether the card frame frame number of judgment frame section A, B, C, D, E is greater than the first predetermined threshold, and for example, frame number is greater than M.
S404, if frame number is greater than the first predetermined threshold, judges satisfied the first card frame bar part of frame section that belongs to quiet section;
For example, the card frame frame number that belongs to frame section A, B, C in the frame section A, B, C, D, E of quiet section is greater than the first predetermined threshold, and for example, frame number is greater than M, and judgement show that belonging to frame section A, B, the C of quiet section meets the first card frame bar part.
S406, if frame number is less than or equal to the first predetermined threshold, judges satisfied the first card frame bar part of frame section that belongs to quiet section.
For example, the card frame frame number that belongs to frame section D, E in the frame section A, B, C, D, E of quiet section is less than or equal to the first predetermined threshold, and for example, frame number is less than or equal to M, and judgement show that belonging to frame section D, the E of quiet section does not meet the first card frame bar part.
The embodiment providing by the application, arranges threshold value to the frame number of card frame frame section, can be used for selecting more accurately the card frame frame section in the audio system that people's ear can identify.
As the optional scheme of one, the frame section of judging the frame Duan Buwei appearance card frame that belongs to quiet section comprises:
S1, detects the characteristic parameter of the frame section that belongs to quiet section;
Alternatively, characteristic parameter in the present embodiment include but not limited to following one of at least: length, energy, the average of current quiet section.
For example, shown in Fig. 5, treat and survey the length, energy and the average that after judgement, belong to the present frame section of quiet section in sound signal and carry out detection of characteristic parameters.
Again for example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, judge that belonging to the frame section of quiet section is A, B, C, D, E, carry out the detection of characteristic parameter (for example, characteristic parameter is length, energy and the average of present frame section) to belonging to frame section A, B, C, D, the E of quiet section.
S2, judges the naturally quiet condition in whether satisfied the first card frame bar part of frame section that belongs to quiet section according to testing result;
Alternatively, the first card frame bar part in the present embodiment includes but not limited to: naturally quiet.For example, whether the frame section being illustrated in figure 7 in the signal segment that judges audio frequency is the decision algorithm process flow diagram of the quiet condition of nature, and this figure only illustrates the decision algorithm flow process of the quiet condition of nature as an example, and the application does not limit this.
Alternatively, according to above-mentioned testing result, judgement belongs to the frame section of quiet section and whether meets the quiet condition of nature.
It should be noted that, not all quiet section is all card frame, and in voice-frequency telephony, some quiet between exchanging is natural pause, so naturally quiet be not to have occurred that audio card pauses, thereby not for example, as effective card frame (, the second card frame section).
S3, meets the quiet condition of nature if belong to the frame section of quiet section, judges frame section and does not meet the first card frame bar part.
For example, shown in Fig. 5, the frame section that belongs to quiet section is that in A, B, C, D, E, frame section E meets the naturally quiet condition in the first card frame bar part, that is to say, frame section E's is quiet for normally quiet, frame section E is not labeled as to the second card frame section.
The embodiment providing by the application, by judging whether to meet the quiet condition of nature to belonging to the frame section of quiet section in sound signal, has got rid of the situation that is mistaken for card frame causing because of naturally quiet, thereby has obtained more effectively and accurately the card frame in sound signal.
As the optional scheme of one, after according to testing result, judgement belongs to the naturally quiet condition in the whether satisfied first card frame bar part of frame section of quiet section, also comprise:
S1, does not meet the quiet condition of nature if belong to the frame section of quiet section, and judgement belongs to the frame section of quiet section and whether meets the audio frequency hit condition in the first card frame bar part;
Alternatively, the first card frame bar part in the present embodiment includes but not limited to: audio frequency hit condition.For example, be illustrated in figure 8 frame section in the signal segment that judges audio frequency and whether meet the decision algorithm process flow diagram of audio frequency hit condition, this figure only illustrates the decision algorithm flow process of audio frequency hit as an example, and the application does not limit this.
For example, shown in Fig. 5, belonging to the frame section that does not meet the quiet condition of nature in the frame section A, B, C, D, E of quiet section is frame section A, B, C, D, judges whether above-mentioned frame section A, B, C, D meet the audio frequency hit condition of the first card frame bar part.
It should be noted that, audio frequency hit is that sound hit causes, if tut does not have hit phenomenon, and effective card frame frame section of audio system (for example, the second card frame section) not likely, thereby be necessary to treat acoustic and carry out the judgement of audio frequency hit condition frequently.
S2, meets audio frequency hit condition if belong to the frame section of quiet section, and whether the frame number that judgement meets the frame section of audio frequency hit condition is greater than the second predetermined threshold;
Alternatively, the setting of the second predetermined threshold is in the present embodiment also relevant with the recognition capability of the Caton phenomenon of people's ear to audio frequency, this second predetermined threshold in actual assessment can by training obtain or according to product quality strictly etc. stage determine.
For example, shown in Fig. 5, when sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, belonging to the frame section that does not meet the quiet condition of nature in the frame section A, B, C, D, E of quiet section is frame section A, B, C, D, judge again and show that the frame section that wherein meets audio frequency hit condition is A, B, whether the card frame frame number of judgment frame section A, B is greater than the second predetermined threshold (for example, frame number is P).
S3, if frame number is greater than the second predetermined threshold, judges satisfied the first card frame bar part of frame section that meets audio frequency hit condition; If frame number is less than or equal to the second predetermined threshold, judge satisfied the first card frame bar part of frame section that meets audio frequency hit condition.
For example, judge that the frame section that does not meet the quiet condition of nature in the frame section A, B, C, D, E that belongs to quiet section is frame section A, B, C, D, judge again and show that the frame section that wherein meets audio frequency hit condition is A, B, if learn that through judgement the frame number of frame section B is greater than the second predetermined threshold, for example, frame number is greater than P, and judgement show that the frame section B that meets audio frequency hit condition meets the first card frame bar part, and frame section B is charged to the second card frame section.If learn that through judgement the frame number of frame section A is less than or equal to the second predetermined threshold, for example, frame number is less than or equal to P, and judgement show that the frame section A that meets audio frequency hit condition does not meet the first card frame bar part, is not labeled as frame section A the second card frame section.
The embodiment providing by the application, by the frame section that belongs to quiet section in sound signal is determined whether to audio frequency hit, further judges whether frame number meets thresholding setting, thereby obtains more effectively and accurately the card frame in sound signal.
As the optional scheme of one, belong to after whether the frame section of quiet section meet the audio frequency hit condition in the first card frame bar part in judgement, also comprise:
S1, if belong to the discontented footsteps of the frame section hit condition frequently of quiet section, judges the sharp-pointed downslide/time domain truncation condition in whether satisfied the first card frame bar part of frame section that belongs to quiet section;
Alternatively, the first card frame bar part in the present embodiment includes but not limited to: sharp-pointed downslide/time domain truncation condition.For example, whether the frame section being illustrated in figure 9 in the signal segment that judges audio frequency meets the sharply decision algorithm process flow diagram of downslide/time domain truncation condition, this figure only illustrates as an example decision algorithm flow process that sharply glide/time domain of sound signal is blocked, and the application does not limit this.
For example, shown in Fig. 5, belonging to the frame section that does not meet the quiet condition of nature in the frame section A, B, C, D, E of quiet section is frame section A, B, C, D, judge that it is C, D that above-mentioned frame section A, B, C, D do not meet the first frame section of blocking the audio frequency hit condition of frame bar part, then frame section C, D are judged whether to meet sharp-pointed downslide/time domain truncation condition.
It should be noted that, sharp-pointed downslide/time domain is blocked and is blocked and cause suddenly for time domain, if above-mentioned frame section is neither neither sharply glide/time domain of audio frequency hit is blocked cause quiet suddenly, likely not effective card frame frame section of audio system is (for example, the second card frame section), thereby be necessary to treat the acoustic judgement of sharply glide/time domain truncation condition frequently.
S2, if belong to the discontented sharp downslide/time domain of the toe truncation condition of frame section of quiet section, the frame section of judging the discontented sharp downslide/time domain of toe truncation condition meets the first card frame bar part;
For example, shown in Fig. 5, to not meeting the frame section C of the audio frequency hit condition in the first card frame bar part, the judgement of sharply glide/time domain of D truncation condition, show that frame section D is discontented with the sharp downslide/time domain of toe truncation condition, frame section D is not labeled as to the second card frame section.
S3, if belong to the satisfied sharp-pointed downslide/time domain truncation condition of frame section of quiet section, whether the frame number of the frame section of the satisfied sharp-pointed downslide/time domain truncation condition of judgement is greater than the 3rd predetermined threshold;
Alternatively, the setting of the 3rd predetermined threshold is in the present embodiment also relevant with the recognition capability of the Caton phenomenon of people's ear to audio frequency, the 3rd predetermined threshold in actual assessment can by training obtain or according to product quality strictly etc. stage determine.
For example, shown in Fig. 5, to not meeting the frame section C of the audio frequency hit condition in the first card frame bar part, the judgement of sharply glide/time domain of D truncation condition, show that frame section C meets sharp-pointed downslide/time domain truncation condition, whether the card frame frame number of judgment frame section C is greater than the 3rd predetermined threshold (for example, frame number is Q).
S4, if frame number is greater than the 3rd predetermined threshold, judges satisfied the first card frame bar part of frame section that meets sharp-pointed downslide/time domain truncation condition; If frame number is less than or equal to the 3rd predetermined threshold, judge satisfied the first card frame bar part of frame section that meets sharp-pointed downslide/time domain truncation condition.
For example, shown in Fig. 5, card frame frame number to the frame section C that meets sharp-pointed downslide/time domain truncation condition judges, if learn that through judgement the frame number of frame section C is greater than the 3rd predetermined threshold, for example, frame number is greater than Q, and judgement show that the frame section C that meets sharp-pointed downslide/time domain truncation condition meets the first card frame bar part, is labeled as frame section C the second card frame section; If learn that through judgement the frame number of frame section C is to be less than or equal to the 3rd predetermined threshold, for example, frame number is less than or equal to Q, and judgement show that the frame section C that meets sharp-pointed downslide/time domain truncation condition does not meet the first card frame bar part, is not labeled as frame section C the second card frame section.
The embodiment providing by the application, by determining whether that to belonging to the frame section of quiet section in sound signal sharp-pointed downslide/time domain blocks, further judges whether frame number meets thresholding setting, thereby obtains more effectively and accurately the card frame in sound signal.
As the optional scheme of one, whether be quiet section according to frame section and from frame section, select the frame section that occurs card frame to comprise:
S1, if quiet section of frame Duan Buwei, whether judgment frame section meets the second card frame bar part;
Alternatively, shown in Fig. 5, the second card frame bar part in the present embodiment includes but not limited to: the correlativity of audio frequency characteristics, periodically judgement.For example, stress condition, magnetization/mechanical sound condition.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, judges that not belonging to the frame section of quiet section is F, G, H, and judgment frame section is whether F, G, H are stress.
S2, if frame section does not meet the second card frame bar part, judges the frame section that card frame appears in frame Duan Buwei;
For example, shown in Fig. 5, do not meet the second card frame bar part if do not belong to frame section G, H in the frame section F, G, H of quiet section, for example, judge and show that frame section G, H are not stress, and magnetization/mechanical voice frequency composition does not exceed preset ratio, judge and show that frame section G, H, not for occurring the frame section of card frame, are not labeled as frame section G, H the second card frame section.
S3, if frame section meets the second card frame bar part, judging frame section is the frame section that occurs card frame.
For example, shown in Fig. 5, meet the second card frame bar part if do not belong to frame section F in the frame section F, G, H of quiet section, for example, judge and show that frame section F is stress, and card frame frame number meets the discernible condition of people's ear, judge and show that frame section F is the frame section that occurs card frame, is labeled as frame section F the second card frame section.
The embodiment providing by the application, by the frame section that does not belong to quiet section is judged, judges whether to meet the second card frame bar part, and then the frame section of non-quiet section is made to differentiation, accurately draws the card frame frame section that can be identified in audio communication system.
As the optional scheme of one, whether judgment frame section meets the second card frame bar part comprises:
S1, whether judgment frame section meets the stress condition in the second card frame bar part;
Alternatively, the second card frame bar part includes but not limited in the present embodiment: stress condition, magnetization/mechanical sound condition.
For example, shown in Fig. 5, judgement draws after the frame section F, G, H that does not belong to quiet section, then judges whether above-mentioned frame section meets the stress condition in the second card frame bar part.For example,, as shown in figure 10 for judging whether the frame section in the signal segment of audio frequency meets the decision algorithm process flow diagram of stress condition, and this figure only illustrates the decision algorithm flow process of sound signal stress as an example, and the application does not limit this.
S2, if the discontented lumping weight sound condition of frame section, whether judgment frame section meets the magnetization/mechanical sound condition in the second card frame bar part;
For example, shown in Fig. 5, if judgement show that frame section G, H are discontented with lumping weight sound condition, whether judgment frame section G, H meet the magnetization/mechanical sound condition in the second card frame bar part, that is to say, whether magnetization/mechanical voice frequency composition of judgment frame section G, H exceeds preset ratio.For example, for judge the whether decision algorithm process flow diagram of satisfied magnetization/mechanical sound condition of frame section in the signal segment of audio frequency, this figure only illustrates the decision algorithm flow process of magnetization/mechanical sound as an example as shown in figure 11, and the application does not limit this.
S3, if frame section does not meet the magnetization/mechanical sound condition in the second card frame bar part, judges frame section and does not meet the second card frame bar part.
It should be noted that, shown in Fig. 5, do not belong to the discontented lumping weight sound condition of frame section of quiet section, judge again and do not meet magnetization/mechanical sound condition, such frame section is not really effectively to block frame frame section, but the frame section of erroneous judgement, thereby not for example, as effectively blocking frame (, the second card frame section).
For example, shown in Fig. 5, if judgement draws the frame section H in stress condition frame section G, the H not meeting in the second card frame bar part, do not meet the magnetization/mechanical sound condition in the second card frame bar part yet, that is to say, magnetization/mechanical voice frequency composition of judgment frame section H does not exceed preset ratio, judges frame section H and does not meet the second card frame bar part, frame section H is not labeled as to the second card frame section.
The embodiment providing by the application, by the frame section that does not belong to quiet section is carried out to distinguishing of refinement, judge whether to meet stress condition and the magnetization/mechanical sound condition in the second card frame bar part, and then the frame section of non-quiet section is made to differentiation, accurately draw the card frame frame section that can be identified in audio communication system.
As the optional scheme of one, if frame section meets stress condition or meets magnetization/mechanical sound condition, method also comprises:
S1, whether the frame number that judgement belongs to frame section is greater than the 4th predetermined threshold;
Alternatively, the setting of the 4th predetermined threshold is in the present embodiment also relevant with the recognition capability of the Caton phenomenon of people's ear to audio frequency, the 4th predetermined threshold in actual assessment can by training obtain or according to product quality strictly etc. stage determine.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, the frame section that judgment frame section meets stress condition or satisfied magnetization/mechanical sound condition is G, and whether the card frame frame number of judgment frame section G is greater than the 4th predetermined threshold (for example, the 4th predetermined threshold is S).
S2, if frame number is greater than the 4th predetermined threshold, judges and belongs to satisfied the second card frame bar part of frame section; If frame number is less than or equal to the 4th predetermined threshold, judges and belong to satisfied the second card frame bar part of frame section.
For example, if the card frame frame number of frame section G is greater than the 4th predetermined threshold, for example, frame number is greater than S, judges and belongs to satisfied the second card frame bar part of frame section G, frame section G is charged to the second card frame disconnected; If the card frame frame number of frame section G is less than or equal to the 4th predetermined threshold, for example, frame number is less than or equal to S, judges and belongs to satisfied the second card frame bar part of frame section G, frame section G is not labeled as to the second card frame section.
The embodiment providing by the application, by to not belonging to quiet section and meet stress condition or meet the frame section of magnetization/mechanical sound condition in sound signal, further judge whether frame number meets thresholding setting, thereby obtain more effectively and accurately the card frame in sound signal.
As the optional scheme of one, from each frame, search and mark eigenwert and occur that abnormal frame section comprises:
S602, if at least one eigenwert of each in the multiple frames of liaison in each frame all not within corresponding threshold range, is designated as eigenwert by the frame segment mark of continuous multiple frames compositions and occurs abnormal frame section;
Alternatively, each the corresponding threshold range in eigenwert is in the present embodiment identical or different.
For example, when searching and mark eigenwert occur abnormal frame section from each frame, be from the multiple frames of liaison, search each frame at least one eigenwert all not within corresponding threshold range, and the frame section of the above-mentioned continuous multiple frames compositions of mark is that abnormal frame section appears in eigenwert.
As the optional scheme of one, the eigenwert in the present embodiment comprise following one of at least: energy envelope value, frequency spectrum flow, spectral smoothing degree, spectrum deflection, spectrum kurtosis.
Alternatively, the Related Computational Methods of above-mentioned eigenwert can be expressed as follows in the present embodiment:
1) energy envelope value, for representing the variation of audio frequency short-time energy, wherein, added window function comprise following one of at least: rectangular window, Hamming window, Hanning window, quarter window, Abbado Lay window.Wherein, the expression formula of the window function of rectangular window is as follows:
w ( n ) = 1,0 &le; n < N 0 - - - ( 1 )
Wherein, the k frame after sound signal windowing is: X k(n)=w (n) * x (k*N+n), wherein N represents the audio sample number corresponding to time window of each frame.K frame signal X k(n) E for average energy value (k) expression, computing formula is as follows:
E ( k ) = 1 N &Sum; n = 1 N - 1 X k ( n ) * X k ( n ) - - - ( 2 )
Envelope is to taking the logarithm after audio power signal evolution normalization, is the one mark that audio frequency short-time energy changes, and the Env for envelope (k) of k frame sound signal represents, is shown below
Env ( k ) = 20 * log 10 ( 1 N &Sum; n = 1 N - 1 X k ( n ) * X k ( n ) / 32768 ) - - - ( 3 )
2) frequency spectrum flow: for embodying the behavioral characteristics of sound signal, can be by drawing taking 2 as the difference of mould after the vector normalization of adjacent two frames, formula specific as follows represents:
v SF ( n ) = &Sum; k = 0 &kappa; / 2 - 1 ( | X ( k , n ) | - | X ( k , n - 1 ) | ) 2 &kappa; / 2 . - - - ( 4 )
Wherein, 0≤v sF(n)≤A, A is the judgement spectrum amplitude threshold value of presetting, v sF (n)littlely show that adjacent sound signal is more steady, or another kind of situation is that input signal thresholding is low.V when sound signal non-stationary transition sudden change sF (n)the abnormal thresholding of very high arrival of punching.
3) spectral smoothing degree, the pause causing for tag card frame or sudden change, can draw by following computing formula:
v Tf ( n ) = &Pi; k = 0 &kappa; / 2 - 1 | X ( k , n ) | &kappa; / 2 2 / &kappa; &CenterDot; &Sum; k = 0 &kappa; / 2 - 1 | X ( k , n ) | = exp ( 2 / &kappa; &CenterDot; &Sum; k = 0 &kappa; / 2 - 1 log ( | X ( k , n ) | ) ) 2 / &kappa; &CenterDot; &Sum; k = 0 &kappa; / 2 - 1 | X ( k , n ) | . - - - ( 5 )
Wherein, smooth audio region, v tf (n)less, the probability that signal is tonal properties is larger; When the pause causing for card frame or sudden change, v tf (n)that can rush is very high, forms spike and exceeds abnormal thresholding.
4) spectrum deflection, the symmetry distributing for characterize audio signals probability density function (PDF, Probability Density Function), can be by 3 center, rank squares of audio signal statistics divided by cube the drawing of standard deviation, formula specific as follows represents:
v Sk ( n ) = 1 &sigma; x 3 ( n ) &CenterDot; &kappa; &Sigma; i = i s ( n ) i e ( n ) ( x ( i ) - &mu; x ( n ) ) 3 . - - - ( 6 )
Wherein, μ x(n) be the average of a frame statistical signal, σ x(n) be corresponding standard deviation.
5) spectrum kurtosis, the non-Gauss who distributes for characterize audio signals PDF, compared with Gaussian distribution, the flatness of its sign input signal values, can draw divided by the biquadratic of standard deviation by 4 center, rank squares of audio signal statistics, formula specific as follows represents:
v K ( n ) = 1 &sigma; x 4 ( n ) &CenterDot; I &Sum; i = i s ( n ) i e ( n ) ( x ( i ) - &mu; x ( n ) ) 4 - 3 . - - - ( 7 )
Wherein, μ x(n) be the average of a frame statistical signal, σ x(n) be corresponding standard deviation.
It should be noted that, ask for the spectrum deflection and spectrum kurtosis value of each frame sound signal, these two eigenwerts have characterized the degree of sound signal distortion. for spectrum deflection and the audio frame of spectrum kurtosis lower than default decision threshold, according to Spectrum Distortion Measure, respective frame is listed in to the audio frame of distortion.
Alternatively, before card frame detection method in the present embodiment, also comprise sound signal is carried out to pre-service, wherein, pre-service includes but not limited to: go direct current, normalized, channel separation.
Alternatively, above-mentioned preprocess method can comprise following process:
1) remove direct current, can remove based on comb filter (Notch Filter trapper) the direct current composition interference of characteristic frequency; If or the audio section that length is t is desirable, also can be by following account form:
x ( i ) = x DC ( i ) - 1 &tau; &Sum; i = 0 &tau; - 1 x DC ( i ) - - - ( 8 )
2) normalized, simple disposal route is:
x ( i ) x s ( i ) max ( | x s ( i ) | ) - - - ( 9 )
Wherein, i is from 0 to T, wherein, and the length that T is audio section.
Based on AGC dynamic gain control, audio section to be carried out to gain-adjusted to adopt the different factor of dynamic increasing to realize normalization adjusting to change according to microphone volume level in the present embodiment.
3) channel separation, by the voice data of multichannel is carried out to channel separation, finally gets the voice data of a sound channel.Simple audio mixing (DownMixing) treatment scheme that reduces can be with reference to following computing formula:
x ( i ) = 1 c &Sum; c = 0 c - 1 x c ( i ) - - - ( 10 )
Wherein, the number that C is sound channel.
The treatment scheme that reduces in the present embodiment audio mixing (DownMixing) adopts the processing mode based on sound channel weight to be:
x ( i ) = 1 C &Sum; c = 0 C w c ( i ) * x c ( i ) - - - ( 11 )
Wherein, w c (i)be the weight ratio of c sound channel, wherein C is sound channel number, w c (i)according to calculate the average energy value of each sound channel and the average energy of all passages and than weighting weight.
It should be noted that, for aforesaid each embodiment of the method, for simple description, therefore it is all expressed as to a series of combination of actions, but those skilled in the art should know, the present invention is not subject to the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and related action and module might not be that the present invention is necessary.
Through the above description of the embodiments, those skilled in the art can be well understood to the mode that can add essential general hardware platform by software according to the method for above-described embodiment and realize, can certainly pass through hardware, but in a lot of situation, the former is better embodiment.Based on such understanding, the part that technical scheme of the present invention contributes to prior art in essence in other words can embody with the form of software product, this computer software product is stored in a storage medium (as ROM/RAM, magnetic disc, CD), comprise that some instructions (can be mobile phones in order to make a station terminal equipment, computing machine, server, or the network equipment etc.) carry out the method described in each embodiment of the present invention.
Embodiment 2
According to the embodiment of the present invention, a kind of card frame pick-up unit is also provided, as shown in figure 12, this device comprises:
1) detecting unit 1202, carries out feature detection for treating survey sound signal, obtains the eigenwert of the each frame in sound signal to be measured;
Alternatively, the card frame detection method providing in the present embodiment can be, but not limited to be applied to audio system, and as shown in Figure 2, this tested audio system comprises local test originating end 202, remote test receiving end 204, test logic server (TestLogic Server) 206.Treat the output of survey audio system and carry out audio sound-recording, and carry out signature analysis detection based on this audio content, obtain the eigenwert of each frame.Optionally, audio file can be (for example, to comprise sampling rate, channel number with a form in the present embodiment, the information such as sample position bit number) audio file, the form of audio file can include but not limited to following one of at least: wav, wma, mp3.
Alternatively, in the present embodiment audio signal segment is carried out the analysis of time domain/time-frequency conversion/frequency domain character and is obtained the eigenwert of each frame corresponding domain, wherein, the eigenwert of each frame include but not limited to following one of at least: energy envelope value, frequency spectrum flow, spectral smoothing degree, spectrum deflection, spectrum kurtosis.
Alternatively, abnormal if at least one in the above-mentioned eigenwert of present frame to be detected occurs, judging this present frame to be detected is that abnormal frame appears in eigenwert.
For example, as shown in Figure 2, the recording testing process of this tested audio frequency comprises:
1) the card frame test App of local side card frame test App and far-end will sign in to respectively test logic server (TestLogicServer) 206, and keeps online;
2) network delay/shake packet loss of local test originating end 202 these simulation of configuration, and open the simulation of corresponding delay/packet loss, and notify opposite end remote test receiving end 204 current time delay/packet loss model;
3) local test originating end 202 starts audio plays code book, and loop play is set, the code book signal of output is transferred to after the far-end broadcasting output of tested audio system through tested audio system collection wherethrough reason flow process, gather through remote test App, and save with forms such as the audio frequency wav/wma/mp3 with audio head form;
4) remote test receiving end 204, within the time of setting, has gathered after the audio frequency sending through the environmental simulation of network delay/packet loss, and the eigenwert of the each frame to recording file sound intermediate frequency is blocked the automated analysis of frame.
2) search indexing unit 1204, occur abnormal frame section for search and mark eigenwert from each frame, wherein, the label information of frame section comprise following one of at least: the frame length of the temporal information of the start frame of frame section and frame section;
Alternatively, after the eigenwert of the each frame to above-mentioned audio frequency to be measured detects in the present embodiment, eigenwert is occurred to abnormal frame section carries out mark, and above-mentioned eigenwert is occurred to abnormal frame segment mark is designated as the first card frame section.
Alternatively, frame section is in the present embodiment the frame section that abnormal frame composition appears in continuous multiple eigenwert.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, after detecting the eigenwert of each frame, obtain occurring that abnormal frame section is A, B, C, D, E, the temporal information of the start frame to each frame (for example, time be t) and the frame length of described frame section (for example, frame length is N) make marks.
3) selected cell 1206, carrys out the frame section of selecting to occur blocking frame from frame section for whether being quiet section according to frame section;
Alternatively, judge that in the present embodiment whether frame section in the signal segment of audio frequency is that the mode of quiet section includes but not limited to: carry out audio activity detection (VAD detects, Voice Activity Detection).
Alternatively, to occurring that abnormal frame section further judges, judge whether quiet section, and then therefrom select to occur the frame section of card frame.
Alternatively, in the present embodiment, can from be designated as the frame section of the first card frame section, select the frame section that occurs card frame, and the frame segment mark of selecting is designated as to the second card frame section.
4) output unit 1208, for exporting the label information of the frame section that occurs card frame.
Alternatively, the label information of the frame section that occurs card frame is exported, for example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, after detecting the eigenwert of each frame, obtain (for example occurring abnormal frame section, the frame section that energy envelope value is abnormal) be A, B, C, D, E, obtain through further analyzing judgement again, frame section C, D, E is real effectively card frame, frame section A, B is erroneous judgement frame section, the frame section C of frame will effectively be blocked, D, the temporal information of the start frame of E (for example, time be t) and the frame length of above-mentioned frame section (for example, frame length is N) information output.
As Figure 15, in figure, show the label information of the frame section that occurs card frame, wherein, 6 audio files (WavFile) have been shown in file " Summery_KaDuninfo ", in 5 minutes, in each audio file, occur the number (5Min_KaDunTImes) of the frame section of card frame, and there is total duration (5MinContinousKaSeconds) of card frame phenomenon in each audio file.Taking audio file " 6.wav " as example, there are 7 frame sections that occur card frames in 5 minutes, the total duration taking is 0.76s.
In addition, the concrete card frame information of audio file " 6.wav " has been shown in file in Figure 15 " 6_KaDuninfo ", for example, there is the sequence number (KaDunNo) of the frame section of card frame, the timestamp of start frame in the frame section of each appearance card frame (KaPos[Min:Seconds]), totalframes (ContinousKaFrames (Frames/20ms)) in the frame section of appearance card frame (wherein, the duration of every frame is 20ms), and the duration (ContinousKaSeconds) of the frame section of each appearance card frame, the frame section of the appearance card frame taking sequence number as 1 is as example, the timestamp of start frame is 53.439999s, this frame section has 10 frames, total duration of 10 frames is 0.200000s.In Figure 15, also show audio file " 5.wav ", the card frame information that " 4.wav " is concrete, the application repeats no more this.
The embodiment providing by the application, extract the eigenwert that detects sound signal, and will there is abnormal frame segment mark out, through further selecting effective card frame after judgement, then the label information of output card frame frame section, and then realization detects the frame section that audio communication system sound intermediate frequency card pauses accurately and efficiently.
As the optional scheme of one, as shown in figure 13, selected cell 1206 comprises:
1) the first judge module 1302, in the time that frame section is quiet section, judgement belongs to the whether satisfied first card frame bar part of frame section of quiet section; In the time judging satisfied the first card frame bar part of frame section that belongs to quiet section, judge the frame section of the frame Duan Buwei appearance card frame that belongs to quiet section; Belong to the frame section of quiet section and meet the first card frame bar part judging, judge that to belong to the frame section of quiet section be the frame section that occurs card frame.
Alternatively, the first card frame bar part in the present embodiment include but not limited to following one of at least: card frame frame number, naturally quiet condition, audio frequency hit condition, sharp-pointed downslide/time domain truncation condition.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, judge and show that belonging to the frame section of quiet section is A, B, C, D, E, whether the card frame frame number of judgment frame section A, B, C, D, E meets preselected threshold condition (for example, frame number is greater than M).
1) do not meet the first card frame bar part if belong to the card frame frame number of frame section D, E in the frame section A, B, C, D, E of quiet section, for example, frame number is less than or equal to M, judges and show that frame section D, E, not for occurring the frame section of card frame, are not labeled as frame section D, E the second card frame section.
2) meet the first card frame bar part if belong to the card frame frame number of frame section A, B, C in the frame section A, B, C, D, E of quiet section, for example, frame number is greater than M, judges and show that frame section A, B, C are the frame section that occurs card frame, and frame section A, B, C are labeled as to the second card frame section.
It should be noted that, Yin Rener district distinguishes limited in one's ability, and the window of each frame windowing is in millisecond rank, and the frame number of continuous-form card frame too hour, is difficult to experience extremely short audio region based on people's ear subjectivity, and therefore, such card frame can be left in the basket and disregard.
The embodiment providing by the application, carries out the judgement of refinement to belonging to the frame section of quiet section, judge whether to meet the first card frame bar part, and then can accurately draw the card frame frame section that can be identified in audio communication system.
As the optional scheme of one, the first judge module 1302 comprises:
1) first judges submodule, whether is greater than the first predetermined threshold for the frame number that judges the frame section that belongs to quiet section; In the time that frame number is greater than the first predetermined threshold, judge satisfied the first card frame bar part of frame section that belongs to quiet section; In the time that frame number is less than or equal to the first predetermined threshold, judge satisfied the first card frame bar part of frame section that belongs to quiet section.
Alternatively, the setting of the first predetermined threshold is in the present embodiment relevant with the recognition capability of the Caton phenomenon of people's ear to audio frequency, this first predetermined threshold in actual assessment can by training obtain or according to product quality strictly etc. stage determine.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, judges that belonging to the frame section of quiet section is A, B, C, D, E, and whether the card frame frame number of judgment frame section A, B, C, D, E is greater than the first predetermined threshold, and for example, frame number is greater than M.
1) be greater than the first predetermined threshold if belong to the card frame frame number of frame section A, B, C in the frame section A, B, C, D, E of quiet section, for example, frame number is greater than M, and judgement show that belonging to frame section A, B, the C of quiet section meets the first card frame bar part.
2) be less than or equal to the first predetermined threshold if belong to the card frame frame number of frame section D, E in the frame section A, B, C, D, E of quiet section, for example, frame number is less than or equal to M, and judgement show that belonging to frame section D, the E of quiet section does not meet the first card frame bar part.
The embodiment providing by the application, arranges threshold value to the frame number of card frame frame section, can be used for selecting more accurately the card frame frame section in the audio system that people's ear can identify.
As the optional scheme of one, the first judge module 1302 comprises:
1) detection sub-module, for detecting the characteristic parameter of the frame section that belongs to quiet section;
Alternatively, characteristic parameter in the present embodiment include but not limited to following one of at least: length, energy, the average of current quiet section.
For example, shown in Fig. 5, treat and survey the length, energy and the average that after judgement, belong to the present frame section of quiet section in sound signal and carry out detection of characteristic parameters.
Again for example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, judge that belonging to the frame section of quiet section is A, B, C, D, E, carry out the detection of characteristic parameter (for example, characteristic parameter is length, energy and the average of present frame section) to belonging to frame section A, B, C, D, the E of quiet section.
2) second judges submodule, whether meets the naturally quiet condition of the first card frame bar part for belong to the frame section of quiet section according to the testing result judgement of detection sub-module; In the time that the frame section that belongs to quiet section meets the quiet condition of nature, judge frame section and do not meet the first card frame bar part.
Alternatively, the first card frame bar part in the present embodiment includes but not limited to: naturally quiet.
Alternatively, according to above-mentioned testing result, judgement belongs to the frame section of quiet section and whether meets the quiet condition of nature.
It should be noted that, not all quiet section is all card frame, and in voice-frequency telephony, some quiet between exchanging is natural pause, so naturally quiet be not to have occurred that audio card pauses, thereby not for example, as effective card frame (, the second card frame section).
For example, shown in Fig. 5, the frame section that belongs to quiet section is that in A, B, C, D, E, frame section E meets the naturally quiet condition in the first card frame bar part, that is to say, frame section E's is quiet for normally quiet, frame section E is not labeled as to the second card frame section.
The embodiment providing by the application, by judging whether to meet the quiet condition of nature to belonging to the frame section of quiet section in sound signal, has got rid of the situation that is mistaken for card frame causing because of naturally quiet, thereby has obtained more effectively and accurately the card frame in sound signal.
As the optional scheme of one, the first judge module 1302 comprises:
1) the 3rd judges submodule, in the time that the frame section that belongs to quiet section does not meet the quiet condition of nature, judges the audio frequency hit condition in whether satisfied the first card frame bar part of frame section that belongs to quiet section;
Alternatively, the first card frame bar part in the present embodiment includes but not limited to: audio frequency hit condition.
For example, shown in Fig. 5, belonging to the frame section that does not meet the quiet condition of nature in the frame section A, B, C, D, E of quiet section is frame section A, B, C, D, judges whether above-mentioned frame section A, B, C, D meet the audio frequency hit condition of the first card frame bar part.
It should be noted that, audio frequency hit is that sound hit causes, if tut does not have hit phenomenon, and effective card frame frame section of audio system (for example, the second card frame section) not likely, thereby be necessary to treat acoustic and carry out the judgement of audio frequency hit condition frequently.
2) the 4th judges submodule, and in the time that the frame section that belongs to quiet section meets audio frequency hit condition, whether the frame number that judgement meets the frame section of audio frequency hit condition is greater than the second predetermined threshold;
Alternatively, the setting of the second predetermined threshold is in the present embodiment also relevant with the recognition capability of the Caton phenomenon of people's ear to audio frequency, this second predetermined threshold in actual assessment can by training obtain or according to product quality strictly etc. stage determine.
For example, shown in Fig. 5, when sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, belonging to the frame section that does not meet the quiet condition of nature in the frame section A, B, C, D, E of quiet section is frame section A, B, C, D, judge again and show that the frame section that wherein meets audio frequency hit condition is A, B, whether the card frame frame number of judgment frame section A, B is greater than the second predetermined threshold (for example, frame number is P).
Alternatively, in the time that frame number is greater than the second predetermined threshold, judge satisfied the first card frame bar part of frame section that meets audio frequency hit condition; In the time that frame number is less than or equal to the second predetermined threshold, judge satisfied the first card frame bar part of frame section that meets audio frequency hit condition.
For example, judge that the frame section that does not meet the quiet condition of nature in the frame section A, B, C, D, E that belongs to quiet section is frame section A, B, C, D, judge again and show that the frame section that wherein meets audio frequency hit condition is A, B, if learn that through judgement the frame number of frame section B is greater than the second predetermined threshold, for example, frame number is greater than P, and judgement show that the frame section B that meets audio frequency hit condition meets the first card frame bar part, and frame section B is charged to the second card frame section.If learn that through judgement the frame number of frame section A is less than or equal to the second predetermined threshold, for example, frame number is less than or equal to P, and judgement show that the frame section A that meets audio frequency hit condition does not meet the first card frame bar part, is not labeled as frame section A the second card frame section.
The embodiment providing by the application, by the frame section that belongs to quiet section in sound signal is determined whether to audio frequency hit, further judges whether frame number meets thresholding setting, thereby obtains more effectively and accurately the card frame in sound signal.
As the optional scheme of one, the first judge module 1302 comprises:
1) the 5th judges submodule, and at the discontented footsteps of the frame section that belongs to quiet section frequently when hit condition, judgement belongs to the frame section of quiet section and whether meets the sharp-pointed downslide/time domain truncation condition in the first card frame bar part; In the time of the discontented sharp downslide/time domain of the toe truncation condition of frame section that belongs to quiet section, the frame section of judging the discontented sharp downslide/time domain of toe truncation condition meets the first card frame bar part;
Alternatively, the first card frame bar part in the present embodiment includes but not limited to: sharp-pointed downslide/time domain truncation condition.
For example, shown in Fig. 5, belonging to the frame section that does not meet the quiet condition of nature in the frame section A, B, C, D, E of quiet section is frame section A, B, C, D, judge that it is C, D that above-mentioned frame section A, B, C, D do not meet the first frame section of blocking the audio frequency hit condition of frame bar part, then frame section C, D are judged whether to meet sharp-pointed downslide/time domain truncation condition.
It should be noted that, sharp-pointed downslide/time domain is blocked and is blocked and cause suddenly for time domain, if above-mentioned frame section is neither neither sharply glide/time domain of audio frequency hit is blocked cause quiet suddenly, likely not effective card frame frame section of audio system is (for example, the second card frame section), thereby be necessary to treat the acoustic judgement of sharply glide/time domain truncation condition frequently.
Again for example, shown in Fig. 5, to not meeting the frame section C of the audio frequency hit condition in the first card frame bar part, the judgement of sharply glide/time domain of D truncation condition, show that frame section D is discontented with the sharp downslide/time domain of toe truncation condition, frame section D is not labeled as to the second card frame section.
2) the 6th judges submodule, and in the time that the frame section that belongs to quiet section meets sharp-pointed downslide/time domain truncation condition, whether the frame number that judgement meets the frame section of sharp-pointed downslide/time domain truncation condition is greater than the 3rd predetermined threshold; In the time that frame number is greater than the 3rd predetermined threshold, judge satisfied the first card frame bar part of frame section that meets sharp-pointed downslide/time domain truncation condition; In the time that frame number is less than or equal to the 3rd predetermined threshold, judge satisfied the first card frame bar part of frame section that meets sharp-pointed downslide/time domain truncation condition.
Alternatively, the setting of the 3rd predetermined threshold is in the present embodiment also relevant with the recognition capability of the Caton phenomenon of people's ear to audio frequency, the 3rd predetermined threshold in actual assessment can by training obtain or according to product quality strictly etc. stage determine.
For example, shown in Fig. 5, to not meeting the frame section C of the audio frequency hit condition in the first card frame bar part, the judgement of sharply glide/time domain of D truncation condition, show that frame section C meets sharp-pointed downslide/time domain truncation condition, whether the card frame frame number of judgment frame section C is greater than the 3rd predetermined threshold (for example, frame number is Q).
Again for example, shown in Fig. 5, card frame frame number to the frame section C that meets sharp-pointed downslide/time domain truncation condition judges, if learn that through judgement the frame number of frame section C is greater than the 3rd predetermined threshold, for example, frame number is greater than Q, and judgement show that the frame section C that meets sharp-pointed downslide/time domain truncation condition meets the first card frame bar part, is labeled as frame section C the second card frame section; If learn that through judgement the frame number of frame section C is to be less than or equal to the 3rd predetermined threshold, for example, frame number is less than or equal to Q, and judgement show that the frame section C that meets sharp-pointed downslide/time domain truncation condition does not meet the first card frame bar part, is not labeled as frame section C the second card frame section.
The embodiment providing by the application, by determining whether that to belonging to the frame section of quiet section in sound signal sharp-pointed downslide/time domain blocks, further judges whether frame number meets thresholding setting, thereby obtains more effectively and accurately the card frame in sound signal.
As the optional scheme of one, as shown in figure 13, selected cell 1206 comprises:
1) the second judge module 1304, for when quiet section of the frame Duan Buwei, whether judgment frame section meets the second card frame bar part; In the time that frame section does not meet the second card frame bar part, judge the frame section that card frame appears in frame Duan Buwei; In the time that frame section meets the second card frame bar part, judging frame section is the frame section that occurs card frame.
Alternatively, shown in Fig. 5, the second card frame bar part in the present embodiment includes but not limited to: the correlativity of audio frequency characteristics, periodically judgement.For example, stress condition, magnetization/mechanical sound condition.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, judges that not belonging to the frame section of quiet section is F, G, H, and judgment frame section is whether F, G, H are stress.
Again for example, shown in Fig. 5, do not meet the second card frame bar part if do not belong to frame section G, H in the frame section F, G, H of quiet section, for example, judge and show that frame section G, H are not stress, and magnetization/mechanical voice frequency composition does not exceed preset ratio, judge and show that frame section G, H, not for occurring the frame section of card frame, are not labeled as frame section G, H the second card frame section.
Again for example, shown in Fig. 5, meet the second card frame bar part if do not belong to frame section F in the frame section F, G, H of quiet section, for example, judge and show that frame section F is stress, and card frame frame number meets the discernible condition of people's ear, judge and show that frame section F is the frame section that occurs card frame, is labeled as frame section F the second card frame section.
The embodiment providing by the application, by the frame section that does not belong to quiet section is judged, judges whether to meet the second card frame bar part, and then the frame section of non-quiet section is made to differentiation, accurately draws the card frame frame section that can be identified in audio communication system.
As the optional scheme of one, the second judge module 1304 comprises:
1) the 7th judges submodule, whether meets the stress condition of the second card frame bar part for judgment frame section;
Alternatively, the second card frame bar part includes but not limited in the present embodiment: stress condition, magnetization/mechanical sound condition.
For example, shown in Fig. 5, judgement draws after the frame section F, G, H that does not belong to quiet section, then judges whether above-mentioned frame section meets the stress condition in the second card frame bar part.
2) the 8th judges submodule, and in the time that frame section is discontented with lumping weight sound condition, whether judgment frame section meets the magnetization/mechanical sound condition in the second card frame bar part; In the time that frame section does not meet the magnetization/mechanical sound condition in the second card frame bar part, judge frame section and do not meet the second card frame bar part.
For example, shown in Fig. 5, if judgement show that frame section G, H are discontented with lumping weight sound condition, whether judgment frame section G, H meet the magnetization/mechanical sound condition in the second card frame bar part, that is to say, whether magnetization/mechanical voice frequency composition of judgment frame section G, H exceeds preset ratio.
It should be noted that, shown in Fig. 5, do not belong to the discontented lumping weight sound condition of frame section of quiet section, judge again and do not meet magnetization/mechanical sound condition, such frame section is not really effectively to block frame frame section, but the frame section of erroneous judgement, thereby not for example, as effectively blocking frame (, the second card frame section).
For example, shown in Fig. 5, if judgement draws the frame section H in stress condition frame section G, the H not meeting in the second card frame bar part, do not meet the magnetization/mechanical sound condition in the second card frame bar part yet, that is to say, magnetization/mechanical voice frequency composition of judgment frame section H does not exceed preset ratio, judges frame section H and does not meet the second card frame bar part, frame section H is not labeled as to the second card frame section.
The embodiment providing by the application, by the frame section that does not belong to quiet section is carried out to distinguishing of refinement, judge whether to meet stress condition and the magnetization/mechanical sound condition in the second card frame bar part, and then the frame section of non-quiet section is made to differentiation, accurately draw the card frame frame section that can be identified in audio communication system.
As the optional scheme of one, the second judge module 1304 comprises:
1) the 9th judges submodule, for meeting stress condition in frame section or meeting when magnetization/mechanical sound condition, judges whether the frame number that belongs to frame section is greater than the 4th predetermined threshold; In the time that frame number is greater than the 4th predetermined threshold, judges and belong to satisfied the second card frame bar part of frame section; In the time that frame number is less than or equal to the 4th predetermined threshold, judges and belong to satisfied the second card frame bar part of frame section.
Alternatively, the setting of the 4th predetermined threshold is in the present embodiment also relevant with the recognition capability of the Caton phenomenon of people's ear to audio frequency, the 4th predetermined threshold in actual assessment can by training obtain or according to product quality strictly etc. stage determine.
For example, sound signal comprises A, B, C, D, E, F, G, eight frame sections of H, the frame section that judgment frame section meets stress condition or satisfied magnetization/mechanical sound condition is G, and whether the card frame frame number of judgment frame section G is greater than the 4th predetermined threshold (for example, the 4th predetermined threshold is S).
Again for example, if the card frame frame number of frame section G is greater than the 4th predetermined threshold, for example, frame number is greater than S, judges and belongs to satisfied the second card frame bar part of frame section G, frame section G is charged to the second card frame disconnected; If the card frame frame number of frame section G is less than or equal to the 4th predetermined threshold, for example, frame number is less than or equal to S, judges and belongs to satisfied the second card frame bar part of frame section G, frame section G is not labeled as to the second card frame section.
The embodiment providing by the application, by to not belonging to quiet section and meet stress condition or meet the frame section of magnetization/mechanical sound condition in sound signal, further judge whether frame number meets thresholding setting, thereby obtain more effectively and accurately the card frame in sound signal.
As the optional scheme of one, as shown in figure 14, search indexing unit 1204 and comprise:
1) mark module 1402, for each at least one eigenwert in the multiple frames of liaison of each frame all not within corresponding threshold range, the frame segment mark of continuous multiple frame compositions is designated as and occurs abnormal frame section, wherein, each the corresponding threshold range in eigenwert is identical or different.
For example, when searching and mark eigenwert occur abnormal frame section from each frame, be from the multiple frames of liaison, search each frame at least one eigenwert all not within corresponding threshold range, and the frame section of the above-mentioned continuous multiple frames compositions of mark is that abnormal frame section appears in eigenwert.
As the optional scheme of one, eigenwert comprise following one of at least: energy envelope value, frequency spectrum flow, spectral smoothing degree, spectrum deflection, spectrum kurtosis.
Alternatively, the Related Computational Methods of above-mentioned eigenwert can be expressed as follows in the present embodiment:
1) energy envelope value, for representing the variation of audio frequency short-time energy, wherein, added window function comprise following one of at least: rectangular window, Hamming window, Hanning window, quarter window, Abbado Lay window.Wherein, the expression formula of the window function of rectangular window is as follows:
w ( n ) = 1,0 &le; n < N 0 - - - ( 12 )
Wherein, the k frame after sound signal windowing is: X k(n)=w (n) * x (k*N+n), wherein N represents the audio sample number corresponding to time window of each frame.K frame signal X k(n) E for average energy value (k) expression, computing formula is as follows:
E ( k ) = 1 N &Sum; n = 1 N - 1 X k ( n ) * X k ( n ) - - - ( 13 )
Envelope is to taking the logarithm after audio power signal evolution normalization, is the one mark that audio frequency short-time energy changes, and the Env for envelope (k) of k frame sound signal represents, is shown below
Env ( k ) = 20 * log 10 ( 1 N &Sum; n = 1 N - 1 X k ( n ) * X k ( n ) / 32768 ) - - - ( 14 )
2) frequency spectrum flow: for embodying the behavioral characteristics of sound signal, can be by drawing taking 2 as the difference of mould after the vector normalization of adjacent two frames, formula specific as follows represents:
v SF ( n ) = &Sum; k = 0 &kappa; / 2 - 1 ( | X ( k , n ) | - | X ( k , n - 1 ) | ) 2 &kappa; / 2 . - - - ( 15 )
Wherein, 0≤v sF(n)≤A, A is the judgement spectrum amplitude threshold value of presetting, v sF (n)littlely show that adjacent sound signal is more steady, or another kind of situation is that input signal thresholding is low.V when sound signal non-stationary transition sudden change sF (n)the abnormal thresholding of very high arrival of punching.
3) spectral smoothing degree, the pause causing for tag card frame or sudden change, can draw by following computing formula:
v Tf ( n ) = &Pi; k = 0 &kappa; / 2 - 1 | X ( k , n ) | &kappa; / 2 2 / &kappa; &CenterDot; &Sum; k = 0 &kappa; / 2 - 1 | X ( k , n ) | = exp ( 2 / &kappa; &CenterDot; &Sum; k = 0 &kappa; / 2 - 1 log ( | X ( k , n ) | ) ) 2 / &kappa; &CenterDot; &Sum; k = 0 &kappa; / 2 - 1 | X ( k , n ) | . - - - ( 16 )
Wherein, smooth audio region, v tf (n)less, the probability that signal is tonal properties is larger; When the pause causing for card frame or sudden change, v tf (n)that can rush is very high, forms spike and exceeds abnormal thresholding.
4) spectrum deflection, the symmetry distributing for characterize audio signals probability density function (PDF, Probability Density Function), can be by 3 center, rank squares of audio signal statistics divided by cube the drawing of standard deviation, formula specific as follows represents:
v Sk ( n ) = 1 &sigma; x 3 ( n ) &CenterDot; &kappa; &Sigma; i = i s ( n ) i e ( n ) ( x ( i ) - &mu; x ( n ) ) 3 . - - - ( 17 )
Wherein, μ x(n) be the average of a frame statistical signal, σ x(n) be corresponding standard deviation.
5) spectrum kurtosis, the non-Gauss who distributes for characterize audio signals PDF, compared with Gaussian distribution, the flatness of its sign input signal values, can draw divided by the biquadratic of standard deviation by 4 center, rank squares of audio signal statistics, formula specific as follows represents:
v K ( n ) = 1 &sigma; x 4 ( n ) &CenterDot; I &Sum; i = i s ( n ) i e ( n ) ( x ( i ) - &mu; x ( n ) ) 4 - 3 . - - - ( 18 )
Wherein, μ x(n) be the average of a frame statistical signal, σ x(n) be corresponding standard deviation.
It should be noted that, ask for the spectrum deflection and spectrum kurtosis value of each frame sound signal, these two eigenwerts have characterized the degree of sound signal distortion. for spectrum deflection and the audio frame of spectrum kurtosis lower than default decision threshold, according to Spectrum Distortion Measure, respective frame is listed in to the audio frame of distortion.
Alternatively, before card frame detection method in the present embodiment, also comprise sound signal is carried out to pre-service, wherein, pre-service includes but not limited to: go direct current, normalized, channel separation.
Alternatively, above-mentioned pretreated method can be referring to the processing procedure described in embodiment 1.
The invention described above embodiment sequence number, just to describing, does not represent the quality of embodiment.
In the above embodiment of the present invention, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the part of detailed description, can be referring to the associated description of other embodiment.
In the several embodiment that provide in the application, should be understood that disclosed client can realize by another way.Wherein, device embodiment described above is only schematic, the division of for example described unit, be only that a kind of logic function is divided, when actual realization, can there is other dividing mode, for example multiple unit or assembly can in conjunction with or can be integrated into another system, or some features can ignore, or do not carry out.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, the indirect coupling of unit or module or communication connection can be electrical or other form.
The described unit as separating component explanation can or can not be also physically to separate, and the parts that show as unit can be or can not be also physical locations, can be positioned at a place, or also can be distributed in multiple network element.Can select according to the actual needs some or all of unit wherein to realize the object of the present embodiment scheme.
In addition, the each functional unit in each embodiment of the present invention can be integrated in a processing unit, can be also that the independent physics of unit exists, and also can be integrated in a unit two or more unit.Above-mentioned integrated unit both can adopt the form of hardware to realize, and also can adopt the form of SFU software functional unit to realize.
If described integrated unit is realized and during as production marketing independently or use, can be stored in a computer read/write memory medium using the form of SFU software functional unit.Based on such understanding, the all or part of of the part that technical scheme of the present invention contributes to prior art in essence in other words or this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprises that some instructions are in order to make a computer equipment (can be personal computer, server or the network equipment etc.) carry out all or part of step of method described in the present invention each embodiment.And aforesaid storage medium comprises: various media that can be program code stored such as USB flash disk, ROM (read-only memory) (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), portable hard drive, magnetic disc or CDs.
The above is only the preferred embodiment of the present invention; it should be pointed out that for those skilled in the art, under the premise without departing from the principles of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.

Claims (22)

1. a card frame detection method, is characterized in that, comprising:
Treat survey sound signal and carry out feature detection, obtain the eigenwert of the each frame in described sound signal to be measured;
From described each frame, search and mark described eigenwert and occur abnormal frame section, wherein, the label information of described frame section comprise following one of at least: the frame length of the temporal information of the start frame of described frame section and described frame section;
Whether be the quiet section of frame section of carrying out to select to occur card frame from described frame section according to described frame section;
Export the label information of the frame section of described appearance card frame.
2. method according to claim 1, is characterized in that, whether described be that the quiet section of frame section of selecting appearance to block frame from described frame section comprises according to described frame section:
If described frame section is described quiet section, judgement belongs to the whether satisfied first card frame bar part of described frame section of described quiet section;
If the described frame section that belongs to described quiet section does not meet described the first card frame bar part, judge the frame section that occurs card frame described in the described frame Duan Buwei that belongs to described quiet section;
If the described frame section that belongs to described quiet section meets described the first card frame bar part, judge that to belong to the described frame section of described quiet section be the frame section of described appearance card frame.
3. method according to claim 2, is characterized in that, described judgement belongs to the described frame section of described quiet section and whether meets the first card frame bar part and comprise:
Whether the frame number that judgement belongs to the described frame section of described quiet section is greater than the first predetermined threshold;
If described frame number is greater than described the first predetermined threshold, judges and belong to the described frame section of described quiet section and meet described the first card frame bar part; If described frame number is less than or equal to described the first predetermined threshold, judges and belong to the described frame section of described quiet section and do not meet described the first card frame bar part.
4. method according to claim 2, is characterized in that, described in judge and occur that the frame section of card frame comprises described in the described frame Duan Buwei that belongs to described quiet section:
Characteristic parameter to the described frame section that belongs to described quiet section detects;
Whether meet the naturally quiet condition in described the first card frame bar part according to the described frame section that belongs to described quiet section described in described testing result judgement;
If described in belong to the described frame section of described quiet section and meet described naturally quiet condition, judge described frame section and do not meet described the first card frame bar part.
5. method according to claim 4, is characterized in that, after whether meeting the naturally quiet condition in described the first card frame bar part according to the described frame section that belongs to described quiet section described in described testing result judgement, also comprises:
If described in belong to the described frame section of described quiet section and do not meet described naturally quiet condition, judgement belongs to the described frame section of described quiet section and whether meets the audio frequency hit condition in described the first card frame bar part;
If described in belong to the described frame section of described quiet section and meet described audio frequency hit condition, whether the frame number that judgement meets the described frame section of described audio frequency hit condition is greater than the second predetermined threshold;
If described frame number is greater than described the second predetermined threshold, judges the described frame section that meets described audio frequency hit condition and meet described the first card frame bar part; If described frame number is less than or equal to described the second predetermined threshold, judges the described frame section that meets described audio frequency hit condition and do not meet described the first card frame bar part.
6. method according to claim 5, is characterized in that, belongs to after whether the described frame section of described quiet section meet the audio frequency hit condition in described the first card frame bar part in judgement, also comprises:
If described in belong to the described frame section of described quiet section and do not meet described audio frequency hit condition, whether the described frame section that belongs to described quiet section described in judgement meets the sharp-pointed downslide/time domain truncation condition in described the first card frame bar part;
If described in belong to the described frame section of described quiet section and do not meet described sharp-pointed downslide/time domain truncation condition, judge the described frame section that does not meet described sharp-pointed downslide/time domain truncation condition and meet described the first card frame bar part;
If described in belong to the described frame section of described quiet section and meet described sharp-pointed downslide/time domain truncation condition, whether the frame number that judgement meets the described frame section of described sharp-pointed downslide/time domain truncation condition is greater than the 3rd predetermined threshold;
If described frame number is greater than described the 3rd predetermined threshold, judges the described frame section that meets described sharp-pointed downslide/time domain truncation condition and meet described the first card frame bar part; If described frame number is less than or equal to described the 3rd predetermined threshold, judges the described frame section that meets described sharp-pointed downslide/time domain truncation condition and do not meet described the first card frame bar part.
7. method according to claim 1, is characterized in that, whether described be that the quiet section of frame section of selecting appearance to block frame from described frame section comprises according to described frame section:
If described in described frame Duan Buwei quiet section, judge whether described frame section meets the second card frame bar part;
If described frame section does not meet described the second card frame bar part, judge the frame section that occurs card frame described in described frame Duan Buwei;
If described frame section meets described the second card frame bar part, judge the frame section that described frame section is described appearance card frame.
8. method according to claim 7, is characterized in that, describedly judges whether described frame section meets described the second card frame bar part and comprise:
Judge whether described frame section meets the stress condition in described the second card frame bar part;
If described frame section does not meet described stress condition, judge whether described frame section meets the magnetization/mechanical sound condition in described the second card frame bar part;
If described frame section does not meet the magnetization/mechanical sound condition in described the second card frame bar part, judge described frame section and do not meet described the second card frame bar part.
9. method according to claim 8, is characterized in that, if described frame section meets described stress condition or meets described magnetization/mechanical sound condition, described method also comprises:
Whether the frame number that judgement belongs to described frame section is greater than the 4th predetermined threshold;
If described frame number is greater than described the 4th predetermined threshold, judges and belong to described frame section and meet described the second card frame bar part; If described frame number is less than or equal to described the 4th predetermined threshold, judges and belong to described frame section and do not meet described the second card frame bar part.
10. according to the method described in any one in claim 1 to 9, it is characterized in that, from described each frame, search and mark described eigenwert and occur that abnormal frame section comprises: if eigenwert is not all within corresponding threshold range described at least one of each in the multiple frames of liaison in described each frame, the frame segment mark of described continuous multiple frame compositions is designated as to described eigenwert and occurs abnormal frame section, wherein, the described threshold range of each correspondence in described eigenwert is identical or different.
11. methods according to claim 10, is characterized in that, described eigenwert comprise following one of at least: energy envelope value, frequency spectrum flow, spectral smoothing degree, spectrum deflection, spectrum kurtosis.
12. 1 kinds of card frame pick-up units, is characterized in that, comprising:
Detecting unit, carries out feature detection for treating survey sound signal, obtains the eigenwert of the each frame in described sound signal to be measured;
Search indexing unit, occur abnormal frame section for searching and mark described eigenwert from described each frame, wherein, the label information of described frame section comprise following one of at least: the frame length of the temporal information of the start frame of described frame section and described frame section;
Selected cell, carrys out the frame section of selecting to occur blocking frame from described frame section for whether being quiet section according to described frame section;
Output unit, for exporting the label information of frame section of described appearance card frame.
13. devices according to claim 12, is characterized in that, described selected cell comprises:
The first judge module, in the time that described frame section is described quiet section, judgement belongs to the whether satisfied first card frame bar part of described frame section of described quiet section; Judging when belonging to the described frame section of described quiet section and not meeting described the first card frame bar part, judge the frame section that occurs card frame described in the described frame Duan Buwei that belongs to described quiet section; Belong to the described frame section of described quiet section and meet described the first card frame bar part judging, judge that to belong to the described frame section of described quiet section be the frame section of described appearance card frame.
14. devices according to claim 13, is characterized in that, described the first judge module comprises:
First judges submodule, whether is greater than the first predetermined threshold for the frame number that judges the described frame section that belongs to described quiet section; In the time that described frame number is greater than described the first predetermined threshold, judges and belong to the described frame section of described quiet section and meet described the first card frame bar part; In the time that described frame number is less than or equal to described the first predetermined threshold, judges and belong to the described frame section of described quiet section and do not meet described the first card frame bar part.
15. devices according to claim 13, is characterized in that, described the first judge module comprises:
Detection sub-module, for detecting the characteristic parameter of the described frame section that belongs to described quiet section;
Second judges submodule, for whether meet the naturally quiet condition of described the first card frame bar part according to the described frame section that belongs to described quiet section described in the testing result judgement of described detection module; In the time that the described described frame section that belongs to described quiet section meets described quiet condition naturally, judge described frame section and do not meet described the first card frame bar part.
16. devices according to claim 15, is characterized in that, described the first judge module comprises:
The 3rd judges submodule, and in the time that the described described frame section that belongs to described quiet section does not meet described quiet condition naturally, judgement belongs to the described frame section of described quiet section and whether meets the audio frequency hit condition in described the first card frame bar part;
The 4th judges submodule, and in the time that the described described frame section that belongs to described quiet section meets described audio frequency hit condition, whether the frame number that judgement meets the described frame section of described audio frequency hit condition is greater than the second predetermined threshold; In the time that described frame number is greater than described the second predetermined threshold, judges the described frame section that meets described audio frequency hit condition and meet described the first card frame bar part; In the time that described frame number is less than or equal to described the second predetermined threshold, judges the described frame section that meets described audio frequency hit condition and do not meet described the first card frame bar part.
17. devices according to claim 16, is characterized in that, described the first judge module comprises:
The 5th judges submodule, and in the time that the described described frame section that belongs to described quiet section does not meet described audio frequency hit condition, whether the described frame section that belongs to described quiet section described in judgement meets the sharp-pointed downslide/time domain truncation condition in described the first card frame bar part; In the time that the described described frame section that belongs to described quiet section does not meet described sharp-pointed downslide/time domain truncation condition, judge the described frame section that does not meet described sharp-pointed downslide/time domain truncation condition and meet described the first card frame bar part;
The 6th judges submodule, and in the time that the described described frame section that belongs to described quiet section meets described sharp-pointed downslide/time domain truncation condition, whether the frame number that judgement meets the described frame section of described sharp-pointed downslide/time domain truncation condition is greater than the 3rd predetermined threshold; In the time that described frame number is greater than described the 3rd predetermined threshold, judges the described frame section that meets described sharp-pointed downslide/time domain truncation condition and meet described the first card frame bar part; In the time that described frame number is less than or equal to described the 3rd predetermined threshold, judges the described frame section that meets described sharp-pointed downslide/time domain truncation condition and do not meet described the first card frame bar part.
18. devices according to claim 12, is characterized in that, described selected cell comprises:
The second judge module, for described in described frame Duan Buwei quiet section time, judges whether described frame section meets the second card frame bar part; In the time that described frame section does not meet described the second card frame bar part, judge the frame section that occurs card frame described in described frame Duan Buwei; In the time that described frame section meets described the second card frame bar part, judge the frame section that described frame section is described appearance card frame.
19. devices according to claim 18, is characterized in that, described the second judge module comprises:
The 7th judges submodule, for judging whether described frame section meets the stress condition of described the second card frame bar part;
The 8th judges submodule, in the time that described frame section does not meet described stress condition, judges whether described frame section meets the magnetization/mechanical sound condition in described the second card frame bar part; In the time that described frame section does not meet the magnetization/mechanical sound condition in described the second card frame bar part, judge described frame section and do not meet described the second card frame bar part.
20. devices according to claim 19, is characterized in that, described the second judge module comprises:
The 9th judges submodule, and when meeting described stress condition in described frame section or meet described magnetization/mechanical sound condition, whether the frame number that judgement belongs to described frame section is greater than the 4th predetermined threshold; In the time that described frame number is greater than described the 4th predetermined threshold, judges and belong to described frame section and meet described the second card frame bar part; In the time that described frame number is less than or equal to described the 4th predetermined threshold, judges and belong to described frame section and do not meet described the second card frame bar part.
21. according to claim 12 to the device described in any one in 20, it is characterized in that, described in search indexing unit and comprise:
Mark module, for eigenwert described in each at least one in the multiple frames of liaison of described each frame all not within corresponding threshold range, the frame segment mark of described continuous multiple frame compositions is designated as to described eigenwert and occurs abnormal frame section, wherein, the described threshold range of each correspondence in described eigenwert is identical or different.
22. devices according to claim 21, is characterized in that, described eigenwert comprise following one of at least: energy envelope value, frequency spectrum flow, spectral smoothing degree, spectrum deflection, spectrum kurtosis.
CN201410036425.8A 2014-01-24 2014-01-24 card frame detection method and device Active CN104123949B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410036425.8A CN104123949B (en) 2014-01-24 2014-01-24 card frame detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410036425.8A CN104123949B (en) 2014-01-24 2014-01-24 card frame detection method and device

Publications (2)

Publication Number Publication Date
CN104123949A true CN104123949A (en) 2014-10-29
CN104123949B CN104123949B (en) 2015-08-12

Family

ID=51769336

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410036425.8A Active CN104123949B (en) 2014-01-24 2014-01-24 card frame detection method and device

Country Status (1)

Country Link
CN (1) CN104123949B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106847307A (en) * 2016-12-21 2017-06-13 广州酷狗计算机科技有限公司 Signal detecting method and device
CN109346061A (en) * 2018-09-28 2019-02-15 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency detection, device and storage medium
CN110430425A (en) * 2019-07-31 2019-11-08 北京奇艺世纪科技有限公司 A kind of video fluency determines method, apparatus, electronic equipment and medium
CN111770413A (en) * 2020-06-30 2020-10-13 浙江大华技术股份有限公司 Multi-sound-source sound mixing method and device and storage medium
CN112802453A (en) * 2020-12-30 2021-05-14 深圳飞思通科技有限公司 Method, system, terminal and storage medium for fast self-adaptive prediction fitting voice
CN113496705A (en) * 2021-08-19 2021-10-12 杭州华橙软件技术有限公司 Audio processing method and device, storage medium and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000341322A (en) * 1999-05-25 2000-12-08 Nippon Telegr & Teleph Corp <Ntt> Stream information distributor
CN102610228A (en) * 2011-01-19 2012-07-25 上海弘视通信技术有限公司 Audio exception event detection system and calibration method for the same
CN103475906A (en) * 2012-06-08 2013-12-25 华为技术有限公司 Measuring method and measuring device for multimedia flows

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000341322A (en) * 1999-05-25 2000-12-08 Nippon Telegr & Teleph Corp <Ntt> Stream information distributor
CN102610228A (en) * 2011-01-19 2012-07-25 上海弘视通信技术有限公司 Audio exception event detection system and calibration method for the same
CN103475906A (en) * 2012-06-08 2013-12-25 华为技术有限公司 Measuring method and measuring device for multimedia flows

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李维,杨付正: "《考虑包内容特性的网络语音质量评价模型》", 《西安电子科技大学学报(自然科学版)》, vol. 38, no. 2, 30 April 2011 (2011-04-30), pages 1 *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106847307A (en) * 2016-12-21 2017-06-13 广州酷狗计算机科技有限公司 Signal detecting method and device
CN106847307B (en) * 2016-12-21 2020-07-10 广州酷狗计算机科技有限公司 Signal detection method and device
CN109346061A (en) * 2018-09-28 2019-02-15 腾讯音乐娱乐科技(深圳)有限公司 Audio-frequency detection, device and storage medium
CN109346061B (en) * 2018-09-28 2021-04-20 腾讯音乐娱乐科技(深圳)有限公司 Audio detection method, device and storage medium
CN110430425A (en) * 2019-07-31 2019-11-08 北京奇艺世纪科技有限公司 A kind of video fluency determines method, apparatus, electronic equipment and medium
CN110430425B (en) * 2019-07-31 2021-02-05 北京奇艺世纪科技有限公司 Video fluency determination method and device, electronic equipment and medium
CN111770413A (en) * 2020-06-30 2020-10-13 浙江大华技术股份有限公司 Multi-sound-source sound mixing method and device and storage medium
CN111770413B (en) * 2020-06-30 2021-08-27 浙江大华技术股份有限公司 Multi-sound-source sound mixing method and device and storage medium
CN112802453A (en) * 2020-12-30 2021-05-14 深圳飞思通科技有限公司 Method, system, terminal and storage medium for fast self-adaptive prediction fitting voice
CN112802453B (en) * 2020-12-30 2024-04-26 深圳飞思通科技有限公司 Fast adaptive prediction voice fitting method, system, terminal and storage medium
CN113496705A (en) * 2021-08-19 2021-10-12 杭州华橙软件技术有限公司 Audio processing method and device, storage medium and electronic equipment
CN113496705B (en) * 2021-08-19 2024-03-08 杭州华橙软件技术有限公司 Audio processing method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN104123949B (en) 2015-08-12

Similar Documents

Publication Publication Date Title
CN104123949B (en) card frame detection method and device
Ahmed et al. Void: A fast and light voice liveness detection system
US8005675B2 (en) Apparatus and method for audio analysis
KR20080059246A (en) Neural network classifier for separating audio sources from a monophonic audio signal
CN109065069B (en) Audio detection method, device, equipment and storage medium
Paul et al. Countermeasure to handle replay attacks in practical speaker verification systems
JP2005532582A (en) Method and apparatus for assigning acoustic classes to acoustic signals
EP2927906B1 (en) Method and apparatus for detecting voice signal
CN102394062A (en) Method and system for automatically identifying voice recording equipment source
CN103632680A (en) Speech quality assessment method, network element and system
CN105513598A (en) Playback voice detection method based on distribution of information quantity in frequency domain
CN108364656B (en) Feature extraction method and device for voice playback detection
CN111835784A (en) Data generalization method and system for replay attack detection system
CN102915740B (en) Phonetic empathy Hash content authentication method capable of implementing tamper localization
Vieira et al. A speech quality classifier based on tree-cnn algorithm that considers network degradations
CN111161746B (en) Voiceprint registration method and system
Kim et al. Enhanced perceptual model for non-intrusive speech quality assessment
CN116884427A (en) Embedded vector processing method based on end-to-end deep learning voice re-etching model
JP4761391B2 (en) Listening quality evaluation method and apparatus
CN110556114A (en) Speaker identification method and device based on attention mechanism
CN111161759B (en) Audio quality evaluation method and device, electronic equipment and computer storage medium
Borrelli et al. Automatic reliability estimation for speech audio surveillance recordings
Elizalde et al. Detection of robocall and spam calls using acoustic features of incoming voicemails
Pop et al. On forensic speaker recognition case pre-assessment
Hajipour et al. Listening to sounds of silence for audio replay attack detection

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant