CN108074586A - A kind of localization method and device of phonetic problem - Google Patents

A kind of localization method and device of phonetic problem Download PDF

Info

Publication number
CN108074586A
CN108074586A CN201611013656.2A CN201611013656A CN108074586A CN 108074586 A CN108074586 A CN 108074586A CN 201611013656 A CN201611013656 A CN 201611013656A CN 108074586 A CN108074586 A CN 108074586A
Authority
CN
China
Prior art keywords
voice data
frame
information
decoding
frame number
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611013656.2A
Other languages
Chinese (zh)
Other versions
CN108074586B (en
Inventor
王威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Academy of Telecommunications Technology CATT
Original Assignee
China Academy of Telecommunications Technology CATT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Academy of Telecommunications Technology CATT filed Critical China Academy of Telecommunications Technology CATT
Priority to CN201611013656.2A priority Critical patent/CN108074586B/en
Publication of CN108074586A publication Critical patent/CN108074586A/en
Application granted granted Critical
Publication of CN108074586B publication Critical patent/CN108074586B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/60Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Quality & Reliability (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The present invention provides a kind of localization method and device of phonetic problem, and this method may include:Voice data to be analyzed is obtained, and the voice data to be analyzed is parsed, obtains parsing voice data;Establish the correspondence for including information source frame number information of each speech frame in the parsing voice data;It searches in the parsing voice data and there are problems that phonetic problem frame;Output includes the result information of the information source frame number relation of described problem frame.The embodiment of the present invention, which can be navigated to quickly, has phonetic problem frame, to improve the speed of positioning phonetic problem.

Description

A kind of localization method and device of phonetic problem
Technical field
The present invention relates to field of communication technology more particularly to a kind of localization method and devices of phonetic problem.
Background technology
In analysing terminal phonetic problem, positioning phonetic problem often needs to analyze substantial amounts of data, could position language Mail is inscribed, that is, finds out phonetic problem.Such as:By taking narrowband as an example, sample frequency 8000Hz, 20 milliseconds (ms) is a frame, 1 second Just there are 50 frames.Tone testing is often what is counted in units of hour, and the time point for reporting problem is often even several there are several seconds Ten seconds or the time difference of a few minutes so cause to need to search for a few minutes before and after problem points when case study Phonetic problem can just be oriented, that is to say, that need to analyze thousands of data to frames up to ten thousand.And when analyzing phonetic problem now, lead to It is often that processing operation is carried out, it is necessary to which code stream analyzing to be isolated to different data types based on local soft imitative engineering.Then root then, It is constantly replaced according to code encoding/decoding mode and each information of file and adjusts the different soft imitative engineerings of encoding and decoding, operate abnormal consuming Time seriously affects work efficiency, positioning phonetic problem is caused to expend the time long.As it can be seen that phonetic problem is positioned at present In the process there are speed it is excessively slow the problem of.
The content of the invention
It is an object of the invention to provide a kind of localization method and devices of phonetic problem, solve positioning phonetic problem The problem of existing speed is excessively slow in the process.
In order to achieve the above object, the embodiment of the present invention provides a kind of localization method of phonetic problem, including:
Voice data to be analyzed is obtained, and the voice data to be analyzed is parsed, obtains parsing voice data;
Establish the correspondence for including information source frame number information of each speech frame in the parsing voice data;
It searches in the parsing voice data and there are problems that phonetic problem frame;
Output includes the result information of the information source frame number relation of described problem frame.
Optionally, the corresponding pass for including information source frame number information for establishing each speech frame in the parsing voice data System, including:
The information source frame number information of each speech frame and the timestamp information of channel protocol layers in the parsing voice data are read, And establish the correspondence of the information source frame number information of each speech frame and the timestamp information of channel protocol layers;And/or
The information source frame number information of each speech frame and the frame number information of channel protocol layers in the parsing voice data are read, and Establish the correspondence of the information source frame number information of each speech frame and the frame number information of channel protocol layers.
Optionally, there is phonetic problem frame in described search in the parsing voice data, including:
It searches when switching in the parsing voice data for progress cell switching or channel and there are problems that phonetic problem Frame, and position mark is carried out to it using label;And/or
The bad frame in the parsing voice data is searched, and is marked using bad frame indication information.
Optionally, the voice data to be analyzed is the voice number after the channel decoding of measured terminal and before source decoding According to the method further includes:
Selection target decoded mode;
Offline pulse code modulation number is obtained into row decoding to the parsing voice data using the target decoded mode According to;
The line pulse coded data of measured terminal of the offline pulse code modulation data with obtaining is compared Analysis, to determine the measured terminal with the presence or absence of decoding problem, wherein, the line pulse coded data is the tested end Hold the data after source decoding.
Optionally, the voice data to be analyzed includes:
Voice data after message sink coding and before channel coding;Or
Voice data after channel decoding and before source decoding.
The embodiment of the present invention also provides a kind of positioner of phonetic problem, including:
Parsing module for obtaining voice data to be analyzed, and parses the voice data to be analyzed, is solved Analyse voice data;
Module is established, the corresponding of information source frame number information that include for establishing each speech frame in the parsing voice data is closed System;
There is phonetic problem frame for searching in the parsing voice data in searching module;
Output module, for exporting the result information for the information source frame number relation for including described problem frame.
Optionally, it is described establish module for read it is described parsing voice data in each speech frame information source frame number information and The timestamp information of channel protocol layers, and establish the information source frame number information of each speech frame and the timestamp information of channel protocol layers Correspondence;And/or
It is described to establish module for reading the information source frame number information of each speech frame and channel association in the parsing voice data The frame number information of layer is discussed, and establishes the correspondence of the information source frame number information of each speech frame and the frame number information of channel protocol layers.
Optionally, the searching module is cut for searching in the parsing voice data to carry out cell switching or channel There are problems that phonetic problem frame when changing, and position mark is carried out to it using label;And/or
The searching module be used for search it is described parsing voice data in bad frame, and using bad frame indication information to its into Line flag.
Optionally, the voice data to be analyzed is the voice number after the channel decoding of measured terminal and before source decoding According to described device further includes:
Selecting module, for selection target decoded mode;
Decoding module, for the target decoded mode to be used into row decoding, to obtain offline the parsing voice data Pulse code modulation data;
Analysis module, for the line pulse of measured terminal of the offline pulse code modulation data with obtaining to be encoded Data compare and analyze, to determine the measured terminal with the presence or absence of decoding problem, wherein, the line pulse coded data For the data after the measured terminal source decoding.
Optionally, the voice data to be analyzed includes:
Voice data after message sink coding and before channel coding;Or
Voice data after channel decoding and before source decoding.
The above-mentioned technical proposal of the present invention at least has the advantages that:
The embodiment of the present invention obtains voice data to be analyzed, and the voice data to be analyzed is parsed, and is solved Analyse voice data;Establish the correspondence for including information source frame number information of each speech frame in the parsing voice data;Search institute It states in parsing voice data and there are problems that phonetic problem frame;Output includes the result letter of the information source frame number relation of described problem frame Breath.Due to establishing the correspondence for including information source frame number information of each speech frame, thus can be fast by the correspondence Speed, which navigates to, has phonetic problem frame, to improve the speed of positioning phonetic problem.
Description of the drawings
Fig. 1 is the model schematic for the digital baseband communication systems that the embodiment of the present invention can be applied to;
Fig. 2 is a kind of flow diagram of the localization method of phonetic problem provided in an embodiment of the present invention;
Fig. 3 is a kind of schematic diagram of reported data provided in an embodiment of the present invention;
Fig. 4 is a kind of schematic diagram of speech frame analysis provided in an embodiment of the present invention;
Fig. 5 is a kind of schematic diagram of the correspondence of speech frame provided in an embodiment of the present invention;
Fig. 6 is the schematic diagram of another speech frame analysis provided in an embodiment of the present invention;
Fig. 7 is the schematic diagram of the correspondence of another speech frame provided in an embodiment of the present invention;
Fig. 8 is the schematic diagram that a kind of software module provided in an embodiment of the present invention realizes process;
Fig. 9 is a kind of structure diagram of the positioner of phonetic problem provided in an embodiment of the present invention;
Figure 10 is a kind of structure diagram of the positioner of phonetic problem provided in an embodiment of the present invention;
Figure 11 is a kind of structure diagram of the positioner of phonetic problem provided in an embodiment of the present invention.
Specific embodiment
To make the technical problem to be solved in the present invention, technical solution and advantage clearer, below in conjunction with attached drawing and tool Body embodiment is described in detail.
Referring to Fig. 1, the model schematic for the digital baseband communication systems that Fig. 1 can be applied to for the embodiment of the present invention, such as Fig. 1 Shown, the process that voice data is transferred to the stay of two nights from information source includes:Analog-to-digital conversion, message sink coding, channel coding, transmission, Channel decoding, source decoding and digital-to-analogue conversion, wherein, often there are the additions of noise source during transmission.In addition, Information source can be understood as the publisher of voice data or voice data transmitting terminal, and the stay of two nights can be understood as connecing for voice data Receipts person or voice data receiving terminal.
In addition, in the embodiment of the present invention, information source can be user terminal, such as:Mobile phone, computer, home appliance, tablet Computer (Tablet Personal Computer), laptop computer (Laptop Computer), personal digital assistant (personal digital assistant, abbreviation PDA), mobile Internet access device (Mobile Internet Device, MID) Or the terminal devices such as wearable device (Wearable Device).Or information source can be network side equipment, such as:Base station Or server etc..Similarly, the stay of two nights can be user terminal or network side equipment.It should be noted that implement in the present invention The concrete type of information source and the stay of two nights is not limited in example.
Referring to Fig. 2, the embodiment of the present invention provides a kind of localization method of phonetic problem, as shown in Fig. 2, including following step Suddenly:
201st, voice data to be analyzed is obtained, and the voice data to be analyzed is parsed, obtains parsing voice number According to;
202nd, the correspondence for including information source frame number information of each speech frame in the parsing voice data is established;
203rd, search in the parsing voice data and there are problems that phonetic problem frame;
204th, output includes the result information of the information source frame number relation of described problem frame.
In the embodiment of the present invention, above-mentioned analysis voice data can be the voice data obtained from measured terminal, wherein, quilt It can be information source or the stay of two nights to survey terminal.And above-mentioned voice data to be analyzed can also be the language of specific position or specific time period Sound data, such as:Can by initial frame number and terminate frame number, obtain measured terminal in specific position voice data or By initial time and end time, the voice data of in measured terminal specific time can be obtained.
After above-mentioned parsing voice data is obtained, it is possible to get the information source frame number information of each speech frame, and establish each The correspondence for including information source frame number information of speech frame.Wherein, information source frame number information can be the information source frame number of speech frame, example Such as:The frame number information of information source pulse code modulation (Pulse Code Modulation, PCM) data.In addition, information source frame number can To be frame number when information source generates voice signal, the frame number is corresponding with information source, and letter can be quickly navigated to by the information source frame number Source problem.In addition, the above-mentioned correspondence including information source frame number information can be understood as that information source frame number information can be found out Correspondence, such as:Information source frame number information and the correspondence of time or the frame number of information source frame number information and channel protocol layers Correspondence of information etc., it is, of course, also possible to be the correspondence that other can find out information source frame number information, to this present invention Embodiment is not construed as limiting.
In the embodiment of the present invention, search in the parsing voice data and there are problems that phonetic problem frame can be by right Each speech frame in parsing voice data is analyzed, such as:The power spectrum of each speech frame is analyzed or can be to each The methods of tonequality of speech frame is analyzed, which finds, has phonetic problem frame, this embodiment of the present invention is not limited It is fixed.In addition, above-mentioned phonetic problem can be voice staccato, fall word or the phonetic problems such as unintelligible, to this embodiment of the present invention It is not construed as limiting.Such as:As shown in figure 3, reporting the sampling point data of problem time frame number, which can include information source frame number, sample Point value and report problem time point.Data shown in Fig. 3 are carried out with case study, analysis may refer to shown in Fig. 4, pass through figure 4 can directly orient the situation of problem points speech frame according to sample value, big in the voice data of thousands of frames up to ten thousand It is big to shorten the orientation problem time, improve work efficiency.
After above problem frame is found, it is possible to the result information of the information source frame number relation of problem frame is exported, certainly, this In output can be the result information of independent output problem frame or the result information of all speech frames of output, to this not It is construed as limiting.In addition, the mode of output includes but not limited to the modes such as display, printing.
It should be noted that in the embodiment of the present invention, the execution sequence of step 202 and step 203 is not construed as limiting, example Such as:Can perform simultaneously or can be successively perform.
It can quickly be navigated to using above-mentioned correspondence by above-mentioned steps and there are problems that phonetic problem frame, to carry The speed of height positioning phonetic problem.
Optionally, the corresponding pass for including information source frame number information for establishing each speech frame in the parsing voice data System, including:
The information source frame number information of each speech frame and the timestamp information of channel protocol layers in the parsing voice data are read, And establish the correspondence of the information source frame number information of each speech frame and the timestamp information of channel protocol layers;And/or
The information source frame number information of each speech frame and the frame number information of channel protocol layers in the parsing voice data are read, and Establish the correspondence of the information source frame number information of each speech frame and the frame number information of channel protocol layers.
Wherein, when the timestamp information of above-mentioned channel protocol layers can be that voice signal carries out channel coding or decoding Timestamp information.In addition, correspondence here is one-to-one relationship, i.e., a speech frame corresponds to an information source frame number information With the correspondence of the timestamp information of channel protocol layers.So reporting the time point of problem can just be found by timestamp information Corresponding information source problem points frame number quickly positions, quickly to navigate to problem points.
Wherein, the frame number information of above-mentioned channel protocol layers can be frame when voice signal carries out channel coding or decoding Number information.Similarly, here it is also one-to-one relationship.Pass through the information source frame number information of above-mentioned speech frame and the frame of channel protocol layers The correspondence of number information, it is possible to exactitude position will be realized between channel protocol layers problem points and information source problem points, it is so logical The problem of information source is found is crossed, information source analysis has no problem, can also provide corresponding exact position and continue to analyze to channel so that information source It is connected between channel, fast lifting problem location judges accuracy.
It should be noted that in the embodiment of the present invention, above two correspondence can be established, i.e., the situation of above-mentioned sum, So that during orientation problem more conveniently and quickly.Such as:As shown in figure 5, for each speech frame, can establish including letter The correspondence of the timestamp information of source frame number, the frame number of channel protocol layers and channel protocol layers, and the correspondence can be with Including sampled point, coding and decoding mode and bad frame instruction (BFI) wherein, bad frame is represented when BFI is 1, it is, of course, also possible to including other Information is not construed as limiting this embodiment of the present invention.
Optionally, in the embodiment, there are problems that phonetic problem frame in the above-mentioned lookup parsing voice data, wrap It includes:
It searches when switching in the parsing voice data for progress cell switching or channel and there are problems that phonetic problem Frame, and position mark is carried out to it using label;And/or
The bad frame in the parsing voice data is searched, and is marked using bad frame indication information.
Wherein, above-mentioned cell switching or channel switching can pass through the frame number information for the channel protocol layers for analyzing speech frame Variation be determined, such as:As shown in figure 5, the frame number information of the channel protocol layers of a speech frame be 1376854, then under The frame number information of one speech frame is 217701, so as to which the frame number information of the channel protocol layers by this two speech frame is assured that Speech frame when this two frame is carries out cell switching or channel switching is the above problem if this two frame is there are phonetic problem Frame directly determines that this two frame is problem frame.When because terminal movement is advanced, the switching of generation cell or channel switch, at this time It is possible that situations such as causing voice staccato, falling word, is unintelligible, some information of switching occur can also change, so as to root Label indicator bit is provided automatically according to these handover information situations to put, channel time stamp and information source frame number also can be corresponded to clearly, soon Speed automatically analyzes positioning, and most of this problem can be fallen with fast filtering, to improve the efficiency of positioning problems analysis.
Wherein, the bad frame in the above-mentioned lookup parsing voice data can carry out bad frame analysis by each speech frame to look into The bad frame found, usual staccato, fall the voice quality problems such as word, unintelligible often majority of case be due to caused by bad frame, Such as:Can by staccato, fall the voice quality problems such as word, unintelligible and be defined as bad frame.It specifically can be with as shown in figure 5, every Bad frame instruction is added in the correspondence of frame, so as to which the bad frame of all code encoding/decoding modes be indicated and channel time stamp and information source frame Number correspondence, to check the fine or not situation analysis voice quality of each frame data and channel situation.
In the embodiment of the present invention, by the information shown in Figure 5, questions and prospect is further found out so as to realize, It undergos mutation for the frame number of this example this period channel, illustrates that cell switching also has occurred in terminal at this time, and have many Bad frame causes problem points staccato serious.If still wanting to give farther insight into reason, can thus problem information source frame number correspond to Channel time stamp, continue to analyze in respective channel protocol layer, problem points position is accurate, the quick analyzing and positioning of problem of implementation.
In addition, in the embodiment of the present invention, above-mentioned cell switching or channel switching can also be by analyzing the volume of speech frame The variation of number mode determines, such as:Exemplified by ring-back tone noise as shown in Figure 6, believed by the correspondence of each frame shown in Fig. 7 Breath, it is possible to determine that speech frame is switched to enhanced full rate compiling from coding mode adaptive multi-rate coder (AMR) Code device (EFR) so that it is determined that channel, which switches and results in voice with bad frame, noise quality problems occurs, can quickly divide Many of this sort phonetic problems are disposed in analysis, also can quickly be positioned to go wrong and be led by information source layer or channel network layer It causes, understands problem source, accelerate Resolving probiems positioning.
Optionally, the voice data to be analyzed is the voice number after the channel decoding of measured terminal and before source decoding According to the method further includes:
Selection target decoded mode;
Offline pulse code modulation number is obtained into row decoding to the parsing voice data using the target decoded mode According to;
The line pulse coded data of measured terminal of the offline pulse code modulation data with obtaining is compared Analysis, to determine the measured terminal with the presence or absence of decoding problem, wherein, the line pulse coded data is the tested end Hold the data after source decoding.
In the embodiment, it can flexibly select decoded mode to speech frame into row decoding, wherein, above-mentioned target decoding side Formula can be full rate codecs (FR), half rate coder (HR), enhanced full rate coder (EFR), narrowband Other decoding sides such as adaptive multi-rate coder (AMR-NB) or wideband adaptive multi tate coder (AMR-WB) Formula.Such as:When the coding mode of speech frame is FR, then FR decoded modes can be selected to speech frame into row decoding,
After offline PCM data is obtained, it is possible to by its online PCM with measured terminal
Data are compared, and may thereby determine that measured terminal with the presence or absence of decoding problem, such as:When offline PCM data When being mismatched with online PCM, then it can determine that measured terminal has decoding, it is on the contrary, it is determined that there is no translate measured terminal Code problem.When decoding problem is not present in definite measured terminal, corresponding accurate position can be provided by above-mentioned correspondence It puts and continues to analyze to channel so that connected between information source and channel, fast lifting problem location judges accuracy.
It should be noted that the data after above-mentioned measured terminal source decoding can be the data decoded when terminal is online, Carry out the data after source decoding during transmission.Can be by the offline PCM data of identical frame in addition, when comparing It is compared with online PCM data, to determine that measured terminal whether there is decoding problem.In this scenario, above-mentioned measured terminal It can be the stay of two nights.
Optionally, in the embodiment of the present invention, voice data to be analyzed can include:After message sink coding and before channel coding Voice data;Or the voice data after channel decoding and before source decoding.
It can quickly be positioned after message sink coding by the voice data after above-mentioned message sink coding and before channel coding and channel The problem of voice data before coding, and can determine measured terminal with the presence or absence of encoded question by these data, i.e., it is true Determine information source with the presence or absence of encoded question or determine whether for channeling side the problem of.And by after above-mentioned channel decoding and information source Voice data before decoding can quickly voice data after locating channel decoding and before source decoding the problem of and can be with Determine that measured terminal with the presence or absence of decoding problem, that is, determines the stay of two nights with the presence or absence of decoding problem or determines whether for channeling side The problem of.
Optionally, in the embodiment of the present invention, the above method can be realized by special software, such as:The software can To include:Input-output file module, code stream analyzing and case study module, decoding module and printing display module, wherein, this The realization process of four modules may refer to Fig. 8, and illustrating for module can be as follows:
Input-output file module, which is mainly responsible for, to be imported by control button or preserves corresponding file and type, and is shown Corresponding file path name;Corresponding export file name need not be filled in for code stream analyzing fuction output file, the present invention can be automatic The different files parsed after separating are stored in corresponding input file in the file under path according to numeric order.
The input file type of code stream analyzing analysis has before source decoding data, information source here after data and message sink coding Data can be understood as the voice data after the channel decoding introduced in earlier embodiments and before source decoding before decoding, and believe Data can be understood as the voice data after message sink coding and before channel coding after source code.Wherein, code stream analyzing step can be with Including:
Input file is imported, which can click on button and select corresponding input file;Due to output file meeting There is different type, so output file does not have to operation, the output file of generation can be automatically saved in the same path text of input file In part folder.
Select data type, selected data type can be number after data or message sink coding before source decoding According to;
After data type chooses, it is possible to data are parsed and case study, wherein, and the number that will have been parsed According to being stored in in input file identical file folder, code stream analyzing is completed.Wherein, it is real to may refer to front for parsing and case study The description of mode is applied, is not repeated herein.
Decoding module can realize following steps:
Selection imports the data type import file name for needing to decode;
Preserve decoded corresponding export file name;
Decoding can be both automatically performed by clicking on corresponding button according to input file code/decode type.
Wherein, communication terminal decodes fundamental type offline AMR-WB, AMR-NB, EFR, HR and FR, corresponding other required Code/decode type also is added in the interface tool of software, meets routine work needs.
The present invention can realize that offline decoding is decoded into PCM according to input file realization is fast automatic by special software Data, easy to operate, off-line data with online PCM data comparative analysis, checks whether end side itself encoding and decoding are abnormal again, if The analysis of information source side does not note abnormalities, that is just given to channeling side and continues to analyze, and passes through the channel time stamp and information source frame number of parsing Correspondence, be also easily found respective channel side problem points.
Printing display module can provide the printout information of each step in tool operation and miscue information, Check understanding in time convenient for user.
It can be realized in addition to voice professional domain use, such as channel agreement by special software in the embodiment of the present invention The personnel such as layer, dependence test also can routine use, it is simple to operate.The operation system of terminal can be realized by an installation kit General under system to install and use, the code outflow for also avoiding incorporated business reaches security requirements.
It should be noted that in the embodiment of the present invention, can be combined with each other realization in the embodiment of above-mentioned introduction, also may be used To be implemented separately, this embodiment of the present invention is not construed as limiting.
It should be noted that in the embodiment of the present invention, the above method can be applied to the terminal for arbitrarily possessing installation software, Such as:The smart machines such as computer, laptop, tablet computer.
The embodiment of the present invention obtains voice data to be analyzed, and the voice data to be analyzed is parsed, and is solved Analyse voice data;Establish the correspondence for including information source frame number information of each speech frame in the parsing voice data;Search institute It states in parsing voice data and there are problems that phonetic problem frame;Output includes the result letter of the information source frame number relation of described problem frame Breath.Due to establishing the correspondence for including information source frame number information of each speech frame, thus can be fast by the correspondence Speed, which navigates to, has phonetic problem frame, to improve the speed of positioning phonetic problem.
Referring to Fig. 9, the embodiment of the present invention provides a kind of positioner of phonetic problem, as shown in figure 9, phonetic problem is determined Position device 900 is included with lower module:
Parsing module 901 for obtaining voice data to be analyzed, and parses the voice data to be analyzed, obtains To parsing voice data;
Module 902 is established, for establishing pair for including information source frame number information of each speech frame in the parsing voice data It should be related to;
There is phonetic problem frame for searching in the parsing voice data in searching module 903;
Output module 904, for exporting the result information for the information source frame number relation for including described problem frame.
Optionally, establish module 902 for read it is described parsing voice data in each speech frame information source frame number information and The timestamp information of channel protocol layers, and establish the information source frame number information of each speech frame and the timestamp information of channel protocol layers Correspondence;And/or
It is described to establish module 902 for reading the information source frame number information and channel of each speech frame in the parsing voice data The frame number information of protocol layer, and establish the corresponding pass of the information source frame number information of each speech frame and the frame number information of channel protocol layers System.
Optionally, searching module 903 is cut for searching in the parsing voice data to carry out cell switching or channel There are problems that phonetic problem frame when changing, and position mark is carried out to it using label;And/or
The searching module 903 is used to search the bad frame in the parsing voice data, and uses bad frame indication information pair It is marked.
Optionally, the voice data to be analyzed is the voice number after the channel decoding of measured terminal and before source decoding According to as shown in Figure 10, described device further includes:
Selecting module 905, for selection target decoded mode;
Decoding module 906, for using the target decoded mode to the parsing voice data into row decoding, obtain from Line pulse code modulation data;
Analysis module 907, for by the offline pulse code modulation data with obtain measured terminal line pulse Coded data compares and analyzes, to determine the measured terminal with the presence or absence of decoding problem, wherein, the line pulse coding Data are the data after the measured terminal source decoding.
Optionally, output module 904 is additionally operable to the analysis result of output analysis module 907.
Optionally, the voice data to be analyzed includes:
Voice data after message sink coding and before channel coding;Or
Voice data after channel decoding and before source decoding.
It should be noted that in the embodiment of the present invention, above device can be applied to or can arbitrarily possess installation The terminal of software, such as:The smart machines such as computer, laptop, tablet computer.
It should be noted that the positioner 900 of phonetic problem can realize Fig. 2 in the embodiment of the present invention in the present embodiment Arbitrary embodiment and reach identical advantageous effect in shown embodiment of the method, details are not described herein again.
Referring to Figure 11, a kind of structure of the positioner of phonetic problem, the positioner bag of the phonetic problem are shown in figure It includes:Processor 1100, transceiver 1110, memory 1120, user interface 1130 and bus interface, wherein:
Processor 1100 for reading the program in memory 1120, performs following process:
Voice data to be analyzed is obtained, and the voice data to be analyzed is parsed, obtains parsing voice data;
Establish the correspondence for including information source frame number information of each speech frame in the parsing voice data;
It searches in the parsing voice data and there are problems that phonetic problem frame;
Output includes the result information of the information source frame number relation of described problem frame.
Wherein, transceiver 1110, for sending and receiving data under the control of processor 1100.
In fig. 11, bus architecture can include the bus and bridge of any number of interconnection, specifically by 1100 generation of processor The various circuits for the memory that the one or more processors and memory 1120 of table represent link together.Bus architecture may be used also Various other circuits of such as peripheral equipment, voltage-stablizer and management circuit or the like to be linked together, these are all It is known in the art, therefore, no longer it is described further herein.Bus interface provides interface.Transceiver 1110 can To be multiple element, i.e., including transmitter and receiver, provide the list for communicating over a transmission medium with various other devices Member.For different user equipmenies, user interface 1130, which can also be, external the interface for needing equipment is inscribed, and connection is set Standby including but not limited to keypad, display, loud speaker, microphone, control stick etc..
Processor 1100 is responsible for bus architecture and common processing, and memory 1120 can store processor 1100 and exist Used data when performing operation.
Optionally, the corresponding pass for including information source frame number information for establishing each speech frame in the parsing voice data System, including:
The information source frame number information of each speech frame and the timestamp information of channel protocol layers in the parsing voice data are read, And establish the correspondence of the information source frame number information of each speech frame and the timestamp information of channel protocol layers;And/or
The information source frame number information of each speech frame and the frame number information of channel protocol layers in the parsing voice data are read, and Establish the correspondence of the information source frame number information of each speech frame and the frame number information of channel protocol layers.
Optionally, there is phonetic problem frame in described search in the parsing voice data, including:
It searches when switching in the parsing voice data for progress cell switching or channel and there are problems that phonetic problem Frame, and position mark is carried out to it using label;And/or
The bad frame in the parsing voice data is searched, and is marked using bad frame indication information.
Optionally, the voice data to be analyzed is the voice number after the channel decoding of measured terminal and before source decoding According to processor 1100 is additionally operable to:
Selection target decoded mode;
Offline pulse code modulation number is obtained into row decoding to the parsing voice data using the target decoded mode According to;
The line pulse coded data of measured terminal of the offline pulse code modulation data with obtaining is compared Analysis, to determine the measured terminal with the presence or absence of decoding problem, wherein, the line pulse coded data is the tested end Hold the data after source decoding.
Optionally, the voice data to be analyzed includes:
Voice data after message sink coding and before channel coding;Or
Voice data after channel decoding and before source decoding.
It should be noted that the positioner of above-mentioned phonetic problem can be realized in the embodiment of the present invention and schemed in the present embodiment Arbitrary embodiment and reach identical advantageous effect in embodiment of the method shown in 2, details are not described herein again.
In several embodiments provided herein, it should be understood that disclosed method and apparatus, it can be by other Mode realize.For example, the apparatus embodiments described above are merely exemplary, for example, the division of the unit, only For a kind of division of logic function, there can be other dividing mode in actual implementation, such as multiple units or component can combine Or it is desirably integrated into another system or some features can be ignored or does not perform.Another, shown or discussed phase Coupling, direct-coupling or communication connection between mutually can be by some interfaces, the INDIRECT COUPLING or communication of device or unit Connection can be electrical, machinery or other forms.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, it can also That the independent physics of unit includes, can also two or more units integrate in a unit.Above-mentioned integrated list The form that hardware had both may be employed in member is realized, can also be realized in the form of hardware adds SFU software functional unit.
The above-mentioned integrated unit realized in the form of SFU software functional unit, can be stored in one and computer-readable deposit In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, is used including some instructions so that a computer Equipment (can be personal computer, server or the network equipment etc.) performs receiving/transmission method described in each embodiment of the present invention Part steps.And foregoing storage medium includes:USB flash disk, mobile hard disk, read-only memory (Read-Only Memory, abbreviation ROM), random access memory (Random Access Memory, abbreviation RAM), magnetic disc or CD etc. are various to store The medium of program code.
The above is the preferred embodiment of the present invention, it is noted that for those skilled in the art For, without departing from the principles of the present invention, several improvements and modifications can also be made, these improvements and modifications It should be regarded as protection scope of the present invention.

Claims (10)

1. a kind of localization method of phonetic problem, which is characterized in that including:
Voice data to be analyzed is obtained, and the voice data to be analyzed is parsed, obtains parsing voice data;
Establish the correspondence for including information source frame number information of each speech frame in the parsing voice data;
It searches in the parsing voice data and there are problems that phonetic problem frame;
Output includes the result information of the information source frame number relation of described problem frame.
2. the method as described in claim 1, which is characterized in that the bag for establishing each speech frame in the parsing voice data The correspondence of information source frame number information is included, including:
The information source frame number information of each speech frame and the timestamp information of channel protocol layers in the parsing voice data are read, and is built Found the correspondence of the information source frame number information of each speech frame and the timestamp information of channel protocol layers;And/or
The information source frame number information of each speech frame and the frame number information of channel protocol layers in the parsing voice data are read, and is established The correspondence of the information source frame number information of each speech frame and the frame number information of channel protocol layers.
3. method as claimed in claim 2, which is characterized in that there are phonetic problems in the lookup parsing voice data The problem of frame, including:
It searches to there is phonetic problem frame when carrying out cell switching or channel switching in the parsing voice data, and Position mark is carried out to it using label;And/or
The bad frame in the parsing voice data is searched, and is marked using bad frame indication information.
4. such as the method any one of claim 1-3, which is characterized in that the voice data to be analyzed is measured terminal Channel decoding after and source decoding before voice data, the method further includes:
Selection target decoded mode;
Offline pulse code modulation data is obtained into row decoding to the parsing voice data using the target decoded mode;
The line pulse coded data of measured terminal of the offline pulse code modulation data with obtaining is compared and analyzed, To determine the measured terminal with the presence or absence of decoding problem, wherein, the line pulse coded data is believed for the measured terminal Data after the decoding of source.
5. such as the method any one of claim 1-3, which is characterized in that the voice data to be analyzed includes:
Voice data after message sink coding and before channel coding;Or
Voice data after channel decoding and before source decoding.
6. a kind of positioner of phonetic problem, which is characterized in that including:
Parsing module for obtaining voice data to be analyzed, and parses the voice data to be analyzed, obtains parsing language Sound data;
Module is established, for establishing the correspondence for including information source frame number information of each speech frame in the parsing voice data;
There is phonetic problem frame for searching in the parsing voice data in searching module;
Output module, for exporting the result information for the information source frame number relation for including described problem frame.
7. device as claimed in claim 6, which is characterized in that described to establish module for reading in the parsing voice data The information source frame number information of each speech frame and the timestamp information of channel protocol layers, and establish the information source frame number information of each speech frame with The correspondence of the timestamp information of channel protocol layers;And/or
It is described to establish module for reading the information source frame number information and channel protocol layers of each speech frame in the parsing voice data Frame number information, and establish the correspondence of the information source frame number information of each speech frame and the frame number information of channel protocol layers.
8. device as claimed in claim 7, which is characterized in that the searching module is used to search in the parsing voice data To there is phonetic problem frame when carrying out cell switching or channel switching, and position mark is carried out to it using label; And/or
The searching module is used to searching the bad frame in the parsing voice data, and using bad frame indication information to it into rower Note.
9. such as the device any one of claim 6-8, which is characterized in that the voice data to be analyzed is measured terminal Channel decoding after and source decoding before voice data, described device further includes:
Selecting module, for selection target decoded mode;
Decoding module, for the target decoded mode to be used, into row decoding, to obtain offline pulse to the parsing voice data Coding modulation data;
Analysis module, for by the offline pulse code modulation data with obtain measured terminal line pulse coded data It compares and analyzes, to determine the measured terminal with the presence or absence of decoding problem, wherein, the line pulse coded data is institute State the data after measured terminal source decoding.
10. such as the device any one of claim 6-8, which is characterized in that the voice data to be analyzed includes:
Voice data after message sink coding and before channel coding;Or
Voice data after channel decoding and before source decoding.
CN201611013656.2A 2016-11-15 2016-11-15 Method and device for positioning voice problem Active CN108074586B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611013656.2A CN108074586B (en) 2016-11-15 2016-11-15 Method and device for positioning voice problem

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611013656.2A CN108074586B (en) 2016-11-15 2016-11-15 Method and device for positioning voice problem

Publications (2)

Publication Number Publication Date
CN108074586A true CN108074586A (en) 2018-05-25
CN108074586B CN108074586B (en) 2021-02-12

Family

ID=62160222

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611013656.2A Active CN108074586B (en) 2016-11-15 2016-11-15 Method and device for positioning voice problem

Country Status (1)

Country Link
CN (1) CN108074586B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5928379A (en) * 1996-06-28 1999-07-27 Nec Corporation Voice-coded data error processing apparatus and method
US20040204935A1 (en) * 2001-02-21 2004-10-14 Krishnasamy Anandakumar Adaptive voice playout in VOP
US20040260542A1 (en) * 2000-04-24 2004-12-23 Ananthapadmanabhan Arasanipalai K. Method and apparatus for predictively quantizing voiced speech with substraction of weighted parameters of previous frames
CN102034476A (en) * 2009-09-30 2011-04-27 华为技术有限公司 Methods and devices for detecting and repairing error voice frame
CN102143519A (en) * 2010-02-01 2011-08-03 中兴通讯股份有限公司 Device and method for positioning voice transmission faults
CN102148665A (en) * 2011-05-25 2011-08-10 电子科技大学 Decoding method for LT (language translation) codes
CN102595498A (en) * 2007-09-17 2012-07-18 华为技术有限公司 Methods and systems for processing uploaded and downloaded data in wireless communication network
CN104538041A (en) * 2014-12-11 2015-04-22 深圳市智美达科技有限公司 Method and system for detecting abnormal sounds
CN104685564A (en) * 2012-11-13 2015-06-03 华为技术有限公司 Voice problem detection method and network element device applied to voice communication network system
CN105338148A (en) * 2014-07-18 2016-02-17 华为技术有限公司 Method and device for detecting audio signal according to frequency domain energy
CN105374367A (en) * 2014-07-29 2016-03-02 华为技术有限公司 Abnormal frame detecting method and abnormal frame detecting device
CN105378831A (en) * 2013-06-21 2016-03-02 弗朗霍夫应用科学研究促进协会 Apparatus and method for improved signal fade out for switched audio coding systems during error concealment
US20160275964A1 (en) * 2015-03-20 2016-09-22 Electronics And Telecommunications Research Institute Feature compensation apparatus and method for speech recogntion in noisy environment

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5928379A (en) * 1996-06-28 1999-07-27 Nec Corporation Voice-coded data error processing apparatus and method
US20040260542A1 (en) * 2000-04-24 2004-12-23 Ananthapadmanabhan Arasanipalai K. Method and apparatus for predictively quantizing voiced speech with substraction of weighted parameters of previous frames
US20040204935A1 (en) * 2001-02-21 2004-10-14 Krishnasamy Anandakumar Adaptive voice playout in VOP
CN102595498A (en) * 2007-09-17 2012-07-18 华为技术有限公司 Methods and systems for processing uploaded and downloaded data in wireless communication network
CN102034476A (en) * 2009-09-30 2011-04-27 华为技术有限公司 Methods and devices for detecting and repairing error voice frame
CN102143519A (en) * 2010-02-01 2011-08-03 中兴通讯股份有限公司 Device and method for positioning voice transmission faults
CN102148665A (en) * 2011-05-25 2011-08-10 电子科技大学 Decoding method for LT (language translation) codes
CN104685564A (en) * 2012-11-13 2015-06-03 华为技术有限公司 Voice problem detection method and network element device applied to voice communication network system
CN105378831A (en) * 2013-06-21 2016-03-02 弗朗霍夫应用科学研究促进协会 Apparatus and method for improved signal fade out for switched audio coding systems during error concealment
CN105338148A (en) * 2014-07-18 2016-02-17 华为技术有限公司 Method and device for detecting audio signal according to frequency domain energy
CN105374367A (en) * 2014-07-29 2016-03-02 华为技术有限公司 Abnormal frame detecting method and abnormal frame detecting device
CN104538041A (en) * 2014-12-11 2015-04-22 深圳市智美达科技有限公司 Method and system for detecting abnormal sounds
US20160275964A1 (en) * 2015-03-20 2016-09-22 Electronics And Telecommunications Research Institute Feature compensation apparatus and method for speech recogntion in noisy environment

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LINFANG WANG: "Multi-Level Error Detection and Concealment Algorithm to Improve Speech Quality in GSM Full Rate Speech Codecs", 《TSINGHUA SCIENCE AND TECHNOLOGY》 *
俞兆强: "接入网语音故障分析处理", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Also Published As

Publication number Publication date
CN108074586B (en) 2021-02-12

Similar Documents

Publication Publication Date Title
CN109599093B (en) Intelligent quality inspection keyword detection method, device and equipment and readable storage medium
CN110221969A (en) A kind of page function test method and relevant apparatus
CN101236523B (en) Input method test method and device
CN109313896A (en) Expansible dynamic class Language Modeling
JP2019502144A (en) Audio information processing method and device
CN102822889B (en) Pre-saved data compression for tts concatenation cost
CN106874259A (en) A kind of semantic analysis method and device, equipment based on state machine
CN101807399A (en) Voice recognition method and device
CN108597538B (en) Evaluation method and system of speech synthesis system
CN112528663B (en) Text error correction method and system in power grid field scheduling scene
CN112116903A (en) Method and device for generating speech synthesis model, storage medium and electronic equipment
CN1714390B (en) Speech recognition device and method
CN110289015A (en) A kind of audio-frequency processing method, device, server, storage medium and system
CN101222703A (en) Identity verification method for mobile terminal based on voice identification
CA2350751C (en) Mitigating errors in a distributed speech recognition process
Mandel et al. Audio super-resolution using concatenative resynthesis
US20240045752A1 (en) Methods and apparatuses for troubleshooting a computer system
CN108074586A (en) A kind of localization method and device of phonetic problem
Bilmes Data-driven extensions to HMM statistical dependencies.
WO2023050967A1 (en) System abnormality detection processing method and apparatus
CN103474063B (en) Voice identification system and method
CN100389627C (en) Testing equipment of short message interface
CN109213466B (en) Court trial information display method and device
WO2023108459A1 (en) Training and using a deep learning model for transcript topic segmentation
CN109559753B (en) Speech recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant