WO2021031505A1 - 音频标注的检错方法、装置、计算机设备和存储介质 - Google Patents
音频标注的检错方法、装置、计算机设备和存储介质 Download PDFInfo
- Publication number
- WO2021031505A1 WO2021031505A1 PCT/CN2019/130444 CN2019130444W WO2021031505A1 WO 2021031505 A1 WO2021031505 A1 WO 2021031505A1 CN 2019130444 W CN2019130444 W CN 2019130444W WO 2021031505 A1 WO2021031505 A1 WO 2021031505A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- error detection
- word
- text
- detection information
- word sequence
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 40
- 238000001514 detection method Methods 0.000 claims abstract description 279
- 238000013528 artificial neural network Methods 0.000 claims description 37
- 238000004590 computer program Methods 0.000 claims description 35
- 230000011218 segmentation Effects 0.000 claims description 9
- 230000008569 process Effects 0.000 abstract description 12
- 238000012937 correction Methods 0.000 abstract description 3
- 238000002372 labelling Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 7
- 238000012549 training Methods 0.000 description 5
- 238000013136 deep learning model Methods 0.000 description 4
- 230000008859 change Effects 0.000 description 3
- 230000001360 synchronised effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Definitions
- This application relates to the technical field of text processing, and in particular to an audio tagging error detection method, device, computer equipment and storage medium.
- ASR Automatic Speech Recognition
- taggers need to process a large amount of audio data every day, and they are prone to tagging errors in repeated and boring tagging tasks. Even if there are reviewers who review the annotation results, they may get wrong training samples, making the trained deep learning model not accurate enough.
- an embodiment of the present invention provides an error detection method for audio annotation, the method including:
- error detection information is generated based on the wrong words; the wrong words are words that are not recorded in the correct vocabulary.
- the foregoing generation of error detection information based on the error words includes:
- generating error detection information when it is determined through error detection that at least one of a word in the labeled text is wrong and a sentence in the labeled text is wrong, generating error detection information includes:
- the first word sequence composed of multiple words included in the labeled text is input into the pre-trained neural network error detection model to obtain the probability information corresponding to the first word sequence output by the neural network error detection model; the probability information is used to indicate the word sequence Right probability
- error detection information including multiple reference words is generated.
- the method further includes:
- the probability information corresponding to the first word sequence is not lower than the preset probability value, stop outputting error detection information, and add the wrong word to the correct word list.
- the error detection information is generated, including:
- an embodiment of the present invention provides an audio tagging error detection device, which includes:
- Annotated text acquisition module which is used to acquire annotated text obtained by annotator after annotating audio data
- Error detection information output module used to output error detection information.
- the word segmentation sub-module is used to segment the labeled text to obtain multiple words included in the labeled text;
- the word search submodule is used to search for each word included in the labeled text in the correct word list established in advance;
- the above-mentioned first error detection information generating submodule is specifically used to find multiple reference words in the correct vocabulary; the editing distance between the reference word and the wrong word is within a preset editing distance, and the editing distance includes pinyin At least one of edit distance and vocabulary edit distance; generating error detection information containing multiple reference words.
- the above-mentioned error detection module includes:
- the probability information output sub-module is used to input the first word sequence composed of multiple words included in the labeled text into the pre-trained neural network error detection model to obtain the probability information corresponding to the first word sequence output by the neural network error detection model ; Probability information is used to indicate the probability that the word sequence is correct;
- the second error detection information generating sub-module is specifically configured to replace the error words with multiple reference words respectively when the probability information corresponding to the first word sequence is lower than the preset probability value, to obtain multiple The second word sequence; input multiple second word sequences into the neural network error detection model to obtain the probability information corresponding to each second word sequence; according to the correspondence between the reference word and the second word sequence and each second word sequence The corresponding probability information generates error detection information containing multiple reference words.
- the first stop output module is configured to stop outputting error detection information if the probability information corresponding to the first word sequence is not lower than the preset probability value, and add the wrong word to the correct word list.
- the third error detection information generation sub-module is configured to generate error detection information if the number of search results is less than the preset number.
- the second stop output module is used to stop outputting error detection information if the number of search results is not less than the preset number, and add the wrong words to the correct vocabulary.
- an embodiment of the present invention provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps in the above method are implemented.
- the above-mentioned audio tagging error detection method, device, computer equipment and storage medium are used to obtain annotated text obtained by annotator after annotating audio data; error detection is performed on the annotated text, and when the error detection determines that the words in the annotated text are wrong and When at least one of the errors in the sentence in the annotation text occurs, error detection information is generated; the error detection information is output.
- the terminal detects errors in the annotated text during the process of annotating the audio data by the annotator. If an error occurs, it generates error detection information and prompts the annotator so that the annotator can make corrections in time, thereby improving the quality of the annotation , Thereby improving the quality of training samples.
- FIG. 1 is an application environment diagram of an audio tagging error detection method in an embodiment
- FIG. 2 is a schematic flowchart of an error detection method for audio annotation in an embodiment
- FIG. 3 is one of the flowcharts of the step of generating error detection information when it is determined by error detection that there is an error in the labeled text in an embodiment
- FIG. 4 is a second schematic diagram of the flow of the step of generating error detection information when it is determined by error detection that an error occurs in the labeled text in an embodiment
- FIG. 6 is a schematic flowchart of an error detection method for audio labeling in another embodiment
- Figure 7 is a structural block diagram of an audio tagging error detection device in an embodiment
- Figure 8 is an internal structure diagram of a computer device in an embodiment.
- an error detection method for audio annotation is provided. Taking the method applied to the terminal in FIG. 1 as an example for description, the method includes the following steps:
- the annotator when annotating the audio data, the annotator inputs the annotation text corresponding to the audio data into the terminal. Specifically, the terminal detects that the annotator enters the annotation text in the text box, and if the annotation text does not change for more than a preset time period, it is determined that the audio data annotation is completed.
- an annotator enters "its clothes are missing” in the text box, and if the annotation text does not change for more than 500 milliseconds, the corresponding annotation text "its clothes are missing" is obtained corresponding to the audio data.
- the embodiment of the present invention does not limit the preset duration in detail, and can be set according to actual conditions.
- error detection is performed on the annotated text. Specifically, check whether there are errors in words or sentences in the labeled text. If there are errors in the words in the annotated text, or errors in the statements in the annotated text, or errors in the words and statements in the annotated text, error detection information is generated.
- the error detection information can be a prompt to replace "its" with "his" or "hers”.
- Step 103 Output the error detection information.
- the error detection information is output, so as to remind the tagger in real time during the tagging process of the tagger. For example, display "his" and “her” on the terminal to prompt the annotator that "it” has made an error.
- the embodiment of the present invention does not limit the display mode in detail, and can be set according to actual conditions.
- the annotation text obtained by the annotator after annotating the audio data is obtained; error detection is performed on the annotation text, and when the error detection determines that the words in the annotation text are incorrect and the sentences in the annotation text are incorrect
- the error detection information is output.
- the terminal detects errors in the annotated text during the process of annotating the audio data by the annotator. If an error occurs, it generates error detection information and prompts the annotator so that the annotator can make corrections in time, thereby improving the quality of the annotation , Thereby improving the quality of training samples.
- this embodiment relates to generating error detection information when it is determined through error detection that at least one of an error in a word in the labeled text and an error in a sentence in the labeled text is generated
- An optional process may specifically include the following steps:
- Step 201 Perform word segmentation on the labeled text to obtain multiple words included in the labeled text.
- Step 202 Search for each word included in the labeled text in the correct vocabulary established in advance.
- a corpus can be preset in the terminal, and a large number of sentences, words, phrases, etc. are stored in the corpus.
- the terminal Before error detection, the terminal establishes a correct vocabulary according to the corpus. Then, in the process of error detection, after the terminal finishes segmenting the labeled text, it searches the correct vocabulary for each word included in the labeled text. For example, search for "its", “clothes", “no”, "see”, and "le” from the correct vocabulary.
- Step 203 When it is determined by searching that there are wrong words among the multiple words included in the labeled text, error detection information is generated based on the wrong words; the wrong words are words that are not recorded in the correct vocabulary.
- a word is not found in the correct vocabulary, it is determined that the word is an error word; then, error detection information is generated based on the error word. For example, if "its" is not found in the correct vocabulary list, then "its” is an error word, and error detection information is generated based on "its".
- the step of generating error detection information based on the wrong word may include: looking up multiple reference words in the correct vocabulary; the editing distance between the reference word and the wrong word is within a preset editing distance, and the editing distance includes pinyin editing distance and vocabulary At least one of edit distances; generating error detection information containing multiple reference words.
- this embodiment relates to generating error detection information when it is determined through error detection that at least one of an error in a word in the labeled text and an error in a sentence in the labeled text is generated
- An optional process Based on the above embodiment shown in FIG. 3, the following steps may also be included:
- Step 301 Input a first word sequence composed of multiple words included in the annotated text into a pre-trained neural network error detection model to obtain probability information corresponding to the first word sequence output by the neural network error detection model; the probability information is used for Indicates the probability that the word sequence is correct.
- the neural network error detection model may be a Bi-RNN model, which is not limited in detail in the embodiment of the present invention, and can be set according to actual conditions.
- step 302 is executed. If the probability information corresponding to the first word sequence is not lower than the preset probability value, then Go to step 303.
- Step 302 If the probability information corresponding to the first word sequence is lower than the preset probability value, error detection information is generated.
- the probability information corresponding to the first word sequence is lower than the preset probability value, it indicates that the probability that the first word sequence is correct is low. For example, the probability information corresponding to the first word sequence "it, clothes, no, see, and now" is 0.93, which is lower than the preset probability value of 0.96, and the probability of determining that the first word sequence is correct is low.
- the annotator did not modify the annotated text, and there are still errors in multiple words included in the annotated text, and error detection information needs to be generated.
- the step of generating error detection information may include: when the probability information corresponding to the first word sequence is lower than a preset probability value, replacing the error words with multiple reference words respectively to obtain multiple second word sequences; A plurality of second word sequences are input into the neural network error detection model to obtain the probability information corresponding to each second word sequence; according to the corresponding relationship between the reference word and the second word sequence and the probability information corresponding to each second word sequence, generate Error detection information for multiple reference words.
- the probability information corresponding to the first word sequence "its, clothes, no, see, and now” is 0.93, which is lower than the preset probability value of 0.96, and the reference words are "his” and "her”, then use "he Replace “It's” to get a second word sequence "his, clothes, no, see, now”, replace “its” with “her” to get another second word sequence "her, clothes, no , See, now”.
- Step 303 If the probability information corresponding to the first word sequence is not lower than the preset probability value, stop outputting the error detection information, and add the wrong word to the correct word list.
- the probability information corresponding to the first word sequence is not lower than the preset probability value, it indicates that the probability that the first word sequence is correct is high, that is to say, the operation of the annotator without modifying the annotated text is correct. At this time, stop outputting the error detection information, and add the wrong words to the correct vocabulary.
- the probability information corresponding to the first word sequence is 0.98, which is not lower than the preset probability value of 0.96, stop outputting the error detection information "his" and “her”, and add the wrong word “its” to the correct word Table, so that the word "its” can be found in the correct vocabulary later.
- this embodiment relates to generating error detection information when it is determined through error detection that at least one of an error in a word in the labeled text and an error in a sentence in the labeled text is generated
- An optional process Based on the above embodiment shown in FIG. 3, the following steps may also be included:
- a search engine is used to search for a first word sequence composed of multiple words included in annotated text to obtain a search result matching the first word sequence.
- the error detection information including multiple reference words is generated. If the annotator does not modify the annotation text based on the error detection information, the first word sequence can be searched through a search engine, where the first word sequence is composed of multiple words included in the annotation text. The search engine searches out search results that exactly match the first word sequence.
- the embodiment of the present invention does not limit the search engine in detail, and can be set according to actual conditions.
- step 402 is executed; if the number of search results is not less than the preset number, step 403 is executed.
- Step 402 If the number of search results is less than the preset number, generate error detection information.
- the step of generating error detection information may include: when the number of search results is less than a preset number, deleting the wrong words from the first word sequence to obtain a third word sequence; searching for the third word sequence through a search engine to obtain Multiple co-occurring words that appear simultaneously with the third word sequence; generate error detection information containing multiple co-occurring words.
- Step 403 If the number of search results is not less than the preset number, stop outputting error detection information, and add the wrong words to the correct vocabulary.
- the search engine When it is determined that there is an error in the labeled text, the search engine is used to search for the first word sequence composed of multiple words included in the labeled text to obtain search results matching the first word sequence; if the number of search results is less than the preset number, then Generate error detection information; if the number of search results is not less than the preset number, stop outputting error detection information, and add the wrong words to the correct vocabulary.
- the search engine is used to check the annotated text again, and the two-level error detection can improve the detection. Wrong accuracy rate, so that the labeled text is more accurate.
- Step 501 Obtain annotated text obtained by an annotator after annotating audio data.
- Step 502 Perform word segmentation on the labeled text to obtain multiple words included in the labeled text.
- Step 503 Search for each word included in the labeled text in the pre-established correct vocabulary.
- Step 504 When it is determined by searching that there are wrong words among the multiple words included in the labeled text, error detection information is generated based on the wrong words; the wrong words are words that are not recorded in the correct vocabulary.
- Step 505 output the error detection information.
- Step 506 Input the first word sequence composed of multiple words included in the annotated text into the pre-trained neural network error detection model to obtain probability information corresponding to the first word sequence output by the neural network error detection model; the probability information is used for Indicates the probability that the word sequence is correct.
- step 507 if the probability information corresponding to the first word sequence is lower than the preset probability value, step 507 is executed; if the probability information corresponding to the first word sequence is not lower than the preset probability value, step 508 is executed.
- Step 507 If the probability information corresponding to the first word sequence is lower than the preset probability value, error detection information is generated.
- the probability information corresponding to the first word sequence is lower than the preset probability value
- multiple reference words are used to replace the wrong words to obtain multiple second word sequences; the multiple second word sequences are input to the nerve
- the probability information corresponding to each second word sequence is obtained; according to the corresponding relationship between the reference word and the second word sequence and the probability information corresponding to each second word sequence, error detection information containing multiple reference words is generated.
- Step 508 If the probability information corresponding to the first word sequence is not lower than the preset probability value, stop outputting the error detection information, and add the wrong word to the correct word list.
- Step 509 Search for a first word sequence composed of multiple words included in the annotation text through a search engine to obtain a search result that matches the first word sequence.
- Step 510 If the number of search results is less than the preset number, generate error detection information.
- generating error detection information includes: when the number of search results is less than the preset number, deleting the wrong word from the first word sequence to obtain the third word sequence;
- the search engine searches the third word sequence to obtain multiple co-occurring words that appear simultaneously with the third word sequence; and generates error detection information containing multiple co-occurring words.
- Step 511 If the number of search results is not less than the preset number, stop outputting error detection information, and add the wrong words to the correct vocabulary.
- the tagging text obtained after tagging the audio data by the tagger is obtained; the tagging text is segmented to obtain multiple words included in the tagging text; and the tagging text is respectively searched in the correct vocabulary established in advance Each word included; when it is determined by searching that there is an error word among multiple words included in the labeled text, error detection information is generated based on the error word; the error detection information is output. If the annotator does not modify the annotated text, the first word sequence consisting of multiple words included in the annotated text is input into the pre-trained neural network error detection model, and the first word sequence output by the neural network error detection model is obtained.
- the search engine is used to search the first word sequence composed of multiple words in the annotated text to obtain search results matching the first word sequence; if the number of search results is less than the preset If the number of search results is not less than the preset number, then stop outputting the error detection information, and add the wrong words to the correct vocabulary.
- three-level error detection can be used to remind annotators multiple times to improve the accuracy of error detection, thereby making the annotated text more accurate, and thus making the deep learning model more accurate.
- an audio tagging error detection device including:
- Annotated text obtaining module 601 configured to obtain annotated text obtained by annotator after annotating audio data
- the error detection module 602 is configured to perform error detection on the labeled text, and generate error detection information when it is determined through the error detection that at least one of an error in a word in the labeled text and an error in a sentence in the labeled text occurs;
- the error detection information output module 603 is used to output error detection information.
- the above-mentioned error detection module 602 includes:
- the word search submodule is used to search for each word included in the labeled text in the correct word list established in advance;
- the first error detection information generation sub-module is used to generate error detection information based on the error words when it is determined that there are error words in the multiple words included in the labeled text through searching; the error words are words that are not recorded in the correct vocabulary.
- the above-mentioned first error detection information generating submodule is specifically used to find multiple reference words in the correct vocabulary; the editing distance between the reference word and the wrong word is within a preset editing distance, and the editing distance includes pinyin At least one of edit distance and vocabulary edit distance; generating error detection information containing multiple reference words.
- the above-mentioned error detection module 602 includes:
- the probability information output sub-module is used to input the first word sequence composed of multiple words included in the labeled text into the pre-trained neural network error detection model to obtain the probability information corresponding to the first word sequence output by the neural network error detection model ; Probability information is used to indicate the probability that the word sequence is correct;
- the second error detection information generating sub-module is configured to generate error detection information if the probability information corresponding to the first word sequence is lower than the preset probability value.
- the second error detection information generating sub-module is specifically configured to replace the error words with multiple reference words respectively when the probability information corresponding to the first word sequence is lower than the preset probability value, to obtain multiple The second word sequence; input multiple second word sequences into the neural network error detection model to obtain the probability information corresponding to each second word sequence; according to the correspondence between the reference word and the second word sequence and each second word sequence The corresponding probability information generates error detection information containing multiple reference words.
- the device further includes:
- the first stop output module is configured to stop outputting error detection information if the probability information corresponding to the first word sequence is not lower than the preset probability value, and add the wrong word to the correct word list.
- the above-mentioned error detection module 602 includes:
- the search sub-module is used to search for a first word sequence composed of multiple words included in the marked text through a search engine to obtain search results matching the first word sequence;
- the third error detection information generation sub-module is configured to generate error detection information if the number of search results is less than the preset number.
- the third error detection information generating submodule is specifically used to delete the wrong words from the first word sequence when the number of search results is less than the preset number to obtain the third word sequence; through the search engine Search for the third word sequence to obtain multiple co-occurring words that appear simultaneously with the third word sequence; generate error detection information containing multiple co-occurring words.
- the device further includes:
- Each module in the above-mentioned audio labeling error detection device can be implemented in whole or in part by software, hardware, and a combination thereof.
- the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
- a computer device is provided.
- the computer device may be a terminal, and its internal structure diagram may be as shown in FIG. 8.
- the computer equipment includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus.
- the processor of the computer device is used to provide calculation and control capabilities.
- the memory of the computer device includes a non-volatile storage medium and an internal memory.
- the non-volatile storage medium stores an operating system and a computer program.
- the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
- the network interface of the computer device is used to communicate with an external terminal through a network connection. When the computer program is executed by the processor, an error detection method for audio annotation is realized.
- error detection information including multiple reference words is generated.
- the probability information corresponding to the first word sequence is not lower than the preset probability value, stop outputting error detection information, and add the wrong word to the correct word list.
- the computer program further implements the following steps when being executed by the processor:
- Volatile memory may include random access memory (RAM) or external cache memory.
- RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Theoretical Computer Science (AREA)
- Mathematical Physics (AREA)
- Human Computer Interaction (AREA)
- Artificial Intelligence (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- Machine Translation (AREA)
- Document Processing Apparatus (AREA)
Abstract
Description
Claims (12)
- 一种音频标注的检错方法,其特征在于,所述方法包括:获取标注人员对音频数据进行标注后得到的标注文本;对所述标注文本进行检错,当通过检错确定所述标注文本中的词语出现错误和所述标注文本中的语句出现错误中的至少一种时,生成检错信息;对所述检错信息进行输出。
- 根据权利要求1所述的方法,其特征在于,所述当通过检错确定所述标注文本中的词语出现错误和所述标注文本中的语句出现错误中的至少一种时,生成检错信息,包括:对所述标注文本进行分词,得到所述标注文本包括的多个词语;在预先建立的正确词表中分别查找所述标注文本包括的各个词语;当通过查找确定所述标注文本包括的多个词语中存在错误词语时,基于所述错误词语生成所述检错信息;所述错误词语为未记录于所述正确词表中的词语。
- 根据权利要求2所述的方法,其特征在于,所述基于所述错误词语生成所述检错信息,包括:在所述正确词表中查找多个参考词语;所述参考词语与所述错误词语的编辑距离在预设编辑距离内,所述编辑距离包括拼音编辑距离以及词汇编辑距离中的至少一种;生成包含多个所述参考词语的检错信息。
- 根据权利要求3所述的方法,其特征在于,所述当通过检错确定所述标注文本中的词语出现错误和所述标注文本中的语句出现错误中的至少一种时,生成检错信息,包括:将由所述标注文本包括的多个词语组成的第一词语序列输入到预先训练的神经网络检错模型中,得到所述神经网络检错模型输出的所述第一词语序列对应的概率信息;所述概率信息用于指示词语序列正确的概率;若所述第一词语序列对应的概率信息低于预设概率值,则生成所述检错信息。
- 根据权利要求4所述的方法,其特征在于,所述若所述第一词语序列对应的概率信息低于预设概率值,则生成所述检错信息,包括:在所述第一词语序列对应的概率信息低于所述预设概率值时,分别采用多个所述参考词语替换所述错误词语,得到多个第二词语序列;分别将多个所述第二词语序列输入到所述神经网络检错模型中,得到各所述第二词语序列对应的概率信息;根据所述参考词语与所述第二词语序列的对应关系和各所述第二词语序列对应的概率信息,生成包含多个所述参考词语的检错信息。
- 根据权利要求4所述的方法,其特征在于,在所述得到所述神经网络检错模型输出的所述第一词语序列对应的概率信息之后,所述方法还包括:若所述第一词语序列对应的概率信息不低于所述预设概率值,则停止输出所述检错信息,并将所述错误词语添加到所述正确词表中。
- 根据权利要求3或5所述的方法,其特征在于,所述当通过检错确定所述标注文本中的词语出现错误和所述标注文本中的语句出现错误中的至少一种时,生成检错信息,包括:通过搜索引擎搜索由所述标注文本包括的多个词语组成的第一词语序列,得到与所述第一词语序列匹配的搜索结果;若所述搜索结果的数量小于预设数量,则生成所述检错信息。
- 根据权利要求7所述的方法,其特征在于,所述若所述搜索结果的数量小于预设数量,则生成所述检错信息,包括:在所述搜索结果的数量小于所述预设数量时,从所述第一词语序列中删除所述错误词语,得到第三词语序列;通过所述搜索引擎搜索所述第三词语序列,得到多个与所述第三词语序列同时出现的共现词语;生成包含多个所述共现词语的检错信息。
- 根据权利要求7所述的方法,其特征在于,在所述得到与所述第一词语序列匹配的搜索结果之后,所述方法还包括:若所述搜索结果的数量不小于所述预设数量,则停止输出所述检错信息,并将所述错误词语添加到所述正确词表中。
- 一种音频标注的检错装置,其特征在于,所述装置包括:标注文本获取模块,用于获取标注人员对音频数据进行标注后得到的标注文本;检错模块,用于对所述标注文本进行检错,当通过检错确定所述标注文本中的词语出现错误和所述标注文本中的语句出现错误中的至少一种时,生成检错信息;检错信息输出模块,用于对所述检错信息进行输出。
- 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至9中任一项所述方法的步骤。
- 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至9中任一项所述的方法的步骤。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910777343.1 | 2019-08-22 | ||
CN201910777343.1A CN110532522A (zh) | 2019-08-22 | 2019-08-22 | 音频标注的检错方法、装置、计算机设备和存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021031505A1 true WO2021031505A1 (zh) | 2021-02-25 |
Family
ID=68662519
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/130444 WO2021031505A1 (zh) | 2019-08-22 | 2019-12-31 | 音频标注的检错方法、装置、计算机设备和存储介质 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN110532522A (zh) |
WO (1) | WO2021031505A1 (zh) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110532522A (zh) * | 2019-08-22 | 2019-12-03 | 深圳追一科技有限公司 | 音频标注的检错方法、装置、计算机设备和存储介质 |
CN110968730B (zh) * | 2019-12-16 | 2023-06-09 | Oppo(重庆)智能科技有限公司 | 音频标记处理方法、装置、计算机设备及存储介质 |
CN112417850B (zh) * | 2020-11-12 | 2024-07-02 | 北京晴数智慧科技有限公司 | 音频标注的检错方法和装置 |
CN112669814B (zh) * | 2020-12-17 | 2024-06-14 | 北京猎户星空科技有限公司 | 一种数据处理方法、装置、设备及介质 |
CN112700763B (zh) * | 2020-12-26 | 2024-04-16 | 中国科学技术大学 | 语音标注质量评价方法、装置、设备及存储介质 |
CN114441029A (zh) * | 2022-01-20 | 2022-05-06 | 深圳壹账通科技服务有限公司 | 语音标注***的录音噪音检测方法、装置、设备及介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180351884A1 (en) * | 2017-05-30 | 2018-12-06 | Taneshia Pawelczak | System and Method for Individualizing Messages |
CN109902957A (zh) * | 2019-02-28 | 2019-06-18 | 腾讯科技(深圳)有限公司 | 一种数据处理方法和装置 |
CN109922371A (zh) * | 2019-03-11 | 2019-06-21 | 青岛海信电器股份有限公司 | 自然语言处理方法、设备及存储介质 |
CN110532522A (zh) * | 2019-08-22 | 2019-12-03 | 深圳追一科技有限公司 | 音频标注的检错方法、装置、计算机设备和存储介质 |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101655837B (zh) * | 2009-09-08 | 2010-10-13 | 北京邮电大学 | 一种对语音识别后文本进行检错并纠错的方法 |
CN107977356B (zh) * | 2017-11-21 | 2019-10-25 | 新疆科大讯飞信息科技有限责任公司 | 识别文本纠错方法及装置 |
CN109522558B (zh) * | 2018-11-21 | 2024-01-12 | 金现代信息产业股份有限公司 | 一种基于深度学习的中文错字校正方法 |
-
2019
- 2019-08-22 CN CN201910777343.1A patent/CN110532522A/zh active Pending
- 2019-12-31 WO PCT/CN2019/130444 patent/WO2021031505A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180351884A1 (en) * | 2017-05-30 | 2018-12-06 | Taneshia Pawelczak | System and Method for Individualizing Messages |
CN109902957A (zh) * | 2019-02-28 | 2019-06-18 | 腾讯科技(深圳)有限公司 | 一种数据处理方法和装置 |
CN109922371A (zh) * | 2019-03-11 | 2019-06-21 | 青岛海信电器股份有限公司 | 自然语言处理方法、设备及存储介质 |
CN110532522A (zh) * | 2019-08-22 | 2019-12-03 | 深圳追一科技有限公司 | 音频标注的检错方法、装置、计算机设备和存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN110532522A (zh) | 2019-12-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021031505A1 (zh) | 音频标注的检错方法、装置、计算机设备和存储介质 | |
CN110765763B (zh) | 语音识别文本的纠错方法、装置、计算机设备和存储介质 | |
US11586987B2 (en) | Dynamically updated text classifier | |
CN107908635B (zh) | 建立文本分类模型以及文本分类的方法、装置 | |
WO2021000555A1 (zh) | 基于知识图谱的问答方法、装置、计算机设备和存储介质 | |
WO2021068321A1 (zh) | 基于人机交互的信息推送方法、装置和计算机设备 | |
US20200293616A1 (en) | Generating a meeting review document that includes links to the one or more documents reviewed | |
US11720741B2 (en) | Artificial intelligence assisted review of electronic documents | |
US9934220B2 (en) | Content revision using question and answer generation | |
US9058317B1 (en) | System and method for machine learning management | |
WO2021114810A1 (zh) | 基于图结构的公文推荐方法、装置、计算机设备及介质 | |
EP4018353A1 (en) | Systems and methods for extracting information from a dialogue | |
JP2009515253A (ja) | 草稿文書における編集パターンの自動検出及び適用 | |
WO2021121158A1 (zh) | 公文文件处理方法、装置、计算机设备及存储介质 | |
US20160085741A1 (en) | Entity extraction feedback | |
US20140075299A1 (en) | Systems and methods for generating extraction models | |
CN112632258A (zh) | 文本数据处理方法、装置、计算机设备和存储介质 | |
US7962324B2 (en) | Method for globalizing support operations | |
Skidmore | Incremental disfluency detection for spoken learner english | |
US20080091694A1 (en) | Transcriptional dictation | |
CN109670040B (zh) | 写作辅助方法、装置及存储介质、计算机设备 | |
CN113705198B (zh) | 场景图生成方法、装置、电子设备及存储介质 | |
CN113050933B (zh) | 脑图数据处理方法、装置、设备及存储介质 | |
CN114896382A (zh) | 人工智能问答模型生成方法、问答方法、装置及存储介质 | |
US11954439B2 (en) | Data labeling method and device, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19942371 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19942371 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19942371 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205 DATED 21/09/2022) |