WO2020034538A1 - Audio keyword quality inspection method and apparatus, computer device, and storage medium - Google Patents

Audio keyword quality inspection method and apparatus, computer device, and storage medium Download PDF

Info

Publication number
WO2020034538A1
WO2020034538A1 PCT/CN2018/123067 CN2018123067W WO2020034538A1 WO 2020034538 A1 WO2020034538 A1 WO 2020034538A1 CN 2018123067 W CN2018123067 W CN 2018123067W WO 2020034538 A1 WO2020034538 A1 WO 2020034538A1
Authority
WO
WIPO (PCT)
Prior art keywords
time point
keyword
audio
recording
target
Prior art date
Application number
PCT/CN2018/123067
Other languages
French (fr)
Chinese (zh)
Inventor
岳鹏昱
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020034538A1 publication Critical patent/WO2020034538A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually

Definitions

  • the present application relates to the field of communication technologies, and in particular, to a method for quality inspection of audio keywords.
  • Customer service staff communicates with customers to answer customer consultation questions or facilitate transactions.
  • the company will record these customer service calls and arrange special quality inspection personnel to check whether there are illegal languages in the recordings. Can ensure the quality of customer service staff and avoid illegal operations.
  • An audio keyword quality inspection method includes:
  • a target recording file in which the target keyword exists in each recording file to be inspected is determined according to a pre-established audio time correspondence relationship, and an audio time point at which the target keyword is located is determined, where the audio time corresponds
  • the relationship records the correspondence between the keywords that need to be checked, the keyword file, and the time point of the keywords.
  • the keyword files refer to the recording files that need to be checked for the keywords that need to be checked that exist in the text.
  • the keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file, and the audio time point refers to a time point where the target keyword is in the target recording file and is played;
  • An audio keyword quality inspection device includes:
  • a keyword determination module configured to determine a target keyword that is currently to be inspected
  • a recording file determining module configured to determine, according to a pre-established audio-time correspondence, a target recording file in which the target keyword exists in each recording file to be inspected, and determine an audio time point where the target keyword is located,
  • the audio time correspondence records the correspondence between keywords that need to be checked, keyword files, and keyword time points.
  • the keyword file refers to the keywords that need to be checked for quality that exist in the text.
  • the keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file, and the audio time point means that the target keyword is located in the target recording file and is played. Point in time
  • the file information output module is configured to output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
  • a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and the processor implements the audio keyword quality when the processor executes the computer-readable instructions. Steps of the inspection method.
  • One or more non-volatile readable storage media storing computer readable instructions, the computer readable storage medium storing computer readable instructions, so that the one or more processors perform the above-mentioned audio keyword quality inspection Method steps.
  • FIG. 1 is a schematic diagram of an application environment of an audio keyword quality inspection method according to an embodiment of the present application
  • FIG. 2 is a flowchart of an audio keyword quality inspection method according to an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of establishing an audio time correspondence relationship in advance in an application scenario according to an audio keyword quality inspection method according to an embodiment of the present application;
  • FIG. 4 is a schematic flowchart of playing a target recording file and locating the audio keyword quality inspection method in an application scenario according to an embodiment of the present application;
  • FIG. 5 is a schematic flowchart of step 302 of an audio keyword quality inspection method in an application environment in an implementation of the present application
  • FIG. 6 is a schematic structural diagram of an audio keyword quality inspection device according to an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a computer device according to an embodiment of the present application.
  • the audio keyword quality inspection method provided in this application can be applied in the application environment as shown in FIG. 1, where a terminal communicates with a server through a network.
  • the terminal may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices.
  • the server can be implemented by an independent server or a server cluster composed of multiple servers.
  • an audio keyword quality inspection method is provided.
  • the method is applied to the server in FIG. 1 as an example, and includes the following steps:
  • the quality inspector may first determine the keyword currently being prepared for quality inspection, that is, the target keyword. Specifically, the server system can display all keywords requiring quality inspection on the interface to the quality inspector for selection, and the quality inspector selects one, two, or more keywords from these displayed keywords as the current quality inspection to be inspected. Targeted keywords.
  • the keywords that need to be checked on the system can be preset by the administrator according to the actual needs. These keywords can be managed through a font. The administrator can add and delete keywords in the font from the terminal. In order to achieve the management of keywords that need quality inspection.
  • the time correspondence records the correspondence between the keywords that need to be inspected, the keyword file, and the time points of the keywords.
  • the keyword file refers to the recordings to be inspected that identify the keywords that need to be inspected in the text.
  • the keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file
  • the audio time point refers to a time point where the target keyword is in the target recording file and is played;
  • the audio time correspondence is the relationship between the recorded keywords, recording files, and playback times of the keywords in the recording files.
  • the recording content of a certain recording file A is "buy XXX insurance, you can cash back 200 yuan ", where the word “cash back” is the keyword of quality inspection, and this keyword appears in the third point of recording file A
  • the audio time correspondence relationship can be associated with the keyword "rebate", the recording file A and the time point of 3 minutes and 40 seconds to establish a correspondence relationship among the three.
  • the audio time correspondence can be established in advance through the following steps:
  • step 201 first, the recording files to be inspected need to be obtained.
  • the server On the server, a large number of recording files are generated every day. In this solution, it can be set to obtain these recording files that have not passed the quality inspection in the early morning of each day to pre-establish the audio time correspondence relationship.
  • step 202 after obtaining the recording files to be inspected, voice recognition technology can be used to perform voice recognition on the recording files to obtain the recognition text corresponding to each of the recording files to be inspected.
  • the server can be executed in the early morning time period by running batches, so that the idle time period of the server system is used to complete the speech recognition processing.
  • the server also needs to record the time point at which the recognized text recognized is played in the corresponding recording file.
  • the recording content of a certain recording file A is "Buy XXX insurance, you can cash back 200 yuan ", where the playback time of the word “buy” is 3 minutes 36 seconds, and the playback time of the word “insurance” It ’s 3 minutes and 38 seconds.
  • the playback time of the word “may” is 3 minutes and 39 seconds.
  • the playback time of the word “cashback” is 3 minutes and 40 seconds.
  • the playback time of the word “200 yuan” is It's 3 minutes 41 seconds, wait.
  • step 203 it can be understood that, in the above step 202, recognition texts corresponding to the recording files to be inspected have been identified, and the playback time points of the texts in these recognition texts have been recorded.
  • step 204 it is known that, in step 203, it is determined that the recording file in which the keyword requiring quality inspection exists in the recognition text and that the keyword exists is based on a point in time of playback in the audio of the recording file.
  • the corresponding relationship between the keywords of quality inspection "," recognition of the recording file where the keywords requiring quality inspection exists in the text ", and” the keywords existing at the time point of the audio of the recording file " The audio time correspondence may be established. For example, following the above example, suppose that the keywords that need quality inspection include "purchase” and "insurance”.
  • the file information of the target recording files also needs to be output, so that the quality inspection personnel can know the target recording files during the quality inspection.
  • the file information may specifically include one or more of file name, file storage location, personnel information of both parties of the conversation in the recording file, and recording duration.
  • the solution also outputs the determined file information of each of the target recording files and identifies each of the target recording files.
  • the audio point in time It can be known from the foregoing content that the audio time point here is the time point when the target keyword is located in the target recording file for playback.
  • this solution also automatically plays the target recording files after determining the target recording files, and the playback position is automatically positioned to the position in front of the audio time point, so that the quality inspection Personnel do not need to waste a lot of time playing the recording file from the beginning, and do not need to manually locate the current playback position before the audio time point, which further improves the efficiency of quality inspection staff on the quality inspection of these target recording files.
  • the method may further include:
  • step 301 since the quality inspection staff can only listen to one target recording file at the same time, when there are multiple target recording files, one target recording file needs to be selected as the current recording file currently being played; if there is only one target recording file, This target recording file can be selected as the current recording file currently being played.
  • the audio time point is the playback time point where the target keyword is located in the target audio file.
  • the quality inspection staff it is generally necessary for the quality inspection staff to listen to the content of the target recording file for a preparation time, so it is generally necessary to play the current recording file.
  • the playback starts from the front of the audio time, that is, the start of the playback time is before the audio time.
  • the start playback time point of the current recording file is determined according to the audio time point corresponding to the current recording file.
  • the step 302 may further specifically include:
  • step 401 and step 402 considering that there may be multiple audio time points in the current recording file, in this case, the quality inspector needs to listen to the earliest audio time point of each audio time point in the current recording file, so The earliest one of the audio time points may be determined as a first time point, and a playback time point before the first time point may be determined as a start playback time point of the current recording file. In this way, it can be ensured that when there are more than two audio time points in a current recording file, the quality inspection personnel start playing from the front of the earliest audio time point, which meets the requirements of quality inspection and the habit of listening to audio.
  • step 402 may specifically include:
  • Method 1 A playback time point in front of the first time point in the current recording file is determined as a start playback time point of the current recording file, and the playback start time point can be specifically determined in two ways. For example, the time point 3 seconds before the first time point may be determined as the start playback time point of the first recording file. For example, assuming that the first recording file A includes two audio time points, one audio time point is 2 minutes and 10 seconds, and the other audio time point is 2 minutes and 30 seconds, then the first time point may be 2 minutes 10 seconds, and the preset first duration is 3 seconds, it can be determined that the start playback time point of the first recording file A is 2 minutes and 7 seconds.
  • Step 501 Perform audio analysis on the current recording file to obtain a voice pause point in the current recording file that is located before the first time point and is closest to the first time point;
  • Step 502 Determine the time point corresponding to the acquired voice pause point as the starting playback time point of the current recording file.
  • steps 501 and 502 it can be understood that, in the second method, in order to fully consider the efficiency of the quality inspection staff in listening to audio and facilitate the quality inspection staff to understand the content of the current recording file, you can find the voice pause point in the current recording file.
  • the speech pause point located in front of the first time point and closest to the first time point is used as the position where the current recording file starts to play.
  • the playback start position not only can the playback start position be as close as possible to the first time point (that is, the earliest audio time point in the current recording file), but also the audio content corresponding to the start playback position can be coherent, which is more convenient for quality inspection.
  • Personnel understand what is in the current recording file.
  • steps 501 and 502 are used to start playback from the position of the speech pause point, which will conform to the law of human dialogue and the rule of quality inspection personnel listening to the recorded content. It should be noted that the position of the speech pause point can be determined by analyzing the volume of audio in the current recording file.
  • the pause point of the voice is at the lowest point of the volume of the audio, because when a person talks, the pause point is that there is no pronunciation or the volume is low. Therefore, by analyzing the volume of the audio in the current recording file, the current volume can be quickly determined Each voice pause in the recording file.
  • the audio segment in the current recording file that is close to the first point in time and located in front of the first point in time for a second duration may be specifically intercepted.
  • the second duration is 10 seconds, then the audio segment 10 seconds before the first time point is intercepted for audio analysis. This is because there is no pause for ordinary people to talk continuously for more than 10 seconds.
  • the second duration can be specifically set according to actual conditions.
  • the method may further include:
  • Step 601 If the determined number of each of the target recording files is greater than 1, obtain a recording time of each of the target recording files;
  • Step 602 Determine the order of each of the target recording files according to the sequence of the recording time.
  • the step of “outputting the determined file information of each of the target recording files” in step 103 is specifically: outputting the determined file information of each of the target recording files to a designated terminal, so that the designated The terminal displays each of the target recording files according to the sort.
  • step 601 and step 602 when outputting the file information of each target recording file, it is considered that if too many target recording files are output, it will make the quality inspector confused, which is not conducive to orderly geological inspection of these target recording files. Therefore, this solution can also determine the order of each of the target recording files according to the sequence of the recording time.
  • the determined file information of each of the target recording files is output to a specified A terminal, so that the designated terminal displays each of the target recording files according to the sort.
  • the above step 303 starts to play the current recording file from the start playback time point. It can be known that when the playback of the current recording file is completed and the quality inspector determines on the system interface that the quality inspection of the current recording file is over, the server can The next recording file is played automatically. At this time, the order of automatic playback can also be determined according to the sequence of the recording time.
  • the above provides an audio keyword quality inspection method.
  • determine the current target keyword to be inspected and then determine the existence of each recording file to be inspected according to the pre-established audio-time correspondence.
  • the audio time correspondence records the keywords that need to be checked for quality, and the keywords that need to be checked for quality that exist in the text are identified.
  • the quality inspector needs to check whether there is an illegal language of a keyword requiring quality inspection in these recording files, he can directly locate the recording file with the keywords through the audio time correspondence and identify the existence of the recording file.
  • the audio time point of the keyword can realize the fast positioning of the keyword in the recording file, which helps the quality inspection personnel to quickly verify whether there is really an illegal language in the recording file, which greatly improves the efficiency of the quality inspection of the recording file.
  • an audio keyword quality inspection device is provided.
  • the audio keyword quality inspection device is in one-to-one correspondence with the audio keyword quality inspection method in the above embodiment.
  • the audio keyword quality inspection device includes a keyword determination module 701, a recording file determination module 702, and a file information output module 703.
  • Each functional module is described in detail as follows:
  • a keyword determination module 701 configured to determine a target keyword to be currently inspected
  • the recording file determining module 702 is configured to determine, according to a pre-established audio-time correspondence, a target recording file in which the target keyword exists in each recording file to be inspected and an audio time point where the target keyword is located.
  • the audio-time correspondence records the keywords that need to be checked, the recording files that need to be checked for the keywords that need to be checked that exist in the text, and the keywords that are located between the time points in the audio of the recorded file.
  • the audio time point refers to a time point at which the target keyword is located in a target recording file for playback;
  • the file information output module 703 is configured to output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
  • the audio time correspondence can be established in advance through the following modules:
  • a file acquisition module for acquiring each recording file to be inspected
  • a voice recognition module is configured to perform voice recognition on each of the recording files to be inspected, to obtain recognition text corresponding to each of the recording files to be inspected, and simultaneously record the recognition text played in the corresponding recording file. Point in time
  • a comparison module configured to compare each of the identified texts with keywords that are preset for quality inspection to determine a keyword file and a keyword time point;
  • a relationship establishing module is configured to establish the audio time correspondence relationship according to the correspondence relationship between the keywords, keyword files, and keyword time points that need to be inspected.
  • the audio keyword quality inspection device may further include:
  • a recording file selection module configured to select a target recording file from each of the target recording files as a currently played current recording file
  • a playback time point determination module configured to determine a start playback time point of the current recording file according to an audio time point corresponding to the current recording file
  • a recording file playback module configured to start playing the current recording file from the start playback time point.
  • the playback time point determination module may include:
  • the earliest time determining unit configured to determine the earliest audio time point among the audio time points corresponding to the current recording file as the first time point;
  • the start playback time determination unit is configured to determine a playback time point in the current recording file that is located before the first time point as a start playback time point of the current recording file.
  • the method further includes:
  • a quantity determining unit if the determined quantity of each of the target recording files is greater than 1, obtaining a recording time of each of the target recording files;
  • a sorting unit which determines the sorting of each of the target recording files according to the sequence of the recording time
  • the file information output module may be specifically configured to: output the determined file information of each of the target recording files to a specified terminal, so that the specified terminal displays each of the target recording files in the order.
  • Each module in the audio keyword quality inspection device can be implemented in whole or in part by software, hardware, and a combination thereof.
  • Each of the above modules may be embedded in the hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 7.
  • the computer device includes a processor, a memory, a network interface, and a database connected through a system bus.
  • the processor of the computer device is used to provide computing and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer-readable instructions, and a database.
  • the internal memory provides an environment for operating systems and computer-readable instructions in a non-volatile storage medium.
  • the database of the computer equipment is used to store the data involved in the frequency keyword quality inspection method.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by a processor to implement a frequency key quality inspection method.
  • a computer device which includes a memory, a processor, and computer-readable instructions stored on the memory and executable on the processor.
  • the processor executes the computer-readable instructions
  • the audio in the foregoing embodiment is implemented.
  • the steps of the keyword quality inspection method are, for example, step 101 to step 103 shown in FIG. 2.
  • the processor executes the computer-readable instructions
  • the functions of the modules / units of the audio keyword quality inspection apparatus in the foregoing embodiment are implemented, for example, the functions of modules 701 to 703 shown in FIG. 6. To avoid repetition, we will not repeat them here.
  • a computer-readable storage medium is provided, the one or more non-volatile storage mediums storing computer-readable instructions, and the computer-readable instructions are executed by one or more processors.
  • the steps of the audio keyword quality inspection method in the foregoing method embodiment are implemented, or the one or more non-volatile readable storages storing computer-readable instructions Medium, when the computer-readable instructions are executed by one or more processors, causing the one or more processors to execute the computer-readable instructions to realize the functions of each module / unit in the audio keyword quality inspection device in the above device embodiment. To avoid repetition, we will not repeat them here.
  • Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory can include random access memory (RAM) or external cache memory.
  • RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
  • SRAM static RAM
  • DRAM dynamic RAM
  • SDRAM synchronous DRAM
  • DDRSDRAM dual data rate SDRAM
  • ESDRAM enhanced SDRAM
  • SLDRAM synchronous chain Synchlink DRAM
  • Rambus direct RAM
  • DRAM direct memory bus dynamic RAM
  • RDRAM memory bus dynamic RAM

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Signal Processing For Digital Recording And Reproducing (AREA)

Abstract

An audio keyword quality inspection method and apparatus, a computer device, and a storage medium, for resolving the issue in which the efficiency of quality inspection performed on audio recording files is low. The method comprises: determining a current target keyword on which quality inspection is to be performed (101); determining, according to a pre-established audio-time correspondence, and from among all audio recording files on which quality inspection is to be performed, a target audio recording file including the target keyword, and determining an audio time point at which the target keyword is located (102), wherein the audio-time correspondence is a correspondence between the keyword requiring quality inspection, the audio recording file requiring quality inspection comprising the keyword requiring quality inspection and located in in recognized texts, and a playing time point at which the keyword in the audio recording file is located, and the audio time point refers to a playing time point of the target keyword in the target recording file; and outputting determined file information of each target audio recording file, and marking the determined audio time point in each target audio recording file (103).

Description

音频关键字质检方法、装置、计算机设备及存储介质Audio keyword quality inspection method, device, computer equipment and storage medium
本申请以2018年08月15日提交的申请号为201810927508.4,名称为“音频关键字质检方法、装置、计算机设备及存储介质”的中国发明专利申请为基础,并要求其优先权。This application is based on a Chinese invention patent application filed on August 15, 2018 with the application number 201810927508.4, entitled "Audio Keyword Quality Inspection Method, Device, Computer Equipment, and Storage Medium", and claims its priority.
技术领域Technical field
本申请涉及通信技术领域,尤其涉及到一种音频关键字质检的方法。The present application relates to the field of communication technologies, and in particular, to a method for quality inspection of audio keywords.
背景技术Background technique
很多企业都配置有客服人员,客服人员通过与客户进行沟通来解答客户的咨询问题或者促成交易的发生,企业会对这些客服电话录音,安排专门的质检人员检查录音中是否存在违规语言,从而可以确保客服人员的服务质量和避免违规操作。Many companies are equipped with customer service staff. Customer service staff communicates with customers to answer customer consultation questions or facilitate transactions. The company will record these customer service calls and arrange special quality inspection personnel to check whether there are illegal languages in the recordings. Can ensure the quality of customer service staff and avoid illegal operations.
但是,对于大型企业来说,客服电话的录音数量非常庞大,同时,质检人员检查录音也要耗费大量的时间,需要将一个录音文件全部听完才能知晓该录音中是否存在违规语言,费时费力且效率低下。However, for large enterprises, the number of recordings for customer service calls is very large. At the same time, it takes a lot of time for quality inspection personnel to check the recording. It is necessary to listen to all of a recording file to know whether there is an illegal language in the recording, which is time-consuming and labor-intensive. And inefficient.
发明内容Summary of the Invention
基于此,有必要针对上述技术问题,提供一种可以提升质检人员的工作效率的音频关键字质检方法、装置、计算机设备及存储介质。Based on this, it is necessary to provide an audio keyword quality inspection method, device, computer equipment, and storage medium that can improve the work efficiency of quality inspectors in response to the above technical problems.
一种音频关键字质检方法,包括:An audio keyword quality inspection method includes:
确定当前待质检的目标关键字;Determine the current target keywords for quality inspection;
根据预先建立的音频时间对应关系确定出各个待质检的录音文件中存在所述目标关键字的目标录音文件,以及确定所述目标关键字所处的音频时间点,其中,所述音频时间对应关系记录了需质检的关键字、关键字文件以及关键字时间点之间的对应关系,所述关键字文件是指识别文本中存在的需质检的关键字的待质检的录音文件,所述关键字时间点是指存在的所述关键字位于录音文件音频中播放的时间点,所述音频时间点是指所述目标关键字位于目标录音文件中播放的时间点;A target recording file in which the target keyword exists in each recording file to be inspected is determined according to a pre-established audio time correspondence relationship, and an audio time point at which the target keyword is located is determined, where the audio time corresponds The relationship records the correspondence between the keywords that need to be checked, the keyword file, and the time point of the keywords. The keyword files refer to the recording files that need to be checked for the keywords that need to be checked that exist in the text. The keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file, and the audio time point refers to a time point where the target keyword is in the target recording file and is played;
输出确定出的各个所述目标录音文件的文件信息,并标识各个所述目标录音文件中确定出的所述音频时间点。Output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
一种音频关键字质检装置,包括:An audio keyword quality inspection device includes:
关键字确定模块,用于确定当前待质检的目标关键字;A keyword determination module, configured to determine a target keyword that is currently to be inspected;
录音文件确定模块,用于根据预先建立的音频时间对应关系确定出各个待质检的录音文件中存在所述目标关键字的目标录音文件,以及确定所述目标关键字所处的音频时间点,其中,所述音频时间对应关系记录了需质检的关键字、关键字文件以及关键字时间点之间的对应关系,所述关键字文件是指识别文本中存在的需质检的关键字的待质检的录音文件,所述关键字时间点是指存在的所述关键字位于录音文件音频中播放的时间点,所述音频时间点是指所述目标关键字位于目标录音文件中播放的时间点;A recording file determining module, configured to determine, according to a pre-established audio-time correspondence, a target recording file in which the target keyword exists in each recording file to be inspected, and determine an audio time point where the target keyword is located, The audio time correspondence records the correspondence between keywords that need to be checked, keyword files, and keyword time points. The keyword file refers to the keywords that need to be checked for quality that exist in the text. For a recording file to be inspected, the keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file, and the audio time point means that the target keyword is located in the target recording file and is played. Point in time
文件信息输出模块,用于输出确定出的各个所述目标录音文件的文件信息,并标识各个所述目标录音文件中确定出的所述音频时间点。The file information output module is configured to output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现上述音频关键字质检方法的步骤。A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and the processor implements the audio keyword quality when the processor executes the computer-readable instructions. Steps of the inspection method.
一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读存储介质存储有计算机可读指令,使得所述一个或多个处理器执行上述音频关键字质检方法的步骤。One or more non-volatile readable storage media storing computer readable instructions, the computer readable storage medium storing computer readable instructions, so that the one or more processors perform the above-mentioned audio keyword quality inspection Method steps.
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。Details of one or more embodiments of the present application are set forth in the accompanying drawings and description below, and other features and advantages of the present application will become apparent from the description, the drawings, and the claims.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solution of the embodiments of the present application more clearly, the drawings used in the description of the embodiments of the application will be briefly introduced below. Obviously, the drawings in the following description are just some embodiments of the application. For those of ordinary skill in the art, other drawings can be obtained based on these drawings without paying creative labor.
图1是本申请一实施例中音频关键字质检方法的一应用环境示意图;FIG. 1 is a schematic diagram of an application environment of an audio keyword quality inspection method according to an embodiment of the present application; FIG.
图2是本申请一实施例中音频关键字质检方法的一流程图;2 is a flowchart of an audio keyword quality inspection method according to an embodiment of the present application;
图3是本申请一实施例中音频关键字质检方法在一个应用场景下预先建立音频时间对应关系的流程示意图;FIG. 3 is a schematic flowchart of establishing an audio time correspondence relationship in advance in an application scenario according to an audio keyword quality inspection method according to an embodiment of the present application;
图4是本申请一实施例中音频关键字质检方法在一个应用场景下播放目标录音文件并定位的流程示意图;4 is a schematic flowchart of playing a target recording file and locating the audio keyword quality inspection method in an application scenario according to an embodiment of the present application;
图5是本申请一实施中音频关键字质检方法步骤302在一个应用环境下的流程示意图;FIG. 5 is a schematic flowchart of step 302 of an audio keyword quality inspection method in an application environment in an implementation of the present application; FIG.
图6是本申请一实施例中音频关键字质检装置的结构示意图;6 is a schematic structural diagram of an audio keyword quality inspection device according to an embodiment of the present application;
图7是本申请一实施例中计算机设备的一示意图。FIG. 7 is a schematic diagram of a computer device according to an embodiment of the present application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In the following, the technical solutions in the embodiments of the present application will be clearly and completely described with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
本申请提供的一种音频关键字质检方法,可应用在如图1的应用环境中,其中,终端通过网络与服务器进行通信。其中,该终端可以但不限于各种个人计算机、笔记本电脑、智能手机、平板电脑和便携式可穿戴设备。服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。The audio keyword quality inspection method provided in this application can be applied in the application environment as shown in FIG. 1, where a terminal communicates with a server through a network. The terminal may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server can be implemented by an independent server or a server cluster composed of multiple servers.
在一实施例中,如图2所示,提供一种音频关键字质检方法,以该方法应用在图1中的服务器为例进行说明,包括如下步骤:In one embodiment, as shown in FIG. 2, an audio keyword quality inspection method is provided. The method is applied to the server in FIG. 1 as an example, and includes the following steps:
101、确定当前待质检的目标关键字;101. Determine a current target keyword for quality inspection.
本方案中,质检人员可以先确定当前准备质检的关键字,即该目标关键字。具体地,服务器的***可以在界面上将所有需要质检的关键字展示给质检人员选择,质检人员从这些展示的关键字中选取一个、两个或多个关键字作为当前待质检的目标关键字。In this solution, the quality inspector may first determine the keyword currently being prepared for quality inspection, that is, the target keyword. Specifically, the server system can display all keywords requiring quality inspection on the interface to the quality inspector for selection, and the quality inspector selects one, two, or more keywords from these displayed keywords as the current quality inspection to be inspected. Targeted keywords.
可以理解的是,***上需要质检的关键字可以由管理员根据实际情况的需要预先设定,可以通过一个字库来管理这些关键字,管理员可以在终端添加、删除字库中的关键字,以实现对需要质检的关键字的管理。It can be understood that the keywords that need to be checked on the system can be preset by the administrator according to the actual needs. These keywords can be managed through a font. The administrator can add and delete keywords in the font from the terminal. In order to achieve the management of keywords that need quality inspection.
102、根据预先建立的音频时间对应关系确定出各个待质检的录音文件中存在所述目标关键字的目标录音文件,以及确定所述目标关键字所处的音频时间点,其中,所述音频时间对应关系记录了需质检的关键字、关键字文件以及关键字时间点之间的对应关系,所述关键字文件是指识别文本中存在的需质检的关键字的待质检的录音文件,所述关键字时间点是指存在的所述关键字位于录音文件音频中播放的时间点,所述音频时间点是指所述目标关键字位于目标录音文件中播放的时间点;102. Determine, according to a pre-established audio-time correspondence, a target recording file in which the target keyword exists in each recording file to be inspected, and determine an audio time point where the target keyword is located, where the audio The time correspondence records the correspondence between the keywords that need to be inspected, the keyword file, and the time points of the keywords. The keyword file refers to the recordings to be inspected that identify the keywords that need to be inspected in the text. File, the keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file, and the audio time point refers to a time point where the target keyword is in the target recording file and is played;
可以理解的是,简单来说,音频时间对应关系就是记录了关键字、录音文件和该关键 字在该录音文件中播放时间点的之间关系。比如某个录音文件A的录音内容是“购买XXX保险,可以返现200元……”,其中“返现”一词为质检的关键字,该关键字出现在录音文件A的第3分40秒的时间点位置,因此,音频时间对应关系可以将关键字“返现”、录音文件A和第3分40秒这一时间点关联存储,建立三者的对应关系。It can be understood that, in simple terms, the audio time correspondence is the relationship between the recorded keywords, recording files, and playback times of the keywords in the recording files. For example, the recording content of a certain recording file A is "buy XXX insurance, you can cash back 200 yuan ...", where the word "cash back" is the keyword of quality inspection, and this keyword appears in the third point of recording file A The position of the time point of 40 seconds, therefore, the audio time correspondence relationship can be associated with the keyword "rebate", the recording file A and the time point of 3 minutes and 40 seconds to establish a correspondence relationship among the three.
进一步地,如图3,所述音频时间对应关系可以通过如下步骤预先建立:Further, as shown in FIG. 3, the audio time correspondence can be established in advance through the following steps:
201、获取各个待质检的录音文件;201. Obtain recording files for each quality inspection;
202、所述各个待质检的录音文件进行语音识别,得到与各个待质检的录音文件对应的识别文本,同时记录所述识别文本在与之对应的录音文件中播放的时间点;202. Perform voice recognition on each of the recording files to be inspected, to obtain recognition text corresponding to each of the recording files to be inspected, and simultaneously record a time point at which the recognition text is played in the corresponding recording file;
203、将各个所述识别文本分别与预设需质检的关键字进行比对,确定关键字文件以及关键字时间点;203: Compare each of the recognition texts with keywords that are preset to be quality-checked to determine a keyword file and a keyword time point;
204、根据所述需质检的关键字、关键字文件以及关键字时间点之间的对应关系建立所述音频时间对应关系。204. Establish the audio-time correspondence relationship according to the correspondence between the keywords, keyword files, and keyword time points that require quality inspection.
对于步骤201,首先,需要获取那些待质检的录音文件。在服务器上,每天都会产生大量的录音文件,本方案中可以设定在每天的凌晨时间段获取这些未经过质检的录音文件进行音频时间对应关系的预先建立。For step 201, first, the recording files to be inspected need to be obtained. On the server, a large number of recording files are generated every day. In this solution, it can be set to obtain these recording files that have not passed the quality inspection in the early morning of each day to pre-establish the audio time correspondence relationship.
对于步骤202,在获取到这些待质检的录音文件之后,可以采用语音识别技术对这些录音文件进行语音识别,得到各个待质检的录音文件对应的识别文本。本方案中,考虑到录音文件的数量往往较为庞大,因此可以以跑批的方式让服务器在凌晨时间段再执行语音识别的步骤,从而利用服务器***的空闲时间段来完成语音识别的处理工作。在语音识别的同时,还需要服务器还需记录识别得出的识别文本在与之对应的录音文件中播放的时间点。例如某个录音文件A的录音内容是“购买XXX保险,可以返现200元……”,其中“购买”一词的播放时间点是第3分36秒,“保险”一词的播放时间点是第3分38秒,“可以”一词的播放时间点是第3分39秒,“返现”一词的播放时间点是第3分40秒,“200元”一词的播放时间点是第3分41秒,等。For step 202, after obtaining the recording files to be inspected, voice recognition technology can be used to perform voice recognition on the recording files to obtain the recognition text corresponding to each of the recording files to be inspected. In this solution, considering that the number of recording files is often huge, the server can be executed in the early morning time period by running batches, so that the idle time period of the server system is used to complete the speech recognition processing. At the same time as speech recognition, the server also needs to record the time point at which the recognized text recognized is played in the corresponding recording file. For example, the recording content of a certain recording file A is "Buy XXX insurance, you can cash back 200 yuan ...", where the playback time of the word "buy" is 3 minutes 36 seconds, and the playback time of the word "insurance" It ’s 3 minutes and 38 seconds. The playback time of the word “may” is 3 minutes and 39 seconds. The playback time of the word “cashback” is 3 minutes and 40 seconds. The playback time of the word “200 yuan” is It's 3 minutes 41 seconds, wait.
对于步骤203,可以理解的是,上述步骤202已经识别得到各个待质检的录音文件对应的识别文本,并记录了这些识别文本中文本的播放时间点。在这种情况下,只需检测各个识别文本中哪些文本属于关键字,找到关键字并获取关键字对应的播放时间点,即可得知这些需质检的关键字出现在哪些录音文件中,以及出现在录音文件中的播放时间点,从而建立三者的对应关系,得到该音频时间对应关系。For step 203, it can be understood that, in the above step 202, recognition texts corresponding to the recording files to be inspected have been identified, and the playback time points of the texts in these recognition texts have been recorded. In this case, you only need to detect which text in each recognized text belongs to the keywords, find the keywords and obtain the playback time points corresponding to the keywords, and you can know which recording files these keywords need to be checked for, And the playback time point appearing in the recording file, so as to establish the correspondence between the three, and obtain the audio time correspondence.
对于步骤204,可知,在步骤203确定识别文本中存在所述需质检的关键字的录音文 件以及存在的所述关键字位于录音文件音频中播放的时间点的基础上,步骤204根据“需质检的关键字”、“识别文本中存在所述需质检的关键字的录音文件”以及“存在的所述关键字位于录音文件音频中播放的时间点”三者之间的对应关系即可建立所述音频时间对应关系。例如,承接上述举例,假设需质检的关键字包括“购买”和“保险”,经过对录音文件A的检测,发现该录音文件A的识别文本存在“购买”和“保险”这两个关键字,因此可以建立得到“购买-录音文件A-第3分36秒”以及“保险-录音文件A-第3分38秒”这两个音频时间对应关系。可知,同理可以通过上述步骤201-204建立其它音频时间对应关系。For step 204, it is known that, in step 203, it is determined that the recording file in which the keyword requiring quality inspection exists in the recognition text and that the keyword exists is based on a point in time of playback in the audio of the recording file. The corresponding relationship between the keywords of quality inspection "," recognition of the recording file where the keywords requiring quality inspection exists in the text ", and" the keywords existing at the time point of the audio of the recording file " The audio time correspondence may be established. For example, following the above example, suppose that the keywords that need quality inspection include "purchase" and "insurance". After testing the recording file A, it is found that the recognition text of the recording file A has two keys: "purchase" and "insurance" Therefore, the two audio time correspondences of "purchase-recording file A-3 minutes and 36 seconds" and "insurance-recording file A-3 minutes and 38 seconds" can be established. It can be known that, similarly, other audio time correspondences can be established through the above steps 201-204.
103、输出确定出的各个所述目标录音文件的文件信息,并标识各个所述目标录音文件中确定出的所述音频时间点。103. Output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
在确定出各个所述目标录音文件时,也即得知了哪些目标录音文件中很可能存在属于“违规语言”的关键字,这是质检人员需要知道的,可以帮助质检人员从大量的录音文件中进行初步的筛选,因此,还需要输出这些目标录音文件的文件信息,这样质检人员在质检时便可得知这些目标录音文件。其中,文件信息具体可以包括文件名、文件存放位置、录音文件中对话的双方人员信息、录音时长等信息中的一个或多个。另外,本方案为了便于质检人员快速定位到录音文件中存在质检关键字的位置,还在输出确定出的各个所述目标录音文件的文件信息的同时标识各个所述目标录音文件中确定出的所述音频时间点。由上述内容可知,这里的音频时间点就是所述目标关键字位于目标录音文件中播放的时间点。When determining each of the target recording files, that is to know which target recording files are likely to have keywords belonging to "illegal language". This is what the quality inspectors need to know, and can help the quality inspectors from a large number of The initial screening is performed on the recording files. Therefore, the file information of the target recording files also needs to be output, so that the quality inspection personnel can know the target recording files during the quality inspection. The file information may specifically include one or more of file name, file storage location, personnel information of both parties of the conversation in the recording file, and recording duration. In addition, in order to facilitate the quality inspection personnel to quickly locate the location where the quality inspection keywords exist in the recording file, the solution also outputs the determined file information of each of the target recording files and identifies each of the target recording files. The audio point in time. It can be known from the foregoing content that the audio time point here is the time point when the target keyword is located in the target recording file for playback.
进一步地,为了更便于质检人员收听这些目标录音文件的音频,本方案还在确定目标录音文件之后对这些目标录音文件进行自动播放,且播放位置自动定位至音频时间点前面位置,从而质检人员无需浪费大量的时间从头开始播放录音文件,也无需手动将当前播放位置定位至该音频时间点前,进一步提升了质检人员对这些目标录音文件质检的效率。如图4,在确定出各个所述目标之后,还可以包括:Further, in order to make it easier for quality inspectors to listen to the audio of these target recording files, this solution also automatically plays the target recording files after determining the target recording files, and the playback position is automatically positioned to the position in front of the audio time point, so that the quality inspection Personnel do not need to waste a lot of time playing the recording file from the beginning, and do not need to manually locate the current playback position before the audio time point, which further improves the efficiency of quality inspection staff on the quality inspection of these target recording files. As shown in FIG. 4, after each of the targets is determined, the method may further include:
301、从各个所述目标录音文件中选取一个目标录音文件作为当前播放的当前录音文件;301. Select a target recording file from each of the target recording files as a currently playing current recording file;
302、根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点;302. Determine a starting playback time point of the current recording file according to an audio time point corresponding to the current recording file.
303、从所述开始播放时间点开始播放所述当前录音文件。303. Play the current recording file from the start playback time point.
对于步骤301,由于质检人员同时只能听取一个目标录音文件,因此,当目标录音文件为多个时,需要从中选取一个目标录音文件作为当前播放的当前录音文件;如果目标录 音文件只有一个,则可以将这一个目标录音文件选取作为当前播放的当前录音文件。For step 301, since the quality inspection staff can only listen to one target recording file at the same time, when there are multiple target recording files, one target recording file needs to be selected as the current recording file currently being played; if there is only one target recording file, This target recording file can be selected as the current recording file currently being played.
对于步骤302,该音频时间点是目标关键字位于目标音频文件中的播放时间点,本方案中,一般需要给质检人员听取目标录音文件中内容一个准备时间,因此一般需要在播放当前录音文件时,从该音频时间点的前面开始播放,也即开始播放时间点位于音频时间点的前面。For step 302, the audio time point is the playback time point where the target keyword is located in the target audio file. In this solution, it is generally necessary for the quality inspection staff to listen to the content of the target recording file for a preparation time, so it is generally necessary to play the current recording file. , The playback starts from the front of the audio time, that is, the start of the playback time is before the audio time.
进一步地,如图5,根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点,所述步骤302具体还可以包括:Further, as shown in FIG. 5, the start playback time point of the current recording file is determined according to the audio time point corresponding to the current recording file. The step 302 may further specifically include:
401、将所述当前录音文件对应的音频时间点中时间最早的一个音频时间点确定为第一时间点;401. Determine an earliest audio time point among the audio time points corresponding to the current recording file as the first time point;
402、将所述当前录音文件中位于所述第一时间点前面的一个播放的时间点确定为所述当前录音文件的开始播放时间点。402. Determine a playback time point that is located before the first time point in the current recording file as a starting playback time point of the current recording file.
对于步骤401和步骤402,考虑到当前录音文件中可能存在多个音频时间点,这种情况下,质检人员需要从该当前录音文件中各个音频时间点的最早一个音频时间点开始听取,因此,可以将其中最早的一个音频时间点确定为第一时间点,并将该第一时间点前面的一个播放的时间点确定为所述当前录音文件的开始播放时间点。这样,便可以保证质检人员在一个当前录音文件中存在两个以上音频时间点的情况下,从最早的一个音频时间点的前面开始播放,符合质检的要求和听取音频的习惯。For step 401 and step 402, considering that there may be multiple audio time points in the current recording file, in this case, the quality inspector needs to listen to the earliest audio time point of each audio time point in the current recording file, so The earliest one of the audio time points may be determined as a first time point, and a playback time point before the first time point may be determined as a start playback time point of the current recording file. In this way, it can be ensured that when there are more than two audio time points in a current recording file, the quality inspection personnel start playing from the front of the earliest audio time point, which meets the requirements of quality inspection and the habit of listening to audio.
进一步地,本方案可以通过以下两种方式来具体确定该开始播放时间点,所述步骤402具体可以包括:Further, this solution can specifically determine the starting time point in the following two ways, and the step 402 may specifically include:
方式一:所述当前录音文件中位于所述第一时间点前面的一个播放的时间点确定为所述当前录音文件的开始播放时间点,可通过两种方式来具体确定该开始播放时间点。比如,可以将第一时间点前面3秒的时间点确定为第一录音文件的开始播放时间点。举例说明,假设第一录音文件A包括两个音频时间点,一个音频时间点为第2分10秒,另一个音频时间点为第2分30秒,则可致第一时间点为第2分10秒,预设第一时长为3秒,则可以确定出该第一录音文件A的开始播放时间点为第2分7秒。Method 1: A playback time point in front of the first time point in the current recording file is determined as a start playback time point of the current recording file, and the playback start time point can be specifically determined in two ways. For example, the time point 3 seconds before the first time point may be determined as the start playback time point of the first recording file. For example, assuming that the first recording file A includes two audio time points, one audio time point is 2 minutes and 10 seconds, and the other audio time point is 2 minutes and 30 seconds, then the first time point may be 2 minutes 10 seconds, and the preset first duration is 3 seconds, it can be determined that the start playback time point of the first recording file A is 2 minutes and 7 seconds.
方式二:步骤501、对所述当前录音文件进行音频分析,获取所述当前录音文件中位于所述第一时间点前面、与所述第一时间点最接近的语音停顿点;Method 2: Step 501: Perform audio analysis on the current recording file to obtain a voice pause point in the current recording file that is located before the first time point and is closest to the first time point;
步骤502、将获取到的所述语音停顿点对应的时间点确定为所述当前录音文件的开始播放时间点。Step 502: Determine the time point corresponding to the acquired voice pause point as the starting playback time point of the current recording file.
对于步骤501和502,可以理解的是,在方式二中,为了充分考虑质检人员听取音频 的效率,便于质检人员理解当前录音文件的内容,可以通过寻找当前录音文件中的语音停顿点,将位于第一时间点前面且与所述第一时间点最接近的语音停顿点作为该当前录音文件开始播放的位置。这样,不仅可以让开始播放的位置尽可能靠近第一时间点(即该当前录音文件中最前面的音频时间点),而且可以使得开始播放的位置对应的音频内容是连贯的,更加便于质检人员理解当前录音文件中的内容。这是因为,录音文件录制的人的对话内容,人在对话时,发言连贯但有具有停顿,若直接在第一时间点前面选取一个时间点开始播放,很可能开始播放的位置位于一句连贯的发言中间,甚至某个字的发音中间,这就不利于质检人员理解将要听到的这段录音的内容。因此,本方案通过步骤501和502,从语音停顿点的位置开始播放,将符合人对话的规律,也符合质检人员听取录音内容的规律。需要说明的是,可以通过分析当前录音文件中音频的音量高低来确定语音停顿点的位置。在一段音频中,语音停顿点位于该段音频的音量最低点位置,因为人对话时,其停顿点就是没有发音或者音量很低,因此通过分析当前录音文件中音频的音量高低可以快速确定该当前录音文件中各个语音停顿点。当然,为了节省服务器的运算能力和资源,在进行音频分析时,只需分析第一时间点前面的音频段即可。更进一步地,具体可以截取该当前录音文件中靠近该第一时间点且位于第一时间点前面第二时长的音频段。比如第二时长为10秒,则截取该第一时间点前面10秒的音频段进行音频分析。这是因为,一般人对话沟通是不会一口气连续讲话10秒以上不存在停顿的。当然,该第二时长具体可以根据实际情况进行设定。For steps 501 and 502, it can be understood that, in the second method, in order to fully consider the efficiency of the quality inspection staff in listening to audio and facilitate the quality inspection staff to understand the content of the current recording file, you can find the voice pause point in the current recording file. The speech pause point located in front of the first time point and closest to the first time point is used as the position where the current recording file starts to play. In this way, not only can the playback start position be as close as possible to the first time point (that is, the earliest audio time point in the current recording file), but also the audio content corresponding to the start playback position can be coherent, which is more convenient for quality inspection. Personnel understand what is in the current recording file. This is because the conversation content of the person recorded in the recording file, the person speaks continuously but has a pause during the conversation. If you select a time point directly before the first time point to start the playback, it is likely that the position to start the playback is in a continuous sentence. In the middle of a speech, or even in the pronunciation of a word, this is not conducive for the quality inspector to understand the content of the recording to be heard. Therefore, in this solution, steps 501 and 502 are used to start playback from the position of the speech pause point, which will conform to the law of human dialogue and the rule of quality inspection personnel listening to the recorded content. It should be noted that the position of the speech pause point can be determined by analyzing the volume of audio in the current recording file. In a piece of audio, the pause point of the voice is at the lowest point of the volume of the audio, because when a person talks, the pause point is that there is no pronunciation or the volume is low. Therefore, by analyzing the volume of the audio in the current recording file, the current volume can be quickly determined Each voice pause in the recording file. Of course, in order to save the computing power and resources of the server, when performing audio analysis, it is only necessary to analyze the audio segment before the first time point. Furthermore, the audio segment in the current recording file that is close to the first point in time and located in front of the first point in time for a second duration may be specifically intercepted. For example, the second duration is 10 seconds, then the audio segment 10 seconds before the first time point is intercepted for audio analysis. This is because there is no pause for ordinary people to talk continuously for more than 10 seconds. Of course, the second duration can be specifically set according to actual conditions.
进一步地,在确定出各个所述目标录音文件之后,还可以包括:Further, after determining each of the target recording files, the method may further include:
步骤601、若确定出的各个所述目标录音文件的数量大于1,则获取各个所述目标录音文件的录制时间;Step 601: If the determined number of each of the target recording files is greater than 1, obtain a recording time of each of the target recording files;
步骤602、按照录制时间的先后顺序确定各个所述目标录音文件的排序;Step 602: Determine the order of each of the target recording files according to the sequence of the recording time.
其中,所述步骤103中“输出确定出的各个所述目标录音文件的文件信息”的步骤具体为:将确定出的各个所述目标录音文件的文件信息输出至指定终端,以使所述指定终端按照所述排序展示各个所述目标录音文件。The step of “outputting the determined file information of each of the target recording files” in step 103 is specifically: outputting the determined file information of each of the target recording files to a designated terminal, so that the designated The terminal displays each of the target recording files according to the sort.
对于步骤601和步骤602,考虑到在输出各个目标录音文件的文件信息时,若输出的目标录音文件数量过多,将使得质检人员无所适从,不利于有序地质检这些目标录音文件。因此,本方案还可以按照录制时间的先后顺序确定各个所述目标录音文件的排序,在输出这些目标录音文件的文件信息时,将将确定出的各个所述目标录音文件的文件信息输出至指定终端,以使所述指定终端按照所述排序展示各个所述目标录音文件。For step 601 and step 602, when outputting the file information of each target recording file, it is considered that if too many target recording files are output, it will make the quality inspector confused, which is not conducive to orderly geological inspection of these target recording files. Therefore, this solution can also determine the order of each of the target recording files according to the sequence of the recording time. When outputting the file information of these target recording files, the determined file information of each of the target recording files is output to a specified A terminal, so that the designated terminal displays each of the target recording files according to the sort.
另一方面,上述步骤303从所述开始播放时间点开始播放所述当前录音文件,可知,当当前录音文件播放完成,并且质检人员在***界面上确定当前录音文件质检结束后,服务器可以自动播放下一个录音文件,此时,自动播放的顺序也可以按照录制时间的先后顺序来确定。On the other hand, the above step 303 starts to play the current recording file from the start playback time point. It can be known that when the playback of the current recording file is completed and the quality inspector determines on the system interface that the quality inspection of the current recording file is over, the server can The next recording file is played automatically. At this time, the order of automatic playback can also be determined according to the sequence of the recording time.
综上所述,上述提供了一种音频关键字质检方法,首先,确定当前待质检的目标关键字,然后,根据预先建立的音频时间对应关系确定出各个待质检的录音文件中存在所述目标关键字的目标录音文件以及所述目标关键字所处的音频时间点,所述音频时间对应关系记录了需质检的关键字、识别文本中存在的需质检的关键字的待质检的录音文件以及存在的所述关键字位于录音文件音频中播放的时间点之间的对应关系,所述音频时间点是指所述目标关键字位于目标录音文件中播放的时间点;最后,输出确定出的各个所述目标录音文件的文件信息,并标识各个所述目标录音文件中确定出的所述音频时间点。当质检人员需要检查这些录音文件中是否出现某个需质检的关键字的违规语言时,可以直接通过该音频时间对应关系定位到存在关键字的录音文件,并标识出该录音文件中存在关键字的音频时间点,可以实现关键字在录音文件中的快速定位,有助于质检人员快速核实该录音文件中是否真的存在违规语言,大大提高了对录音文件质检的工作效率。In summary, the above provides an audio keyword quality inspection method. First, determine the current target keyword to be inspected, and then determine the existence of each recording file to be inspected according to the pre-established audio-time correspondence. The target recording file of the target keyword and the audio time point at which the target keyword is located. The audio time correspondence records the keywords that need to be checked for quality, and the keywords that need to be checked for quality that exist in the text are identified. Correspondence between the quality inspection recording file and the time point where the keyword is located in the audio of the recording file, where the audio time point refers to the time point where the target keyword is located in the target recording file; Output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files. When the quality inspector needs to check whether there is an illegal language of a keyword requiring quality inspection in these recording files, he can directly locate the recording file with the keywords through the audio time correspondence and identify the existence of the recording file. The audio time point of the keyword can realize the fast positioning of the keyword in the recording file, which helps the quality inspection personnel to quickly verify whether there is really an illegal language in the recording file, which greatly improves the efficiency of the quality inspection of the recording file.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the order of execution. The execution order of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.
在一实施例中,提供一种音频关键字质检装置,该音频关键字质检装置与上述实施例中音频关键字质检方法一一对应。如图6所示,该音频关键字质检装置包括关键字确定模块701、录音文件确定模块702、和文件信息输出模块703,各功能模块详细说明如下:In one embodiment, an audio keyword quality inspection device is provided. The audio keyword quality inspection device is in one-to-one correspondence with the audio keyword quality inspection method in the above embodiment. As shown in FIG. 6, the audio keyword quality inspection device includes a keyword determination module 701, a recording file determination module 702, and a file information output module 703. Each functional module is described in detail as follows:
关键字确定模块701,用于确定当前待质检的目标关键字;A keyword determination module 701, configured to determine a target keyword to be currently inspected;
录音文件确定模块702,用于根据预先建立的音频时间对应关系确定出各个待质检的录音文件中存在所述目标关键字的目标录音文件以及所述目标关键字所处的音频时间点,所述音频时间对应关系记录了需质检的关键字、识别文本中存在的需质检的关键字的待质检的录音文件以及存在的所述关键字位于录音文件音频中播放的时间点之间的对应关系,所述音频时间点是指所述目标关键字位于目标录音文件中播放的时间点;The recording file determining module 702 is configured to determine, according to a pre-established audio-time correspondence, a target recording file in which the target keyword exists in each recording file to be inspected and an audio time point where the target keyword is located. The audio-time correspondence records the keywords that need to be checked, the recording files that need to be checked for the keywords that need to be checked that exist in the text, and the keywords that are located between the time points in the audio of the recorded file. Corresponding relationship, the audio time point refers to a time point at which the target keyword is located in a target recording file for playback;
文件信息输出模块703,用于输出确定出的各个所述目标录音文件的文件信息,并标识各个所述目标录音文件中确定出的所述音频时间点。The file information output module 703 is configured to output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
进一步地,所述音频时间对应关系可以通过如下模块预先建立:Further, the audio time correspondence can be established in advance through the following modules:
文件获取模块,用于获取各个待质检的录音文件;A file acquisition module, for acquiring each recording file to be inspected;
语音识别模块,用于对所述各个待质检的录音文件进行语音识别,得到与各个待质检的录音文件对应的识别文本,同时记录所述识别文本在与之对应的录音文件中播放的时间点;A voice recognition module is configured to perform voice recognition on each of the recording files to be inspected, to obtain recognition text corresponding to each of the recording files to be inspected, and simultaneously record the recognition text played in the corresponding recording file. Point in time
比对模块,用于将各个所述识别文本分别与预设需质检的关键字进行比对,确定关键字文件以及关键字时间点;A comparison module, configured to compare each of the identified texts with keywords that are preset for quality inspection to determine a keyword file and a keyword time point;
关系建立模块,用于根据所述需质检的关键字、关键字文件以及关键字时间点之间的对应关系建立所述音频时间对应关系。A relationship establishing module is configured to establish the audio time correspondence relationship according to the correspondence relationship between the keywords, keyword files, and keyword time points that need to be inspected.
进一步地,所述音频关键字质检装置还可以包括:Further, the audio keyword quality inspection device may further include:
录音文件选取模块,用于从各个所述目标录音文件中选取一个目标录音文件作为当前播放的当前录音文件;A recording file selection module, configured to select a target recording file from each of the target recording files as a currently played current recording file;
播放时间点确定模块,用于根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点;A playback time point determination module, configured to determine a start playback time point of the current recording file according to an audio time point corresponding to the current recording file;
录音文件播放模块,用于从所述开始播放时间点开始播放所述当前录音文件。A recording file playback module, configured to start playing the current recording file from the start playback time point.
进一步地,所述播放时间点确定模块可以包括:Further, the playback time point determination module may include:
最早时间确定单元,用于将所述当前录音文件对应的音频时间点中时间最早的一个音频时间点确定为第一时间点;The earliest time determining unit, configured to determine the earliest audio time point among the audio time points corresponding to the current recording file as the first time point;
开始播放时间确定单元,用于将所述当前录音文件中位于所述第一时间点前面的一个播放的时间点确定为所述当前录音文件的开始播放时间点。The start playback time determination unit is configured to determine a playback time point in the current recording file that is located before the first time point as a start playback time point of the current recording file.
进一步地,确定出各个所述目标录音文件之后,还包括:Further, after determining each of the target recording files, the method further includes:
数量确定单元,若确定出的各个所述目标录音文件的数量大于1,则获取各个所述目标录音文件的录制时间;A quantity determining unit, if the determined quantity of each of the target recording files is greater than 1, obtaining a recording time of each of the target recording files;
排序单元,按照录制时间的先后顺序确定各个所述目标录音文件的排序;A sorting unit, which determines the sorting of each of the target recording files according to the sequence of the recording time;
其中,所述文件信息输出模块具体可以用于:确定出的各个所述目标录音文件的文件信息输出至指定终端,以使所述指定终端按照所述排序展示各个所述目标录音文件。The file information output module may be specifically configured to: output the determined file information of each of the target recording files to a specified terminal, so that the specified terminal displays each of the target recording files in the order.
关于音频关键字质检装置的具体限定可以参见上文中对于音频关键字质检方法的限定,在此不再赘述。上述音频关键字质检装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对 应的操作。For the specific limitation of the audio keyword quality inspection device, refer to the foregoing limitation on the audio keyword quality inspection method, which is not repeated here. Each module in the audio keyword quality inspection device can be implemented in whole or in part by software, hardware, and a combination thereof. Each of the above modules may be embedded in the hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图7所示。该计算机设备包括通过***总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作***、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作***和计算机可读指令的运行提供环境。该计算机设备的数据库用于存储频关键字质检方法中涉及到的数据。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种频关键字质检方法。In one embodiment, a computer device is provided. The computer device may be a server, and its internal structure diagram may be as shown in FIG. 7. The computer device includes a processor, a memory, a network interface, and a database connected through a system bus. The processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer-readable instructions, and a database. The internal memory provides an environment for operating systems and computer-readable instructions in a non-volatile storage medium. The database of the computer equipment is used to store the data involved in the frequency keyword quality inspection method. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer-readable instructions are executed by a processor to implement a frequency key quality inspection method.
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现上述实施例中音频关键字质检方法的步骤,例如图2所示的步骤101至步骤103。或者,处理器执行计算机可读指令时实现上述实施例中音频关键字质检装置的各模块/单元的功能,例如图6所示模块701至模块703的功能。为避免重复,这里不再赘述。In one embodiment, a computer device is provided, which includes a memory, a processor, and computer-readable instructions stored on the memory and executable on the processor. When the processor executes the computer-readable instructions, the audio in the foregoing embodiment is implemented. The steps of the keyword quality inspection method are, for example, step 101 to step 103 shown in FIG. 2. Alternatively, when the processor executes the computer-readable instructions, the functions of the modules / units of the audio keyword quality inspection apparatus in the foregoing embodiment are implemented, for example, the functions of modules 701 to 703 shown in FIG. 6. To avoid repetition, we will not repeat them here.
在一个实施例中,提供了一种计算机可读存储介质,该一个或多个存储有计算机可读指令的非易失性可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行计算机可读指令时实现上述方法实施例中音频关键字质检方法的步骤,或者,该一个或多个存储有计算机可读指令的非易失性可读存储介质,计算机可读指令被一个或多个处理器执行时,使得一个或多个处理器执行计算机可读指令时实现上述装置实施例中音频关键字质检装置中各模块/单元的功能。为避免重复,这里不再赘述。In one embodiment, a computer-readable storage medium is provided, the one or more non-volatile storage mediums storing computer-readable instructions, and the computer-readable instructions are executed by one or more processors. , So that when one or more processors execute computer-readable instructions, the steps of the audio keyword quality inspection method in the foregoing method embodiment are implemented, or the one or more non-volatile readable storages storing computer-readable instructions Medium, when the computer-readable instructions are executed by one or more processors, causing the one or more processors to execute the computer-readable instructions to realize the functions of each module / unit in the audio keyword quality inspection device in the above device embodiment. To avoid repetition, we will not repeat them here.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直 接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by computer-readable instructions to instruct related hardware. The computer-readable instructions can be stored in a non-volatile computer. In the readable storage medium, the computer-readable instructions, when executed, may include the processes of the embodiments of the methods described above. Wherein, any reference to the memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and / or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。Those skilled in the art can clearly understand that, for the convenience and brevity of the description, only the above-mentioned division of functional units and modules is used as an example. In practical applications, the above functions can be assigned by different functional units, Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to describe the technical solution of the present application, but are not limited thereto. Although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art should understand that they can still implement the foregoing implementations. The technical solutions described in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not deviate the essence of the corresponding technical solutions from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of this application.

Claims (20)

  1. 一种音频关键字质检方法,其特征在于,包括:An audio keyword quality inspection method, comprising:
    确定当前待质检的目标关键字;Determine the current target keywords for quality inspection;
    根据预先建立的音频时间对应关系确定出各个待质检的录音文件中存在所述目标关键字的目标录音文件,以及确定所述目标关键字所处的音频时间点,其中,所述音频时间对应关系记录了需质检的关键字、关键字文件以及关键字时间点之间的对应关系,所述关键字文件是指识别文本中存在的需质检的关键字的待质检的录音文件,所述关键字时间点是指存在的所述关键字位于录音文件音频中播放的时间点,所述音频时间点是指所述目标关键字位于目标录音文件中播放的时间点;A target recording file in which the target keyword exists in each recording file to be inspected is determined according to a pre-established audio time correspondence relationship, and an audio time point at which the target keyword is located is determined, where the audio time corresponds The relationship records the correspondence between the keywords that need to be checked, the keyword file, and the time point of the keywords. The keyword files refer to the recording files that need to be checked for the keywords that need to be checked that exist in the text. The keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file, and the audio time point refers to a time point where the target keyword is in the target recording file and is played;
    输出确定出的各个所述目标录音文件的文件信息,并标识各个所述目标录音文件中确定出的所述音频时间点。Output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
  2. 根据权利要求1所述的音频关键字质检方法,其特征在于,所述音频时间对应关系通过如下步骤预先建立:The audio keyword quality inspection method according to claim 1, wherein the audio time correspondence is established in advance by the following steps:
    获取各个待质检的录音文件;Obtain recording files for each quality inspection;
    对所述各个待质检的录音文件进行语音识别,得到与各个待质检的录音文件对应的识别文本,同时记录所述识别文本在与之对应的录音文件中播放的时间点;Performing speech recognition on each of the recording files to be inspected to obtain identification text corresponding to each of the recording files to be inspected, and recording a time point at which the identification text is played in the corresponding recording file;
    将各个所述识别文本分别与预设需质检的关键字进行比对,确定关键字文件以及关键字时间点;Comparing each of the identified texts with keywords that are preset for quality inspection to determine a keyword file and a keyword time point;
    根据所述需质检的关键字、关键字文件以及关键字时间点之间的对应关系建立所述音频时间对应关系。The audio time correspondence relationship is established according to the correspondence relationship between the keywords, keyword files, and keyword time points that need to be quality checked.
  3. 根据权利要求1所述的音频关键字质检方法,其特征在于,在确定出各个所述目标录音文件之后,还包括:The audio keyword quality inspection method according to claim 1, further comprising: after determining each of the target recording files, further comprising:
    从各个所述目标录音文件中选取一个目标录音文件作为当前播放的当前录音文件;Selecting a target recording file from each of the target recording files as a currently playing current recording file;
    根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点;Determining a starting playback time point of the current recording file according to an audio time point corresponding to the current recording file;
    从所述开始播放时间点开始播放所述当前录音文件。Playing the current recording file from the starting playback time point.
  4. 根据权利要求3所述的音频关键字质检方法,其特征在于,所述根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点包括:The audio keyword quality inspection method according to claim 3, wherein the determining the start playback time point of the current recording file according to the audio time point corresponding to the current recording file comprises:
    将所述当前录音文件对应的音频时间点中时间最早的一个音频时间点确定为第一时间点;Determining the earliest audio time point among the audio time points corresponding to the current recording file as the first time point;
    将所述当前录音文件中位于所述第一时间点前面的一个播放的时间点确定为所述当前录音文件的开始播放时间点。Determining a playback time point in the current recording file that precedes the first time point as a starting playback time point of the current recording file.
  5. 根据权利要求1至4中任一项所述的音频关键字质检方法,其特征在于,在确定出各个所述目标录音文件之后,还包括:The audio keyword quality inspection method according to any one of claims 1 to 4, after determining each of the target recording files, further comprising:
    若确定出的各个所述目标录音文件的数量大于1,则获取各个所述目标录音文件的录制时间;If the determined number of each of the target recording files is greater than 1, obtaining a recording time of each of the target recording files;
    按照录制时间的先后顺序确定各个所述目标录音文件的排序;Determining the order of each of the target recording files according to the sequence of the recording time;
    所述输出确定出的各个所述目标录音文件的文件信息具体为:将确定出的各个所述目标录音文件的文件信息输出至指定终端,以使所述指定终端按照所述排序展示各个所述目标录音文件。The file information of each of the target recording files determined by the output is specifically: outputting the determined file information of each of the target recording files to a designated terminal, so that the designated terminal displays each of the target according to the sorting. The target recording file.
  6. 一种音频关键字质检装置,其特征在于,包括:An audio keyword quality inspection device, comprising:
    关键字确定模块,用于确定当前待质检的目标关键字;A keyword determination module, configured to determine a target keyword that is currently to be inspected;
    录音文件确定模块,用于根据预先建立的音频时间对应关系确定出各个待质检的录音文件中存在所述目标关键字的目标录音文件,以及确定所述目标关键字所处的音频时间点,其中,所述音频时间对应关系记录了需质检的关键字、关键字文件以及关键字时间点之间的对应关系,所述关键字文件是指识别文本中存在的需质检的关键字的待质检的录音文件,所述关键字时间点是指存在的所述关键字位于录音文件音频中播放的时间点,所述音频时间点是指所述目标关键字位于目标录音文件中播放的时间点;A recording file determining module, configured to determine, according to a pre-established audio-time correspondence, a target recording file in which the target keyword exists in each recording file to be inspected, and determine an audio time point where the target keyword is located, The audio time correspondence records the correspondence between keywords that need to be checked, keyword files, and keyword time points. The keyword file refers to the keywords that need to be checked for quality that exist in the text. For a recording file to be inspected, the keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file, and the audio time point means that the target keyword is located in the target recording file and is played. Point in time
    文件信息输出模块,用于输出确定出的各个所述目标录音文件的文件信息,并标识各个所述目标录音文件中确定出的所述音频时间点。The file information output module is configured to output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
  7. 根据权利要求6所述的音频关键字质检装置,其特征在于,所述音频时间对应关系通过如下模块预先建立:The audio keyword quality inspection device according to claim 6, wherein the audio time correspondence is established in advance through the following modules:
    文件获取模块,用于获取各个待质检的录音文件;A file acquisition module, for acquiring each recording file to be inspected;
    语音识别模块,用于对所述各个待质检的录音文件进行语音识别,得到与各个待质检的录音文件对应的识别文本,同时记录所述识别文本在与之对应的录音文件中播放的时间点;A voice recognition module is configured to perform voice recognition on each of the recording files to be inspected, to obtain recognition text corresponding to each of the recording files to be inspected, and simultaneously record the recognition text played in the corresponding recording file. Point in time
    比对模块,用于将各个所述识别文本分别与预设需质检的关键字进行比对,确定关键字文件以及关键字时间点;A comparison module, configured to compare each of the identified texts with keywords that are preset for quality inspection to determine a keyword file and a keyword time point;
    关系建立模块,用于根据所述需质检的关键字、关键字文件以及关键字时间点之间的对应关系建立所述音频时间对应关系。A relationship establishing module is configured to establish the audio time correspondence relationship according to the correspondence relationship between the keywords, keyword files, and keyword time points that need to be inspected.
  8. 根据权利要求6所述的音频关键字质检装置,其特征在于,所述音频关键字质检装置还包括:The audio keyword quality inspection device according to claim 6, wherein the audio keyword quality inspection device further comprises:
    录音文件选取模块,用于从各个所述目标录音文件中选取一个目标录音文件作为当前播放的当前录音文件;A recording file selection module, configured to select a target recording file from each of the target recording files as a currently played current recording file;
    播放时间点确定模块,用于根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点;A playback time point determination module, configured to determine a start playback time point of the current recording file according to an audio time point corresponding to the current recording file;
    录音文件播放模块,用于从所述开始播放时间点开始播放所述当前录音文件。A recording file playback module, configured to start playing the current recording file from the start playback time point.
  9. 根据权利要求8所述的音频关键字质检装置,其特征在于,所述播放时间点确定模块包括:The audio keyword quality inspection device according to claim 8, wherein the playback time point determination module comprises:
    最早时间确定单元,用于将所述当前录音文件对应的音频时间点中时间最早的一个音频时间点确定为第一时间点;The earliest time determining unit, configured to determine the earliest audio time point among the audio time points corresponding to the current recording file as the first time point;
    开始播放时间确定单元,用于将所述当前录音文件中位于所述第一时间点前面的一个播放的时间点确定为所述当前录音文件的开始播放时间点。The start playback time determination unit is configured to determine a playback time point in the current recording file that is located before the first time point as a start playback time point of the current recording file.
  10. 根据权利要求6至9中任一项所述的音频关键字质检装置,其特征在于,所述音频关键字质检装置还包括:The audio keyword quality inspection device according to any one of claims 6 to 9, wherein the audio keyword quality inspection device further comprises:
    数量确定单元,若确定出的各个所述目标录音文件的数量大于1,则获取各个所述目标录音文件的录制时间;A quantity determining unit, if the determined quantity of each of the target recording files is greater than 1, obtaining a recording time of each of the target recording files;
    排序单元,按照录制时间的先后顺序确定各个所述目标录音文件的排序;A sorting unit, which determines the sorting of each of the target recording files according to the sequence of the recording time;
    其中,所述文件信息输出模块具体可以用于:确定出的各个所述目标录音文件的文件信息输出至指定终端,以使所述指定终端按照所述排序展示各个所述目标录音文件。The file information output module may be specifically configured to: output the determined file information of each of the target recording files to a specified terminal, so that the specified terminal displays each of the target recording files in the order.
  11. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:A computer device includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, wherein the processor implements the computer-readable instructions as follows: step:
    确定当前待质检的目标关键字;Determine the current target keywords for quality inspection;
    根据预先建立的音频时间对应关系确定出各个待质检的录音文件中存在所述目标关键字的目标录音文件,以及确定所述目标关键字所处的音频时间点,其中,所述音频时间对应关系记录了需质检的关键字、关键字文件以及关键字时间点之间的对应关系,所述关键字文件是指识别文本中存在的需质检的关键字的待质检的录音文件,所述关键字时间点是指存在的所述关键字位于录音文件音频中播放的时间点,所述音频时间点是指所述目标关键字位于目标录音文件中播放的时间点;A target recording file in which the target keyword exists in each recording file to be inspected is determined according to a pre-established audio time correspondence relationship, and an audio time point at which the target keyword is located is determined, where the audio time corresponds The relationship records the correspondence between the keywords that need to be checked, the keyword file, and the time point of the keywords. The keyword files refer to the recording files that need to be checked for the keywords that need to be checked that exist in the text. The keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file, and the audio time point refers to a time point where the target keyword is in the target recording file and is played;
    输出确定出的各个所述目标录音文件的文件信息,并标识各个所述目标录音文件中确定出的所述音频时间点。Output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
  12. 根据权利要求11所述的计算机设备,其特征在于,所述音频时间对应关系通过如下步骤预先建立:The computer device according to claim 11, wherein the audio time correspondence is established in advance by the following steps:
    获取各个待质检的录音文件;Obtain recording files for each quality inspection;
    对所述各个待质检的录音文件进行语音识别,得到与各个待质检的录音文件对应的识别文本,同时记录所述识别文本在与之对应的录音文件中播放的时间点;Performing speech recognition on each of the recording files to be inspected to obtain identification text corresponding to each of the recording files to be inspected, and recording a time point at which the identification text is played in the corresponding recording file;
    将各个所述识别文本分别与预设需质检的关键字进行比对,确定关键字文件以及关键字时间点;Comparing each of the identified texts with keywords that are preset for quality inspection to determine a keyword file and a keyword time point;
    根据所述需质检的关键字、关键字文件以及关键字时间点之间的对应关系建立所述音频时间对应关系。The audio time correspondence relationship is established according to the correspondence relationship between the keywords, keyword files, and keyword time points that need to be quality checked.
  13. 根据权利要求11所述的计算机设备,其特征在于,在确定出各个所述目标录音文件之后,所述处理器执行所述计算机可读指令时还实现如下步骤:The computer device according to claim 11, wherein after determining each of the target recording files, the processor further implements the following steps when executing the computer-readable instructions:
    从各个所述目标录音文件中选取一个目标录音文件作为当前播放的当前录音文件;Selecting a target recording file from each of the target recording files as a currently playing current recording file;
    根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点;Determining a starting playback time point of the current recording file according to an audio time point corresponding to the current recording file;
    从所述开始播放时间点开始播放所述当前录音文件。Playing the current recording file from the starting playback time point.
  14. 根据权利要求13所述的计算机设备,其特征在于,所述根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点包括:The computer device according to claim 13, wherein the determining a starting playback time point of the current recording file according to an audio time point corresponding to the current recording file comprises:
    将所述当前录音文件对应的音频时间点中时间最早的一个音频时间点确定为第一时间点;Determining the earliest audio time point among the audio time points corresponding to the current recording file as the first time point;
    将所述当前录音文件中位于所述第一时间点前面的一个播放的时间点确定为所述当前录音文件的开始播放时间点。Determining a playback time point in the current recording file that precedes the first time point as a starting playback time point of the current recording file.
  15. 根据权利要求11至14中任一项所述的计算机设备,其特征在于,在确定出各个所述目标录音文件之后,所述处理器执行所述计算机可读指令时还实现如下步骤:The computer device according to any one of claims 11 to 14, wherein after determining each of the target recording files, the processor further implements the following steps when executing the computer-readable instructions:
    若确定出的各个所述目标录音文件的数量大于1,则获取各个所述目标录音文件的录制时间;If the determined number of each of the target recording files is greater than 1, obtaining a recording time of each of the target recording files;
    按照录制时间的先后顺序确定各个所述目标录音文件的排序;Determining the order of each of the target recording files according to the sequence of the recording time;
    所述输出确定出的各个所述目标录音文件的文件信息具体为:将确定出的各个所述目标录音文件的文件信息输出至指定终端,以使所述指定终端按照所述排序展示各个所述目标录音文件。The file information of each of the target recording files determined by the output is specifically: outputting the determined file information of each of the target recording files to a designated terminal, so that the designated terminal displays each of the target according to the sorting. The target recording file.
  16. 一个或多个存储有计算机可读指令的非易失性可读存储介质,其特征在于,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:One or more non-volatile readable storage media storing computer-readable instructions, wherein when the computer-readable instructions are executed by one or more processors, the one or more processors execute The following steps:
    确定当前待质检的目标关键字;Determine the current target keywords for quality inspection;
    根据预先建立的音频时间对应关系确定出各个待质检的录音文件中存在所述目标关键字的目标录音文件,以及确定所述目标关键字所处的音频时间点,其中,所述音频时间对应关系记录了需质检的关键字、关键字文件以及关键字时间点之间的对应关系,所述关键字文件是指识别文本中存在的需质检的关键字的待质检的录音文件,所述关键字时间点是指存在的所述关键字位于录音文件音频中播放的时间点,所述音频时间点是指所述目标关键字位于目标录音文件中播放的时间点;A target recording file in which the target keyword exists in each recording file to be inspected is determined according to a pre-established audio time correspondence relationship, and an audio time point at which the target keyword is located is determined, where the audio time corresponds The relationship records the correspondence between the keywords that need to be checked, the keyword file, and the time point of the keywords. The keyword files refer to the recording files that need to be checked for the keywords that need to be checked that exist in the text. The keyword time point refers to a point in time when the keyword exists and is played in the audio of the recording file, and the audio time point refers to a time point where the target keyword is in the target recording file and is played;
    输出确定出的各个所述目标录音文件的文件信息,并标识各个所述目标录音文件中确定出的所述音频时间点。Output the determined file information of each of the target recording files, and identify the audio time point determined in each of the target recording files.
  17. 根据权利要求16所述的非易失性可读存储介质,其特征在于,所述音频时间对应关系通过如下步骤预先建立:The non-volatile readable storage medium according to claim 16, wherein the audio time correspondence is established in advance by the following steps:
    获取各个待质检的录音文件;Obtain recording files for each quality inspection;
    对所述各个待质检的录音文件进行语音识别,得到与各个待质检的录音文件对应的识别文本,同时记录所述识别文本在与之对应的录音文件中播放的时间点;Performing speech recognition on each of the recording files to be inspected to obtain identification text corresponding to each of the recording files to be inspected, and recording a time point at which the identification text is played in the corresponding recording file;
    将各个所述识别文本分别与预设需质检的关键字进行比对,确定关键字文件以及关键字时间点;Comparing each of the identified texts with keywords that are preset for quality inspection to determine a keyword file and a keyword time point;
    根据所述需质检的关键字、关键字文件以及关键字时间点之间的对应关系建立所述音频时间对应关系。The audio time correspondence relationship is established according to the correspondence relationship between the keywords, keyword files, and keyword time points that need to be quality checked.
  18. 根据权利要求16所述的非易失性可读存储介质,其特征在于,在确定出各个所述目标录音文件之后,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:The non-volatile readable storage medium according to claim 16, wherein after determining each of the target recording files, the computer-readable instructions are executed by one or more processors to cause the computer-readable instructions to One or more processors also perform the following steps:
    从各个所述目标录音文件中选取一个目标录音文件作为当前播放的当前录音文件;Selecting a target recording file from each of the target recording files as a currently playing current recording file;
    根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点;Determining a starting playback time point of the current recording file according to an audio time point corresponding to the current recording file;
    从所述开始播放时间点开始播放所述当前录音文件。Playing the current recording file from the starting playback time point.
  19. 根据权利要求18所述的非易失性可读存储介质,其特征在于,所述根据所述当前录音文件对应的音频时间点确定所述当前录音文件的开始播放时间点包括:The non-volatile readable storage medium according to claim 18, wherein determining the starting playback time point of the current recording file according to the audio time point corresponding to the current recording file comprises:
    将所述当前录音文件对应的音频时间点中时间最早的一个音频时间点确定为第一时间点;Determining the earliest audio time point among the audio time points corresponding to the current recording file as the first time point;
    将所述当前录音文件中位于所述第一时间点前面的一个播放的时间点确定为所述当前录音文件的开始播放时间点。Determining a playback time point in the current recording file that precedes the first time point as a starting playback time point of the current recording file.
  20. 根据权利要求16至19中任一项所述的非易失性可读存储介质,其特征在于,在确定出各个所述目标录音文件之后,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器还执行如下步骤:The non-volatile readable storage medium according to any one of claims 16 to 19, wherein after determining each of the target recording files, the computer-readable instructions are processed by one or more processors. When executed, the one or more processors further perform the following steps:
    若确定出的各个所述目标录音文件的数量大于1,则获取各个所述目标录音文件的录制时间;If the determined number of each of the target recording files is greater than 1, obtaining a recording time of each of the target recording files;
    按照录制时间的先后顺序确定各个所述目标录音文件的排序;Determining the order of each of the target recording files according to the sequence of the recording time;
    所述输出确定出的各个所述目标录音文件的文件信息具体为:将确定出的各个所述目标录音文件的文件信息输出至指定终端,以使所述指定终端按照所述排序展示各个所述目标录音文件。The file information of each of the target recording files determined by the output is specifically: outputting the determined file information of each of the target recording files to a designated terminal, so that the designated terminal displays each of the target according to the sorting. The target recording file.
PCT/CN2018/123067 2018-08-15 2018-12-24 Audio keyword quality inspection method and apparatus, computer device, and storage medium WO2020034538A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810927508.4 2018-08-15
CN201810927508.4A CN109241334A (en) 2018-08-15 2018-08-15 Audio keyword quality detecting method, device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2020034538A1 true WO2020034538A1 (en) 2020-02-20

Family

ID=65070732

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/123067 WO2020034538A1 (en) 2018-08-15 2018-12-24 Audio keyword quality inspection method and apparatus, computer device, and storage medium

Country Status (2)

Country Link
CN (1) CN109241334A (en)
WO (1) WO2020034538A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10811618B2 (en) 2016-12-19 2020-10-20 Universal Display Corporation Organic electroluminescent materials and devices
CN111935552A (en) * 2020-07-30 2020-11-13 安徽鸿程光电有限公司 Information labeling method, device, equipment and medium
US10910570B2 (en) 2017-04-28 2021-02-02 Universal Display Corporation Organic electroluminescent materials and devices
US10934293B2 (en) 2017-05-18 2021-03-02 Universal Display Corporation Organic electroluminescent materials and devices

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112101311A (en) * 2020-11-16 2020-12-18 深圳壹账通智能科技有限公司 Double-recording quality inspection method and device based on artificial intelligence, computer equipment and medium
CN112509609B (en) * 2020-12-16 2022-06-10 北京乐学帮网络技术有限公司 Audio processing method and device, electronic equipment and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107424640A (en) * 2017-07-27 2017-12-01 上海与德科技有限公司 A kind of audio frequency playing method and device
CN107610718A (en) * 2017-08-29 2018-01-19 深圳市买买提乐购金融服务有限公司 A kind of method and device that voice document content is marked
CN108287930A (en) * 2018-03-08 2018-07-17 珠海格力电器股份有限公司 A kind of recording searching method, device and electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP5311348B2 (en) * 2009-09-03 2013-10-09 株式会社eVOICE Speech keyword collation system in speech data, method thereof, and speech keyword collation program in speech data
CN105141787A (en) * 2015-08-14 2015-12-09 上海银天下科技有限公司 Service record compliance checking method and device
GB2549117B (en) * 2016-04-05 2021-01-06 Intelligent Voice Ltd A searchable media player

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107424640A (en) * 2017-07-27 2017-12-01 上海与德科技有限公司 A kind of audio frequency playing method and device
CN107610718A (en) * 2017-08-29 2018-01-19 深圳市买买提乐购金融服务有限公司 A kind of method and device that voice document content is marked
CN108287930A (en) * 2018-03-08 2018-07-17 珠海格力电器股份有限公司 A kind of recording searching method, device and electronic equipment

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10811618B2 (en) 2016-12-19 2020-10-20 Universal Display Corporation Organic electroluminescent materials and devices
US10910570B2 (en) 2017-04-28 2021-02-02 Universal Display Corporation Organic electroluminescent materials and devices
US10934293B2 (en) 2017-05-18 2021-03-02 Universal Display Corporation Organic electroluminescent materials and devices
CN111935552A (en) * 2020-07-30 2020-11-13 安徽鸿程光电有限公司 Information labeling method, device, equipment and medium

Also Published As

Publication number Publication date
CN109241334A (en) 2019-01-18

Similar Documents

Publication Publication Date Title
WO2020034538A1 (en) Audio keyword quality inspection method and apparatus, computer device, and storage medium
US10824814B2 (en) Generalized phrases in automatic speech recognition systems
US10319366B2 (en) Predicting recognition quality of a phrase in automatic speech recognition systems
CN108962282B (en) Voice detection analysis method and device, computer equipment and storage medium
US10872068B2 (en) Systems and methods for providing searchable customer call indexes
US7995732B2 (en) Managing audio in a multi-source audio environment
US11289077B2 (en) Systems and methods for speech analytics and phrase spotting using phoneme sequences
WO2020077841A1 (en) Voiceprint recognition-based customer service method, device, computer device, and storage medium
CN105187674B (en) Compliance checking method and device for service recording
US9607615B2 (en) Classifying spoken content in a teleconference
US11157916B2 (en) Systems and methods for detecting complaint interactions
WO2022142031A1 (en) Invalid call determination method and apparatus, computer device, and storage medium
CN110598008A (en) Data quality inspection method and device for recorded data and storage medium
US10447848B2 (en) System and method for reliable call recording testing and proprietary customer information retrieval
CN111507698A (en) Processing method and device for transferring accounts, computing equipment and medium
CN110298543B (en) Service tracking method, device, computer equipment and storage medium
US20160072945A1 (en) Call recording test suite
CN113645357B (en) Call quality inspection method, device, computer equipment and computer readable storage medium
US20220309413A1 (en) Method and apparatus for automated workflow guidance to an agent in a call center environment
KR20220122355A (en) Contract management system and method for managing non-face-to-face contracts
US12033163B2 (en) Systems and methods for detecting complaint interactions
US20230186897A1 (en) Searching calls based on contextual similarity among calls
CN114157765A (en) Voice quality inspection method and device, electronic equipment and storage medium
CN117354422A (en) Service auxiliary processing method, device, computer equipment and storage medium
CN116701649A (en) Knowledge graph construction method, device, equipment and medium based on artificial intelligence

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18930234

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18930234

Country of ref document: EP

Kind code of ref document: A1