CN109637536A

CN109637536A - A kind of method and device of automatic identification semantic accuracy

Info

Publication number: CN109637536A
Application number: CN201811611680.5A
Authority: CN
Inventors: 林婷; 吴有宝
Original assignee: AI Speech Ltd
Current assignee: Sipic Technology Co Ltd
Priority date: 2018-12-27
Filing date: 2018-12-27
Publication date: 2019-04-16
Anticipated expiration: 2038-12-27
Also published as: CN109637536B

Abstract

A kind of method that the present invention discloses automatic identification semantic accuracy includes the following steps: to obtain phonetic order；Phonetic order is identified and is parsed, semantic parsing result is obtained；Semantic parsing result is detected, determines the accuracy of semanteme parsing result；According to the accuracy output test result to semantic parsing result.The invention also discloses a kind of devices of automatic identification semantic accuracy.Method and apparatus disclosed by the invention can be used for batch identification voice messaging, and export corresponding correct semantic content, and only need a simple key operation, field corresponding to corpus, slot, controlling value etc. in voice messaging can be seen at a glance, thus can efficiently statistical semantic identification accuracy, and the performance of semantics recognition is analyzed, improve the working efficiency of voice integrated client and the accuracy rate of adaptation.

Description

A kind of method and device of automatic identification semantic accuracy

Technical field

The present invention relates to voice processing technology field, the especially a kind of method and dress of automatic identification semantic accuracy It sets.

Background technique

Speech recognition refers to a kind of technology that corresponding word content is identified from speech waveform, is artificial intelligence field One of important technology.The general basis of current audio recognition method: acoustic model, Pronounceable dictionary and language model are known Not.Wherein for acoustic model by deep neural network training, language model is usually statistical language model, and Pronounceable dictionary records Corresponding relationship between word and phoneme is the tie for connecting acoustic model and language model.

With the extensive use of interactive voice technology, voice-based interactive device is increasingly liked by user, and this The key that class equipment can complete corresponding interactive voice scene is exactly to be identified simultaneously using speech recognition technology to phonetic order Correct response, thus the accuracy of speech recognition are made, is to determine being critical to for interactive voice equipment performance and user experience Element.

It, all can be first to it before interactive voice launch in order to provide the interactive voice product of high-quality to user Tone testing is carried out, the accuracy rate for identifying and responding to detect it to phonetic order.Under existing mode, tone testing is carried out, It is that manually reading phonetic order is differentiated that efficiency is very low to tester one by one, testing cost is also very high.

Summary of the invention

To solve the above-mentioned problems, one of the objects of the present invention is to provide one kind to be capable of automatic identification semantic accuracy Tool improves testing efficiency to meet the testing requirement handled large batch of voice.

Meanwhile the present invention also aims to, also can be to voice on the basis of can automate batch processing voice The effect of processing such as accuracy carries out automatic discrimination, to further increase testing efficiency, reduces testing cost.

In addition, the present invention also aims to can simplify the implementation method of the tool, so that it is easy to accomplish.

Based on this, according to the first aspect of the invention, a kind of method of automatic identification semantic accuracy is provided, wrapped Include following steps:

Obtain phonetic order；

Phonetic order is identified and is parsed, semantic parsing result is obtained；

Semantic parsing result is detected, determines the accuracy of semanteme parsing result；

According to the accuracy output test result to semantic parsing result.

According to the second aspect of the invention, a kind of device of automatic identification semantic accuracy is provided, comprising:

Voice obtains module, for obtaining phonetic order；

Speech analysis module obtains semantic parsing result for being identified and being parsed to phonetic order；

Checking module determines the accuracy of semanteme parsing result for detecting to semantic parsing result；

As a result module is presented, for according to the accuracy output test result to semantic parsing result.

According to the third aspect of the present invention, a kind of electronic equipment is provided comprising: at least one processor, and The memory being connect at least one processor communication, wherein memory is stored with the finger that can be executed by least one processor It enables, instruction is executed by least one processor, so that the step of at least one processor is able to carry out the above method.

According to the fourth aspect of the present invention, a kind of storage medium is provided, computer program is stored thereon with, the program The step of above method is realized when being executed by processor.

Device and method provided by the invention can be used for batch identification voice messaging, and the correspondence language identified be presented The accuracy of adopted content, and a simple key operation is only needed, it can see at a glance to the accurate of semantic parsing result Property testing result, thus can efficient statistical semantic identification accuracy, and analyze the performance of semantics recognition, improve voice collection At the working efficiency of client and the accuracy rate of adaptation.

Detailed description of the invention

Fig. 1 is the method flow diagram of the automatic identification semantic accuracy of an embodiment of the present invention；

Fig. 2 is the device principle block diagram of the automatic identification semantic accuracy of an embodiment of the present invention；

Fig. 3 is the electronic device block diagram of an embodiment of the present invention.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.

It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase Mutually combination.

The present invention can describe in the general context of computer-executable instructions executed by a computer, such as program Module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, programs, objects, member Part, data structure etc..The present invention can also be practiced in a distributed computing environment, in these distributed computing environments, by Task is executed by the connected remote processing devices of communication network.In a distributed computing environment, program module can be with In the local and remote computer storage media including storage equipment.

In the present invention, the fingers such as " module ", " device ", " system " are applied to the related entities of computer, such as hardware, hardware Combination, software or software in execution with software etc..In detail, for example, element can with but be not limited to run on processing Process, processor, object, executable element, execution thread, program and/or the computer of device.In addition, running on server Application program or shell script, server can be element.One or more elements can be in the process and/or thread of execution In, and element can be localized and/or be distributed between two or multiple stage computers on one computer, and can be by each Kind computer-readable medium operation.Element can also according to the signal with one or more data packets, for example, from one with Another element interacts in local system, distributed system, and/or the network in internet passes through signal and other system interactions The signals of data communicated by locally and/or remotely process.

Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise", not only include those elements, and And further include other elements that are not explicitly listed, or further include for this process, method, article or equipment institute it is intrinsic Element.In the absence of more restrictions, the element limited by sentence " including ... ", it is not excluded that including described want There is also other identical elements in the process, method, article or equipment of element.

The method of the automatic identification semantic accuracy of the embodiment of the present invention can be applied to any be configured with phonetic function Terminal device, for example, the terminal devices such as smart phone, tablet computer, smart home, the invention is not limited in this regard.

The invention will now be described in further detail with reference to the accompanying drawings.

The method that Fig. 1 schematically shows a kind of automatic identification semantic accuracy of embodiment according to the present invention Flow chart.As shown in Figure 1, the present embodiment includes the following steps:

Step S101: phonetic order is obtained.

Illustratively, it can be implemented as synthesizing to obtain phonetic order by the corpus configured, specially obtain corpus first Configuration file, wherein be stored with specific corpus information in corpus configuration file, be stored in specified path position, pass through Corresponding file is read in specified path position, the corpus configuration file can be got.Preferably, corpus configuration file is realized For excel file, the reading to the corpus configuration file of excel format is realized by python script, to obtain corpus content, The corpus content that will acquire later synthesizes TTS voice by speech synthesis technique, that is, synthesizes phonetic order.

In some embodiments, obtaining phonetic order can also be the phonetic order for directly acquiring audio file, this feelings Under condition, when starting to process, it can directly be obtained to specified path by the corresponding audio files storage of phonetic order to specified path Take the audio file of whole phonetic orders.

Step S102: identifying phonetic order and parsed, and semantic parsing result is obtained.

The mode for being identified and being parsed to phonetic order is referred to the speech recognition and semantic parsing skill of the prior art Art is realized, i.e., phonetic order is converted to text by speech recognition technology one by one, and carry out to the text that identification obtains semantic Parsing, obtains final semantic parsing result.Wherein, as a kind of preferred embodiment, obtained semantic parsing result can be real Now being includes field, classification (slot) and controlling value etc., and field is closed in this way for identifying the corresponding business scope of the phonetic order In navigation or music etc., classification is for identifying object pointed by the phonetic order, and controlling value is then used to identify this The movement to be made of phonetic order, illustratively, for the phonetic order of " opening radio ", parsing result is " MUSIC (field)/Radio (object)/PLAY (controlling value) ".

Step S103: detecting semantic parsing result, determines the accuracy of semanteme parsing result, to semanteme parsing knot The differentiation of the accuracy of fruit is mainly based upon standard corpus and compares realization, different based on the mode for obtaining phonetic order, can To there is following two targetedly to implement example:

One, for the semantic parsing result of the phonetic order synthesized by corpus configuration file, carrying out corpus configuration text When the configuration of part, corpus content can be set in corpus configuration file, at this point, can be using the corpus in corpus configuration file as mark Quasi- column, i.e. standard corpus, semantic parsing result are compared to the corresponding corpus in corpus configuration file, according to comparison result Determine accuracy.

Two, for the phonetic order obtained by directly acquiring audio file, the detection to its semantic parsing result is Differentiated by preconfigured standard corpus, specifically, being first audio file configuration standard corpus file, in standard speech Expect in file, the identifier (such as title, path or ID) of audio file and standard corpus content is associated binding, formation is reflected Penetrate file, after getting semantic parsing result, by the standard corpus content in semantic parsing result and standard corpus file into Row compares, and determines accuracy according to comparison result.

Step S104: according to the accuracy output test result to semantic parsing result.

The mode of output test result illustratively can be, by will test result be wrong corresponding phonetic order into Row output display is also possible to by will test result be that the wrong corresponding file line of phonetic order is highlighted, with For the latter, different for the acquisition modes of phonetic order, which be may be implemented are as follows:

The case where for obtaining phonetic order by corpus configuration file, (i.e. and language by the corresponding corpus row of phonetic order Sound instructs the row where corresponding corpus) background colour highlighted, can be according to the standard of semantic parsing result True property modifies the corresponding corpus background colour of the row of corpus configuration file by python script.Illustratively, a certain voice The semantic results accuracy of instruction is lower, then by the corresponding function (calling corresponding interface function) of python script by the language The background colour for the corpus that sound instructs this accuracy rate of corresponding corpus configuration excel file low is revised as red.Pass through Pyhton script carries out file reading and the modification of background document color, and realization is simple, facilitates research and development, and execution and response efficiency all compare Height, other opposite implementations, is easier to realize, and simplifies development process.

The case where for obtaining phonetic order by audio file, then it can be pair by modifying its standard corpus file The background colour that should be gone.

Through the above steps, it a key may be implemented can import phonetic order in batches to carry out identification test, and can be automatic The accuracy of recognition result is analyzed, and the accuracy of recognition result is visualized, simple, intuitive.As it can be seen that according to this The method of embodiment may be implemented automatically to identify large batch of phonetic order, the accuracy of efficient statistical semantic identification, and point The performance for analysing semantics recognition, improves the working efficiency of voice integrated client and the accuracy rate of adaptation.

As a kind of preferred embodiment, during carrying out the above method, other than by testing result output, also Semantic parsing result directly can be carried out output in corresponding corpus row to show, also can be convenient user and directly carry out verification language Whether sound processing result is reasonable, in this case, can also individually can only export semantic solution in conjunction with the processing of background colour change It analyses result and is changed without background colour, confirmation is voluntarily compared by user, it is clear that above-mentioned realization example is more excellent.Due to the language of output Adopted parsing result includes field, classification, controlling value (the corresponding movement of phonetic order) where the corresponding corpus of phonetic order, because And relatively traditional test mode, also reach efficient, intuitive, quick effect.

Fig. 2 schematically shows the principle of device of automatic identification semantic accuracy according to an embodiment of the present invention Block diagram, as shown in Fig. 2,

The device of automatic identification semantic accuracy includes that voice obtains module 201, speech analysis module 202, check and correction mould Module 204 is presented in block 203 and result.

Voice obtains module 201 for obtaining phonetic order, and the audio which can be implemented with pickup function is adopted Acquisition means are also possible to the file read module for reading corpus configuration file or audio file, when for the latter, Ke Yitong Cross the realization of python script.

Wherein, when by read corpus configuration file to obtain phonetic order when, voice obtain module 201 be embodied as include Corpus acquiring unit 2011 and speech synthesis unit 2012.Corpus acquiring unit 2011 is read for obtaining corpus configuration file Take corpus.Speech synthesis unit 2012 is used to synthesizing the corpus read into phonetic order output.

Speech analysis module 202 obtains semantic parsing result, realization side for phonetic order to be identified and parsed Formula is referred to the speech recognition and semantic analytic technique of the prior art.

Checking module 203 determines the accuracy of semanteme parsing result, realizes for detecting to semantic parsing result Mode is referred to above-mentioned method part.

As a result module 204 is presented to be used for according to the accuracy output test result to semantic parsing result, the testing result For the corresponding corpus background colour of the row of the semantic lower phonetic order of parsing result accuracy rate is highlighted, such as It is revised as red.

It is realized in example at other, above-mentioned apparatus can also not have checking module, but module directly is presented by result 204 outputs for carrying out semantic parsing result are shown, wherein the semantic parsing result for exporting displaying includes neck where phonetic order Domain, classification, controlling value (the corresponding movement of phonetic order), are shown in the corresponding standard corpus of phonetic order difference of the row Column, to achieve the effect that conveniently to check comparison.

Speech recognition can be carried out to the high-volume phonetic order of acquisition automatically according to the device of the present embodiment, and can Show the parameters such as its field, object, controlling value, accuracy rate, it is possible thereby to the efficiently accuracy of statistical semantic identification, and analytic language The performance of justice identification, improves the working efficiency of voice integrated client and the accuracy rate of adaptation.

In above-described embodiment, the operation of corpus configuration file (is such as read by corpus, semantic parsing result, write-in language is written Material, change background colour) and the judgement of semantic parsing result accuracy can be realized by python script, and language will be inputted Material is converted into phonetic order and generative semantics parsing result can then be realized by C language, to simplify development process.

In some embodiments, the embodiment of the present invention provides a kind of non-volatile computer readable storage medium storing program for executing, described to deposit Being stored in storage media one or more includes the programs executed instruction, and executing instruction can be (including but unlimited by electronic equipment In computer, server or the network equipment etc.) it reads and executes, to know for executing any of the above-described automation of the present invention The method of other semantic accuracy.

In some embodiments, the embodiment of the present invention also provides a kind of computer program product, computer program product packet The computer program being stored on non-volatile computer readable storage medium storing program for executing is included, computer program includes program instruction, works as institute When program instruction is computer-executed, make the method for computer execution any of the above-described automatic identification semantic accuracy.

In some embodiments, the embodiment of the present invention also provides a kind of electronic equipment comprising: at least one processor, And the memory being connect at least one processor communication, wherein memory, which is stored with, to be executed by least one processor Instruction, instruction executed by least one processor, so as to be able to carry out automatic identification semantic accurate at least one processor The method of property.

In some embodiments, the embodiment of the present invention also provides a kind of storage medium, is stored thereon with computer program, It is characterized in that, the method for automatic identification semantic accuracy when which is executed by processor.

The device of the automatic identification semantic accuracy of the embodiments of the present invention can be used for executing the embodiment of the present invention The method of automatic identification semantic accuracy, and the realization automatic identification for reaching the embodiments of the present invention accordingly is semantic quasi- The method technical effect achieved of true property, which is not described herein again.Hardware processor can be passed through in the embodiment of the present invention (hardware processor) Lai Shixian related function module.

Fig. 3 is the hard of the electronic equipment of the method for the execution automatic identification semantic accuracy that one embodiment of the invention provides Part structural schematic diagram, as shown in figure 3, the equipment includes:

One or more processors 410 and memory 420, in Fig. 3 by taking a processor 410 as an example.

The equipment for executing the method for automatic identification semantic accuracy can also include: input unit 430 and output device 440。

Processor 410, memory 420, input unit 430 and output device 440 can pass through bus or other modes It connects, in Fig. 3 for being connected by bus.

Memory 420 is used as a kind of non-volatile computer readable storage medium storing program for executing, can be used for storing non-volatile software journey Sequence, non-volatile computer executable program and module, such as the automatic identification semantic accuracy in the embodiment of the present application Corresponding program instruction/the module of method.Processor 410 by operation be stored in memory 420 non-volatile software program, Instruction and module realize above method embodiment thereby executing the various function application and data processing of server The method of automatic identification semantic accuracy.

Memory 420 may include storing program area and storage data area, wherein storing program area can store operation system Application program required for system, at least one function；Storage data area can store the dress according to automatic identification semantic accuracy That sets uses created data etc..In addition, memory 420 may include high-speed random access memory, it can also include non- Volatile memory, for example, at least a disk memory, flush memory device or other non-volatile solid state memory parts.? In some embodiments, optional memory 420 includes the memory remotely located relative to processor 410, these remote memories It can be by being connected to the network to the device of automatic identification semantic accuracy.The example of above-mentioned network includes but is not limited to interconnect Net, intranet, local area network, mobile radio communication and combinations thereof.

Input unit 430 can receive the number or character information of input, and generate and automatic identification semantic accuracy Device user setting and the related signal of function control.Output device 440 may include that display screen etc. shows equipment.

Said one or multiple modules are stored in memory 420, are held when by one or more of processors 410 When row, the method that executes the automatic identification semantic accuracy in above-mentioned any means embodiment.

Method provided by the embodiment of the present application can be performed in the said goods, has the corresponding functional module of execution method and has Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to method provided by the embodiment of the present application.

The electronic equipment of the embodiment of the present application exists in a variety of forms, including but not limited to:

(1) mobile communication equipment: the characteristics of this kind of equipment is that have mobile communication function, and to provide speech, data Communication is main target.This Terminal Type includes: smart phone (such as iPhone), multimedia handset, functional mobile phone and low Hold mobile phone etc..

(2) super mobile personal computer equipment: this kind of equipment belongs to the scope of personal computer, there is calculating and processing function Can, generally also have mobile Internet access characteristic.This Terminal Type includes: PDA, MID and UMPC equipment etc., such as iPad.

(3) car-mounted device: this kind of equipment application may be implemented and the companies such as other auxiliary systems of automobile in vehicle carried driving It connects.

(4) server: providing the equipment of the service of calculating, and the composition of server includes that processor, hard disk, memory, system are total Line etc., server is similar with general computer architecture, but due to needing to provide highly reliable service, in processing energy Power, stability, reliability, safety, scalability, manageability etc. are more demanding.

(5) other electronic devices with data interaction function.

The apparatus embodiments described above are merely exemplary, wherein described, unit can as illustrated by the separation member It is physically separated with being or may not be, component shown as a unit may or may not be physics list Member, it can it is in one place, or may be distributed over multiple network units.It can be selected according to the actual needs In some or all of the modules achieve the purpose of the solution of this embodiment.

Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It is realized by the mode of software plus general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, above-mentioned technology Scheme substantially in other words can be embodied in the form of software products the part that the relevant technologies contribute, the computer Software product may be stored in a computer readable storage medium, such as ROM/RAM, magnetic disk, CD, including some instructions to So that computer equipment (can be personal computer, server or the network equipment etc.) execute each embodiment or Method described in certain parts of embodiment.

Finally, it should be noted that above embodiments are only to illustrate the technical solution of the application, rather than its limitations；Although The application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: it still may be used To modify the technical solutions described in the foregoing embodiments or equivalent replacement of some of the technical features； And these are modified or replaceed, each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution spirit and Range.

Claims

1. the method for automatic identification semantic accuracy, which comprises the steps of:

Obtain phonetic order；

According to the accuracy output test result of semantic parsing result.

2. the method according to claim 1, wherein wherein, obtaining phonetic order includes

Corpus configuration file is obtained, and reads corpus；

The corpus read is synthesized into phonetic order output；

It is described to include according to the accuracy output test result of semantic parsing result

According to the accuracy of semantic parsing result, the corresponding corpus background colour of the row of corpus configuration file is modified.

3. according to the method described in claim 1, wherein, the acquisition phonetic order includes

Obtain audio file output.

4. according to the method described in claim 2, determination is semantic it is characterized in that, described detect semantic parsing result The accuracy of parsing result includes

Semantic parsing result is compared to the corresponding corpus in corpus configuration file, accuracy is determined according to comparison result.

5. according to the method described in claim 3, determination is semantic it is characterized in that, described detect semantic parsing result The accuracy of parsing result includes

For the audio file configuration standard corpus file；

Semantic parsing result is compared with the semantic content in standard corpus file, accuracy is determined according to comparison result.

6. according to the method described in claim 2, it is characterized in that, the corpus configuration file is excel file, the acquisition Corpus configuration file simultaneously reads corpus and according to the accuracy to semantic parsing result, modifies the corresponding of corpus configuration file Corpus background colour of the row passes through python script and realizes.

7. the device of automatic identification semantic accuracy, which is characterized in that including

Voice obtains module, for obtaining phonetic order；

As a result module is presented, for the accuracy output test result according to semantic parsing result.

8. device according to claim 7, which is characterized in that the voice obtains module and includes

Corpus acquiring unit for obtaining corpus configuration file, and reads corpus；

Speech synthesis unit, for the corpus read to be synthesized phonetic order output.

9. electronic equipment comprising: at least one processor, and the storage being connect at least one described processor communication Device, wherein the memory is stored with the instruction that can be executed by least one described processor, and described instruction is by described at least one A processor executes, so that at least one described processor is able to carry out the step of any one of claim 1-6 the method Suddenly.

10. storage medium is stored thereon with computer program, which is characterized in that the program realizes right when being executed by processor It is required that the step of any one of 1-6 the method.