WO2021051872A1 - 实体识别方法、装置、设备及计算机可读存储介质 - Google Patents

实体识别方法、装置、设备及计算机可读存储介质 Download PDF

Info

Publication number
WO2021051872A1
WO2021051872A1 PCT/CN2020/093481 CN2020093481W WO2021051872A1 WO 2021051872 A1 WO2021051872 A1 WO 2021051872A1 CN 2020093481 W CN2020093481 W CN 2020093481W WO 2021051872 A1 WO2021051872 A1 WO 2021051872A1
Authority
WO
WIPO (PCT)
Prior art keywords
entity
recognition result
recognized
entity recognition
sentence
Prior art date
Application number
PCT/CN2020/093481
Other languages
English (en)
French (fr)
Inventor
杨坤
许开河
王少军
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021051872A1 publication Critical patent/WO2021051872A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Definitions

  • This application relates to the field of natural language processing technology, and in particular to an entity recognition method, device, equipment, and computer-readable storage medium.
  • Entity recognition refers to the identification of entities with specific meanings in the text, mainly including names of persons, places, organizations, proper nouns, etc.; for example, identifying named entities such as names of persons, places, cities, or car names from the text.
  • entity recognition there are two directions for entity recognition. One is to recognize through regular matching, and the other is to use models for entity recognition.
  • regular matching recognition is to pre-write the entity library based on rules.
  • the sentence to be recognized is compared with the entity library, and the target entity that is the same as the entity in the entity library is identified from the sentence to be recognized.
  • the entity library cannot exhaust all entities. Therefore, using regular matching recognition cannot completely identify all entities, that is, regular matching recognition may not be able to recognize the entities contained in the sentence to be recognized.
  • the inventor realizes that the model entity recognition is based on the entity recognition model trained on the training corpus.
  • the sentence to be recognized is input into the entity recognition model, and the entity recognition model recognizes and outputs the words in the sentence to be recognized. Included target entity.
  • the entity recognition model will have the problem of entity recognition errors and inaccuracy.
  • the main purpose of this application is to provide an entity recognition method, device, equipment, and computer-readable storage medium, aiming to solve the technical problem of inaccurate entity recognition results using existing entity recognition technology for entity recognition.
  • this application provides an entity identification method, which includes the following steps:
  • the target entity recognition result of the sentence to be recognized is determined.
  • the present application also provides an entity identification device, the entity identification device comprising:
  • the sentence acquisition module is used to acquire the sentence to be recognized, and input the sentence to be recognized into the preset entity recognition model and the preset matching recognition model respectively;
  • a result obtaining module configured to obtain a first entity recognition result generated by the entity recognition model based on the sentence to be recognized, and a second entity recognition result generated by the matching recognition model based on the sentence to be recognized;
  • the entity determination module is configured to determine the target entity recognition result of the sentence to be recognized according to the first entity recognition result and the second entity recognition result.
  • the present application also provides an entity identification device, the entity identification device including a processor, a memory, and computer-readable instructions stored on the memory and executable by the processor, wherein When the computer-readable instructions are executed by the processor, the following steps are implemented:
  • the target entity recognition result of the sentence to be recognized is determined.
  • the present application also provides a computer-readable storage medium having computer-readable instructions stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the following steps are implemented :
  • the target entity recognition result of the sentence to be recognized is determined.
  • the present application avoids the problems of inaccurate recognition results of the entity recognition model and incomplete recognition results of the matching recognition model, and improves the accuracy of entity recognition.
  • FIG. 1 is a schematic flowchart of the first embodiment of the entity identification method of this application
  • FIG. 2 is a schematic flowchart of a second embodiment of an entity identification method of this application.
  • FIG. 3 is a schematic flowchart of a fourth embodiment of an entity identification method of this application.
  • FIG. 5 is a schematic diagram of the hardware structure of the entity recognition device involved in the solution of the embodiment of the present application.
  • FIG. 1 is a schematic flowchart of a first embodiment of the entity identification method of this application.
  • the embodiments of the application provide embodiments of the entity recognition method. It should be noted that although the logical sequence is shown in the flowchart, in some cases, the sequence shown or described can be executed in a different order than here. A step of.
  • Entity recognition methods are used in entity recognition devices, servers or terminals.
  • Terminals can include mobile terminals such as mobile phones, tablets, notebooks, palmtops, and personal digital assistants (Personal Digital Assistants, PDAs), as well as digital TVs and desktop computers. Wait for the fixed terminal.
  • PDAs Personal Digital Assistants
  • each embodiment of the entity recognition method for ease of description, each embodiment is described with an entity recognition device as an execution subject, where the entity recognition device includes a preset entity recognition model and a preset matching recognition model.
  • the entity identification method includes:
  • Step S10 Obtain the sentence to be recognized, and input the sentence to be recognized into a preset entity recognition model and a preset matching recognition model respectively;
  • the accuracy of entity recognition can be guaranteed through regular matching, but because regular matching can only be successfully matched when the relevant rules are met, in order not to miss recognition, a large number of rules can only be written, but because the language is too express There are too many, so the recognition is limited, and there is no ability to understand semantics, and there will be a problem of missing recognition. Recognizing entities based on models can improve the problem of missing recognition, but limited by the quality of the training corpus and the recognition ability of the model, there will be problems that some recognized entities are wrong.
  • the embodiment of the present application merges the regular matching recognition with the model entity recognition, and integrates the entity result obtained by the regular matching recognition with the model entity recognition The obtained entity result determines the final entity recognition result.
  • the sentence to be recognized is detected, the sentence to be recognized is obtained, and the sentence to be recognized is input into a preset entity recognition model and a preset matching recognition model, respectively.
  • the sentence to be recognized refers to the text that requires entity recognition to obtain the target named entity.
  • the entity recognition model is used to obtain the sentence to be recognized, recognize the sentence to be recognized, analyze each entity contained in the sentence to be recognized, and obtain the named entity of the type of entity that needs to be recognized in the sentence to be recognized, as the sentence to be recognized
  • the model of the entity recognition result of the sentence Before performing entity recognition on the sentence to be recognized, the corpus data to be trained is used to train the model to be trained until the model to be trained converges to generate a preset entity recognition model.
  • the matching recognition model is used to obtain the sentence to be recognized, and find and output the same entities in the entity library pre-written based on rules from the sentence to be recognized, and obtain the type of entity that needs to be recognized in the sentence to be recognized
  • the named entity is used as a model of the entity recognition result of the sentence to be recognized.
  • Step S20 Obtain a first entity recognition result generated by the entity recognition model based on the sentence to be recognized, and a second entity recognition result generated by the matching recognition model based on the sentence to be recognized;
  • the entity recognition model obtains the sentence to be recognized, it first determines the type of entity to be recognized in the sentence to be recognized, for example, it is necessary to recognize entities of the type of person, city, or place name. Then, according to the entity type to be recognized in the sentence to be recognized, the sentence to be recognized is recognized, and the sentence to be recognized is analyzed: each entity whose type is the entity type to be recognized is used as the entity recognition model based on the first generated sentence of the sentence to be recognized. An entity recognition result.
  • the matching recognition model After the matching recognition model obtains the sentence to be recognized, it compares the sentence to be recognized with the entity library pre-written based on rules, and finds out from the sentence to be recognized: each entity that is the same as the entity in the entity library as The matching recognition model is based on the second entity recognition result generated by the sentence to be recognized.
  • the first entity recognition result refers to the entity recognition model that recognizes the sentence to be recognized, and the sentence to be recognized is included in the analyzed sentence: each entity whose type is the entity type to be recognized.
  • the second entity recognition result means that the matching recognition model compares the sentence to be recognized with an entity library pre-written based on rules, and finds out from the sentence to be recognized: each entity that is the same as the entity in the entity library.
  • Step S30 Determine the target entity recognition result of the sentence to be recognized according to the first entity recognition result and the second entity recognition result.
  • an implementation manner is to first obtain the entity type to be recognized in the sentence to be recognized; then, to separately detect whether each entity in the first entity recognition result meets the entity rules of the entity type to be recognized, for example, a person’s name
  • entity rules of the entity type to be recognized for example, a person’s name
  • the type of entity is the surname plus the name, the surname is generally 1 to 2 characters, and the name is generally 1 to 2 characters.
  • each entity in the first entity recognition result meets the entity rules of the entity type to be recognized, each entity in the first entity recognition result is added to the preset entity list template and output, thereby obtaining the sentence to be recognized The target entity recognition result.
  • each entity in the second entity recognition result is added to the preset entity list template, and the first entity The entities included in the non-intersecting part of the recognition result obtained from the second entity recognition result are added to the preset entity list template; then, each entity in the preset entity list template is output as the target entity recognition of the sentence to be recognized result.
  • One implementation is to first detect whether the second entity recognition result includes or is equal to the first entity recognition result.
  • the entity that meets the entity rules of the entity type to be recognized is obtained from the first entity recognition result and added to the preset entity list template ( Specifically, from the first entity recognition result, obtain the entities included in the non-intersected part of the second entity recognition result; and detect whether the obtained entity meets the entity rules of the entity type to be recognized; if it meets the entity rules, then The acquired entities are added to the preset entity list template). Each entity in the second entity recognition result is added to the preset entity list template. Finally, each entity in the preset entity list template is output to obtain the target entity recognition result of the sentence to be recognized.
  • each entity in the second entity recognition result is added to the preset entity list template. And output each entity in the preset entity list template to obtain the target entity recognition result of the sentence to be recognized.
  • the target entity recognition result means that the entity recognition device adopts the matching recognition model and the entity recognition model to recognize the sentence to be recognized according to the type of entity to be recognized by the sentence to be recognized, and then matches the recognition result and entity of the sentence to be recognized by the recognition model.
  • the recognition model fuses the recognition results of the sentence to be recognized to obtain the entities contained in the sentence to be recognized.
  • the method further includes:
  • the model to be trained is trained until the model to be trained converges to generate a preset entity recognition model.
  • the training model Before using a preset entity recognition model to perform entity recognition on the sentence to be recognized, and before detecting the entities contained in the sentence to be recognized, the training model needs to be trained to generate the preset entity recognition model.
  • collect the corpus data to be trained for training the model to be trained for example, collect multiple texts or multiple sentences as the corpus data to be trained.
  • the preset entity recognition model is capable of recognizing the sentence to be recognized according to the type of the entity to be recognized in the sentence to be recognized, and analyzing each entity contained in the sentence to be recognized: the type is the entity type to be recognized.
  • the corpus data to be trained refers to data such as sentences, texts, or documents used to train the model to be trained.
  • the matching recognition model is used to accurately identify entities, an entity recognition model is added to recognize the sentence to be recognized, so as to further identify the entities that may be the target, but the matching recognition model cannot recognize entities. The problem of inaccuracy in the recognition result of the entity recognition model and incomplete recognition result of the matching recognition model is avoided, and the accuracy of the entity recognition is improved.
  • FIG. 2 is a schematic flowchart of a second embodiment of an entity identification method of this application. Based on the above-mentioned first embodiment, a second embodiment of the entity identification method of the present application is proposed, and step S30 includes:
  • Step S31 detecting whether the second entity recognition result includes or is equal to the first entity recognition result
  • the entities included in the first entity recognition result are respectively compared with the entities included in the second entity recognition result, and it is detected whether each entity in the first entity recognition result exists in the second entity recognition result. The same entity. If it is detected that one or more entities in the first entity recognition result do not have the same entity in the second entity recognition result, it is determined that the second entity recognition result does not include and is not equal to the first entity recognition result.
  • each entity in the first entity recognition result has the same entity in the second entity recognition result, then it is further checked whether each entity in the second entity recognition result exists in the first entity recognition result The same entity, and whether the number of entities in the second entity recognition result is equal to the number of entities in the first entity recognition result.
  • each entity in the second entity recognition result has the same entity in the first entity recognition result, and the number of entities in the second entity recognition result is equal to the number of entities in the first entity recognition result, then It is determined that the second entity recognition result is equal to the first entity recognition result. If it is detected that one or more entities in the second entity recognition result do not have the same entity in the first entity recognition result, and the number of entities in the second entity recognition result is greater than the number of entities in the first entity recognition result , It is determined that the second entity recognition result includes the first entity recognition result.
  • the entities included in the first entity recognition result are: Li Ming, Zhang San, Li Si, and Zhao Xiaohong. If the entities included in the second entity recognition result are: Li Ming, Zhang San, the second entity recognition result does not include and does not Equal to the first entity recognition result; if the entities included in the second entity recognition result are: Li Ming, Zhang San, Li Si, Zhao Xiaohong, Sun Xiaojie, then the second entity recognition result includes the first entity recognition result; if the second entity recognition result The included entities are: Li Ming, Zhang San, Li Si, and Zhao Xiaohong, the second entity recognition result is equal to the first entity recognition result.
  • Step S32 If it is detected that the second entity recognition result does not include and is not equal to the first entity recognition result, then obtain entities that meet the preset entity rules from the first entity recognition result, and add them to the preset In the entity list template;
  • the first entity included in the non-intersecting part of the second entity recognition result is obtained from the first entity recognition result.
  • the preset entity rule is acquired and based on the entity type to be identified, and it is detected whether the first entity meets the preset entity rule.
  • the first entity complies with the preset entity rule, the first entity is added to the preset entity list template. If the first entity does not meet the preset entity rules, the first entity that does not meet the preset entity rules is regarded as a recognition error of the entity recognition model, and the entity that does not meet the preset entity rules is discarded.
  • the first entity included in the non-intersecting part of the second entity recognition result in the first entity recognition result is "Li Si” and "Zhao Xiaohong”.
  • “Li Si” does not meet the preset entity rules, and "Zhao Xiaohong” meets the preset entity rules, then "Zhao Xiaohong” is added to the preset entity list template, and "Li Si” is regarded as recognized by the entity recognition model Make a mistake and discard.
  • Step S33 Add each entity in the second entity recognition result to the entity list template
  • step S32 Add the first entity that meets the preset entity rule among the first entities included in the non-intersection part of the first entity recognition result and the second entity recognition result to the preset entity list template as the sentence to be recognized
  • each entity "Li Ming" and "Zhang San” in the second entity recognition result is added to the preset entity list template as each entity of the target entity recognition result of the sentence to be recognized.
  • Step S34 Output each entity in the entity list template to obtain the target entity recognition result of the sentence to be recognized.
  • the second entity recognition result includes or is equal to the first entity recognition result
  • the second entity recognition result includes or equals the first entity recognition result
  • the second entity recognition result is entity recognition for the sentence to be recognized through the matching recognition model
  • the recognition result of the matching recognition model has a higher accuracy rate, so each entity in the second entity recognition result can be directly added to the preset entity list template as each of the target entity recognition results of the sentence to be recognized entity. And output each entity in the preset entity list template to obtain the target entity recognition result of the sentence to be recognized (that is, the second entity recognition result is used as the target entity recognition result of the sentence to be recognized).
  • the recognition result of the entity recognition model is more comprehensive but there are inaccurate entities, and the recognition result of the matching recognition model is accurate but there is a problem of incomplete recognition.
  • the step of obtaining an entity that meets a preset entity rule from the first entity recognition result and adding it to a preset entity list template includes:
  • Step A1 Obtain a first entity included in a non-intersecting part of the second entity recognition result from the first entity recognition result;
  • each entity included in the intersection of the first entity recognition result and the second entity recognition result is detected, and the entities included in the first entity recognition result are subtracted from the entities included in the second entity recognition result.
  • Each entity in is used as the first entity included in the non-intersecting part of the second entity recognition result in the first entity recognition result, and the first entity included in the non-intersecting part of the second entity recognition result is obtained.
  • the first entity refers to the entity included in the non-intersecting part of the recognition result of the first entity and the recognition result of the second entity.
  • Step A2 detecting whether the first entity complies with preset entity rules
  • Each entity type has a corresponding entity rule. After the entity type to be recognized in the sentence to be recognized is determined, it can be determined whether the first entity conforms to the preset entity rule according to the entity type to be recognized in the sentence to be recognized. Specifically, acquiring and determining whether the first entity meets the entity rule corresponding to the entity type of the sentence to be recognized according to the entity type to be recognized in the sentence to be recognized.
  • the entity of the name type is the surname plus the name
  • the surname is generally 1 to 2 characters
  • the name is generally 1 to 2
  • the preset entity rule refers to the entity rule corresponding to the entity type to be recognized in the sentence to be recognized.
  • Step A3 If the first entity complies with a preset entity rule, add the first entity to a preset entity list template.
  • the first entity conforming to the preset entity rule is added to the preset entity list template , As the target entity of the sentence to be recognized. If the first entity does not meet the preset entity rules, the first entity that does not meet the preset entity rules is regarded as a recognition error of the entity recognition model, and the entity that does not meet the preset entity rules is discarded.
  • FIG. 3 is a schematic flowchart of a fourth embodiment of an entity identification method of this application. Based on the above-mentioned second embodiment, a fourth embodiment of the entity identification method of the present application is proposed. After step S31, the method further includes:
  • Step S35 If it is detected that the second entity recognition result includes or is equal to the first entity recognition result, each entity in the second entity recognition result is added to the entity list template;
  • the second entity recognition result includes or is equal to the first entity recognition result
  • the second entity recognition result includes or is equal to the first entity recognition result (that is, each entity in the first entity recognition result, in the second entity recognition result)
  • the second entity recognition result is the entity obtained by entity recognition of the sentence to be recognized through the matching recognition model, and the recognition result of the matching recognition model has a higher accuracy rate, so the second entity
  • Each entity in the recognition result is directly added to the preset entity list template as each entity of the target entity recognition result of the sentence to be recognized.
  • the second entity recognition result does not contain and is not equal to the first entity recognition result, obtain the entity conforming to the preset entity rule from the first entity recognition result, and add it to the preset entity list template as to be recognized
  • the target entity of the sentence recognizes each entity of the result.
  • Step S36 Output each entity in the entity list template to obtain the target entity recognition result of the sentence to be recognized.
  • the recognition result of the entity recognition model is more comprehensive but there are inaccurate entities, and the recognition result of the matching recognition model is accurate but there is a problem of incomplete recognition.
  • the step of obtaining the first entity recognition result generated by the entity recognition model based on the sentence to be recognized includes:
  • Step B1 Obtain the entity type to be recognized in the sentence to be recognized
  • the entity type to be recognized refers to the type of entity that needs to be recognized for the sentence to be recognized. For example, if the entity type to be recognized is a person name, the entity of the person name type is identified from the sentence to be recognized, if the entity type of the person name type, the entity type of the city name type, and the entity type of the country name type are recognized.
  • the entity type to be recognized in the sentence to be recognized is determined.
  • Step B2 Obtain each second entity conforming to the entity type obtained by recognizing the sentence to be recognized by the entity recognition model;
  • the second entity refers to the entity recognition model that obtains the sentence to be recognized, and recognizes the sentence to be recognized, and the sentence to be recognized is included in the analyzed sentence: each entity whose type is the entity type to be recognized.
  • the entity recognition model obtains the sentence to be recognized, recognizes the sentence to be recognized, and analyzes the sentence to be recognized that includes each second entity whose type is the entity type to be recognized.
  • the entity type to be recognized in the sentence to be recognized is a city name
  • the sentence to be recognized includes: name type entities "Zhang San” and “Li Si”, city name type entities "Beijing” and "Shanghai”
  • the entity recognition model is obtained
  • the sentence to be recognized, and the sentence to be recognized is recognized, and the sentence to be recognized included in the analyzed sentence: the entities "Beijing" and "Shanghai” whose types are city names are used as the second entity.
  • Step B3 using the second entity as the recognition result of the first entity.
  • each second entity that meets the entity type to be recognized by the sentence to be recognized is obtained, and the second entity is used as the first entity recognition result; thereby ensuring the first entity
  • the obtaining of an entity recognition result provides an accurate data basis for the subsequent determination of the target entity recognition result of the sentence to be recognized based on the first entity recognition result.
  • the step of obtaining the second entity recognition result generated by the matching recognition model based on the sentence to be recognized includes:
  • Step C1 Obtain a pre-written entity library, where the entity library includes multiple entities;
  • the matching recognition model In order to ensure that the matching recognition model obtains the sentence to be recognized for entity recognition, it can directly compare the sentence to be recognized with the entities in the entity library to identify the entities contained in the sentence to be recognized.
  • the matching recognition model is used to perform the recognition of the sentence to be recognized.
  • entity recognition Before entity recognition, establish an entity database. Among them, the established entity library is pre-written based on rules and contains multiple entities; the entities in the entity library are compiled in an exhaustive manner. As a better implementation, when the entity library is pre-written, the named entities that are easy to identify incorrectly by the entity recognition model are written into the entity library as entities of the entity library, so as to further improve the accuracy of entity recognition.
  • a pre-written entity library corresponding to the entity type to be recognized in the sentence to be recognized is obtained; for example, if the sentence to be recognized is to be recognized If the entity type is a person name type entity, the pre-written entity type of the entity type is a person name; if the entity type to be recognized in the sentence to be recognized is a car name type entity, then the pre-written entity type of the entity type is a car name entity database.
  • Step C2 Obtaining the matching recognition model to find each third entity that is the same as the entity in the entity library from the sentence to be recognized;
  • the third entity refers to the matching recognition model to obtain the sentence to be recognized, and to compare the sentence to be recognized with the entity in the entity library, and to find each entity that is the same as the entity in the entity library from the sentence to be recognized.
  • the matching recognition model obtains the sentence to be recognized, and compares the sentence to be recognized with the entity in the entity library, and finds each third entity that is the same as the entity in the entity library from the sentence to be recognized.
  • the entities included in the sentence to be recognized are: “Zhang San”, “Li Si”, “Beijing” and “Shanghai”.
  • the pre-written entity database After comparing the sentence to be recognized with the entities in the entity database, it is found that the pre-written entity database If there are entities with “Zhang San” and “Li Si” and entities without “Beijing” and “Shanghai”, then "Zhang San” and "Li Si” will be regarded as the third entity.
  • Step C3 using the third entity as the second entity recognition result.
  • the sentence to be recognized is compared with the entity in the entity library by obtaining the matching recognition model, and each third entity in the sentence to be recognized that is the same as the entity in the entity library is obtained, and the third entity is taken as the first entity. 2. Entity recognition results; so as to ensure that the second entity recognition result is obtained, and provide an accurate data basis for the subsequent determination of the target entity recognition result of the sentence to be recognized based on the second entity recognition result.
  • this application also provides an entity identification device.
  • FIG. 4 is a schematic diagram of the functional modules of the first embodiment of the entity identification device of this application.
  • the entity identification device includes:
  • the sentence acquisition module 10 is configured to acquire the sentence to be recognized, and input the sentence to be recognized into a preset entity recognition model and a preset matching recognition model respectively;
  • the result obtaining module 20 is configured to obtain a first entity recognition result generated by the entity recognition model based on the sentence to be recognized, and a second entity recognition result generated by the matching recognition model based on the sentence to be recognized;
  • the entity determination module 30 is configured to determine the target entity recognition result of the sentence to be recognized according to the first entity recognition result and the second entity recognition result.
  • entity determining module 30 further includes:
  • a detecting unit configured to detect whether the second entity recognition result includes or is equal to the first entity recognition result
  • the first adding unit is configured to, if it is detected that the second entity recognition result does not include and is not equal to the first entity recognition result, obtain an entity that meets the preset entity rule from the first entity recognition result, Add to the preset entity list template;
  • the second adding unit is configured to add each entity in the second entity recognition result to the entity list template
  • the first entity output unit is configured to output each entity in the entity list template to obtain the target entity recognition result of the sentence to be recognized.
  • the first adding unit further includes:
  • An obtaining subunit configured to obtain, from the first entity recognition result, a first entity included in a non-intersecting part of the second entity recognition result;
  • the detection subunit is used to detect whether the first entity conforms to a preset entity rule
  • the adding subunit is configured to add the first entity to a preset entity list template if the first entity complies with a preset entity rule.
  • entity determining module 30 further includes:
  • the third adding unit is configured to add each entity in the second entity recognition result to the entity list template if it is detected that the second entity recognition result includes or is equal to the first entity recognition result ;
  • the second entity output unit is configured to output each entity in the entity list template to obtain the target entity recognition result of the sentence to be recognized.
  • the result obtaining module further includes:
  • the first obtaining unit is configured to obtain the entity type to be recognized in the sentence to be recognized;
  • the second acquiring unit is configured to acquire each second entity that meets the entity type obtained by recognizing the sentence to be recognized by the entity recognition model;
  • the first recognition result determining unit is configured to use the second entity as the first entity recognition result.
  • the result obtaining module further includes:
  • the third obtaining unit is configured to obtain a pre-written entity library, where the entity library includes multiple entities;
  • a fourth acquiring unit configured to acquire the matching recognition model to find each third entity that is the same as the entity in the entity library from the sentence to be recognized;
  • the second recognition result determining unit is configured to use the third entity as the second entity recognition result.
  • entity identification device further includes:
  • the training data acquisition module is used to acquire the corpus data to be trained
  • the model training module is used to train the model to be trained according to the corpus data to be trained until the model to be trained converges to generate a preset entity recognition model.
  • each embodiment in the entity recognition device is basically the same as each embodiment of the above-mentioned entity recognition method, and will not be described in detail here.
  • FIG. 5 is a schematic structural diagram of the hardware operating environment of the entity recognition device involved in the solution of the embodiment of the present application.
  • FIG. 5 can be a schematic structural diagram of the hardware operating environment of the entity recognition device.
  • the entity identification device in the embodiment of the present application may be a terminal device such as a PC and a portable computer.
  • the entity recognition device may include a processor 1001 (for example, a CPU), a communication bus 1002, a user interface 1003, a network interface 1004, and a memory 1005.
  • the communication bus 1002 is used to realize the connection and communication between these components;
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard);
  • the network interface 1004 may optionally include a standard wired interface, a wireless interface (Such as WI-FI interface);
  • the memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory.
  • the memory 1005 can optionally also be a storage device independent of the aforementioned processor 1001 .
  • the entity identification device may also include a camera, an RF (Radio Frequency, radio frequency) circuit, a sensor, an audio circuit, a WiFi module, and so on.
  • RF Radio Frequency, radio frequency
  • the hardware structure of the entity recognition device shown in FIG. 5 does not constitute a limitation on the entity recognition device, and may include more or less components than shown in the figure, or combine certain components, or different The layout of the components.
  • the memory 1005 as a computer-readable storage medium in FIG. 5 may include an operating system, a network communication module, and computer-readable instructions.
  • the network communication module is mainly used to connect to the database and perform data communication with the database; and the processor 1001 can call the computer-readable instructions stored in the memory 1005 and execute the following steps:
  • the target entity recognition result of the sentence to be recognized is determined.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile, and computer-readable instructions are stored on the computer-readable storage medium. When the computer-readable instructions are executed by the processor, the following steps are implemented:
  • the target entity recognition result of the sentence to be recognized is determined.
  • the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disks, optical disks), including several instructions to make a terminal device (which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the method described in each embodiment of the present application.
  • a terminal device which can be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

一种实体识别方法、实体识别装置、设备和计算机可读存储介质,该方法包括:获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型(S10);获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果(S20);根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果(S30)。该方法避免了实体识别模型的识别结果存在不准确、匹配识别模型的识别结果存在不完整的问题,提高了实体识别的准确率。

Description

实体识别方法、装置、设备及计算机可读存储介质
本申请要求于2019年9月18日提交中国专利局,专利名称为“实体识别方法、装置、设备及计算机可读存储介质”,申请号为201910880672.9的发明专利的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及自然语言处理技术领域,尤其涉及一种实体识别方法、装置、设备及计算机可读存储介质。
背景技术
在自然语言处理领域,实体识别是一个重要的研究方向。实体识别,是指识别文本中具有特定意义的实体,主要包括人名、地名、机构名、专有名词等;例如,从文本中识别出人名、地名、城市名或者汽车名等命名实体。目前实体识别有两个方向,一个是通过正则匹配识别,另一个是借助模型来进行实体识别。
其中,正则匹配识别,是基于规则预先编写实体库,当需要识别的语言时,将待识别语句与实体库对比,从待识别语句中,识别出与实体库中的实体相同的目标实体。但,由于语言表达方式有多种,实体库无法穷举所有的实体。故,采用正则匹配识别无法完整识别出所有的实体,即正则匹配识别可能无法识别出待识别语句中所包含的实体。
发明人意识到,模型实体识别,是基于由训练语料训练得到的实体识别模型,当需要识别的语言时,将待识别语句输入实体识别模型中,由实体识别模型识别并输出待识别语句中所包含的目标实体。但受限于训练语料的质量和模型的识别能力,实体识别模型会存在实体识别错误、不准确的问题。
技术问题
本申请的主要目的在于提供一种实体识别方法、装置、设备及计算机可读存储介质,旨在解决采用现有实体识别技术进行实体识别,实体识别结果不准确的技术问题。
技术解决方案
为实现上述目的,本申请提供一种实体识别方法,所述实体识别方法包括以下步骤:
获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
此外,为实现上述目的,本申请还提供一种实体识别装置,所述实体识别装置包括:
语句获取模块,用于获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
结果获取模块,用于获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
实体确定模块,用于根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
此外,为实现上述目的,本申请还提供一种实体识别设备,所述实体识别设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时,实现以下步骤:
获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其中所述计算机可读指令被处理器执行时,实现以下步骤:
获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
有益效果
本申请避免了实体识别模型的识别结果存在不准确、匹配识别模型的识别结果存在不完整的问题,提高了实体识别的准确率。
附图说明
图1为本申请实体识别方法第一实施例的流程示意图;
图2为本申请实体识别方法第二实施例的流程示意图;
图3为本申请实体识别方法第四实施例的流程示意图;
图4为本申请实体识别装置第一实施例的功能模块示意图;
图5是本申请实施例方案涉及的实体识别设备的硬件结构示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。
本发明的实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供一种实体识别方法,参照图1,图1为本申请实体识别方法第一实施例的流程示意图。
本申请实施例提供了实体识别方法的实施例,需要说明的是,虽然在流程图中示出了逻辑顺序,但是在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤。
实体识别方法应用于实体识别设备、服务器或者终端中,终端可以包括诸如手机、平板电脑、笔记本电脑、掌上电脑、个人数字助理(Personal Digital Assistant,PDA)等移动终端,以及诸如数字TV、台式计算机等固定终端。在实体识别方法的各个实施例中,为了便于描述,以实体识别设备为执行主体进行阐述各个实施例,其中,实体识别设备包括预设的实体识别模型和预设的匹配识别模型。在本申请实体识别方法第一实施例中,实体识别方法包括:
步骤S10,获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
通过正则匹配的方式能保证实体识别的准确率,但是由于正则匹配只能在满足相关规则的情况下才能成功匹配,为了不漏识别,只能通过编写大量的规则,但由于语言的表达方式太多,所以使得识别有局限性,没有语义理解的能力,会存在漏识别的问题。基于模型来对实体进行识别能对漏识别的问题进行改善,但是受限于训练语料的质量和模型的识别能力,会存在一些识别出来的实体是错误的问题。
为了解决现有的实体识别技术存在实体识别错误、不全面以及不准确的问题,本申请实施例通过将正则匹配识别与模型实体识别进行融合,综合正则匹配识别得出的实体结果与模型实体识别得出的实体结果,确定最终的实体识别结果。
具体地,当检测到待识别语句时,获取待识别语句,并将待识别语句分别输入至预设的实体识别模型以及预设的匹配识别模型。
其中,待识别语句,是指需要进行实体识别,以得出目标命名实体的文本。
实体识别模型,是指用于获取待识别语句,并对待识别语句进行识别,分析待识别语句中所包含的各个实体,得出待识别语句中需要识别的实体类型的命名实体,以作为待识别语句的实体识别结果的模型。在对待识别语句进行实体识别之前,采用待训练的语料数据,对待训练模型进行训练,直至待训练模型收敛,以生成预设的实体识别模型。
匹配识别模型,是指用于获取待识别语句,并从待识别语句中查找出与基于规则预先编写的实体库中的实体相同的各个实体并输出,得出待识别语句中需要识别的实体类型的命名实体,以作为待识别语句的实体识别结果的模型。
步骤S20,获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
具体地,实体识别模型在获取到待识别语句后,首先确定待识别语句待识别的实体类型,如需要识别出人名、城市名或地名类型的实体。然后,根据待识别语句待识别的实体类型,对待识别语句进行识别,分析待识别语句中所包含的:类型为待识别的实体类型的各个实体,以作为实体识别模型基于待识别语句生成的第一实体识别结果。
匹配识别模型在获取到待识别语句后,将待识别语句与基于规则预先编写的实体库进行对比,从待识别语句中查找出的:与所述实体库中的实体相同的各个实体,以作为匹配识别模型基于待识别语句生成的第二实体识别结果。
其中,第一实体识别结果,是指实体识别模型对待识别语句进行识别,分析得出的待识别语句中所包含的:类型为待识别的实体类型的各个实体。
第二实体识别结果,是指匹配识别模型将待识别语句与基于规则预先编写的实体库进行对比,从待识别语句中查找出的:与所述实体库中的实体相同的各个实体。
步骤S30,根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
具体地,一种实施方式为,首先,获取待识别语句待识别的实体类型;然后,分别检测第一实体识别结果中的每个实体是否均符合待识别的实体类型的实体规则,例如,人名类型的实体为姓氏加上名称,姓氏一般为1至2个字,名称一般为1至2字。
若第一实体识别结果中的每个实体均符合待识别的实体类型的实体规则,则将第一实体识别结果中的各个实体添加至预设的实体列表模板中并输出,从而得到待识别语句的目标实体识别结果。
若第一实体识别结果中的有一个或以上实体不符合待识别的实体类型的实体规则,则将第二实体识别结果中的各个实体添加至预设的实体列表模板中,并从第一实体识别结果中获取与第二实体识别结果非交集部分所包含的实体添加至预设的实体列表模板中;然后,输出预设的实体列表模板中的各个实体,以作为待识别语句的目标实体识别结果。
一种实施方式为,首先,检测第二实体识别结果是否包含或等于第一实体识别结果。
若检测到第二实体识别结果不包含且不等于第一实体识别结果,则从第一实体识别结果中获取符合待识别的实体类型的实体规则的实体,添加至预设的实体列表模板中(具体地,从第一实体识别结果中,获取与第二实体识别结果非交集部分所包含的实体;并检测所获取的实体是否符合待识别的实体类型的实体规则;若符合实体规则,则将所获取的实体添加至预设的实体列表模板中)。将第二实体识别结果中的各个实体,添加至预设的实体列表模板中。最后,将预设的实体列表模板中的各个实体输出,得到待识别语句的目标实体识别结果。
若检测到第二实体识别结果包含或等于第一实体识别结果,则将第二实体识别结果中的各个实体,添加至预设的实体列表模板中。并将预设的实体列表模板中的各个实体输出,得到待识别语句的目标实体识别结果。
其中,目标实体识别结果,是指实体识别设备根据待识别语句待识别的实体类型,分别采用匹配识别模型和实体识别模型对待识别语句进行识别后,将匹配识别模型对待识别语句的识别结果和实体识别模型对待识别语句的识别结果进行融合,得出的待识别语句中所包含的实体。
进一步地,所述获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果的步骤之前还包括:
获取待训练语料数据;
依据所述待训练语料数据,对待训练模型进行训练,直至待训练模型收敛,以生成预设的实体识别模型。
在采用预设的实体识别模型对待识别语句进行实体识别,检测待识别语句中所包含的实体之前,需要对待训练模型进行训练,以生成预设的实体识别模型。首先,采集用于训练待训练模型的待训练语料数据,例如,采集多个文本或者多个语句作为待训练语料数据。
然后将待训练语料数据输入至待训练模型进行训练,直至待训练模型收敛,以生成预设的实体识别模型。至此,预设的实体识别模型,具有根据待识别语句待识别的实体类型,对待识别语句进行识别,分析待识别语句中所包含的:类型为待识别的实体类型的各个实体。
其中,待训练语料数据,是指用于训练待训练模型的语句、文本或文档等数据。
在本实施例中,通过将待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型,并获取实体识别模型基于待识别语句生成的第一实体识别结果,以及匹配识别模型基于待识别语句生成的第二实体识别结果。最后,将第一实体识别结果和第二实体识别结果进行融合,作为待识别语句最终的目标实体识别结果。在采用匹配识别模型准确识别实体的同时,增加实体识别模型对待识别语句进行识别,以便进一步识别出可能为目标的实体,而匹配识别模型无法识别的实体。避免了实体识别模型的识别结果存在不准确、匹配识别模型的识别结果存在不完整的问题,提高了实体识别的准确率。
进一步地,参照图2,图2为本申请实体识别方法第二实施例的流程示意图。基于上述第一实施例,提出本申请实体识别方法第二实施例,步骤S30包括:
步骤S31,检测所述第二实体识别结果是否包含或等于所述第一实体识别结果;
具体地,将第一实体识别结果中所包括的实体分别与第二实体识别结果中所包括的实体进行对比,检测第一实体识别结果中的每一个实体是否在第二实体识别结果中均存在与之相同的实体。如果检测到第一实体识别结果中有一个或以上的实体在第二实体识别结果中不存在与之相同的实体,则确定第二实体识别结果不包含且不等于第一实体识别结果。
如果检测到第一实体识别结果中每一个实体在第二实体识别结果中均存在与之相同的实体,则进一步检测第二实体识别结果中的每一个实体是否在第一实体识别结果中均存在与之相同的实体,且是否第二实体识别结果中的实体数量等于第一实体识别结果中的实体数量。
如果检测到第二实体识别结果中的每一个实体在第一实体识别结果中均存在与之相同的实体,且第二实体识别结果中的实体数量等于第一实体识别结果中的实体数量,则确定第二实体识别结果等于第一实体识别结果。如果检测到第二实体识别结果中有一个或以上的实体在第一实体识别结果中不存在与之相同的实体,且第二实体识别结果中的实体数量大于第一实体识别结果中的实体数量,则确定第二实体识别结果包含第一实体识别结果。
为了方便理解,以一具体实施例进行说明。例如,第一实体识别结果包括的实体为:李明、张三、李四、赵小红,若第二实体识别结果包括的实体为:李明、张三,则第二实体识别结果不包含且不等于第一实体识别结果;若第二实体识别结果包括的实体为:李明、张三、李四、赵小红、孙小杰,则第二实体识别结果包含第一实体识别结果;若第二实体识别结果包括的实体为:李明、张三、李四、赵小红,则第二实体识别结果等于第一实体识别结果。
步骤S32,若检测到所述第二实体识别结果不包含且不等于所述第一实体识别结果,则从所述第一实体识别结果中获取符合预设的实体规则的实体,添加至预设的实体列表模板中;
具体地,如果检测到第二实体识别结果不包含且不等于第一实体识别结果,从第一实体识别结果中,获取与第二实体识别结果非交集部分所包含的第一实体。然后,获取并依据待识别的实体类型预设的实体规则,检测第一实体是否符合预设的实体规则。
如果第一实体符合预设的实体规则,则将第一实体添加至预设的实体列表模板中。如果第一实体不符合预设的实体规则,则将不符合预设的实体规则的第一实体视为实体识别模型识别的错误,并将不符合预设的实体规则的实体舍弃。
为了方便理解,接以上步骤S31的例子继续说明。在第二实体识别结果不包含且不等于第一实体识别结果的情况下,其中,第一实体识别结果中,与第二实体识别结果非交集部分所包含的第一实体为“李四”和“赵小红”。并且“李四”不符合预设的实体规则、“赵小红”符合预设的实体规则,则将“赵小红”添加至预设的实体列表模板中、将“李四”视为实体识别模型识别的错误并舍弃。
步骤S33,将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
为了方便理解,接以上步骤S32的例子继续说明。将第一实体识别结果中,与第二实体识别结果非交集部分所包含的第一实体中,符合预设的实体规则的第一实体添加至预设的实体列表模板中,以作为待识别语句的目标实体识别结果的各个实体。并将第二实体识别结果中的各个实体“李明”、”张三”添加至预设的实体列表模板中,以作为待识别语句的目标实体识别结果的各个实体。
步骤S34,将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
最终,将所有添加至预设的实体列表模板中的各个实体输出,所有输出的实体即符合待识别语句待识别的实体类型的目标实体,至此,得到待识别语句的目标实体识别结果。
如果检测到第二实体识别结果包含或等于第一实体识别结果,由于第二实体识别结果包含或等于了第一实体识别结果,且第二实体识别结果为经过匹配识别模型对待识别语句进行实体识别得出的实体,而匹配识别模型识别结果正确率较高,故可将第二实体识别结果中的各个实体直接添加至预设实体列表模板中,以作为待识别语句的目标实体识别结果的各个实体。并将预设的实体列表模板中的各个实体输出,得到待识别语句的目标实体识别结果(即将第二实体识别结果作为待识别语句的目标实体识别结果)。
在本实施例中,针对实体识别模型的识别结果识别较全面但存在不准确实体,而匹配识别模型的识别结果准确但存在识别不全面的问题,通过检测第二实体识别结果是否包含或等于第一实体识别结果,并在检测到第二实体识别结果不包含且不等于第一实体识别结果时,从第一实体识别结果中获取符合预设的实体规则的实体,添加至预设的实体列表模板中;将第二实体识别结果中的各个实体,添加至预设的实体列表模板中,并将添加至预设的实体列表模板中的各个实体作为待识别语句的目标实体。从而避免了实体识别模型的识别结果存在不准确、匹配识别模型的识别结果存在不完整的问题,提高了实体识别的准确率。
进一步地,基于上述第二实施例,提出本申请实体识别方法第三实施例,
所述从所述第一实体识别结果中获取符合预设的实体规则的实体,添加至预设的实体列表模板中步骤包括:
步骤A1,从所述第一实体识别结果中,获取与所述第二实体识别结果非交集部分所包含的第一实体;
具体地,检测第一实体识别结果与第二实体识别结果的交集部分所包含的各个实体,并从第一实体识别结果所包含的各个实体中减去与第二实体识别结果的交集部分所包含的各个实体,以作为第一实体识别结果中,与第二实体识别结果非交集部分所包含的第一实体,并获取第二实体识别结果非交集部分所包含的第一实体。
其中,第一实体,是指第一实体识别结果中,与第二实体识别结果非交集部分所包含的实体。
步骤A2,检测所述第一实体是否符合预设的实体规则;
每种实体类型都有对应的实体规则,在待识别语句待识别的实体类型确定后,可依据待识别语句待识别的实体类型,确定第一实体是否符合预设的实体规则。具体地,获取并依据待识别语句待识别的实体类型,确定第一实体是否符合待识别语句待识别的实体类型对应的实体规则。
例如,人名类型的实体为姓氏加上名称,姓氏一般为1至2个字,名称一般为1至2字,则检测第一实体是否符合“姓氏加上名称、姓氏1或2个字、名称为1或2个字”。
其中,预设的实体规则,是指与待识别语句待识别的实体类型对应的实体规则。
步骤A3,若所述第一实体符合预设的实体规则,则将所述第一实体添加至预设的实体列表模板中。
如果第一实体符合预设的实体规则,即第一实体符合待识别语句待识别的实体类型对应的实体规则,则将符合预设的实体规则的第一实体添加至预设的实体列表模板中,以作为待识别语句的目标实体。如果第一实体不符合预设的实体规则,则将不符合预设的实体规则的第一实体视为实体识别模型识别的错误,并将不符合预设的实体规则的实体舍弃。
在本实施例中,通过检测第一实体识别结果中,与第二实体识别结果非交集部分所包含的实体,是否符合预设的实体规则,并将符合预设的实体规则的实体添加至预设的实体列表中,将不符合预设的实体规则的实体视为实体识别模型识别的错误进行舍弃;从而使得添加至预设的实体列表中的实体符合预设的实体规则,避免了实体识别模型存在实体识别错误的问题。
进一步地,参照图3,图3为本申请实体识别方法第四实施例的流程示意图。基于上述第二实施例,提出本申请实体识别方法第四实施例,步骤S31之后还包括:
步骤S35,若检测到所述第二实体识别结果包含或等于所述第一实体识别结果,则将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
如果检测到第二实体识别结果包含或等于第一实体识别结果,由于第二实体识别结果包含或等于了第一实体识别结果(即第一实体识别结果中的每个实体,在第二实体识别结果中均存在与之相同的实体),且第二实体识别结果为经过匹配识别模型对待识别语句进行实体识别得出的实体,而匹配识别模型识别结果正确率较高,故可将第二实体识别结果中的各个实体直接添加至预设实体列表模板中,以作为待识别语句的目标实体识别结果的各个实体。
如果检测到第二实体识别结果不包含且不等于第一实体识别结果,从第一实体识别结果中获取符合预设的实体规则的实体,添加至预设的实体列表模板中,以作为待识别语句的目标实体识别结果的各个实体。
步骤S36,将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
将预设的实体列表模板中的各个实体输出,得到待识别语句的目标实体识别结果(即将第二实体识别结果作为待识别语句的目标实体识别结果)。
在本实施例中,针对实体识别模型的识别结果识别较全面但存在不准确实体,而匹配识别模型的识别结果准确但存在识别不全面的问题,通过检测第二实体识别结果是否包含或等于第一实体识别结果,并在检测到第二实体识别结果包含或等于第一实体识别结果时,将第二实体识别结果中的各个实体,添加至预设的实体列表模板中以作为待识别语句的目标实体。从而避免了实体识别模型的识别结果存在不准确、匹配识别模型的识别结果存在不完整的问题,提高了实体识别的准确率。
进一步地,基于上述第一实施例,提出本申请实体识别方法第五实施例,所述获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果的步骤包括:
步骤B1,获取所述待识别语句待识别的实体类型;
其中,待识别的实体类型,是指需要对待识别语句进行识别的实体的类型。如,人名类型的实体、城市名类型的实体、国家名类型的实体,若待识别的实体类型为人名,则从待识别语句中识别出人名类型的实体。
具体地,根据对待识别语句的识别需求,确定待识别语句待识别的实体类型。
步骤B2,获取所述实体识别模型对所述待识别语句进行识别得出的符合所述实体类型的各个第二实体;
第二实体,是指实体识别模型获取待识别语句,并对待识别语句进行识别,分析得出的待识别语句中所包含的:类型为待识别的实体类型的各个实体。
具体地,实体识别模型获取待识别语句,并对待识别语句进行识别,分析得出的待识别语句中所包含的:类型为待识别的实体类型的各个第二实体。
例如,待识别语句待识别的实体类型为城市名,待识别语句中包括:人名类型实体“张三”和“李四”、城市名类型实体“北京”和“上海”,则实体识别模型获取待识别语句,并对待识别语句进行识别,分析得出的待识别语句中所包含的:类型为城市名的实体“北京”和“上海”,作为第二实体。
步骤B3,将所述第二实体作为所述第一实体识别结果。
最后,将所有的第二实体作为第一实体识别结果,以供后续依据第一实体识别结果确定待识别语句的目标实体识别结果。
在本实施例中,通过获取实体识别模型对待识别语句进行识别,得出的符合待识别语句待识别的实体类型的各个第二实体,并将第二实体作为第一实体识别结果;从而保证第一实体识别结果的得出,为后续依据第一实体识别结果确定待识别语句的目标实体识别结果提供了准确的数据依据。
进一步地,基于上述第一实施例,提出本申请实体识别方法第五实施例,所述获取所述匹配识别模型基于所述待识别语句生成的第二实体识别结果的步骤包括:
步骤C1,获取预先编写的实体库,其中,所述实体库包括多个实体;
为了保证匹配识别模型在获取待识别语句进行实体识别时,可以直接将待识别语句与实体库的实体进行对比,以识别出待识别语句中所包含的实体,在采用匹配识别模型对待识别语句进行实体识别前,建立实体库。其中,建立的实体库是基于规则预先编写的,其中包含了多个实体;实体库中实体通过穷举的方式进行编写。作为一种更优的实施方式,在预先编写实体库时,将实体识别模型易识别错误的命名实体,编写到实体库中,作为实体库的实体,以进一步提高对实体识别的准确率。
在需要对待识别语句进行实体识别时,获取预先编写的实体库,以供匹配识别模型将待识别与实体库的实体进行对比,从待识别语句中查找出与实体库中的实体相同的各个实体。进一步地,在对待识别语句进行实体识别时,依据待识别语句待识别的实体类型,获取预先编写的:与待识别语句待识别的实体类型对应的实体库;例如,若待识别语句待识别的实体类型为人名类型实体,则获取预先编写的实体类型为人名的实体库;若待识别语句待识别的实体类型为车名类型实体,则获取预先编写的实体类型为车名的实体库。
步骤C2,获取所述匹配识别模型从所述待识别语句中查找出与所述实体库中的实体相同的各个第三实体;
第三实体,是指匹配识别模型获取待识别语句,并将待识别语句与实体库的实体进行对比,从待识别语句中查找出的与实体库中的实体相同的各个实体。
具体地,匹配识别模型获取待识别语句,并将待识别语句与实体库的实体进行对比,从待识别语句中查找出的与实体库中的实体相同的各个第三实体。
例如,待识别语句中包括的实体有:“张三”、“李四”、“北京”和“上海”,而在将待识别语句与实体库的实体进行对比后,发现预先编写的实体库中存在“张三”和“李四”的实体、不存在“北京”和“上海”的实体,则将“张三”和“李四”作为第三实体。
步骤C3,将所述第三实体作为所述第二实体识别结果。
最后,将所有的第三实体作为第二实体识别结果,以供后续依据第二实体识别结果确定待识别语句的目标实体识别结果。
在本实施例中,通过获取匹配识别模型将待识别语句与实体库的实体进行对比,得出的待识别语句中与实体库中的实体相同的各个第三实体,并将第三实体作为第二实体识别结果;从而保证第二实体识别结果的得出,为后续依据第二实体识别结果确定待识别语句的目标实体识别结果提供了准确的数据依据。
此外,本申请还提供一种实体识别装置。
参照图4,图4为本申请实体识别装置第一实施例的功能模块示意图。
本实施例中,所述实体识别装置包括:
语句获取模块10,用于获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
结果获取模块20,用于获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
实体确定模块30,用于根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
进一步的,所述实体确定模块30还包括:
检测单元,用于检测所述第二实体识别结果是否包含或等于所述第一实体识别结果;
第一添加单元,用于若检测到所述第二实体识别结果不包含且不等于所述第一实体识别结果,则从所述第一实体识别结果中获取符合预设的实体规则的实体,添加至预设的实体列表模板中;
第二添加单元,用于将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
第一实体输出单元,用于将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
进一步的,所述第一添加单元还包括:
获取子单元,用于从所述第一实体识别结果中,获取与所述第二实体识别结果非交集部分所包含的第一实体;
检测子单元,用于检测所述第一实体是否符合预设的实体规则;
添加子单元,用于若所述第一实体符合预设的实体规则,则将所述第一实体添加至预设的实体列表模板中。
进一步的,所述实体确定模块30还包括:
第三添加单元,用于若检测到所述第二实体识别结果包含或等于所述第一实体识别结果,则将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
第二实体输出单元,用于将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
进一步的,所述结果获取模块还包括:
第一获取单元,用于获取所述待识别语句待识别的实体类型;
第二获取单元,用于获取所述实体识别模型对所述待识别语句进行识别得出的符合所述实体类型的各个第二实体;
第一识别结果确定单元,用于将所述第二实体作为所述第一实体识别结果。
进一步的,所述结果获取模块还包括:
第三获取单元,用于获取预先编写的实体库,其中,所述实体库包括多个实体;
第四获取单元,用于获取所述匹配识别模型从所述待识别语句中查找出与所述实体库中的实体相同的各个第三实体;
第二识别结果确定单元,用于将所述第三实体作为所述第二实体识别结果。
进一步的,所述实体识别装置还包括:
训练数据获取模块,用于获取待训练语料数据;
模型训练模块,用于依据所述待训练语料数据,对待训练模型进行训练,直至待训练模型收敛,以生成预设的实体识别模型。
其中,实体识别装置中的各个实施例与上述实体识别方法的各实施例基本相同,在此不再详细赘述。
此外,本申请还提供一种实体识别设备。如图5所示,图5是本申请实施例方案涉及的实体识别设备的硬件运行环境的结构示意图。
需要说明的是,图5即可为实体识别设备的硬件运行环境的结构示意图。本申请实施例实体识别设备可以是PC,便携计算机等终端设备。
如图5所示,实体识别设备可以包括处理器1001(例如CPU),通信总线1002,用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信;用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard);网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口);存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器,存储器1005可选的还可以是独立于前述处理器1001的存储装置。
可选地,该实体识别设备还可以包括摄像头、RF(Radio Frequency,射频)电路,传感器、音频电路、WiFi模块等等。
本领域技术人员可以理解,图5中示出的实体识别设备的硬件结构并不构成对实体识别设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
继续参照图5,图5中作为一种计算机可读存储介质的存储器1005可以包括操作***、网络通信模块以及计算机可读指令。
在图5中,网络通信模块主要用于连接数据库,与数据库进行数据通信;而处理器1001可以调用存储器1005中存储的计算机可读指令,并执行以下步骤:
获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
本申请实体识别设备具体实施方式与上述实体识别方法各实施例基本相同,在此不再赘述。
此外,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质可以是非易失性,也可以是易失性,所述计算机可读存储介质上存储有计算机可读指令,所述计算机可读指令被处理器执行时实现以下步骤:
获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
本申请计算机可读存储介质具体实施方式与上述实体识别方法各实施例基本相同,在此不再赘述。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者***不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者***所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者***中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (20)

  1. 一种实体识别方法,所述实体识别方法包括以下步骤:
    获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
    获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
    根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
  2. 如权利要求1所述的实体识别方法,所述根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果的步骤包括:
    检测所述第二实体识别结果是否包含或等于所述第一实体识别结果;
    若检测到所述第二实体识别结果不包含且不等于所述第一实体识别结果,则从所述第一实体识别结果中获取符合预设的实体规则的实体,添加至预设的实体列表模板中;
    将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
    将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
  3. 如权利要求2所述的实体识别方法,所述从所述第一实体识别结果中获取符合预设的实体规则的实体,添加至预设的实体列表模板中的步骤包括:
    从所述第一实体识别结果中,获取与所述第二实体识别结果非交集部分所包含的第一实体;
    检测所述第一实体是否符合预设的实体规则;
    若所述第一实体符合预设的实体规则,则将所述第一实体添加至预设的实体列表模板中。
  4. 如权利要求2所述的实体识别方法,所述检测所述第二实体识别结果是否包含或等于第一实体识别结果的步骤之后还包括:
    若检测到所述第二实体识别结果包含或等于所述第一实体识别结果,则将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
    将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
  5. 如权利要求1所述的实体识别方法,所述获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果的步骤包括:
    获取所述待识别语句待识别的实体类型;
    获取所述实体识别模型对所述待识别语句进行识别得出的符合所述实体类型的各个第二实体;
    将所述第二实体作为所述第一实体识别结果。
  6. 如权利要求1所述的实体识别方法,所述获取所述匹配识别模型基于所述待识别语句生成的第二实体识别结果的步骤包括:
    获取预先编写的实体库,其中,所述实体库包括多个实体;
    获取所述匹配识别模型从所述待识别语句中查找出与所述实体库中的实体相同的各个第三实体;
    将所述第三实体作为所述第二实体识别结果。
  7. 如权利要求1所述的实体识别方法,所述获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果的步骤之前还包括:
    获取待训练语料数据;
    依据所述待训练语料数据,对待训练模型进行训练,直至待训练模型收敛,以生成预设的实体识别模型。
  8. 一种实体识别装置,所述实体识别装置包括:
    语句获取模块,用于获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
    结果获取模块,用于获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
    实体确定模块,用于根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
  9. 一种实体识别设备,所述实体识别设备包括处理器、存储器、以及存储在所述存储器上并可被所述处理器执行的计算机可读指令,其中所述计算机可读指令被所述处理器执行时实现以下步骤:
    获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
    获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
    根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
  10. 如权利要求9所述的实体识别设备,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    检测所述第二实体识别结果是否包含或等于所述第一实体识别结果;
    若检测到所述第二实体识别结果不包含且不等于所述第一实体识别结果,则从所述第一实体识别结果中获取符合预设的实体规则的实体,添加至预设的实体列表模板中;
    将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
    将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
  11. 如权利要求10所述的实体识别设备,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    从所述第一实体识别结果中,获取与所述第二实体识别结果非交集部分所包含的第一实体;
    检测所述第一实体是否符合预设的实体规则;
    若所述第一实体符合预设的实体规则,则将所述第一实体添加至预设的实体列表模板中。
  12. 如权利要求10所述的实体识别设备,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    若检测到所述第二实体识别结果包含或等于所述第一实体识别结果,则将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
    将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
  13. 如权利要求9所述的实体识别设备,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    获取所述待识别语句待识别的实体类型;
    获取所述实体识别模型对所述待识别语句进行识别得出的符合所述实体类型的各个第二实体;
    将所述第二实体作为所述第一实体识别结果。
  14. 如权利要求9所述的实体识别设备,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    获取预先编写的实体库,其中,所述实体库包括多个实体;
    获取所述匹配识别模型从所述待识别语句中查找出与所述实体库中的实体相同的各个第三实体;
    将所述第三实体作为所述第二实体识别结果。
  15. 如权利要求9所述的实体识别设备,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    获取待训练语料数据;
    依据所述待训练语料数据,对待训练模型进行训练,直至待训练模型收敛,以生成预设的实体识别模型。
  16. 一种计算机可读存储介质,所述计算机可读存储介质上存储有计算机可读指令,其中所述计算机可读指令可被处理器执行,以使所述处理器执行如下步骤:
    获取待识别语句,并将所述待识别语句分别输入至预设的实体识别模型和预设的匹配识别模型;
    获取所述实体识别模型基于所述待识别语句生成的第一实体识别结果,以及所述匹配识别模型基于所述待识别语句生成的第二实体识别结果;
    根据所述第一实体识别结果和所述第二实体识别结果,确定所述待识别语句的目标实体识别结果。
  17. 如权利要求16所述的计算机可读存储介质,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    检测所述第二实体识别结果是否包含或等于所述第一实体识别结果;
    若检测到所述第二实体识别结果不包含且不等于所述第一实体识别结果,则从所述第一实体识别结果中获取符合预设的实体规则的实体,添加至预设的实体列表模板中;
    将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
    将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
  18. 如权利要求17所述的计算机可读存储介质,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    从所述第一实体识别结果中,获取与所述第二实体识别结果非交集部分所包含的第一实体;
    检测所述第一实体是否符合预设的实体规则;
    若所述第一实体符合预设的实体规则,则将所述第一实体添加至预设的实体列表模板中。
  19. 如权利要求17所述的计算机可读存储介质,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    若检测到所述第二实体识别结果包含或等于所述第一实体识别结果,则将所述第二实体识别结果中的各个实体,添加至所述实体列表模板中;
    将所述实体列表模板中的各个实体输出,得到所述待识别语句的目标实体识别结果。
  20. 如权利要求16所述的计算机可读存储介质,所述计算机可读指令被所述处理器执行时还实现以下步骤:
    获取所述待识别语句待识别的实体类型;
    获取所述实体识别模型对所述待识别语句进行识别得出的符合所述实体类型的各个第二实体;
    将所述第二实体作为所述第一实体识别结果。
PCT/CN2020/093481 2019-09-18 2020-05-29 实体识别方法、装置、设备及计算机可读存储介质 WO2021051872A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910880672.9 2019-09-18
CN201910880672.9A CN110750991B (zh) 2019-09-18 2019-09-18 实体识别方法、装置、设备及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2021051872A1 true WO2021051872A1 (zh) 2021-03-25

Family

ID=69276621

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/093481 WO2021051872A1 (zh) 2019-09-18 2020-05-29 实体识别方法、装置、设备及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN110750991B (zh)
WO (1) WO2021051872A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110750991B (zh) * 2019-09-18 2022-04-15 平安科技(深圳)有限公司 实体识别方法、装置、设备及计算机可读存储介质
CN111651990B (zh) * 2020-04-14 2024-03-15 车智互联(北京)科技有限公司 一种实体识别方法、计算设备及可读存储介质
CN112307766A (zh) * 2020-09-22 2021-02-02 北京京东世纪贸易有限公司 用于识别预设类别实体的方法、装置、电子设备和介质
CN112818675A (zh) * 2021-02-01 2021-05-18 北京金山数字娱乐科技有限公司 一种基于知识库问答的实体抽取方法及装置
CN114997171A (zh) * 2022-06-17 2022-09-02 平安科技(深圳)有限公司 实体识别方法、装置、设备及存储介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120303557A1 (en) * 2011-05-28 2012-11-29 Microsoft Corporation Interactive framework for name disambiguation
CN107943786A (zh) * 2017-11-16 2018-04-20 广州市万隆证券咨询顾问有限公司 一种中文命名实体识别方法及***
CN108304375A (zh) * 2017-11-13 2018-07-20 广州腾讯科技有限公司 一种信息识别方法及其设备、存储介质、终端
CN109684631A (zh) * 2018-12-12 2019-04-26 北京神州泰岳软件股份有限公司 命名实体抽取方法、装置及介质
CN109918680A (zh) * 2019-03-28 2019-06-21 腾讯科技(上海)有限公司 实体识别方法、装置及计算机设备
CN110750991A (zh) * 2019-09-18 2020-02-04 平安科技(深圳)有限公司 实体识别方法、装置、设备及计算机可读存储介质

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140163951A1 (en) * 2012-12-07 2014-06-12 Xerox Corporation Hybrid adaptation of named entity recognition
CN103995885B (zh) * 2014-05-29 2017-11-17 百度在线网络技术(北京)有限公司 实体名的识别方法和装置
CN107330011B (zh) * 2017-06-14 2019-03-26 北京神州泰岳软件股份有限公司 多策略融合的命名实体的识别方法及装置
CN108491373B (zh) * 2018-02-01 2022-05-27 北京百度网讯科技有限公司 一种实体识别方法及***
CN110147551B (zh) * 2019-05-14 2023-07-11 腾讯科技(深圳)有限公司 多类别实体识别模型训练、实体识别方法、服务器及终端

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120303557A1 (en) * 2011-05-28 2012-11-29 Microsoft Corporation Interactive framework for name disambiguation
CN108304375A (zh) * 2017-11-13 2018-07-20 广州腾讯科技有限公司 一种信息识别方法及其设备、存储介质、终端
CN107943786A (zh) * 2017-11-16 2018-04-20 广州市万隆证券咨询顾问有限公司 一种中文命名实体识别方法及***
CN109684631A (zh) * 2018-12-12 2019-04-26 北京神州泰岳软件股份有限公司 命名实体抽取方法、装置及介质
CN109918680A (zh) * 2019-03-28 2019-06-21 腾讯科技(上海)有限公司 实体识别方法、装置及计算机设备
CN110750991A (zh) * 2019-09-18 2020-02-04 平安科技(深圳)有限公司 实体识别方法、装置、设备及计算机可读存储介质

Also Published As

Publication number Publication date
CN110750991A (zh) 2020-02-04
CN110750991B (zh) 2022-04-15

Similar Documents

Publication Publication Date Title
WO2021051872A1 (zh) 实体识别方法、装置、设备及计算机可读存储介质
WO2019184217A1 (zh) 热点事件分类方法、装置及存储介质
US10114809B2 (en) Method and apparatus for phonetically annotating text
US20190146985A1 (en) Natural language question answering method and apparatus
US10262059B2 (en) Method, apparatus, and storage medium for text information processing
US10496745B2 (en) Dictionary updating apparatus, dictionary updating method and computer program product
CN111324743A (zh) 文本关系抽取的方法、装置、计算机设备及存储介质
US20190325021A1 (en) Method and Device for Creating Hyperlink
US10199036B2 (en) Method and device for implementing voice input
WO2020215550A1 (zh) 错句检测方法、装置及计算机可读存储介质
WO2022100452A1 (zh) Ocr***的评估方法、装置、设备及可读存储介质
CN109522397B (zh) 信息处理方法及装置
US11790175B2 (en) System and method for phonetic hashing and named entity linking from output of speech recognition
US9811517B2 (en) Method and system of adding punctuation and establishing language model using a punctuation weighting applied to chinese speech recognized text
CN110737689A (zh) 数据标准符合性检测方法、装置、***及存储介质
WO2021135603A1 (zh) 意图识别方法、服务器及存储介质
CN111143556A (zh) 软件功能点自动计数方法、装置、介质及电子设备
KR20170052974A (ko) 언어 학습을 위한 원어민 번역 교정 방법 및 번역 교정 서비스 제공 서버
CN110781673B (zh) 文档验收方法、装置、计算机设备及存储介质
CN111325031A (zh) 简历解析方法及装置
WO2021174814A1 (zh) 众包任务的答案验证方法、装置、计算机设备及存储介质
WO2023001308A1 (zh) 文本识别方法及装置、计算机可读存储介质和电子设备
CN115881108A (zh) 语音识别方法、装置、设备及存储介质
CN114547059A (zh) 平台数据的更新处理方法、装置及计算机设备
WO2021098876A1 (zh) 一种基于知识图谱的问答方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20866662

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20866662

Country of ref document: EP

Kind code of ref document: A1