WO2023213166A1 - 文本的处理方法、装置和计算机可读存储介质 - Google Patents

文本的处理方法、装置和计算机可读存储介质 Download PDF

Info

Publication number
WO2023213166A1
WO2023213166A1 PCT/CN2023/086629 CN2023086629W WO2023213166A1 WO 2023213166 A1 WO2023213166 A1 WO 2023213166A1 CN 2023086629 W CN2023086629 W CN 2023086629W WO 2023213166 A1 WO2023213166 A1 WO 2023213166A1
Authority
WO
WIPO (PCT)
Prior art keywords
entity
node
current
type
text
Prior art date
Application number
PCT/CN2023/086629
Other languages
English (en)
French (fr)
Inventor
杨帅
张亚
吴元清
周谦
Original Assignee
北京京东拓先科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京京东拓先科技有限公司 filed Critical 北京京东拓先科技有限公司
Publication of WO2023213166A1 publication Critical patent/WO2023213166A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Definitions

  • the present disclosure relates to the technical fields of medical medicine and natural language processing, and in particular to a text processing method, device and computer-readable storage medium.
  • methods for extracting entity relationships include: dependency syntax analysis.
  • Dependency syntax analysis makes use of grammatical relationships, usually starting from verbs to construct rules, and limiting parts of speech and dependency relationships.
  • Text in the pharmaceutical field is a special kind of text that people use in their daily lives, such as drug instructions, clinical guidelines and other drug usage guidance texts. Texts in the field of pharmacy have a unique way of expressing themselves.
  • a text processing method including: identifying multiple entities in the text to be processed and the type of each entity, wherein the text to be processed includes a usage instruction text of a drug; according to the The type determines the grouping of each entity, where the grouping includes: condition entity grouping and result entity grouping.
  • the entities in the condition entity grouping are used as condition entities related to the usage conditions of the drug.
  • the entities in the result entity grouping are used as conditions related to the use mode or use of the drug.
  • Result entities related to the results extract entity relationships based on the order of each entity in the text to be processed, the type and grouping of each entity.
  • extracting entity relationships according to the order of each entity in the text to be processed, the type and grouping of each entity includes: according to the order of each entity in the text to be processed, the type and grouping of each entity, classifying each entity Generate a tree structure as a node to obtain an entity tree; extract entity relationships based on the entity tree.
  • generating a tree structure using each entity as a node according to the order of each entity in the text to be processed, the type and grouping of each entity includes: according to the order of each entity in the text to be processed. Obtain each entity as the current entity; for each current entity, when the current entity is a conditional entity, determine the relationship between the current entity and the current node according to the type of the current entity and the type of the current node; according to the current entity and the current The node's relationship adds the current entity to the tree structure and updates the current entity's node to the current node.
  • generating a tree structure using each entity as a node according to the order of each entity in the text to be processed, the type and grouping of each entity further includes: when the current entity is the result entity, determining that the current entity is Leaf nodes of the current node and added to the tree structure.
  • determining the relationship between the current entity and the current node according to the type of the current entity and the type of the current node includes: when the types of the current entity and the current node are different, determining whether there is any parent node of the current node.
  • a parent node of the same type as the current entity if there is a parent node of the same type as the current entity among all parent nodes of the current node, the current entity is regarded as the sibling node of the parent node of the same type as the current entity. ; If there is no parent node of the same type as the current entity among all the parent nodes of the current node, the current entity will be regarded as the child node of the current node.
  • determining the relationship between the current entity and the current node according to the type of the current entity and the type of the current node includes: when the type of the current entity and the current node is the same, determining whether the current entity and the current node are included. relationship; when the current entity and the current node are included in the relationship, the current entity is regarded as the child node of the current node; when the current entity and the current node are not included in the relationship, the current entity is regarded as the current node sibling nodes.
  • extracting entity relationships based on the entity tree includes: performing a depth-first search starting from the root node of the entity tree and reaching each leaf node; using each node in sequence from the leaf node to the root node as the current search node; For each current search node, extract the entity relationship based on the node type of the current search node, the entity type of the child node of the current search node, and the entity type of the leaf node among the sibling nodes of the current search node.
  • the entity relationship is extracted according to the node type of the current search node, the entity type of the child node of the current search node, and the entity type of the leaf node among the sibling nodes of the current search node. Including: for each current search node, when the current search node is a leaf node, extract the entity of the current search node and return it to the parent node of the current search node as the extraction result; when the current search node is a non-leaf node, In this case, the entity relationship is extracted based on the entity type of the child node of the current search node and the entity type of the leaf node among the sibling nodes of the current search node.
  • Extracting the entity relationship includes: when there is no leaf node in the sibling node of the current search node, the type of entity in the child node of the current search node is different from the entity type of the current search node.
  • the child node as the first child node, combines the extraction result returned by each first child node with the entity of the current search node to form the extraction result, and returns it to the parent node of the current search node; combines the child nodes of the current search node with the current search node.
  • the child nodes of the same entity type are used as the second child nodes, and the extraction results returned by each second child node are directly returned to the parent node of the current search node.
  • extracting the entity relationship according to the entity type of the child node of the current search node and the entity type of the leaf node among the sibling nodes of the current search node includes: there is a leaf node among the sibling nodes of the current search node.
  • the leaf nodes among the sibling nodes of the current search node are used as candidate nodes; candidate nodes are selected from the candidate nodes that are of different types from the entities of the child nodes of the current search node; the entities of the selected candidate nodes are extracted and assigned Go to the extraction result corresponding to the child node of the current search node, and form the extraction result with the entity of the current search node, and return to the parent node of the current search node.
  • determining the grouping of each entity according to the type of each entity includes: keyword identification of the text to be processed, determining the text type of the text to be processed; and searching the corresponding entity grouping table according to the text type, where the entity grouping table Including the corresponding relationship between each type and grouping; determine the grouping of each entity according to the entity grouping table.
  • the method further includes: constructing a knowledge graph based on the extracted entity relationships; and generating answers to drug usage questions based on the knowledge graph.
  • the method further includes: constructing a knowledge graph based on the extracted entity relationships; reviewing the drug prescription according to the knowledge graph to determine whether the drug prescription is correct.
  • a text processing device including: an identification module for identifying multiple entities in the text to be processed and the type of each entity, wherein the text to be processed includes usage instructions for medicines.
  • Text used to determine the grouping of each entity according to the type of each entity, where the grouping includes: condition entity grouping and result entity grouping, the entities in the condition entity grouping are used as condition entities related to the use conditions of the drug, and the result entity
  • the entities in the group are used as result entities related to the usage or results of the medicine
  • the extraction module is used to extract entity relationships based on the order of each entity in the text to be processed, the type of each entity, and the grouping.
  • a text processing device including: a processor; and a memory coupled to the processor, used to store instructions, and when the instructions are executed by the processor, the processor executes the foregoing The text processing method of any embodiment.
  • a non-transitory computer-readable storage medium on which a computer program is stored, wherein when the program is executed by a processor, the text processing method of any of the foregoing embodiments is implemented.
  • a computer program including: instructions, which when executed by the processor, cause the processor to perform the text processing method as in any of the foregoing embodiments.
  • Figure 1 shows a schematic flowchart of a text processing method according to some embodiments of the present disclosure.
  • FIG. 2 shows a schematic flowchart of a text processing method according to other embodiments of the present disclosure.
  • Figure 3 shows a schematic diagram of an entity tree according to some embodiments of the present disclosure.
  • FIG. 4 shows a schematic flowchart of a text processing method according to further embodiments of the present disclosure.
  • Figure 5 shows a schematic structural diagram of a text processing device according to some embodiments of the present disclosure.
  • FIG. 6 shows a schematic structural diagram of a text processing device according to other embodiments of the present disclosure.
  • FIG. 7 shows a schematic structural diagram of a text processing device according to further embodiments of the present disclosure.
  • the dependency syntax analysis method in the related art uses common grammatical structures. For example, in the sentence "The frequency of administration for pediatric patients is three times a day”, the keyword it focuses on is “is” ” and the words “of”. Texts in the pharmaceutical field have their own special grammatical structures. If common grammar is used for analysis, Not only does it increase the processing complexity, it also reduces the accuracy and recall rate. For example, if the above text is changed to "Child patients three times a day” and the keywords "is”, "frequency of administration” and "of” are removed, the meaning of this sentence will not be affected, but it will seriously affect the analysis results.
  • a technical problem to be solved by this disclosure is: how to improve the accuracy of entity relationship extraction for texts in the pharmaceutical field.
  • the present disclosure proposes a text processing method, which will be described below with reference to Figures 1 to 4 .
  • Figure 1 is a flowchart of some embodiments of the processing method of the present disclosure. As shown in Figure 1, the method in this embodiment includes: steps S102 to S106.
  • step S102 multiple entities in the text to be processed and the types of each entity are identified.
  • the text to be processed can be a text in the pharmaceutical field, including a drug usage guidance text, for example, a complete text such as drug instructions or clinical guidelines, or a text that describes the issues that should be paid attention to when the drug is used to treat diseases, such as describing the indications of the drug. , usage and dosage, contraindications, precautions, medication for special groups and other texts.
  • a drug usage guidance text for example, a complete text such as drug instructions or clinical guidelines
  • Existing technologies can be used for entity recognition, for example, deep learning and other methods can be used for entity recognition, which will not be described again here.
  • the types of entities include: frequency entity (for example, 1 to 2 times a day), dosage entity (for example, 20 to 60 mg once), treatment course entity (for example, three days), and administration route
  • timing of administration entity e.g., postprandial
  • population entity e.g., children
  • disease entity e.g., reflux esophagitis
  • coadministration entity e.g., omeprazole with clarithromycin combination
  • one or more of the biochemical indicator entities for example, glomerular filtration rate
  • step S104 the grouping of each entity is determined according to the type of each entity.
  • Text 1 Reflux esophagitis: 20 to 60 mg (1 to 3 pills at a time), 1 to 2 times a day
  • Text 2 When omeprazole is combined with clarithromycin or erythromycin, they The blood concentration of the drug will increase.
  • the medication operation that should be performed is 20 to 60 mg once (dose operation), 1 to 2 times a day (frequency operation).
  • Text 2 under the precondition of combined medication, the effect of the drug is to increase the blood concentration. Therefore, two groups are constructed: condition entity grouping and result entity grouping.
  • the entities in the condition entity group serve as condition entities related to the use conditions of the drug
  • the entities in the result entity group serve as result entities related to the use mode or results of the drug.
  • condition entity groups For example, for the text on drug usage and dosage, population entities, disease entities, combined medication entities, administration route entities, and biochemical indicator entities are divided into condition entity groups, and the remaining entities are divided into result entity groups. For example, for taboo text, population entities, disease entities, past medical history entities, etiology entities, combined medication entities, etc. are divided into condition entity groups, and usage level entities (applicable, prohibited, used with caution, quasi-medical advice, etc.) are divided into result entity groups . For example, “Metformin is contraindicated in pregnant and lactating women.” Here, “pregnant and lactating women" is the condition entity, and “prohibited” is the result entity.
  • keyword recognition is performed on the text to be processed to determine the text type of the text to be processed; the corresponding entity grouping table is searched according to the text type, where the entity grouping table includes the corresponding relationship between each type and the grouping; according to the entity grouping table that determines the grouping of various entities.
  • Text types include, for example, one or more of indications, usage and dosage, contraindications, precautions, and medication for special groups, and are not limited to the examples given.
  • the title of a text in the field of pharmacy contains keywords corresponding to the text type, and keyword identification can be performed to determine the text type of the text to be processed.
  • Other existing text classification methods can also be used to classify the text to be processed to determine the text type of the text to be processed.
  • a classification model is used to classify the text to be processed, which is not limited to the examples given.
  • each type type of entity
  • the grouping can be configured in advance to form an entity grouping table, and each text type corresponds to an entity grouping table. Then, after the text type to be processed is determined, the corresponding entity grouping table is searched to determine each entity. grouping.
  • step S106 entity relationships are extracted based on the order of each entity in the text to be processed, the type and grouping of each entity.
  • Entities can be stored as a linear list in memory or in the database in the order they appear in the text to be processed.
  • each entity is used as a node to generate a tree structure to obtain an entity tree; the entity relationship is extracted according to the entity tree.
  • each condition entity in the condition entity group serves as a non-leaf node, and each result entity in the result entity group serves as a leaf node. In this way, when performing a depth-first search in the order from the root node to the leaf node of the entity tree, the entity relationship can be formed under what conditions and what kind of results are obtained, which is in line with the characteristics of the pharmaceutical field text.
  • each entity is obtained as the current entity in turn; for each current entity, when the current entity is a conditional entity, according to the type of the current entity and the current node The type determines the relationship between the current entity and the current node; according to the relationship between the current entity and the current node, the current entity is added to the tree structure, and the node of the current entity is updated to the current node.
  • the current entity is the result entity, determine the current entity as the leaf node of the current node and add it to the tree structure middle.
  • the current entity when the type of the current entity is different from that of the current node, it is determined whether there is a parent node of the same type as the current entity among all the parent nodes of the current node; among all the parent nodes of the current node If there is a parent node of the same type as the current entity, the current entity is regarded as the sibling node of the parent node of the same type as the current entity; there is no parent node of the same type as the current entity among all the parent nodes of the current node. In the case of a parent node, the current entity is regarded as a child node of the current node.
  • the current entity and the current node when the current entity and the current node are of the same type, it is determined whether the current entity and the current node are in an included relationship; when the current entity and the current node are in an included relationship, the current entity and the current node are included in the relationship.
  • the entity is regarded as a child node of the current node; when the current entity and the current node do not belong to the included relationship, the current entity is regarded as the sibling node of the current node.
  • step S202 a root node is established and the root node is used as the current node, where the first entity is used as the root node according to the order of each entity in the text to be processed. You can set a pointer to maintain (point to) the current node.
  • step S204 according to the order of each entity in the text to be processed, the next entity is obtained as the current entity.
  • step S206 the grouping of the current entity is determined. If the current entity is the result entity, step S207 is executed. Otherwise, step S208 is executed.
  • step S207 the current entity is the leaf node of the current node and added to the tree structure, and returns to step S204 to start again. At this time, the current node is not updated, that is, the pointer pointing is not changed.
  • step S208 it is determined whether the types of the current entity and the current node are the same. If they are the same, step S210 is executed; otherwise, step S214 is executed.
  • step S210 it is determined whether the current entity and the current node have an included relationship. If so, step S211 is executed. Otherwise, step S212 is executed.
  • step S211 the current entity is added to the tree structure as a child node of the current node.
  • step S212 the current entity is added to the tree structure as a sibling node of the current node.
  • step S214 it is determined whether there is a parent node of the same type as the current entity among all parent nodes of the current node. If there is, step S215 is executed. Otherwise, step S216 is executed. All parent nodes of the current node include: the parent node of the current node, the parent node of the parent node... until the root node, etc.
  • step S215 the current entity is regarded as the sibling node of the parent node of the same type as the current entity. Added to the tree structure.
  • step S216 the current entity is added to the tree structure as a child node of the current node.
  • step S218 update the node of the current entity to the current node, and return to step S204 to start again. That is, the pointer points to the current entity.
  • the original text for the usage and dosage text is: Oral.
  • the entity tree generated according to the method of the above embodiment is shown in Figure 3.
  • a depth-first search is performed starting from the root node of the entity tree and reaching each leaf node; each node is used as the current search node in order from the leaf node to the root node; for each current search node, according to The node type of the current search node, the entity type of the child node of the current search node, and the entity type of the leaf node among the sibling nodes of the current search node are used to extract the entity relationship.
  • the entity of the current search node is extracted and returned to the parent node of the current search node as the extraction result; in the current search When the node is a non-leaf node, the entity relationship is extracted based on the entity type of the child node of the current search node and the entity type of the leaf node among the sibling nodes of the current search node.
  • the child nodes that are of different types from the entity of the current search node are used as the first child nodes, and the extraction results returned by each first child node are combined with the entities of the current search node to form the extraction results, and return Go to the parent node of the current search node; use the child nodes of the current search node that have the same type as the entity of the current search node as the second child node, and directly return the extraction results returned by each second child node to the current Search for the node's parent node.
  • the leaf nodes among the sibling nodes of the current search node are used as candidate nodes; and the node that is the same as the child node of the current search node is selected from the candidate nodes.
  • Candidate nodes with different types of entities extract the entities of the selected candidate nodes, assign them to the extraction results corresponding to the child nodes of the current search node, and form the extraction results with the entities of the current search node, and return them to the parent of the current search node node.
  • step S402 a depth-first search is performed starting from the root node of the entity tree and reaching each leaf node.
  • step S404 the current search node is obtained in order from leaf nodes to root nodes.
  • step S406 it is determined whether the current search node is a leaf node. If so, step S407 is executed. Otherwise, step S408 is executed.
  • step S407 the entity of the current search node is extracted, and the extraction result is returned to the parent node of the current search node, and the process returns to step S404 to start again.
  • step S408 it is determined whether there is a leaf node among the sibling nodes of the current search node. If so, step S410 is executed; otherwise, step S414 is executed.
  • step S410 the leaf nodes among the sibling nodes of the current search node are used as candidate nodes, and candidate nodes that are different in entity type from the child nodes of the current search node are selected from the candidate nodes.
  • step S412 the entities of the selected candidate nodes are extracted, assigned to the extraction results corresponding to the child nodes of the current search node, and combined with the entities of the current search node to form the extraction results, returned to the parent node of the current search node, and returned Step S404 starts again.
  • step S414 among the child nodes of the current search node, the child nodes that are different from the entity type of the current search node are used as first child nodes, and the extraction results returned by each first child node are extracted with the entity composition of the current search node.
  • return to the parent node of the current search node use the child nodes of the current search node that have the same type as the entity of the current search node as the second child node, and directly extract the results returned by each second child node, Return to the parent node of the current search node and return to step S404 to start again.
  • the current search node is updated to peptic ulcer.
  • two pieces are taken at a time, three times a day, and 1 to 2 weeks is the first child node, gastric ulcer.
  • Duodenal ulcer is the second child node.
  • Two tablets at a time, three times a day, 1 to 2 weeks will be combined with peptic ulcer to form the extraction result.
  • the results ((gastric ulcer, two tablets at a time, three times a day, 4 to 8 weeks), (duodenal ulcer, two tablets at a time, three times a day, 2 to 4 weeks)) returned to the adult node.
  • the entity relationship extraction results of the entity tree in Figure 3 are: [Oral, children, peptic ulcer, one tablet at a time, three times a day], [Oral, children, reflux esophagitis, one tablet at a time, twice a day], [ Oral, adults, peptic ulcer, two tablets at a time, three times a day, 1 to 2 weeks], [Oral, adults, gastric ulcer, two tablets at a time, three times a day, 4 to 8 weeks], [Oral, adults, ten Duodenal ulcer, two tablets at a time, three times a day, 2 to 4 weeks], [oral, adults, reflux esophagitis, two tablets at a time, twice a day].
  • the above entity relationship can be converted into the form of a triplet.
  • [oral, children, peptic ulcer, one tablet at a time, three times a day] is used as a node, and [the route of administration of this node is oral] is used as a triplet to form the corresponding
  • the edge between this node and the oral node in the knowledge graph is the route of administration.
  • Other methods can also be used to convert the above entity relationship into the form of triples, for example, [children's dosage is one tablet at a time], which is not limited to the examples given.
  • a knowledge graph is constructed based on the extracted entity relationships, such as the method in the above embodiments, but is not limited to the examples given. Further, answers to drug usage questions are generated based on the knowledge graph, or the drug prescription is reviewed based on the knowledge graph to determine whether the drug prescription is correct. For example, if the user asks about the medicine in the above application example, what is the dosage for children, the dosage for children can be determined as one tablet at a time based on the constructed knowledge graph. For another example, the prescription of a drug states that the patient is 30 years old, the disease is reflux esophagitis, and the medication frequency is three times a day. Based on the constructed knowledge graph, the prescription is reviewed and the medication frequency of twice a day is wrong.
  • the method of the above embodiment first identifies each entity in the text to be processed and the type of each entity, and then divides each entity into a condition entity related to the usage conditions of the drug and a result entity related to the usage mode or result of the drug according to the type of the entity. , and then extract entity relationships based on the order of each entity in the text to be processed, the type and grouping of each entity.
  • the method of the above embodiment is designed for the expression of text in the pharmaceutical field. This paper proposes an entity division method and entity relationship extraction method for texts in the pharmaceutical field, which can reduce the complexity of entity relationship extraction and improve accuracy and recall.
  • the method in the above embodiment can be applied to the scenario of constructing a knowledge graph based on texts in the pharmaceutical field. Since the entity relationship extraction method can ensure interpretability and improve accuracy and recall, it can reduce the cost of manual intervention to construct knowledge graphs, improve the efficiency and accuracy of knowledge graph construction, and facilitate the expansion of the scale of pharmaceutical knowledge graphs. Furthermore, the knowledge graph can be used in fields such as drug question and answer, prescription review, etc. to realize automatic online prescription of drugs and ensure accuracy.
  • the present disclosure also provides a text processing device, which will be described below with reference to FIG. 5 .
  • Figure 5 is a structural diagram of some embodiments of a processing device of the present disclosure. As shown in FIG. 5 , the device 50 of this embodiment includes: an identification module 510 , a grouping module 520 , and an extraction module 530 .
  • the identification module 510 is used to identify multiple entities and the types of each entity in the text to be processed, where the text to be processed includes usage instruction text of medicines.
  • the grouping module 520 is used to determine the grouping of each entity according to the type of each entity, wherein the grouping includes: condition entity grouping and result entity grouping.
  • the entities in the condition entity grouping are condition entities related to the use conditions of the medicine.
  • the grouping module 520 is used to perform keyword identification on the text to be processed, determine the text type of the text to be processed, and find the corresponding entity grouping table according to the text type, where the entity grouping table includes the correspondence between each type and the grouping. Relationship; determine the grouping of each entity according to the entity grouping table.
  • the extraction module 530 is used to extract entity relationships based on the order of each entity in the text to be processed, the type and grouping of each entity.
  • the extraction module 530 is configured to use each entity as a node to generate a tree structure according to the order of each entity in the text to be processed, the type and grouping of each entity, to obtain an entity tree; and extract entity relationships according to the entity tree.
  • the extraction module 530 is used to sequentially obtain each entity as the current entity according to the order of each entity in the text to be processed; for each current entity, if the current entity is a conditional entity, according to the current entity Type and the type of the current node, determine the relationship between the current entity and the current node; add the current entity to the tree structure based on the relationship between the current entity and the current node, and update the node of the current entity to the current node.
  • the extraction module 530 is configured to determine that the current entity is a leaf node of the current node when the current entity is the result entity, and add it to the tree structure.
  • the extraction module 530 is used to determine whether there is a parent node of the same type as the current entity among all parent nodes of the current node when the type of the current entity is different from that of the current node; among all the parent nodes of the current node, If there is a parent node of the same type as the current entity in the parent node, the current entity will be regarded as the sibling node of the parent node of the same type as the current entity; there is no parent node of the same type as the current entity among all the parent nodes of the current node. In the case of parent nodes of the same type, the current entity is regarded as the child node of the current node.
  • the extraction module 530 is used to determine whether the current entity and the current node are included in a contained relationship when the current entity and the current node are of the same type; , the current entity is regarded as the child node of the current node; if the current entity and the current node do not belong to the included relationship, the current entity is regarded as the sibling node of the current node.
  • a depth-first search is performed starting from the root node of the entity tree and reaching each leaf node; each node is used as the current search node in order from the leaf node to the root node; for each current search node, according to The node type of the current search node, the entity type of the child node of the current search node, and the entity type of the leaf node among the sibling nodes of the current search node are used to extract the entity relationship.
  • the extraction module 530 is used for each current search node, when the current search node is a leaf node, extract the entity of the current search node, and return it to the parent node of the current search node as the extraction result;
  • the current search node is a non-leaf node
  • the entity relationship is extracted based on the entity type of the child node of the current search node and the entity type of the leaf node among the sibling nodes of the current search node.
  • the extraction module 530 is configured to, when there is no leaf node in the sibling nodes of the current search node, use the child nodes of the child nodes of the current search node that are different from the entity type of the current search node as the third One child node, the extraction result returned by each first child node is combined with the entity of the current search node to form the extraction result, and returned to the parent node of the current search node; the child nodes of the current search node are of the same type as the entity of the current search node The child node, as the second child node, directly returns the extraction results returned by each second child node to the parent node of the current search node.
  • the extraction module 530 is configured to use the leaf nodes among the sibling nodes of the current search node as candidate nodes when there are leaf nodes among the sibling nodes of the current search node; select the candidate nodes from the candidate nodes that are the same as the current search node. Candidate nodes with different types of entities of the child nodes; extract the entities of the selected candidate nodes, assign them to the extraction results corresponding to the child nodes of the current search node, and form the extraction results with the entities of the current search node, and return to the current Search for the node's parent node.
  • the device 50 also includes: a building module 540, used to build a knowledge graph based on the extracted entity relationships; a question and answer module 550, used to generate answers to drug usage questions based on the knowledge graph; an audit module 560, used to Review drug prescriptions based on the knowledge graph to determine whether the drug prescription is correct.
  • a building module 540 used to build a knowledge graph based on the extracted entity relationships
  • a question and answer module 550 used to generate answers to drug usage questions based on the knowledge graph
  • an audit module 560 used to Review drug prescriptions based on the knowledge graph to determine whether the drug prescription is correct.
  • the text processing device in the embodiment of the present disclosure can be implemented by various computing devices or computer systems, which will be described below in conjunction with FIG. 6 and FIG. 7 .
  • Figure 6 is a structural diagram of some embodiments of a processing device of the present disclosure.
  • the device 60 of this embodiment includes: a memory 610 and a processor 620 coupled to the memory 610.
  • the processor 620 is configured to execute any implementation of the present disclosure based on instructions stored in the memory 610.
  • the text processing method in the example is a structural diagram of some embodiments of a processing device of the present disclosure.
  • the memory 610 may include, for example, system memory, fixed non-volatile storage media, etc.
  • System memory stores, for example, operating systems, applications, boot loaders, databases, and other programs.
  • Figure 7 is a structural diagram of other embodiments of the processing device of the present disclosure.
  • the device 70 of this embodiment includes: a memory 710 and a processor 720, which are similar to the memory 610 and the processor 620 respectively. It may also include an input/output interface 730, a network interface 740, a storage interface 750, etc. These interfaces 730, 740, 750, the memory 710 and the processor 720 may be connected through a bus 760, for example.
  • the input and output interface 730 provides a connection interface for input and output devices such as a monitor, mouse, keyboard, and touch screen.
  • the network interface 740 provides a connection interface for various networked devices, such as a database server or a cloud storage server.
  • the storage interface 750 provides a connection interface for external storage devices such as SD cards and USB disks.
  • the present disclosure also provides a computer program, including: instructions, which when executed by the processor, cause the processor to execute the text processing method as in any of the foregoing embodiments.
  • embodiments of the present disclosure may be provided as methods, systems, or computer program products. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable non-transitory storage media (including, but not limited to, disk memory, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein. .
  • These computer program instructions can be provided to general-purpose computers, special-purpose computers, embedded processors, or other programmable data A processor of a data processing device to produce a machine such that instructions executed by a processor of a computer or other programmable data processing device produce a process or processes for implementing a process or processes in a flowchart and/or a block or blocks in a block diagram A device for the functions specified in the box.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing apparatus to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the instructions
  • the device implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device, causing a series of operating steps to be performed on the computer or other programmable device to produce computer-implemented processing, thereby executing on the computer or other programmable device.
  • Instructions provide steps for implementing the functions specified in a process or processes of a flowchart diagram and/or a block or blocks of a block diagram.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开涉及一种文本的处理方法、装置和计算机可读存储介质,涉及计算机技术领域。本公开的方法包括:识别待处理文本中的多个实体以及各个实体的类型,其中,待处理文本包括药品的使用指导文本;根据各个实体的类型确定各个实体的分组,其中,分组包括:条件实体分组和结果实体分组,条件实体分组中的实体作为与药品的使用条件相关的条件实体,结果实体分组中的实体作为与药品的使用方式或结果相关的结果实体;根据各个实体在待处理文本中的顺序、各个实体的类型以及分组,抽取实体关系。

Description

文本的处理方法、装置和计算机可读存储介质
相关申请的交叉引用
本申请是以CN申请号为202210479767.1,申请日为2022年5月5日的申请为基础,并主张其优先权,该CN申请的公开内容在此作为整体引入本申请中。
技术领域
本公开涉及医疗医药技术领域及自然语言处理技术领域,特别涉及一种文本的处理方法、装置和计算机可读存储介质。
背景技术
在自然语言处理领域中,一般将带有某一类特征的文本片段称为实体。从文本中挖掘实体之间的联系,称为实体关系抽取。
相关技术中,实体关系的抽取方法包括:依存句法分析,依存句法分析利用了语法关系,通常以动词为起点构建规则,对词性和依存关系进行限定。
药学领域文本是人们日常生活中会用到的一种特殊的文本,例如,药品说明书、临床指南等药品的使用指导文本。药学领域文本具有特有的表达方式。
发明内容
根据本公开的一些实施例,提供的一种文本的处理方法,包括:识别待处理文本中的多个实体以及各个实体的类型,其中,待处理文本包括药品的使用指导文本;根据各个实体的类型确定各个实体的分组,其中,分组包括:条件实体分组和结果实体分组,条件实体分组中的实体作为与药品的使用条件相关的条件实体,结果实体分组中的实体作为与药品的使用方式或结果相关的结果实体;根据各个实体在待处理文本中的顺序、各个实体的类型以及分组,抽取实体关系。
在一些实施例中,根据各个实体在待处理文本中的顺序、各个实体的类型以及分组,抽取实体关系包括:根据各个实体在待处理文本中的顺序,各个实体的类型以及分组,将各个实体作为节点生成树状结构,得到实体树;根据实体树抽取实体关系。
在一些实施例中,根据各个实体在待处理文本中的顺序,各个实体的类型以及分组,将各个实体作为节点生成树状结构包括:按照各个实体在待处理文本中的顺序依 次获取每个实体作为当前实体;在针对每个当前实体,当前实体为条件实体的情况下,根据当前实体的类型和当前节点的类型,确定当前实体与当前节点的关系;根据当前实体与当前节点的关系将当前实体添加到树状结构中,并将当前实体的节点更新为当前节点。
在一些实施例中,根据各个实体在待处理文本中的顺序、各个实体的类型以及分组,将各个实体作为节点生成树状结构还包括:在当前实体为结果实体的情况下,确定当前实体为当前节点的叶子节点,并添加到树状结构中。
在一些实施例中,根据当前实体的类型和当前节点的类型,确定当前实体与当前节点的关系包括:在当前实体与当前节点的类型不同的情况下,确定当前节点的所有父辈节点中是否存在类型与当前实体的类型相同的父辈节点;在当前节点的所有父辈节点中存在类型与当前实体的类型相同的父辈节点的情况下,将当前实体作为与当前实体的类型相同的父辈节点的兄弟节点;在当前节点的所有父辈节点中不存在类型与当前实体的类型相同的父辈节点的情况下,将当前实体作为当前节点的子节点。
在一些实施例中,根据当前实体的类型和当前节点的类型,确定当前实体与当前节点的关系包括:在当前实体与当前节点的类型相同的情况下,确定当前实体与当前节点是否为被包含的关系;在当前实体与当前节点为被包含的关系的情况下,将当前实体作为当前节点的子节点;在当前实体与当前节点不属于被包含的关系的情况下,将当前实体作为当前节点的兄弟节点。
在一些实施例中,根据实体树抽取实体关系包括:从实体树的根节点开始进行深度优先搜索,到达各个叶子节点;按照由叶子节点到根节点的顺序,依次将各个节点作为当前搜索节点;针对每个当前搜索节点,根据当前搜索节点的节点类型、当前搜索节点的子节点的实体的类型、以及当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系。
在一些实施例中,针对每个当前搜索节点,根据当前搜索节点的节点类型、当前搜索节点的子节点的实体的类型、以及当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系包括:针对每个当前搜索节点,在当前搜索节点为叶子节点的情况下,将当前搜索节点的实体抽取出来,作为抽取结果返回到当前搜索节点的父节点;在当前搜索节点为非叶子节点的情况下,根据当前搜索节点的子节点的实体的类型、以及当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系。
在一些实施例中,根据当前搜索节点的子节点的实体的类型、以及当前搜索节点 的兄弟节点中叶子节点的实体的类型,抽取实体关系包括:在当前搜索节点的兄弟节点中不存在叶子节点的情况下,将当前搜索节点的子节点中与当前搜索节点的实体的类型不同的子节点,作为第一子节点,将各个第一子节点返回的抽取结果与当前搜索节点的实体组成抽取结果,返回到当前搜索节点的父节点;将当前搜索节点的子节点中与当前搜索节点的实体的类型相同的子节点,作为第二子节点,直接将各个第二子节点返回的抽取结果,返回到当前搜索节点的父节点。
在一些实施例中,根据当前搜索节点的子节点的实体的类型、以及当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系包括:在当前搜索节点的兄弟节点中存在叶子节点的情况下,将当前搜索节点的兄弟节点中的叶子节点作为候选节点;从候选节点中选取与当前搜索节点的子节点的实体的类型不同的候选节点;将选取的候选节点的实体抽取出来,分配到当前搜索节点的子节点对应的抽取结果中,并与当前搜索节点的实体组成抽取结果,返回到当前搜索节点的父节点。
在一些实施例中,根据各个实体的类型确定各个实体的分组包括:对待处理文本进行关键词识别,确定待处理文本的文本类型;根据文本类型,查找对应的实体分组表,其中,实体分组表包括各个类型与分组的对应关系;根据实体分组表,确定各个实体的分组。
在一些实施例中,该方法还包括:根据抽取的实体关系构建知识图谱;根据知识图谱生成药品的使用问题的答案。
在一些实施例中,该方法还包括:根据抽取的实体关系构建知识图谱;根据知识图谱对药品的处方进行审核,确定药品的处方是否正确。
根据本公开的另一些实施例,提供的一种文本的处理装置,包括:识别模块,用于识别待处理文本中的多个实体以及各个实体的类型,其中,待处理文本包括药品的使用指导文本;分组模块,用于根据各个实体的类型确定各个实体的分组,其中,分组包括:条件实体分组和结果实体分组,条件实体分组中的实体作为与药品的使用条件相关的条件实体,结果实体分组中的实体作为与药品的使用方式或结果相关的结果实体;抽取模块,用于根据各个实体在待处理文本中的顺序、各个实体的类型以及分组,抽取实体关系。
根据本公开的又一些实施例,提供的一种文本的处理装置,包括:处理器;以及耦接至处理器的存储器,用于存储指令,指令被处理器执行时,使处理器执行如前述任意实施例的文本的处理方法。
根据本公开的再一些实施例,提供的一种非瞬时性计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现前述任意实施例的文本的处理方法。
根据本公开的又一些实施例,提供的一种计算机程序,包括:指令,所述指令被所述处理器执行时,使所述处理器执行如前述任意实施例的文本的处理方法。
通过以下参照附图对本公开的示例性实施例的详细描述,本公开的其它特征及其优点将会变得清楚。
附图说明
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出本公开的一些实施例的文本的处理方法的流程示意图。
图2示出本公开的另一些实施例的文本的处理方法的流程示意图。
图3示出本公开的一些实施例的实体树的示意图。
图4示出本公开的又一些实施例的文本的处理方法的流程示意图。
图5示出本公开的一些实施例的文本的处理装置的结构示意图。
图6示出本公开的另一些实施例的文本的处理装置的结构示意图。
图7示出本公开的又一些实施例的文本的处理装置的结构示意图。
具体实施方式
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。以下对至少一个示例性实施例的描述实际上仅仅是说明性的,决不作为对本公开及其应用或使用的任何限制。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
发明人发现:相关技术中的依存句法分析方法,利用的是通用的语法结构,例如,在“儿童患者的给药频次是一日三次”这句话中,它关注到的关键词为“是”和“的”这两个词。药学领域的文本有其特殊的语法结构,如果采用通用的语法来进行分析, 不但增加了处理复杂度,还降低了准确性与召回率。例如,如果上述文本改为“儿童患者一日三次”,去掉了关键词“是”、“给药频次”、“的”,不影响这句话的意思,但是却严重影响分析结果。
本公开所要解决的一个技术问题是:如何提高针对药学领域文本的实体关系抽取的准确性。
本公开提出一种文本的处理方法,下面结合图1~4进行描述。
图1为本公开文本的处理方法一些实施例的流程图。如图1所示,该实施例的方法包括:步骤S102~S106。
在步骤S102中,识别待处理文本中的多个实体以及各个实体的类型。
待处理文本可以是药学领域文本,包括药品的使用指导文本,例如,药品说明书或临床指南等完整的文本,或者描述药品在治疗疾病过程中要注意的问题的文本,例如,描述药品的适应症、用法用量、禁忌、注意事项、特殊人群用药等文本。可以采用现有技术进行实体识别,例如,采用深度学习等方式进行实体识别,在此不再赘述。例如,针对待处理文本,实体的类型包括:频次实体(例如,一日1~2次),剂量实体(例如,一次20~60mg),疗程实体(例如,三天),给药途径实体(例如,口服),给药时机实体(例如,餐后),人群实体(例如,儿童),疾病实体(例如,返流性食道炎),联合用药实体(例如,奥美拉唑与克拉霉素合用),生化指标实体(例如,肾小球滤过率)中一项或多项,不限于所举示例。
在步骤S104中,根据各个实体的类型确定各个实体的分组。
发明人通过研究发现,药学领域文本描述的突出特点是,在某些前置条件下,应该进行某种操作,或者产生某种效果。例如,文本1:返流性食道炎:一次20~60mg(一次1~3粒),一日1~2次;文本2:当奥美拉唑与克拉霉素或红霉素合用时,它们的血药浓度会上升。文本1中在返流性食道炎这个适应症作为前置条件的情况下,应该进行的用药操作是一次20~60mg(剂量操作),一日1~2次(频次操作)。文本2中在联合用药这个前置条件下,药品产生的效果是血药浓度上升。因此,构建条件实体分组和结果实体分组两个分组。条件实体分组中的实体作为与药品的使用条件相关的条件实体,结果实体分组中的实体作为与药品的使用方式或结果相关的结果实体。
例如,针对药品用法用量的文本,人群实体、疾病实体、联合用药实体、给药途径实体、生化指标实体被划分到条件实体分组,其余的实体被划分到结果实体分组。 例如,针对禁忌文本,人群实体、疾病实体、既往病史实体、病因实体、联合用药实体等被划分到条件实体分组,使用等级实体(适用、禁用、慎用、准医嘱等)划分到结果实体分组。例如,“二甲双胍,妊娠及哺乳期妇女禁用。”这里,“妊娠及哺乳期妇女”是条件实体,“禁用”是结果实体。
在一些实施例中,对待处理文本进行关键词识别,确定待处理文本的文本类型;根据文本类型,查找对应的实体分组表,其中,实体分组表包括各个类型与分组的对应关系;根据实体分组表,确定各个实体的分组。
文本类型例如包括:适应症、用法用量、禁忌、注意事项、特殊人群用药中一项或多项,不限于所举示例。一般药学领域文本的标题中包含文本类型对应的关键字,可以进行关键字识别确定待处理文本的文本类型。也可以采用其他现有的文本分类方法对待处理文本进行分类确定待处理文本的文本类型,例如,采用分类模型对待处理文本进行分类,不限于所举示例。
可以预先配置各个类型(实体的类型)与分组的对应关系形成实体分组表,并且每个文本类型对应一个实体分组表,进而在对待处理文本确定文本类型后查找相应的实体分组表,确定各个实体的分组。
在步骤S106中,根据各个实体在待处理文本中的顺序,各个实体的类型以及分组,抽取实体关系。
各个实体按照在待处理文本中的顺序可以在内存或数据库中存储为线性列表。在一些实施例中,根据各个实体在待处理文本中的顺序、各个实体的类型以及分组,将各个实体作为节点生成树状结构,得到实体树;根据实体树抽取实体关系。在一些实施例中,在实体树中条件实体分组中的各个条件实体作为非叶子节点,结果实体分组中的各个结果实体作为叶子节点。这样按照实体树由根节点到叶结点的顺序进行深度优先搜索时可以形成在什么样的条件下,得到什么样的结果的实体关系,符合药学领域文本的特征。
下面具体描述如何生成实体树。
在一些实施例中,按照各个实体在待处理文本中的顺序,依次获取每个实体作为当前实体;针对每个当前实体,在当前实体为条件实体的情况下,根据当前实体的类型和当前节点的类型,确定当前实体与当前节点的关系;根据当前实体与当前节点的关系,将当前实体添加到树状结构中,并将当前实体的节点更新为当前节点。在当前实体为结果实体的情况下,确定当前实体为当前节点的叶子节点,并添加到树状结构 中。
进一步,在一些实施例中,在当前实体与当前节点的类型不同的情况下,确定当前节点的所有父辈节点中是否存在类型与当前实体的类型相同的父辈节点;在当前节点的所有父辈节点中存在类型与当前实体的类型相同的父辈节点的情况下,将当前实体作为与当前实体的类型相同的父辈节点的兄弟节点;在当前节点的所有父辈节点中不存在类型与当前实体的类型相同的父辈节点的情况下,将当前实体作为当前节点的子节点。
在另一些实施例中,在当前实体与当前节点的类型相同的情况下,确定当前实体与当前节点是否为被包含的关系;在当前实体与当前节点为被包含的关系的情况下,将当前实体作为当前节点的子节点;在当前实体与当前节点不属于被包含的关系的情况下,将当前实体作为当前节点的兄弟节点。
如图2所示,在步骤S202中,建立根节点,并将根节点作为当前节点,其中,按照各个实体在待处理文本中的顺序将第一个实体作为根节点。可以设置一个指针维护(指向)当前节点。
在步骤S204中,按照各个实体在待处理文本中的顺序,获取下一个实体作为当前实体。
在步骤S206中,判断当前实体的分组,在当前实体为结果实体的情况下,执行步骤S207,否则执行步骤S208。
在步骤S207中,将当前实体为当前节点的叶子节点,并添加到树状结构中,返回步骤S204重新开始。此时,不更新当前节点,即不改变指针的指向。
在步骤S208中,判断当前实体与当前节点的类型是否相同,如果相同,则执行步骤S210,否则执行步骤S214。
在步骤S210中,确定当前实体与当前节点是否为被包含的关系,如果是,则执行步骤S211,否则,执行步骤S212。
在步骤S211中,将当前实体作为当前节点的子节点添加到树状结构中。
在步骤S212中,将当前实体作为当前节点的兄弟节点添加到树状结构中。
在步骤S214中,判断当前节点的所有父辈节点中是否存在类型与当前实体的类型相同的父辈节点,如果存在,则执行步骤S215,否则,执行步骤S216。当前节点的所有父辈节点包括:当前节点的父节点,父节点的父节点…直到根节点等
在步骤S215中,将当前实体作为与当前实体的类型相同的父辈节点的兄弟节点 添加到树状结构中。
在步骤S216中,将当前实体作为当前节点的子节点添加到树状结构中。
在步骤S218中,将当前实体的节点更新为当前节点,返回步骤S204重新开始。即将指针指向当前实体。
下面结合图3描述上述方法对应的应用例。例如,用法用量文本的原文为:口服。儿童患者:消化性溃疡,一次一片,一日三次;返流性食道炎,一次一片,一日两次。成人患者:消化性溃疡一次两片,一日三次,疗程1~2周,胃溃疡疗程通常为4~8周,十二指肠溃疡疗程通常2~4周;返流性食道炎,一次两片,一日两次。
将上述用法用量文本进行实体识别,确定实体的类型后得到以下结果:(口服-给药途径实体),(儿童-人群实体),(消化性溃疡-疾病实体),(一次一片-剂量实体),(一日三次-频次实体),(返流性食道炎-疾病实体),(一次一片-剂量实体),(一日两次-频次实体),(成人-人群实体),(消化性溃疡-疾病实体),(一次两片-剂量实体),(一日三次-频次实体),(1~2周-疗程实体),(胃溃疡-疾病实体),(4~8周-疗程实体),(十二指肠溃疡-疾病实体),(2~4周-疗程实体),(返流性食道炎-疾病实体),(一次两片-剂量实体),(一日两次-频次实体)。
根据上述实施例的方法生成的实体树如图3所示。以当前实体为胃溃疡,当前节点为消化性溃疡为例,判断当前实体与当前节点的类型是否相同,两者的类型相同,则判断胃溃疡是否被包含在消化性溃疡中,判断结果为是,则将胃溃疡作为消化性溃疡的子节点,添加到树状结构中,并将指针指向胃溃疡,将4~8周作为下一个当前实体,继续生成实体树中的节点。
下面具体描述如何根据实体树抽取实体关系。
在一些实施例中,从实体树的根节点开始进行深度优先搜索,到达各个叶子节点;按照由叶子节点到根节点的顺序,依次将各个节点作为当前搜索节点;针对每个当前搜索节点,根据当前搜索节点的节点类型、当前搜索节点的子节点的实体的类型、以及当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系。
进一步,在一些实施例中,针对每个当前搜索节点,在当前搜索节点为叶子节点的情况下,将当前搜索节点的实体抽取出来,作为抽取结果返回到当前搜索节点的父节点;在当前搜索节点为非叶子节点的情况下,根据当前搜索节点的子节点的实体的类型、以及当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系。
进一步,在一些实施例中,在当前搜索节点的兄弟节点中不存在叶子节点的情况 下,将当前搜索节点的子节点中与当前搜索节点的实体的类型不同的子节点,作为第一子节点,将各个第一子节点返回的抽取结果与当前搜索节点的实体组成抽取结果,返回到当前搜索节点的父节点;将当前搜索节点的子节点中与当前搜索节点的实体的类型相同的子节点,作为第二子节点,直接将各个第二子节点返回的抽取结果,返回到当前搜索节点的父节点。
在另一些实施例中,在当前搜索节点的兄弟节点中存在叶子节点的情况下,将当前搜索节点的兄弟节点中的叶子节点作为候选节点;从候选节点中选取与当前搜索节点的子节点的实体的类型不同的候选节点;将选取的候选节点的实体抽取出来,分配到当前搜索节点的子节点对应的抽取结果中,并与当前搜索节点的实体组成抽取结果,返回到当前搜索节点的父节点。
如图4所示,在步骤S402中,从实体树的根节点开始进行深度优先搜索,到达各个叶子节点。
在步骤S404中,按照由叶子节点到根节点的顺序,获取当前搜索节点。
可以从深度最深的叶子节点开始向上进行遍历。
在步骤S406中,判断当前搜索节点是否为叶子节点,如果是,则执行步骤S407,否则执行步骤S408。
在步骤S407中,将当前搜索节点的实体抽取出来,作为抽取结果返回到当前搜索节点的父节点,返回步骤S404重新开始。
在步骤S408中,判断当前搜索节点的兄弟节点中是否存在叶子节点,如果是,则执行步骤S410,否则执行步骤S414。
在步骤S410中,将当前搜索节点的兄弟节点中的叶子节点作为候选节点,从候选节点中选取与当前搜索节点的子节点的实体的类型不同的候选节点。
在步骤S412中,将选取的候选节点的实体抽取出来,分配到当前搜索节点的子节点对应的抽取结果中,并与当前搜索节点的实体组成抽取结果,返回到当前搜索节点的父节点,返回步骤S404重新开始。
在步骤S414中,将当前搜索节点的子节点中与当前搜索节点的实体的类型不同的子节点,作为第一子节点,将各个第一子节点返回的抽取结果与当前搜索节点的实体组成抽取结果,返回到当前搜索节点的父节点,将当前搜索节点的子节点中与当前搜索节点的实体的类型相同的子节点,作为第二子节点,直接将各个第二子节点返回的抽取结果,返回到当前搜索节点的父节点,返回步骤S404重新开始。
如图3所示的实体树,以当前搜索节点为胃溃疡节点为例,胃溃疡节点的兄弟节点中存在三个叶子节点一次两片,一日三次,1~2周,将这三个叶子节点作为候选节点,从中选取与胃溃疡节点的子节点(4~8周)实体的类型不同的候选节点,即一次两片,一日三次,将一次两片,一日三次,分配到胃溃疡节点的子节点对应的抽取结果中,并与胃溃疡组成抽取结果(胃溃疡,一次两片,一日三次,4~8周)返回到父节点消化性溃疡。当前搜索节点更新为消化性溃疡,消化性溃疡的兄弟节点中不存在叶子节点,消化性溃疡的所有子节点中一次两片,一日三次,1~2周为第一子节点,胃溃疡,十二指肠溃疡为第二子节点,将一次两片,一日三次,1~2周与消化性溃疡组成抽取结果,返回到成人节点,直接将胃溃疡,十二指肠溃疡返回的抽取结果((胃溃疡,一次两片,一日三次,4~8周),(十二指肠溃疡,一次两片,一日三次,2~4周))返回到成人节点。
图3中实体树的实体关系抽取结果为:[口服,儿童,消化性溃疡,一次一片,一日三次],[口服,儿童,返流性食道炎,一次一片,一日两次],[口服,成人,消化性溃疡,一次两片,一日三次,1~2周],[口服,成人,胃溃疡,一次两片,一日三次,4~8周],[口服,成人,十二指肠溃疡,一次两片,一日三次,2~4周],[口服,成人,返流性食道炎,一次两片,一日两次]。可以将上述实体关系转换为三元组的形式,例如,将[口服,儿童,消化性溃疡,一次一片,一日三次]作为节点,[该节点给药途径口服]作为三元组,形成相应的知识图谱中该节点和口服节点之间的边为给药途径。也可以采用其他方式将上述实体关系转换为三元组的形式,例如,[儿童剂量一次一片],不限于所举示例。
在一些实施例中,根据抽取的实体关系构建知识图谱,如上述实施例中的方法,但不限于所举示例。进一步,根据知识图谱生成药品的使用问题的答案,或者,根据知识图谱对药品的处方进行审核,确定药品的处方是否正确。例如,用户针对上述应用例中的药品提问,儿童的用量是多少,可以根据构建的知识图谱,确定儿童的剂量为一次一片。又例如,药品的处方中患者30岁,疾病为反流性食道炎,用药频次为一日三次,根据构建的知识图谱,对该处方进行审核,一日两次的用药频次是错误的。
上述实施例的方法首先识别待处理文本中的各个实体以及各个实体的类型,进而根据实体的类型将各个实体划分为与药品的使用条件相关的条件实体和与药品的使用方式或结果相关结果实体,再根据各个实体在待处理文本中的顺序,各个实体的类型以及分组,抽取实体关系。上述实施例的方法针对药学领域文本的表达方式,设计 了针对药学领域文本的实体划分方式和实体关系抽取方法,能够降低实体关系抽取的复杂度,提高准确性和召回率。
上述实施例的方法可以应用的场景为基于药学领域文本构建知识图谱。由于实体关系抽取方法能够保证可解释性,提高准确率和召回率,进而可以降低人工干预构建知识图谱的成本,提高知识图谱构建的效率和准确性,方便扩充药学知识图谱的规模。进一步,知识图谱可以用于药品问答、处方审核等领域,实现自动的在线开药,并且保证了准确性。
本公开还提供一种文本的处理装置,下面结合图5进行描述。
图5为本公开文本的处理装置的一些实施例的结构图。如图5所示,该实施例的装置50包括:识别模块510,分组模块520,抽取模块530。
识别模块510用于识别待处理文本中的多个实体以及各个实体的类型,其中,待处理文本包括药品的使用指导文本。
分组模块520用于根据各个实体的类型确定各个实体的分组,其中,分组包括:条件实体分组和结果实体分组,条件实体分组中的实体作为与药品的使用条件相关的条件实体,结果实体分组中的实体作为与药品的使用方式或结果相关的结果实体。
在一些实施例中,分组模块520用于对待处理文本进行关键词识别,确定待处理文本的文本类型;根据文本类型,查找对应的实体分组表,其中,实体分组表包括各个类型与分组的对应关系;根据实体分组表,确定各个实体的分组。
抽取模块530用于根据各个实体在待处理文本中的顺序、各个实体的类型以及分组,抽取实体关系。
在一些实施例中,抽取模块530用于根据各个实体在待处理文本中的顺序,各个实体的类型以及分组,将各个实体作为节点生成树状结构,得到实体树;根据实体树抽取实体关系。
在一些实施例中,抽取模块530用于按照各个实体在待处理文本中的顺序依次获取每个实体作为当前实体;针对每个当前实体,在当前实体为条件实体的情况下,根据当前实体的类型和当前节点的类型,确定当前实体与当前节点的关系;根据当前实体与当前节点的关系将当前实体添加到树状结构中,并将当前实体的节点更新为当前节点。
在一些实施例中,抽取模块530用于在当前实体为结果实体的情况下,确定当前实体为当前节点的叶子节点,并添加到树状结构中。
在一些实施例中,抽取模块530用于在当前实体与当前节点的类型不同的情况下,确定当前节点的所有父辈节点中是否存在类型与当前实体的类型相同的父辈节点;在当前节点的所有父辈节点中存在类型与当前实体的类型相同的父辈节点的情况下,将当前实体作为与当前实体的类型相同的父辈节点的兄弟节点;在当前节点的所有父辈节点中不存在类型与当前实体的类型相同的父辈节点的情况下,将当前实体作为当前节点的子节点。
在一些实施例中,抽取模块530用于在当前实体与当前节点的类型相同的情况下,确定当前实体与当前节点是否为被包含的关系;在当前实体与当前节点为被包含的关系的情况下,将当前实体作为当前节点的子节点;在当前实体与当前节点不属于被包含的关系的情况下,则将当前实体作为当前节点的兄弟节点。
在一些实施例中,从实体树的根节点开始进行深度优先搜索,到达各个叶子节点;按照由叶子节点到根节点的顺序,依次将各个节点作为当前搜索节点;针对每个当前搜索节点,根据当前搜索节点的节点类型、当前搜索节点的子节点的实体的类型、以及当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系。
在一些实施例中,抽取模块530用于针对每个当前搜索节点,在当前搜索节点为叶子节点的情况下,将当前搜索节点的实体抽取出来,作为抽取结果返回到当前搜索节点的父节点;在当前搜索节点为非叶子节点的情况下,根据当前搜索节点的子节点的实体的类型、以及当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系。
在一些实施例中,抽取模块530用于在当前搜索节点的兄弟节点中不存在叶子节点的情况下,将当前搜索节点的子节点中与当前搜索节点的实体的类型不同的子节点,作为第一子节点,将各个第一子节点返回的抽取结果与当前搜索节点的实体组成抽取结果,返回到当前搜索节点的父节点;将当前搜索节点的子节点中与当前搜索节点的实体的类型相同的子节点,作为第二子节点,直接将各个第二子节点返回的抽取结果,返回到当前搜索节点的父节点。
在一些实施例中,抽取模块530用于在当前搜索节点的兄弟节点中存在叶子节点的情况下,将当前搜索节点的兄弟节点中的叶子节点作为候选节点;从候选节点中选取与当前搜索节点的子节点的实体的类型不同的候选节点;将选取的候选节点的实体抽取出来,分配到当前搜索节点的子节点对应的抽取结果中,并与当前搜索节点的实体组成抽取结果,返回到当前搜索节点的父节点。
在一些实施例中,该装置50还包括:构建模块540,用于根据抽取的实体关系构建知识图谱;问答模块550,用于根据知识图谱生成药品的使用问题的答案;审核模块560,用于根据知识图谱对药品的处方进行审核,确定药品的处方是否正确。
本公开的实施例中的文本的处理装置可各由各种计算设备或计算机***来实现,下面结合图6以及图7进行描述。
图6为本公开文本的处理装置的一些实施例的结构图。如图6所示,该实施例的装置60包括:存储器610以及耦接至该存储器610的处理器620,处理器620被配置为基于存储在存储器610中的指令,执行本公开中任意一些实施例中的文本的处理方法。
其中,存储器610例如可以包括***存储器、固定非易失性存储介质等。***存储器例如存储有操作***、应用程序、引导装载程序(Boot Loader)、数据库以及其他程序等。
图7为本公开文本的处理装置的另一些实施例的结构图。如图7所示,该实施例的装置70包括:存储器710以及处理器720,分别与存储器610以及处理器620类似。还可以包括输入输出接口730、网络接口740、存储接口750等。这些接口730,740,750以及存储器710和处理器720之间例如可以通过总线760连接。其中,输入输出接口730为显示器、鼠标、键盘、触摸屏等输入输出设备提供连接接口。网络接口740为各种联网设备提供连接接口,例如可以连接到数据库服务器或者云端存储服务器等。存储接口750为SD卡、U盘等外置存储设备提供连接接口。
本公开还提供一种计算机程序,包括:指令,所述指令被所述处理器执行时,使所述处理器执行如前述任意实施例的文本的处理方法。
本领域内的技术人员应当明白,本公开的实施例可提供为方法、***、或计算机程序产品。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用非瞬时性存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本公开是参照根据本公开实施例的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解为可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数 据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
以上所述仅为本公开的较佳实施例,并不用以限制本公开,凡在本公开的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本公开的保护范围之内。

Claims (17)

  1. 一种文本的处理方法,包括:
    识别待处理文本中的多个实体以及各个实体的类型,其中,所述待处理文本包括药品的使用指导文本;
    根据各个实体的类型确定各个实体的分组,其中,所述分组包括:条件实体分组和结果实体分组,所述条件实体分组中的实体作为与所述药品的使用条件相关的条件实体,所述结果实体分组中的实体作为与所述药品的使用方式或结果相关的结果实体;
    根据各个实体在所述待处理文本中的顺序、各个实体的类型以及分组,抽取实体关系。
  2. 根据权利要求1所述的处理方法,其中,所述根据各个实体在所述待处理文本中的顺序、各个实体的类型以及分组,抽取实体关系包括:
    根据各个实体在所述待处理文本中的顺序、各个实体的类型以及分组,将各个实体作为节点生成树状结构,得到实体树;
    根据所述实体树抽取实体关系。
  3. 根据权利要求2所述的处理方法,其中,所述根据各个实体在所述待处理文本中的顺序、各个实体的类型以及分组,将各个实体作为节点生成树状结构包括:
    按照各个实体在所述待处理文本中的顺序,依次获取每个实体作为当前实体;
    针对每个当前实体,在所述当前实体为条件实体的情况下,根据所述当前实体的类型和当前节点的类型,确定所述当前实体与所述当前节点的关系;
    根据所述当前实体与所述当前节点的关系,将当前实体添加到所述树状结构中,并将所述当前实体的节点更新为当前节点。
  4. 根据权利要求3所述的处理方法,其中,所述根据各个实体在所述待处理文本中的顺序、各个实体的类型以及分组,将各个实体作为节点生成树状结构还包括:
    在所述当前实体为结果实体的情况下,确定所述当前实体为当前节点的叶子节点,并添加到所述树状结构中。
  5. 根据权利要求3所述的处理方法,其中,所述根据当前实体的类型和当前节点的类型,确定当前实体与当前节点的关系包括:
    在所述当前实体与所述当前节点的类型不同的情况下,确定所述当前节点的所有父辈节点中是否存在类型与所述当前实体的类型相同的父辈节点;
    在所述当前节点的所有父辈节点中存在类型与所述当前实体的类型相同的父辈节点的情况下,将所述当前实体作为与所述当前实体的类型相同的父辈节点的兄弟节点;
    在所述当前节点的所有父辈节点中不存在类型与所述当前实体的类型相同的父辈节点的情况下,将所述当前实体作为所述当前节点的子节点。
  6. 根据权利要求3所述的处理方法,其中,所述根据当前实体的类型和当前节点的类型,确定当前实体与当前节点的关系包括:
    在所述当前实体与所述当前节点的类型相同的情况下,确定所述当前实体与所述当前节点是否为被包含的关系;
    在所述当前实体与所述当前节点为被包含的关系的情况下,将所述当前实体作为所述当前节点的子节点;
    在所述当前实体与所述当前节点不属于被包含的关系的情况下,将所述当前实体作为所述当前节点的兄弟节点。
  7. 根据权利要求2所述的处理方法,其中,所述根据所述实体树抽取实体关系包括:
    从所述实体树的根节点开始进行深度优先搜索,到达各个叶子节点;
    按照由叶子节点到根节点的顺序,依次将各个节点作为当前搜索节点;
    针对每个当前搜索节点,根据所述当前搜索节点的节点类型、所述当前搜索节点的子节点的实体的类型、以及所述当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系。
  8. 根据权利要求7所述的处理方法,其中,所述针对每个当前搜索节点,根据所述当前搜索节点的节点类型、所述当前搜索节点的子节点的实体的类型、以及所述当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系包括:
    针对每个当前搜索节点,在所述当前搜索节点为叶子节点的情况下,将所述当前搜索节点的实体抽取出来,作为抽取结果返回到所述当前搜索节点的父节点;
    在所述当前搜索节点为非叶子节点的情况下,根据所述当前搜索节点的子节点的实体的类型、以及所述当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系。
  9. 根据权利要求8所述的处理方法,其中,所述根据所述当前搜索节点的子节点的实体的类型、以及所述当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系包括:
    在所述当前搜索节点的兄弟节点中不存在叶子节点的情况下,将所述当前搜索节点的子节点中与所述当前搜索节点的实体的类型不同的子节点,作为第一子节点,将各个第一子节点返回的抽取结果与所述当前搜索节点的实体组成抽取结果,返回到所述当前搜索节点的父节点;
    将所述当前搜索节点的子节点中与所述当前搜索节点的实体的类型相同的子节点,作为第二子节点,直接将各个第二子节点返回的抽取结果,返回到所述当前搜索节点的父节点。
  10. 根据权利要求8所述的处理方法,其中,所述根据所述当前搜索节点的子节点的实体的类型、以及所述当前搜索节点的兄弟节点中叶子节点的实体的类型,抽取实体关系包括:
    在所述当前搜索节点的兄弟节点中存在叶子节点的情况下,将所述当前搜索节点的兄弟节点中的叶子节点作为候选节点;
    从所述候选节点中选取与所述当前搜索节点的子节点的实体的类型不同的候选节点;
    将选取的候选节点的实体抽取出来,分配到所述当前搜索节点的子节点对应的抽取结果中,并与所述当前搜索节点的实体组成抽取结果,返回到所述当前搜索节点的父节点。
  11. 根据权利要求1-10任一项所述的处理方法,其中,所述根据各个实体的类型确定各个实体的分组包括:
    对所述待处理文本进行关键词识别,确定所述待处理文本的文本类型;
    根据所述文本类型,查找对应的实体分组表,其中,所述实体分组表包括各个类型与分组的对应关系;
    根据所述实体分组表,确定各个实体的分组。
  12. 根据权利要求1-11任一项所述的处理方法,还包括:
    根据抽取的所述实体关系构建知识图谱;
    根据所述知识图谱生成所述药品的使用问题的答案。
  13. 根据权利要求1-11任一项所述的处理方法,还包括:
    根据抽取的所述实体关系构建知识图谱;
    根据所述知识图谱对所述药品的处方进行审核,确定所述药品的处方是否正确。
  14. 一种文本的处理装置,包括:
    识别模块,用于识别待处理文本中的多个实体以及各个实体的类型,其中,所述待处理文本包括药品的使用指导文本;
    分组模块,用于根据各个实体的类型确定各个实体的分组,其中,所述分组包括:条件实体分组和结果实体分组,所述条件实体分组中的实体作为与所述药品的使用条件相关的条件实体,所述结果实体分组中的实体作为与所述药品的使用方式或结果相关的结果实体;
    抽取模块,用于根据各个实体在所述待处理文本中的顺序、各个实体的类型以及分组,抽取实体关系。
  15. 一种文本的处理装置,包括:
    处理器;以及
    耦接至所述处理器的存储器,用于存储指令,所述指令被所述处理器执行时,使所述处理器执行如权利要求1-13任一项所述的文本的处理方法。
  16. 一种非瞬时性计算机可读存储介质,其上存储有计算机程序,其中,该程序被处理器执行时实现权利要求1-13任一项所述方法的步骤。
  17. 一种计算机程序,包括:指令,所述指令被所述处理器执行时,使所述处理器执行如权利要求1-13任一项所述的文本的处理方法。
PCT/CN2023/086629 2022-05-05 2023-04-06 文本的处理方法、装置和计算机可读存储介质 WO2023213166A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210479767.1 2022-05-05
CN202210479767.1A CN117057348A (zh) 2022-05-05 2022-05-05 文本的处理方法、装置和计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2023213166A1 true WO2023213166A1 (zh) 2023-11-09

Family

ID=88646232

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/086629 WO2023213166A1 (zh) 2022-05-05 2023-04-06 文本的处理方法、装置和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN117057348A (zh)
WO (1) WO2023213166A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190318022A1 (en) * 2018-04-11 2019-10-17 Intel Corporation Technologies for flexible tree-based lookups for network devices
CN111986770A (zh) * 2020-08-31 2020-11-24 平安医疗健康管理股份有限公司 药方用药审核方法、装置、设备及存储介质
CN112148851A (zh) * 2020-09-09 2020-12-29 常州大学 一种基于知识图谱的医药知识问答***的构建方法
CN112307216A (zh) * 2020-07-30 2021-02-02 北京沃东天骏信息技术有限公司 药品知识图谱的构建方法和装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190318022A1 (en) * 2018-04-11 2019-10-17 Intel Corporation Technologies for flexible tree-based lookups for network devices
CN112307216A (zh) * 2020-07-30 2021-02-02 北京沃东天骏信息技术有限公司 药品知识图谱的构建方法和装置
CN111986770A (zh) * 2020-08-31 2020-11-24 平安医疗健康管理股份有限公司 药方用药审核方法、装置、设备及存储介质
CN112148851A (zh) * 2020-09-09 2020-12-29 常州大学 一种基于知识图谱的医药知识问答***的构建方法

Also Published As

Publication number Publication date
CN117057348A (zh) 2023-11-14

Similar Documents

Publication Publication Date Title
WO2020147758A1 (zh) 药品的推荐方法、装置、介质和电子设备
CN110990579B (zh) 跨语言的医学知识图谱构建方法、装置与电子设备
Masarie Jr et al. An interlingua for electronic interchange of medical information: using frames to map between clinical vocabularies
Doan et al. Natural language processing in biomedicine: a unified system architecture overview
US9361587B2 (en) Authoring system for bayesian networks automatically extracted from text
Wu et al. Ranking gene-drug relationships in biomedical literature using latent dirichlet allocation
JP7068106B2 (ja) 試験計画策定支援装置、試験計画策定支援方法及びプログラム
CN112347204B (zh) 药物研发知识库构建方法及装置
CN111723570A (zh) 药品知识图谱的构建方法、装置和计算机设备
WO2022021958A1 (zh) 药品知识图谱的构建方法和装置
Yu et al. The use of natural language processing to identify vaccine‐related anaphylaxis at five health care systems in the Vaccine Safety Datalink
CN116383413B (zh) 基于医疗数据提取的知识图谱更新方法和***
Hsu et al. Mining frequency of drug side effects over a large twitter dataset using apache spark
JP6092493B1 (ja) データベース管理装置およびその方法
Whitton et al. Automated tabulation of clinical trial results: A joint entity and relation extraction approach with transformer-based language representations
WO2023213166A1 (zh) 文本的处理方法、装置和计算机可读存储介质
Wunnava et al. Towards transforming FDA adverse event narratives into actionable structured data for improved pharmacovigilance
John et al. Medication recommendation system based on clinical documents
Botsis et al. Application of natural language processing and network analysis techniques to post-market reports for the evaluation of dose-related anti-thymocyte globulin safety patterns
Chirila et al. Named entity recognition for the contraindication and dosing sections of patient information leaflets with CRFClassifier tools
US20180260426A1 (en) System and method for uniformly correlating unstructured entry features to associated therapy features
McNeer et al. A post-processing algorithm for building longitudinal medication dose data from extracted medication information using natural language processing from electronic health records
US11210314B2 (en) Device and method for generating a drug database
CN115376705B (zh) 药品说明书的解析方法和装置
Yu et al. Towards a TCM domain ontology: Standardization, ontology engineering, and applications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23799153

Country of ref document: EP

Kind code of ref document: A1