CN112926315A - Automatic medical term standardization method and device - Google Patents

Automatic medical term standardization method and device Download PDF

Info

Publication number
CN112926315A
CN112926315A CN202110511800.XA CN202110511800A CN112926315A CN 112926315 A CN112926315 A CN 112926315A CN 202110511800 A CN202110511800 A CN 202110511800A CN 112926315 A CN112926315 A CN 112926315A
Authority
CN
China
Prior art keywords
words
alternative
vocabulary
medical
same class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110511800.XA
Other languages
Chinese (zh)
Other versions
CN112926315B (en
Inventor
王硕
胡可云
陈联忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jiahesen Health Technology Co ltd
Original Assignee
Beijing Jiahesen Health Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jiahesen Health Technology Co ltd filed Critical Beijing Jiahesen Health Technology Co ltd
Priority to CN202110511800.XA priority Critical patent/CN112926315B/en
Publication of CN112926315A publication Critical patent/CN112926315A/en
Application granted granted Critical
Publication of CN112926315B publication Critical patent/CN112926315B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The application provides a method and a device for automatically standardizing medical terms, which are characterized in that a basic data vocabulary is obtained, wherein the basic data vocabulary is provided with basic medical terms; classifying the candidate words required to be standardized based on the basic data vocabulary; judging whether antisense words exist in the alternative words belonging to the same class, and continuously classifying the alternative words belonging to the same class again based on the antisense words when the antisense words exist among different alternative words belonging to the same class; calculating the similarity between the alternative words belonging to the same class; establishing a synonym relation among all the alternative words with the similarity greater than a preset value; and determining a standard vocabulary corresponding to the alternative words with the synonym relationship, and establishing a mapping relationship between the standard vocabulary and the alternative words corresponding to the standard vocabulary, so that the processing efficiency of the medical term standardization processing is improved.

Description

Automatic medical term standardization method and device
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a method and a device for automatically standardizing medical terms.
Background
At present, medical terms used by various medical institutions in China have the phenomenon of non-uniform writing modes, so that information separation is caused, and obstacles are caused for full utilization of medical data. The existing international terminology standard system cannot cover different writing methods of the same vocabulary in medical institutions, and the classification mode is not completely suitable for clinical product application based on specific scenes. Therefore, a set of clinical standard medical term system needs to be established to integrate the multi-word and one-meaning situation in the medical data. Because of the huge data volume of medical terms, the vocabulary standardization only by manual work has the problems of long time consumption, high cost and easy omission. Therefore, there is an urgent need for a solution that can quickly standardize medical terms.
Disclosure of Invention
In view of this, embodiments of the present invention provide an automatic medical term standardization method and apparatus, so as to implement automatic medical term standardization processing.
In order to achieve the above purpose, the embodiments of the present invention provide the following technical solutions:
a method for automatic normalization of medical terms, comprising:
acquiring a basic data vocabulary, wherein the basic data vocabulary has basic medical vocabularies;
classifying the candidate words required to be standardized based on the basic data vocabulary;
judging whether antisense words exist in the alternative words belonging to the same class, and continuously classifying the alternative words belonging to the same class again based on the antisense words when the antisense words exist among different alternative words belonging to the same class;
calculating the similarity between the alternative words belonging to the same class;
establishing a synonym relation among all the alternative words with the similarity greater than a preset value;
and determining a standard vocabulary corresponding to the alternative words with the synonym relationship, and establishing a mapping relationship between the standard vocabulary and the alternative words corresponding to the standard vocabulary.
Optionally, the above automatic medical term standardization method further includes:
removing meaningless words in various alternative words, recording as corrected alternative words, and determining the mapping relation between the alternative words and the corrected alternative words;
calculating the similarity among the alternative words belonging to the same class, and establishing a synonym relationship among the alternative words with the similarity larger than a preset value, wherein the method comprises the following steps:
and calculating the similarity among the corrected alternative words belonging to the same class, and establishing a synonym relation among the corrected alternative words with the similarity larger than a preset value.
Optionally, in the above method for automatically standardizing medical terms, the determining a standard vocabulary corresponding to an alternative word having a synonym relationship includes:
and judging whether each alternative word with the synonym relationship has a corresponding standard vocabulary, if so, taking the standard vocabulary as the standard vocabulary corresponding to the alternative word with the synonym relationship, and if not, selecting one alternative word in each alternative word with the synonym relationship as the standard vocabulary.
Optionally, in the above method for automatically standardizing medical terms, the basic medical vocabulary includes: one or more of location, profile, and typing, the base data vocabulary also having stored therein synonym relationships between base medical vocabulary.
Optionally, in the above method for automatically standardizing medical terms, classifying the candidate words to be standardized based on the basic data vocabulary includes:
extracting basic medical vocabularies contained in each alternative word required to be standardized based on the basic data vocabulary;
and judging whether the candidate words of which the basic medical vocabularies are all the same or are synonyms exist, and if so, dividing the candidate words into the same class.
An automatic medical term standardizing apparatus comprising:
the basic data vocabulary acquisition unit is used for acquiring a basic data vocabulary which has basic medical vocabularies;
the classification unit is used for classifying the alternative words required to be standardized based on the basic data vocabulary; judging whether antisense words exist in the alternative words belonging to the same class, and continuously classifying the alternative words belonging to the same class again based on the antisense words when the antisense words exist among different alternative words belonging to the same class;
the similarity calculation unit is used for calculating the similarity among all the alternative words belonging to the same class and establishing a synonym relation among all the alternative words with the similarity larger than a preset value;
and the standard vocabulary constructing unit is used for determining the standard vocabulary corresponding to the alternative words with the synonym relationship and establishing the mapping relationship between the standard vocabulary and the alternative words corresponding to the standard vocabulary.
Optionally, in the automatic medical term standardization apparatus, the candidate word optimization unit is configured to remove nonsense words from various candidate words, mark the nonsense words as corrected candidate words, and determine a mapping relationship between the candidate words and the corrected candidate words;
the similarity calculation unit is specifically configured to, when calculating the similarity between the candidate words belonging to the same class and establishing a synonym relationship between the candidate words having a similarity greater than a preset value:
and calculating the similarity among the corrected alternative words belonging to the same class, and establishing a synonym relation among the corrected alternative words with the similarity larger than a preset value.
Optionally, in the automatic medical term standardization apparatus, when determining a standard vocabulary corresponding to an alternative word having a synonym relationship, the standard vocabulary construction unit is specifically configured to:
and judging whether each alternative word with the synonym relationship has a corresponding standard vocabulary, if so, taking the standard vocabulary as the standard vocabulary corresponding to the alternative word with the synonym relationship, and if not, selecting one alternative word in each alternative word with the synonym relationship as the standard vocabulary.
Optionally, in the above automatic medical term standardization apparatus, the basic medical vocabulary includes: one or more of location, profile, and typing, the base data vocabulary also having stored therein synonym relationships between base medical vocabulary.
Optionally, in the automatic medical term standardization apparatus, when the classification unit classifies the candidate words to be standardized based on the basic data vocabulary, the classification unit is specifically configured to:
extracting basic medical vocabularies contained in each alternative word required to be standardized based on the basic data vocabulary;
and judging whether the candidate words of which the basic medical vocabularies are all the same or are synonyms exist, and if so, dividing the candidate words into the same class.
Based on the technical scheme, the scheme provided by the embodiment of the invention obtains the basic data vocabulary which has basic medical vocabulary; classifying the candidate words required to be standardized based on the basic data vocabulary; judging whether antisense words exist in the alternative words belonging to the same class, and continuously classifying the alternative words belonging to the same class again based on the antisense words when the antisense words exist among different alternative words belonging to the same class; calculating the similarity between the alternative words belonging to the same class; establishing a synonym relation among all the alternative words with the similarity greater than a preset value; and determining a standard vocabulary corresponding to the alternative words with the synonym relationship, and establishing a mapping relationship between the standard vocabulary and the alternative words corresponding to the standard vocabulary, so that the processing efficiency of the medical term standardization processing is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic flow chart diagram illustrating a method for automatically normalizing medical terms disclosed in an embodiment of the present application;
fig. 2 is a schematic structural diagram of an automatic medical term standardizing device disclosed in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Step S101: acquiring a basic data vocabulary;
in the scheme, the words and term contents which need to be standardized are analyzed in advance, and the required basic term types are combed. The basic term type of disease includes information on location (e.g., liver, kidney), side category (e.g., left, right), type (e.g., type 1, type 2), etc. The data in the base term types are derived from a standard dictionary or a segmentation dictionary, synonym relationships are established, and the base attributes for which the synonym relationships are established are added to the base data vocabulary.
Step S102: classifying the alternative words based on the basic data vocabulary;
in the scheme, the alternative words are classified firstly, and the alternative words are classified according to the basic medical vocabulary in the basic data vocabulary table, so that the belonging categories of the alternative words can be ensured to be consistent. If classification is not based on the basic medical vocabulary, but directly by text similarity matching, it may happen that terms with very high text similarity, but not the same actual meaning, are classified as synonym relationships. For example, the examples are: in the terms of surgery, "lung laceration repair" and "liver laceration repair", the similarity of the texts of the two is very high, but the two terms are different organs and cannot be used as synonyms.
In the scheme, a basic data vocabulary is adopted to find out alternative words with a correlation relation based on the relevance of content meanings in basic medical vocabularies; specifically, the candidate words are classified according to the basic medical vocabulary in the basic data vocabulary, and the candidate words with the same basic medical vocabulary are divided into a group. In the scheme, the same candidate word may have a plurality of basic medical vocabularies, and all words with the same basic medical vocabularies are divided into a group.
Example 1:
the alternative term "bilateral breast fibroadenoma" contains the basic medical words "bilateral" and "breast", and the alternative term "double breast fibroadenoma" contains the basic medical words "double" and "breast", wherein "bilateral" and "double" are synonymous, and "breast" are synonymous, so that "bilateral breast fibroadenoma" and "double breast fibroadenoma" are divided into the same group.
Example 2:
the alternative word "type 1 diabetic neurogenic edema" contains a basic medical word "type 1", and the alternative word "type 2 diabetic neurogenic edema" contains a basic medical word "type 2", wherein "type 1" and "type 2" are different words in the classification, so that "type 1 diabetic neurogenic edema" and "type 2 diabetic neurogenic edema" cannot be classified into the same group;
step S103: further dividing the alternative words with or without antisense words belonging to the same class;
in the step, judging whether antisense words exist in the alternative words belonging to the same class, if so, further dividing the alternative words according to the antisense words;
in this step, the classification result is further classified, whether there is a basic medical vocabulary with an antisense word relationship in the candidate words classified into a group is detected, and if so, the group is further classified.
Example (c): the alternative words of 'secondary pulmonary tuberculosis (multiple treatment, single drug resistance) coating yin and culturing yang' and 'secondary pulmonary tuberculosis (multiple treatment, multiple drug resistance) coating yin and culturing yang' comprise 'single drug resistance' and 'multiple drug resistance', and 'single drug resistance' and 'multiple drug resistance' are defined as antisense words in a basic data vocabulary table, so that 'secondary pulmonary tuberculosis (multiple treatment, single drug resistance) coating yin and culturing yang' and 'secondary pulmonary tuberculosis (multiple treatment, multiple drug resistance) coating yin and culturing yang' need to be divided into different sub-groups.
Step S104: removing the nonsense words in the alternative words, and establishing a mapping relation between the original words and the new words;
some words in the alternative are not meant to be further elaborated in the specific context. If the word is "disease" in disease, if the word is kept "disease" or not, it will not affect the meaning of the word itself, but if it is kept, it will bring large interference to the similarity analysis when the similarity is finally calculated, especially when the word number is small. Therefore, such high-frequency and nonsense words need to be eliminated, and the mapping relation between the original words and the new words is established. Example (c): after the high-frequency nonsense word 'disease' is removed from 'hypertension', mapping to 'hypertension'.
When the nonsense words are removed, the nonsense words can be identified and removed in a keyword identification mode, and the nonsense words can also be removed in a manual mode.
Step S105: determining the synonymy relation among all the alternative words in the same class of alternative words according to the similarity of the words;
and for the mapped group of alternative words, calculating the similarity of words pairwise. And when the vocabulary similarity between some alternative words exceeds a specified threshold value, the two alternative words are considered to have a synonymy relationship and are reserved as the same group, and the mutual synonymy relationship of the alternative words is established.
Step S106: and determining a standard vocabulary corresponding to the alternative words with the synonym relationship, and establishing a mapping relationship between the standard vocabulary and the alternative words corresponding to the standard vocabulary.
After the relation between the alternative words and the synonyms is established, if the similarity between the alternative words and the standard words is larger than the specified threshold value, the alternative words are used as the synonyms of the standard words, and the standard words are used as the standard words of the alternative words; if there is no standard vocabulary with similarity greater than the specified threshold, one of the alternative words can be selected as the standard vocabulary of the alternative words and added into the standard vocabulary library.
The automatic medical term standardization method provided by the invention can be used for carrying out batch processing on the candidate words needing standardization processing to obtain the standard words corresponding to the candidate words, the data processing speed is high, the application range is wide, and the processing efficiency of the medical term standardization processing is improved.
Corresponding to the above method, referring to fig. 2, the present application also discloses an automatic medical term standardizing apparatus, comprising:
a basic data vocabulary acquisition unit 100 for acquiring a basic data vocabulary having basic medical vocabulary therein;
a classifying unit 200, configured to classify the candidate words to be normalized based on the basic data vocabulary; judging whether antisense words exist in the alternative words belonging to the same class, and continuously classifying the alternative words belonging to the same class again based on the antisense words when the antisense words exist among different alternative words belonging to the same class;
the similarity calculation unit 300 is configured to calculate similarities between the candidate words belonging to the same class, and establish a synonym relationship between the candidate words having similarities greater than a preset value;
and the standard vocabulary constructing unit 400 is configured to determine a standard vocabulary corresponding to the candidate words with the synonym relationship, and establish a mapping relationship between the standard vocabulary and the corresponding candidate words.
Corresponding to the method, the device further comprises: the alternative word optimizing unit is used for eliminating nonsense words in various alternative words, recording the nonsense words as corrected alternative words and determining the mapping relation between the alternative words and the corrected alternative words;
the similarity calculation unit is specifically configured to, when calculating the similarity between the candidate words belonging to the same class and establishing a synonym relationship between the candidate words having a similarity greater than a preset value:
and calculating the similarity among the corrected alternative words belonging to the same class, and establishing a synonym relation among the corrected alternative words with the similarity larger than a preset value.
Corresponding to the method, when determining the standard vocabulary corresponding to the alternative word with the synonym relationship, the standard vocabulary constructing unit in the apparatus is specifically configured to:
and judging whether each alternative word with the synonym relationship has a corresponding standard vocabulary, if so, taking the standard vocabulary as the standard vocabulary corresponding to the alternative word with the synonym relationship, and if not, selecting one alternative word in each alternative word with the synonym relationship as the standard vocabulary.
Corresponding to the method, the basic medical vocabulary in the device comprises: one or more of location, profile, and typing, the base data vocabulary also having stored therein synonym relationships between base medical vocabulary.
Corresponding to the method, when the classifying unit in the apparatus classifies the candidate words to be normalized based on the basic data vocabulary, the classifying unit is specifically configured to:
extracting basic medical vocabularies contained in each alternative word required to be standardized based on the basic data vocabulary;
and judging whether the candidate words of which the basic medical vocabularies are all the same or are synonyms exist, and if so, dividing the candidate words into the same class.
For convenience of description, the above system is described with the functions divided into various modules, which are described separately. Of course, the functionality of the various modules may be implemented in the same one or more software and/or hardware implementations as the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, Read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
It is further noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1. A method for automatically normalizing medical terms, comprising:
acquiring a basic data vocabulary, wherein the basic data vocabulary has basic medical vocabularies;
classifying the candidate words required to be standardized based on the basic data vocabulary;
judging whether antisense words exist in the alternative words belonging to the same class, and continuously classifying the alternative words belonging to the same class again based on the antisense words when the antisense words exist among different alternative words belonging to the same class;
calculating the similarity between the alternative words belonging to the same class;
establishing a synonym relation among all the alternative words with the similarity greater than a preset value;
and determining a standard vocabulary corresponding to the alternative words with the synonym relationship, and establishing a mapping relationship between the standard vocabulary and the alternative words corresponding to the standard vocabulary.
2. The method for automatically normalizing medical terms according to claim 1, further comprising:
removing meaningless words in various alternative words, recording as corrected alternative words, and determining the mapping relation between the alternative words and the corrected alternative words;
calculating the similarity among the alternative words belonging to the same class, and establishing a synonym relationship among the alternative words with the similarity larger than a preset value, wherein the method comprises the following steps:
and calculating the similarity among the corrected alternative words belonging to the same class, and establishing a synonym relation among the corrected alternative words with the similarity larger than a preset value.
3. The method for automatically normalizing medical terms according to claim 1, wherein the step of determining the standard vocabulary corresponding to the alternative words with synonym relationships comprises the steps of:
and judging whether each alternative word with the synonym relationship has a corresponding standard vocabulary, if so, taking the standard vocabulary as the standard vocabulary corresponding to the alternative word with the synonym relationship, and if not, selecting one alternative word in each alternative word with the synonym relationship as the standard vocabulary.
4. The method of automatically normalizing medical terms according to claim 1, wherein the base medical vocabulary comprises: one or more of location, profile, and typing, the base data vocabulary also having stored therein synonym relationships between base medical vocabulary.
5. The method of claim 1, wherein classifying the candidate words to be normalized based on the base data vocabulary comprises:
extracting basic medical vocabularies contained in each alternative word required to be standardized based on the basic data vocabulary;
and judging whether the candidate words of which the basic medical vocabularies are all the same or are synonyms exist, and if so, dividing the candidate words into the same class.
6. An automatic medical term standardizing apparatus, comprising:
the basic data vocabulary acquisition unit is used for acquiring a basic data vocabulary which has basic medical vocabularies;
the classification unit is used for classifying the alternative words required to be standardized based on the basic data vocabulary; judging whether antisense words exist in the alternative words belonging to the same class, and continuously classifying the alternative words belonging to the same class again based on the antisense words when the antisense words exist among different alternative words belonging to the same class;
the similarity calculation unit is used for calculating the similarity among all the alternative words belonging to the same class and establishing a synonym relation among all the alternative words with the similarity larger than a preset value;
and the standard vocabulary constructing unit is used for determining the standard vocabulary corresponding to the alternative words with the synonym relationship and establishing the mapping relationship between the standard vocabulary and the alternative words corresponding to the standard vocabulary.
7. The automatic medical term standardization device according to claim 6, wherein the candidate word optimization unit is configured to remove nonsense words from various candidate words, record the nonsense words as corrected candidate words, and determine a mapping relationship between the candidate words and the corrected candidate words;
the similarity calculation unit is specifically configured to, when calculating the similarity between the candidate words belonging to the same class and establishing a synonym relationship between the candidate words having a similarity greater than a preset value:
and calculating the similarity among the corrected alternative words belonging to the same class, and establishing a synonym relation among the corrected alternative words with the similarity larger than a preset value.
8. The automatic normalizing device for medical terms according to claim 6, wherein the standard vocabulary constructing unit is specifically configured to, when determining the standard vocabulary corresponding to the alternative words having the synonym relationship:
and judging whether each alternative word with the synonym relationship has a corresponding standard vocabulary, if so, taking the standard vocabulary as the standard vocabulary corresponding to the alternative word with the synonym relationship, and if not, selecting one alternative word in each alternative word with the synonym relationship as the standard vocabulary.
9. The automatic normalization apparatus for medical terms according to claim 6, wherein the basic medical vocabulary includes: one or more of location, profile, and typing, the base data vocabulary also having stored therein synonym relationships between base medical vocabulary.
10. The automatic normalization apparatus for medical terms according to claim 6, wherein the classification unit, when classifying the candidate words to be normalized based on the base data vocabulary, is specifically configured to:
extracting basic medical vocabularies contained in each alternative word required to be standardized based on the basic data vocabulary;
and judging whether the candidate words of which the basic medical vocabularies are all the same or are synonyms exist, and if so, dividing the candidate words into the same class.
CN202110511800.XA 2021-05-11 2021-05-11 Automatic medical term standardization method and device Active CN112926315B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110511800.XA CN112926315B (en) 2021-05-11 2021-05-11 Automatic medical term standardization method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110511800.XA CN112926315B (en) 2021-05-11 2021-05-11 Automatic medical term standardization method and device

Publications (2)

Publication Number Publication Date
CN112926315A true CN112926315A (en) 2021-06-08
CN112926315B CN112926315B (en) 2021-08-03

Family

ID=76174825

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110511800.XA Active CN112926315B (en) 2021-05-11 2021-05-11 Automatic medical term standardization method and device

Country Status (1)

Country Link
CN (1) CN112926315B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642327A (en) * 2021-10-14 2021-11-12 中国光大银行股份有限公司 Method and device for constructing standard knowledge base

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120296669A1 (en) * 2009-09-17 2012-11-22 General Electric Company Systems, methods, and apparatus for automated mapping and integrated workflow of a controlled medical vocabulary
CN111368555A (en) * 2020-05-27 2020-07-03 腾讯科技(深圳)有限公司 Data identification method and device, storage medium and electronic equipment
CN111445968A (en) * 2020-03-16 2020-07-24 平安国际智慧城市科技股份有限公司 Electronic medical record query method and device, computer equipment and storage medium
CN111506673A (en) * 2020-03-27 2020-08-07 泰康保险集团股份有限公司 Medical record classification code determination method and device
CN112015866A (en) * 2020-08-28 2020-12-01 北京百度网讯科技有限公司 Method, device, electronic equipment and storage medium for generating synonymous text
CN112214995A (en) * 2019-07-09 2021-01-12 百度(美国)有限责任公司 Hierarchical multitask term embedding learning for synonym prediction

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120296669A1 (en) * 2009-09-17 2012-11-22 General Electric Company Systems, methods, and apparatus for automated mapping and integrated workflow of a controlled medical vocabulary
CN112214995A (en) * 2019-07-09 2021-01-12 百度(美国)有限责任公司 Hierarchical multitask term embedding learning for synonym prediction
CN111445968A (en) * 2020-03-16 2020-07-24 平安国际智慧城市科技股份有限公司 Electronic medical record query method and device, computer equipment and storage medium
CN111506673A (en) * 2020-03-27 2020-08-07 泰康保险集团股份有限公司 Medical record classification code determination method and device
CN111368555A (en) * 2020-05-27 2020-07-03 腾讯科技(深圳)有限公司 Data identification method and device, storage medium and electronic equipment
CN112015866A (en) * 2020-08-28 2020-12-01 北京百度网讯科技有限公司 Method, device, electronic equipment and storage medium for generating synonymous text

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642327A (en) * 2021-10-14 2021-11-12 中国光大银行股份有限公司 Method and device for constructing standard knowledge base

Also Published As

Publication number Publication date
CN112926315B (en) 2021-08-03

Similar Documents

Publication Publication Date Title
WO2021139262A1 (en) Document mesh term aggregation method and apparatus, computer device, and readable storage medium
TWI653542B (en) Method, system and device for discovering and tracking hot topics based on network media data flow
WO2022105115A1 (en) Question and answer pair matching method and apparatus, electronic device and storage medium
CN109815487B (en) Text quality inspection method, electronic device, computer equipment and storage medium
CN111104526A (en) Financial label extraction method and system based on keyword semantics
US20140214835A1 (en) System and method for automatically classifying documents
WO2018086401A1 (en) Cluster processing method and device for questions in automatic question and answering system
CN110347701B (en) Target type identification method for entity retrieval query
CN110543592A (en) Information searching method and device and computer equipment
CN107491447B (en) Method for establishing query rewrite judging model, method for judging query rewrite and corresponding device
CN112559684A (en) Keyword extraction and information retrieval method
CN112434194A (en) Similar user identification method, device, equipment and medium based on knowledge graph
WO2022222942A1 (en) Method and apparatus for generating question and answer record, electronic device, and storage medium
CN112270178B (en) Medical literature cluster theme determination method and device, electronic equipment and storage medium
WO2019080428A1 (en) Method for obtaining target document and application server
CN112926315B (en) Automatic medical term standardization method and device
CN108121721A (en) Intension recognizing method and device
CN114969387A (en) Document author information disambiguation method and device and electronic equipment
KR102371505B1 (en) A program for labeling news articles using big data
CN116882414B (en) Automatic comment generation method and related device based on large-scale language model
CN111782970B (en) Data analysis method and device
CN116049376B (en) Method, device and system for retrieving and replying information and creating knowledge
CN109144999B (en) Data positioning method, device, storage medium and program product
CN110941713B (en) Self-optimizing financial information block classification method based on topic model
CN110929526A (en) Sample generation method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant