WO2021184729A1

WO2021184729A1 - Drug classification method and apparatus, storage medium, and intelligent device

Info

Publication number: WO2021184729A1
Application number: PCT/CN2020/119301
Authority: WO
Inventors: 蒋雪涵; 孙行智; 胡岗; 赵惟; 左磊; 徐卓扬
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-03-17
Filing date: 2020-09-30
Publication date: 2021-09-23
Also published as: CN111475686A

Abstract

A drug classification method and apparatus, a storage medium, and an intelligent device. The method comprises: obtaining an original medication record table, and extracting medication information from the original medication record table (S101); cleaning the medication information according to a preset cleaning rule to obtain drug names (S102); matching the drug names with drug generic names in a drug information standard library to determine standard drug generic names corresponding to the drug names (S103); obtaining a drug classification demand input by a user (S104); and according to the standard drug generic names obtained by matching and the drug classification demand, determining classification labels corresponding to the drug names, the classification labels being labels predefined in advance according to the drug classification demand and the standard drug generic names (S105). By means of the method, the efficiency of obtaining and utilizing drug information can be improved, and effective management of the drug information can be realized.

Description

一种药品分类方法、装置、存储介质和智能设备Medicine classification method, device, storage medium and intelligent equipment

本申请要求于2020年03月17日提交中国专利局、申请号为202010185346.9，发明名称为“一种药品分类方法、装置、存储介质和智能设备”的中国专利申请的优先权，其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application filed with the Chinese Patent Office on March 17, 2020 with the application number 202010185346.9 and the invention title "A method, device, storage medium and smart device for drug classification", and the entire content of which is approved The reference is incorporated in this application.

技术领域Technical field

本申请属于信息处理技术领域，尤其涉及一种药品分类方法、装置、存储介质和智能设备。This application belongs to the field of information processing technology, and in particular relates to a method, device, storage medium, and smart device for classifying medicines.

背景技术Background technique

随着信息技术的快速发展，我国医药行业正加速医疗信息化建设。医疗信息化建设有助于提升医疗处理效率，给患者提供很好的体验，为提高医疗服务质量提供很大帮助。药品信息是医疗保险结算的重要依据，也是医疗信息化建设的重要组成部分。对不同的药品进行分类，可大大提高利用和管理药品信息的效率，对发展医疗信息化建设具有重要的意义。With the rapid development of information technology, my country's pharmaceutical industry is accelerating the construction of medical information. The construction of medical information will help improve the efficiency of medical treatment, provide patients with a good experience, and provide great help to improve the quality of medical services. Drug information is an important basis for medical insurance settlement and an important part of medical information construction. The classification of different drugs can greatly improve the efficiency of using and managing drug information, and is of great significance to the development of medical information construction.

发明人发现，在对药品进行药品分类的过程中，通常需要根据药品的特征和分类规则对药品进行分类，然而，各个医院根据自身的情况设有特定的药品分类标准，在进行医疗信息整合时，由于药品分类标准不一致，不同医院的药品信息无法自动关联，不利于提高获取药品信息的效率。The inventor found that in the process of classifying drugs, it is usually necessary to classify drugs according to the characteristics and classification rules of drugs. However, each hospital has specific drug classification standards according to its own situation. When performing medical information integration Due to the inconsistent drug classification standards, the drug information of different hospitals cannot be automatically correlated, which is not conducive to improving the efficiency of obtaining drug information.

技术问题technical problem

有鉴于此，本申请实施例提供了一种药品分类方法、装置、存储介质和智能设备，以解决现有技术中，由于药品分类标准不一致，不同医院的药品信息无法自动关联，不利于提高获取利用药品信息的效率的问题。In view of this, the embodiments of the present application provide a drug classification method, device, storage medium, and smart device to solve the problem that in the prior art, due to inconsistent drug classification standards, the drug information of different hospitals cannot be automatically correlated, which is not conducive to improving the acquisition. The issue of the efficiency of using drug information.

技术解决方案Technical solutions

第一方面，本申请实施例提供了一种药品分类方法，包括：In the first aspect, the embodiments of the present application provide a method for classifying drugs, including:

获取原始用药记录表，并从所述原始用药记录表中提取用药信息；Obtain an original medication record sheet, and extract medication information from the original medication record sheet;

按预设清洗规则对所述用药信息进行清洗，得到药品名称；Clean the medication information according to the preset cleaning rules to obtain the name of the medication;

将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名；Match the drug name with the drug generic name in the drug information standard library, and determine the standard drug generic name corresponding to the drug name;

获取用户输入的药品分类需求；Obtain the drug classification requirements entered by the user;

根据匹配得到的标准药品通用名与所述药品分类需求，确定所述药品名称对应的分类标签，所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签。According to the matched standard drug generic name and the drug classification requirement, a classification label corresponding to the drug name is determined, and the classification label is a label predefined according to the drug classification requirement and the standard drug generic name.

第二方面，本申请实施例提供了一种药品分类装置，包括：In the second aspect, an embodiment of the present application provides a medicine classification device, including:

用药信息提取单元，用于获取原始用药记录表，并从所述原始用药记录表中提取用药信息；The medication information extraction unit is used to obtain the original medication record sheet, and extract medication information from the original medication record sheet;

信息清洗单元，用于按预设清洗规则对所述用药信息进行清洗，得到药品名称；The information cleaning unit is used to clean the medication information according to the preset cleaning rules to obtain the name of the medicine;

通用名匹配单元，用于将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名；The generic name matching unit is used to match the drug name with the drug generic name in the drug information standard library, and determine the standard drug generic name corresponding to the drug name;

分类需求获取单元，用于获取用户输入的药品分类需求；The classification requirement obtaining unit is used to obtain the pharmaceutical classification requirements input by the user;

分类标签确定单元，用于根据匹配得到的标准药品通用名与所述药品分类需求，确定所述药品名称对应的分类标签，所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签；The classification label determining unit is configured to determine the classification label corresponding to the drug name according to the matched standard drug generic name and the drug classification requirement, and the classification label is predefined according to the drug classification requirement and the standard drug generic name Label;

药品分类单元，根据所述分类标签，将所述药品名称对应的药品分类。The medicine classification unit classifies the medicine corresponding to the medicine name according to the classification label.

第三方面，本申请实施例提供了一种计算机可读存储介质，所述计算机可读存储介质存储有计算机可读指令，所述计算机可读指令被处理器执行时实现如下步骤：In a third aspect, an embodiment of the present application provides a computer-readable storage medium, where the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, the following steps are implemented:

第四方面，本申请实施例提供了一种智能设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令，所述处理器执行所述计算机可读指令时实现如下步骤：In a fourth aspect, an embodiment of the present application provides a smart device, including a memory, a processor, and computer-readable instructions stored in the memory and running on the processor, and the processor executes the computer The following steps are implemented when the instruction is readable:

第五方面，本申请实施例提供了一种计算机可读指令产品，当计算机可读指令产品在终端设备上运行时，使得终端设备执行上述第一方面所述的药品分类方法。In a fifth aspect, embodiments of the present application provide a computer-readable instruction product, which when the computer-readable instruction product runs on a terminal device, causes the terminal device to execute the drug classification method described in the first aspect.

有益效果Beneficial effect

本申请实施例中提供的药品分类方法，自动按预设清洗规则对所述用药信息进行清洗，得到药品名称，使得用于匹配标准药品通用名的药品名称简洁，提高匹配准确性，无需关联其他医院的药品信息，同时可按需求对药品分类，大大提高了获取利用药品信息的效率，并且实现对药品信息进行有效的管理。The drug classification method provided in the embodiments of this application automatically cleans the medication information according to the preset cleaning rules to obtain the drug name, so that the drug name used to match the standard drug generic name is concise, and the matching accuracy is improved, without the need for other associations. The hospital's drug information can also be classified according to needs, which greatly improves the efficiency of obtaining and using drug information, and realizes effective management of drug information.

附图说明Description of the drawings

为了更清楚地说明本申请实施例中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍，显而易见地，下面描述中的附图仅仅是本申请的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动性的前提下，还可以根据这些附图获得其他的附图。In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the drawings in the following description are only of the present application. For some embodiments, those of ordinary skill in the art can obtain other drawings based on these drawings without creative labor.

图1是本申请实施例提供的药品分类方法的实现流程图；Fig. 1 is a flow chart of the implementation of the drug classification method provided by the embodiment of the present application;

图2是本申请实施例提供的药品分类方法S103的具体实现流程图；FIG. 2 is a specific implementation flowchart of the drug classification method S103 provided by an embodiment of the present application;

图3是本申请另一实施例提供的药品分类方法S103的具体实现流程图；FIG. 3 is a specific implementation flowchart of a drug classification method S103 provided by another embodiment of the present application;

图4是本申请再一实施例提供的药品分类方法S103的具体实现流程图；FIG. 4 is a specific implementation flowchart of a drug classification method S103 provided by still another embodiment of the present application;

图5是本申请实施例提供的当所述药品名称与药品信息标准库中的药品通用名无法匹配时药品分类方法的实现流程图；FIG. 5 is a flowchart of the implementation of the method for classifying drugs when the drug name cannot match the generic name of the drug in the drug information standard library provided by an embodiment of the present application;

图6是本申请实施例提供的药品分类装置的结构框图；Figure 6 is a structural block diagram of a drug classification device provided by an embodiment of the present application;

图7是本申请实施例提供的智能设备的示意图。Fig. 7 is a schematic diagram of a smart device provided by an embodiment of the present application.

本发明的实施方式Embodiments of the present invention

以下描述中，为了说明而不是为了限定，提出了诸如特定***结构、技术之类的具体细节，以便透彻理解本申请实施例。然而，本领域的技术人员应当清楚，在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中，省略对众所周知的***、装置、电路以及方法的详细说明，以免不必要的细节妨碍本申请的描述。另外，在本申请说明书和所附权利要求书的描述中，术语“第一”、“第二”、“第三”等仅用于区分描述，而不能理解为指示或暗示相对重要性。In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application. In addition, in the description of the specification of this application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.

以下描述中，为了说明而不是为了限定，提出了诸如特定***结构、技术之类的具体细节，以便透彻理解本申请实施例。然而，本领域的技术人员应当清楚，在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中，省略对众所周知的***、装置、电路以及方法的详细说明，以免不必要的细节妨碍本申请的描述。In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application.

应当理解，当在本申请说明书和所附权利要求书中使用时，术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在，但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in the specification and appended claims of this application, the term "comprising" indicates the existence of the described features, wholes, steps, operations, elements and/or components, but does not exclude one or more other The existence or addition of features, wholes, steps, operations, elements, components, and/or collections thereof.

还应当理解，在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合，并且包括这些组合。It should also be understood that the term "and/or" used in the specification and appended claims of this application refers to any combination of one or more of the associated listed items and all possible combinations, and includes these combinations.

如在本申请说明书和所附权利要求书中所使用的那样，术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地，短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。As used in the description of this application and the appended claims, the term "if" can be construed as "when" or "once" or "in response to determination" or "in response to detecting ". Similarly, the phrase "if determined" or "if detected [described condition or event]" can be interpreted as meaning "once determined" or "in response to determination" or "once detected [described condition or event]" depending on the context ]" or "in response to detection of [condition or event described]".

另外，在本申请说明书和所附权利要求书的描述中，术语“第一”、“第二”、“第三”等仅用于区分描述，而不能理解为指示或暗示相对重要性。In addition, in the description of the specification of this application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.

在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此，在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例，而是意味着“一个或多个但不是所有的实施例”，除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”，除非是以其他方式另外特别强调。Reference to "one embodiment" or "some embodiments" described in the specification of this application means that one or more embodiments of this application include a specific feature, structure, or characteristic described in combination with the embodiment. Therefore, the sentences "in one embodiment", "in some embodiments", "in some other embodiments", "in some other embodiments", etc. appearing in different places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments" unless it is specifically emphasized otherwise. The terms "including", "including", "having" and their variations all mean "including but not limited to", unless otherwise specifically emphasized.

本申请实施例提供的一种药品分类方法可以应用于服务器、平板电脑、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer，UMPC)、上网本、等智能设备上，本申请实施例对终端设备的具体类型不作任何限制。The drug classification method provided by the embodiment of this application can be applied to smart devices such as servers, tablet computers, laptop computers, ultra-mobile personal computers (UMPC), netbooks, and so on. There are no restrictions on the specific types.

本申请实施例提出一种药品分类方法，具体涉及聚类分析技术，使得药品分类信息具有通用性，可适用范围更为广泛，大大提高利用和管理药品信息的效率。The embodiment of the application proposes a drug classification method, which specifically involves cluster analysis technology, which makes the drug classification information universal, has a wider applicable range, and greatly improves the efficiency of using and managing drug information.

图1示出了本申请实施例提供的药品分类方法的实现流程，该方法流程包括步骤S101至S105。各步骤的具体实现原理如下：FIG. 1 shows the implementation process of the drug classification method provided by the embodiment of the present application, and the method process includes steps S101 to S105. The specific implementation principles of each step are as follows:

S101：获取原始用药记录表，并从所述原始用药记录表中提取用药信息。S101: Obtain an original medication record table, and extract medication information from the original medication record table.

在本申请实施例中，获取目标医院的原始用药记录表，并从所述原始用药记录表中提取用药信息。智能设备通过连接并访问目标医院的服务器收集原始用药记录表。其中，所述目标医院指的是待进行医疗信息整合的医院，所述原始用药记录表指医院用于记录用药信息的记录表。在一些可能的实施方式中，为提高数据清洗和分类的效率，可获取指定时间段的原始用药记录表，避免对已经过清洗和分类的药品信息再次重复清洗和分类。In this embodiment of the application, the original medication record form of the target hospital is obtained, and medication information is extracted from the original medication record form. The smart device collects the original medication record form by connecting and accessing the server of the target hospital. Wherein, the target hospital refers to a hospital to be integrated with medical information, and the original medication record sheet refers to a record sheet used by the hospital to record medication information. In some possible implementation manners, in order to improve the efficiency of data cleaning and classification, the original medication record table for a specified time period can be obtained to avoid repeated cleaning and classification of the drug information that has been cleaned and classified.

所述原始用药记录表中由多条用药信息组成。对所述原始用药记录表进行分割，获取组成所述原始用药记录表的一条一条的用药信息。具体地，获取所述原始用药记录表的表结构，根据所述表结构进行分割，得到所述原始用药记录表中的各条用药信息，所述用药信息中包括但不限于用药时间、剂量以及药品信息。The original medication record table consists of multiple pieces of medication information. The original medication record table is divided to obtain the medication information one by one constituting the original medication record table. Specifically, the table structure of the original medication record table is obtained, and the table structure is segmented to obtain each piece of medication information in the original medication record table. The medication information includes, but is not limited to, medication time, dosage, and Drug information.

在一些可能的实施方式中，再从所述用药信息中提取药品信息，统计相同药品信息在所述原始用药记录表中出现的频次。统计频次的目的在于确定药品清洗的优先级。如果资源有限，在清洗的过程中，先清洗出现频率较高的药品名。In some possible implementation manners, the drug information is then extracted from the medication information, and the frequency of the same medication information in the original medication record table is counted. The purpose of statistical frequency is to determine the priority of medicine cleaning. If resources are limited, in the cleaning process, first clean the names of drugs that appear frequently.

S102：按预设清洗规则对所述用药信息进行清洗，得到药品名称。S102: Clean the medication information according to the preset cleaning rule to obtain the name of the medication.

具体地，所述用药信息中包括但不限于用药时间、剂量、药品信息、给药途径、生产厂商等，所述药品信息包括药品名称，所述按预设清洗规则对所述用药信息进行清洗是指将所述用药信息中的除药品名称以外的其他用药信息去除。Specifically, the medication information includes but is not limited to medication time, dosage, medication information, route of administration, manufacturer, etc., the medication information includes the name of the medication, and the medication information is cleaned according to a preset cleaning rule It refers to removing other medication information except the name of the medication from the medication information.

在一些可能的实施方式中，通过药品名称正则化实现这一步骤，所述药品名称正则化是指减少药品信息的特征维度。例如，用法用量经常包含数字，生产厂商一般是在括号内，通过信息特征去掉除药品名称以外的其他用药信息。或者，通过建立标准词库实现该过程，例如，将经常出现的给药途径整合为标准库，删除用药信息中包含给药途径标准库的词，类似的，将给药途径、生产厂商等信息设定为目标词，将所述用药信息中的目标词去除。在本实施例中，去掉用药信息中的非药品名称的信息，如给药途径(静脉注射、静注、口服、皮下等)、用法用量(X.XXmg、每日X片等)、生产厂商(哈六厂、赛诺菲等)，从而得到原始用药记录表中记录的药品的药品名称。In some possible implementation manners, this step is implemented through drug name regularization, which refers to reducing the feature dimension of drug information. For example, usage and dosage often include numbers, and manufacturers generally use parentheses to remove other medication information except the name of the drug through information features. Or, the process can be realized by establishing a standard vocabulary, for example, integrating frequently occurring routes of administration into a standard database, deleting the words in the drug information containing the standard database of administration routes, and similarly adding information such as route of administration, manufacturer, etc. Set as the target word, and remove the target word in the medication information. In this embodiment, the non-drug name information in the medication information is removed, such as the route of administration (intravenous injection, intravenous injection, oral, subcutaneous, etc.), usage and dosage (X.XXmg, daily X tablets, etc.), manufacturer (Harbin No. 6 Factory, Sanofi, etc.) to obtain the drug name of the drug recorded in the original drug record sheet.

在本申请实施例中，经过清洗得到的药品名称为药品中主要成分的名字。比如原始记录中为“口服盐酸二甲双胍缓释片(赛诺菲)”，去掉药物成分等与名称不相关的词语后，仅剩下“二甲双胍”。这使得在进行后续的药品名称匹配和基于词语相似聚类中，更有可能与相同成分的药物匹配成功或聚为一类。In the embodiment of the present application, the name of the medicine obtained after cleaning is the name of the main ingredient in the medicine. For example, the original record is "Oral Metformin Hydrochloride Sustained-Release Tablets (Sanofi)". After removing words that are not related to the name such as drug ingredients, only "metformin" is left. This makes it more likely that in subsequent drug name matching and word similarity clustering, it is more likely that drugs of the same composition will be successfully matched or clustered together.

S103：将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名。S103: Match the drug name with the drug generic name in the drug information standard library, and determine the standard drug generic name corresponding to the drug name.

在本实施例中，药品信息的标准化依赖于一个已存在的药品信息标准库，该药品信息标准库中包含了某个标准的药品通用名及其对应的所有关于该药品的信息，所述药品信息标准库中包括标准的药品通用名，将获取的原始用药记录表中的药品名称与药品信息标准库中的药品通用名进行模糊匹配，确定所述药品名称对应的标准药品通用名。In this embodiment, the standardization of drug information relies on an existing drug information standard database. The drug information standard database contains the generic name of a certain standard drug and all the corresponding information about the drug. The information standard database includes standard drug generic names, and the obtained drug names in the original medication record form are fuzzy matched with the drug generic names in the drug information standard database to determine the standard drug generic names corresponding to the drug names.

作为本申请的一个实施例，图2示出了本申请实施例提供的药品分类方法步骤S103的具体实现流程，详述如下：As an embodiment of the present application, FIG. 2 shows the specific implementation process of step S103 of the drug classification method provided by the embodiment of the present application, which is described in detail as follows:

A1：计算所述药品名称与所述药品信息标准库中的药品通用名的文本相似度。具体地，上述步骤A1具体包括：A1: Calculate the text similarity between the name of the drug and the generic name of the drug in the drug information standard library. Specifically, the above step A1 specifically includes:

A11：获取所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度。所述药品名称和药品通用名均由字符串表示，分别获取所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度。A11: Obtain the character string and character string length of the drug name and the character string and character string length of the drug generic name. The drug name and the general name of the drug are both represented by character strings, and the character string and the length of the character string of the drug name and the character string and the length of the character string of the drug general name are obtained respectively.

A12：根据所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度，计算所述药品名称与所述药品通用名的编辑距离，所述编辑距离是指所述药品名称变换到与所述药品通用名相同时需要经历的最小变换次数。变换包括***、删除与替换。A12: Calculate the edit distance between the drug name and the drug generic name according to the character string and the length of the drug name and the character string and the length of the drug generic name. The edit distance refers to the total The minimum number of times the drug name is changed to be the same as the general name of the drug. Transformation includes insertion, deletion and replacement.

A13：根据所述编辑距离确定所述药品名称与所述药品通用名的文本相似度。具体地，根据如下公式(1)计算所述药品名称与所述药品通用名的相似度值Sim(a，b)：A13: Determine the text similarity between the drug name and the generic name of the drug according to the edit distance. Specifically, the similarity value Sim(a, b) between the name of the drug and the generic name of the drug is calculated according to the following formula (1):

其中，lev(a,b)表示所述药品名称的字符串a与所述药品通用名的字符串b之间的编辑距离，len(a)表示所述字符串a的字符串长度，len(b)表示所述字符串b的字符串长度，max(len(a),len(b))表示所述字符串a与所述字符串b中的较长字符串长度。Wherein, lev(a,b) represents the edit distance between the string a of the drug name and the string b of the generic name of the drug, len(a) represents the string length of the string a, len( b) represents the length of the character string of the character string b, and max(len(a), len(b)) represents the length of the longer character string of the character string a and the character string b.

在一些可能的实施方式中，根据下式(2)确定所述药品名称的字符串a与所述药品通用名的字符串b之间的编辑距离lev(a,b)：In some possible implementation manners, the edit distance lev(a, b) between the character string a of the drug name and the character string b of the generic name of the drug is determined according to the following formula (2):

其中，leva,b(i,j)指的是字符串a中前i个字符和字符串b中前j个字符之间的距离。为了方便理解，这里的可以看作是长度。这里的字符串的第一个字符index索引从1开始，因此最后的编辑距离便是i＝|a|，j＝|b|时的距离leva,b(|a|,|b|)；Among them, leva,b(i,j) refers to the distance between the first i characters in the string a and the first j characters in the string b. In order to facilitate understanding, here can be regarded as length. The index index of the first character of the string here starts from 1, so the final edit distance is the distance leva,b(|a|,|b|) when i=|a|, j=|b|;

当min(i,j)＝0的时候，对应着字符串a中前i个字符和字符串b中前j个字符，此时的i,j有一个值为0，表示字符串a和b中有一个为空串，那么从a转换到b只需要进行次单字符编辑操作即可，所以它们之间的编辑距离为max(i,j)，即i,j中的最大者。When min(i,j)=0, it corresponds to the first i characters in the string a and the first j characters in the string b. At this time, i, j has a value of 0, which means the strings a and b If one of them is an empty string, then only a single-character editing operation is required to switch from a to b, so the editing distance between them is max(i,j), which is the largest of i,j.

当min(i,j)≠0的时候，leva,b(|a|,|b|)为如下三种情况的最小值：When min(i,j)≠0, leva,b(|a|,|b|) is the minimum of the following three cases:

1.leva,b(i-1,j)+1表示删除ai；1.leva,b(i-1,j)+1 means delete ai;

2.leva,b(i,j-1)+1表示***bj；2.leva,b(i,j-1)+1 means insert bj;

3.leva,b(i-1,j-1)+1(ai≠bj)表示替换bj，1(ai≠bj)为一个指示函数，表示当ai＝bj的时候取0；当ai≠bj的时候，其值为1。3.leva,b(i-1,j-1)+1(ai≠bj) means to replace bj, 1(ai≠bj) is an indicator function, which means that when ai=bj, take 0; when ai≠bj When, its value is 1.

A2：若所述文本相似度达到预设相似度阈值，则将所述文本相似度对应的药品通用名确定为所述药品名称的标准药品通用名。A2: If the text similarity reaches the preset similarity threshold, the drug generic name corresponding to the text similarity is determined as the standard drug generic name of the drug name.

在本申请实施例中，将进行相似度比较的药品名称与药品通用名的编辑距离与所述药品名称与所述药品通用名中的较长字符串长度的比值与1的差值来确定相似度值，可提高相似度计算的准确性。示例性地，“盐酸二甲双胍”和“二甲双胍片”编辑距离是3，用编辑距离除以进行比较的字符串中较长的字符串——即“盐酸二甲双胍”，这样计算得到相似度分值较大，匹配率更高。In the embodiment of the present application, the edit distance between the drug name and the generic name of the drug to be compared for similarity and the ratio of the longer character string length between the drug name and the generic name of the drug and the difference of 1 are used to determine the similarity The degree value can improve the accuracy of similarity calculation. Exemplarily, the edit distance of "metformin hydrochloride" and "metformin tablets" is 3. Divide the edit distance by the longer character string in the string for comparison-that is, "metformin hydrochloride", so that the similarity score is calculated Larger, higher matching rate.

作为本申请一种可能的实施方式，由于药品的特殊性，药品中存在名称相似但功效不同的药品。不同功效的药品对应的药品通用名可能相似但绝不相同。如图3所示，上述S103还包括：As a possible implementation of this application, due to the particularity of the medicine, there are medicines with similar names but different efficacies. The generic names of drugs with different functions may be similar but never the same. As shown in Figure 3, the above S103 further includes:

B1：若与所述药品名称的文本相似度达到预设相似度阈值的药品通用名不止一个，则将与所述药品名称的文本相似度达到预设相似度阈值所对应的药品通用名确定为待定通用名。B1: If there is more than one generic drug name whose text similarity to the drug name reaches the preset similarity threshold, the generic name of the drug corresponding to the text similarity to the drug name that reaches the preset similarity threshold is determined as The common name is to be determined.

B2：从所述原始用药记录表中获取所述药品名称对应的待分类药品的适用症状信息。具体地，所述原始用药记录表中不仅包括用药信息，还包括药品的适用症状信息。B2: Obtain applicable symptom information of the drug to be classified corresponding to the drug name from the original drug record table. Specifically, the original medication record table includes not only medication information, but also information on applicable symptoms of the medication.

B3：从所述药品信息标准库中获取所述待定通用名对应的标准药品的适用症状信息。B3: Obtain applicable symptom information of the standard drug corresponding to the undetermined generic name from the drug information standard database.

B4：将与所述待分类药品的适用症状信息相同的标准药品对应的待定通用名确定为所述待分类药品的标准药品通用名。B4: Determine the undetermined generic name corresponding to the standard drug with the same applicable symptom information as the drug to be classified as the standard drug generic name of the drug to be classified.

需要说明的是，本申请实施例中所述适用症状信息相同并不是指绝对相同。具体地，将所述待分类药品的适用症状信息与所述标准药品的适用症状信息进行相似度比较，相似度达到指定相似度值即认定适用症状信息相同。在一种可能的实施方式中，所述标准药品的适用症状信息包含所述待分类药品的适用症状信息，即认定所述标准药品的适用症状信息与所述待分类药品的适用症状信息相同。It should be noted that the same applicable symptom information in the embodiments of the present application does not mean that they are absolutely the same. Specifically, the similarity is compared between the applicable symptom information of the drug to be classified and the applicable symptom information of the standard drug, and the similarity reaches a specified similarity value, that is, it is determined that the applicable symptom information is the same. In a possible implementation, the applicable symptom information of the standard drug includes applicable symptom information of the drug to be classified, that is, it is determined that the applicable symptom information of the standard drug is the same as the applicable symptom information of the drug to be classified.

在本申请实施例中，为使得匹配的准确度更高，在计算所述药品名称与所述药品信息标准库中的药品通用名的文本相似度之后，若所述药品信息标准库中与所述药品名称的文本相似度达到所述预设相似度阈值的药品通用名不止一个时，通过进一步比较适用症状信息来确定所述药品名称对应的标准药品通用名，从而使得标准药品通用名的确定更为准确有效。In the embodiment of this application, in order to make the matching accuracy higher, after calculating the text similarity between the drug name and the drug generic name in the drug information standard library, if the drug information standard library is When the text similarity of the drug name reaches the preset similarity threshold, there is more than one generic drug name, the standard drug generic name corresponding to the drug name is determined by further comparing the applicable symptom information, so that the standard drug generic name is determined More accurate and effective.

在一些可能的实施方式中，对于未能与药品信息标准库中的药品通用名成功匹配的药物名称，可通过人工清洗的方式确定标准药品通用名，通过人工检索，得到某个药物别名对应的标准药品通用名。例如，在原始标准库中，没有“速尿”这种说法，经过一次人工的查找后，确定“速尿”对应的药品通用名是“呋塞米”。In some possible implementations, for the drug name that fails to successfully match the drug generic name in the drug information standard library, the standard drug generic name can be determined by manual cleaning, and the corresponding drug alias can be obtained through manual search. Generic name of standard drug. For example, in the original standard library, there was no such word as "furosemide". After a manual search, it was determined that the generic name of the drug corresponding to "furosemide" was "furosemide".

作为本申请一种可能的实施方式，如图4所示，上述步骤S103包括：As a possible implementation manner of the present application, as shown in FIG. 4, the foregoing step S103 includes:

C1：将所述药品名称与药品信息标准库中的药品通用名进行匹配。C1: Match the name of the drug with the generic name of the drug in the drug information standard library.

C2：将匹配成功的药品名称归入第一类药品名称集，并将匹配到的药品通用名确定为所述第一类药品名称集中各药品名称对应的标准药品通用名。所述第一类药品名称集用于存放匹配确定了标准药品通用名的药品名称。C2: Classify the matched drug names into the first-category drug name set, and determine the matched drug generic names as the standard drug generic names corresponding to each drug name in the first-category drug name set. The first type of drug name set is used to store drug names matching the common names of standard drugs.

C3：将未能匹配成功的药品名称归入第二类药品名称集。所述第二类药品名称集用于存放未匹配确定标准药品通用名的药品名称。C3: Classify the drug names that failed to match into the second category of drug name sets. The second type of drug name set is used to store drug names that do not match the common name of the determined standard drug.

C4：将所述第一类药品名称集中的药品名称按指定聚类算法进行聚类，得到第一聚类名称子集，所述第一聚类名称子集中包括聚类得到的第一聚类名称。C4: Cluster the drug names in the first type of drug name set according to a specified clustering algorithm to obtain a first cluster name subset, and the first cluster name subset includes the first cluster obtained by clustering name.

C5：将所述第二类药品名称集中的药品名称按指定聚类算法进行聚类，得到第二聚类名称子集，所述第二聚类名称子集中包括聚类得到的第二聚类名称。C5: Cluster the drug names in the second type of drug name set according to the designated clustering algorithm to obtain a second cluster name subset, and the second cluster name subset includes the second cluster obtained by clustering name.

C6：将所述第一聚类名称与所述第二聚类名称进行匹配，根据匹配结果确定所述第二类药品名称集中的药品名称对应的标准药品通用名。C6: Match the first cluster name with the second cluster name, and determine the standard drug generic name corresponding to the drug name in the second type of drug name set according to the matching result.

在本申请实施例中，将上述与药品信息标准库中的标准通用名匹配的药品名称确定为第一类药品名称，将未能与药品信息标准库中的标准通用名匹配的药品名称确定为第二类药品名称，这两类药品名称分别各自聚类，再将第二类药品名称的聚类结果中的第二聚类名称与第一类药品名称的聚类结果中的第一聚类名称分别进行匹配，进而确定未能与药品信息标准库中的标准通用名匹配的药品名称对应的标准药品通用名。通过聚类之后再进行名称匹配可使得匹配效率提高。In the embodiment of the application, the drug name that matches the standard generic name in the drug information standard library is determined to be the first class drug name, and the drug name that fails to match the standard generic name in the drug information standard library is determined to be Names of the second category of drugs, the two categories of drug names are clustered separately, and then the second cluster name in the clustering results of the second category of drug names and the first cluster in the clustering results of the first category of drug names The names are matched respectively, and then the standard drug generic name corresponding to the drug name that fails to match the standard generic name in the drug information standard library is determined. Name matching after clustering can improve matching efficiency.

比如，第一类原来有1000个词，聚类得到了100个簇，表示第一类有100个唯一药物(每个簇中药品化学成分一样，但具体的通用名可能不一样，例如二甲双胍和盐酸二甲双胍缓释片在一个簇里)；类似地，第二类原来有1300个词，聚类得到150个簇，这时候就可以用这150个词与第一类中的100个词比较。由于得到的簇都是化学名称，词语比较短，所以有一部分属于第二类的词，由于简化，更有可能与第一类中简化过的词匹配上(举例：”注射用人工合成胰岛素(赛诺菲)”与“胰岛素”的编辑距离大，相似度小，所以并未匹配上，但是在简化了药品名后，前者简化为“胰岛素”，就更有可能匹配上标准的名称)。而且，聚类前，是第一类中1000个词与第二类中1300个词之间的两两比较，需要计算1000*1300/2次的相似距离；但是聚类之后，仅仅需要计算簇与簇之间的相似度，即100*150/2次的相似度计算，减少了计算量，进而可提高匹配效率。For example, the first category originally had 1000 words, and the clustering resulted in 100 clusters, which means that the first category has 100 unique drugs (the chemical composition of the drugs in each cluster is the same, but the specific generic names may be different, such as metformin and Metformin hydrochloride sustained-release tablets are in one cluster); similarly, the second category originally had 1300 words, and 150 clusters were obtained by clustering. At this time, these 150 words can be compared with the 100 words in the first category. Since the obtained clusters are all chemical names and the words are relatively short, some words belong to the second category. Due to the simplification, they are more likely to match the simplified words in the first category (for example: "Injection synthetic insulin ( "Sanofi)" and "Insulin" have a large edit distance and a small similarity, so they did not match, but after simplifying the name of the drug, the former is simplified to "Insulin", which is more likely to match the standard name). Moreover, before clustering, it is a pairwise comparison between 1000 words in the first category and 1300 words in the second category. It is necessary to calculate the similarity distance of 1000*1300/2 times; but after the clustering, only the clusters need to be calculated. The similarity between the clusters, that is, the similarity calculation of 100*150/2 times, reduces the amount of calculation, thereby improving the matching efficiency.

对于某些没有用通用名表示的药物名称(如商品名)导致未能成功匹配的药物，一般需要人工检索该药物的通用名。如果通过聚类，我们仅需要对每个簇的代表词进行人工查找即可，不需要对属于该簇的所有词进行查找，节省人工清洗的成本(例如，呋塞米注射液的别名是“速尿”，原始记录中“速尿10mg”、“速尿静注”这些词简化并聚类为“速尿”后，我们仅仅需要确定一次“速尿”是“呋塞米”，而不用每次出现“速尿”都进行一次人工检索)。For some drugs that are not represented by a generic name (such as a trade name), which results in unsuccessful matching of drugs, it is generally necessary to manually retrieve the generic name of the drug. If clustering is used, we only need to manually search for the representative words of each cluster, instead of searching for all words belonging to the cluster, saving the cost of manual cleaning (for example, the alias of furosemide injection is " Furosemide. After the words "furosemide 10mg" and "furosemide intravenous injection" in the original record are simplified and clustered into "furosemide", we only need to determine once "furosemide" is "furosemide" instead of A manual search is performed every time "furosemide" appears).

在一些可能的实施方式中，智能设备将聚类后得到的标准通用名及其对应的药品信息补充到药品标准库中，这样随着清洗经验的积累，药品标准库越来越丰富，人工清洗的工作越来越少，最终可能实现全自动化。In some possible implementations, the smart device adds the standard generic names obtained after clustering and their corresponding drug information to the drug standard library, so that with the accumulation of cleaning experience, the drug standard library becomes more and more abundant, and manual cleaning The work is getting less and less, and it may eventually be fully automated.

作为本申请一种可能的实施方式，如图5所示，当所述药品名称与药品信息标准库中的药品通用名无法匹配时，所述药品分类方法还包括：As a possible implementation manner of this application, as shown in FIG. 5, when the drug name cannot match the drug generic name in the drug information standard library, the drug classification method further includes:

D1：将所述药品名称按预设简化规则进行简化，得到待匹配的第一简化药品名称。例如，将药品名称中包含的药物剂型和药物媒介的词删掉，使得药品名称中尽可能只包含表示化学成分的词。D1: Simplify the drug name according to the preset simplified rule to obtain the first simplified drug name to be matched. For example, delete the words of drug dosage form and drug vehicle contained in the name of the drug, so that the name of the drug contains only words representing chemical ingredients as much as possible.

D2：获取药品信息标准库中药品的药品名称，并按所述预设简化规则进行简化，得到第二简化药品名称。同上，将药品名称中包含的药物剂型和药物媒介的词删掉，使得药品名称中尽可能只包含表示化学成分的词。D2: Obtain the drug name of the drug in the drug information standard library, and simplify it according to the preset simplified rules to obtain the second simplified drug name. Same as above, delete the words of drug dosage form and drug vehicle contained in the name of the drug, so that the name of the drug contains only words that indicate chemical ingredients as much as possible.

D3：将所述第二简化药品名称进行聚类，形成设定数量的簇。具体地，根据所述药品信息标准库中简化后得到的所有第二简化药品名称建立初始集合，从所述初始集合中随机选取设定数量的第二简化药品名称作为初始聚类中心，根据初始集合中第二简化药品名称与初始聚类中心的相似度，行程设定数量的簇。D3: Cluster the second simplified drug name to form a set number of clusters. Specifically, an initial set is established based on all the second simplified drug names obtained after simplification in the drug information standard database, a set number of second simplified drug names are randomly selected from the initial set as the initial cluster center, and The similarity between the second simplified drug name in the set and the initial cluster center, and the number of clusters for the stroke set.

D4：根据各簇的中心药品名称，生成匹配列表。所述匹配列表中包括各簇的中心药品名称。D4: Generate a matching list based on the name of the central drug in each cluster. The matching list includes the name of the central drug of each cluster.

D5：将所述待匹配的简化药品名称与所述匹配列表中的中心药品名称进行匹配，将匹配的中心药品名称确定为所述药品名称的标准药品通用名。D5: Match the simplified drug name to be matched with the central drug name in the matching list, and determine the matched central drug name as the standard drug generic name of the drug name.

具体地，首先将药品名中包含的药物剂型和药物媒介的词删掉，使得药品名中尽可能只包含表示化学成分的词，其次，将药品信息标准库中简化的第二简化药品名称进行聚类，即简化的药名进行两两之间的相似度计算，相似度高的词语归为一类，并在某一类中选择一个药名代表这一类，选择的方法是与该类中其他药名相似度高于某一阈值最多的那个药名，由此，相似的药品名被聚为一簇，最后根据每一簇的中心药品名称生成匹配列表，根据匹配列表去匹配待匹配的第一简化药品名称，匹配方法仍然如前所述，通过简化和聚类减少药品名称中与药品化学成分不相关的信息，提高匹配率，同时，可减少人工清洗的工作量，从而可提高药品名称的匹配的有效性，使得确定标准通用药品名称的效率提高。Specifically, first delete the words of the drug dosage form and drug medium contained in the drug name, so that the drug name contains only the words representing the chemical composition as much as possible, and secondly, the simplified second simplified drug name in the drug information standard library is carried out. Clustering, that is, the similarity calculation between two simplified drug names, the words with high similarity are classified into one category, and a drug name in a certain category is selected to represent this category, and the method of selection is the same as that of the category The name of the medicine with the most similarity of other Chinese medicine names higher than a certain threshold. As a result, similar medicine names are clustered into a cluster, and finally a matching list is generated according to the central medicine name of each cluster, and the matching list is matched according to the matching list. The first simplified drug name, the matching method is still as described above. Through simplification and clustering, the information that is not related to the chemical composition of the drug in the drug name is reduced, and the matching rate is improved. At the same time, the workload of manual cleaning can be reduced, thereby increasing The effectiveness of drug name matching improves the efficiency of determining standard generic drug names.

S104：获取用户输入的药品分类需求。S104: Obtain the drug classification requirements input by the user.

在本申请实施例中，所述药品分类需求是指用户的分类目的。所述药品分类需求包括药品用途。在医疗大数据分析时，根据研究的问题不同，对药品的分类尺度和标准不一样，例如在研究推荐糖尿病治疗药品时，关注的尺度可能是口服药还是胰岛素，这时要对降糖药分别标记为口服药或胰岛素；而在研究合并多种疾病的心衰病人治疗时，可能考虑的尺度是是否要对患者推荐降糖药，这时候口服降糖药和胰岛素都应该被标记为降糖药。因此，在本申请实施例中，根据分类需求预设分类标签，通过获取用户输入的药品分类需求以便进一步确定分类标签。In the embodiment of the present application, the drug classification requirement refers to the classification purpose of the user. The drug classification requirements include drug use. In the analysis of medical big data, the classification scale and standard of drugs are different according to the different research problems. For example, when researching and recommending diabetes treatment drugs, the scale of concern may be oral drugs or insulin. Labeled as oral drugs or insulin; when studying the treatment of heart failure patients with multiple diseases, the possible measure is whether to recommend hypoglycemic drugs to the patients. At this time, oral hypoglycemic drugs and insulin should be labeled as hypoglycemic drugs. medicine. Therefore, in the embodiment of the present application, the classification label is preset according to the classification requirement, and the classification label is further determined by obtaining the drug classification requirement input by the user.

S105：根据匹配得到的标准药品通用名与所述药品分类需求，确定所述药品名称对应的分类标签，所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签。S105: Determine a classification label corresponding to the drug name according to the matched standard drug generic name and the drug classification requirement, where the classification label is a label predefined according to the drug classification requirement and the standard drug generic name.

所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签。需要说明的是，由于同一药品在不同的药品分类需求下所属的类别可能不同，当用户输入的分类需求有多个时，同一药品可同时对应多个分类标签。The classification label is a label predefined according to the classification requirements of the medicine and the common name of the standard medicine. It should be noted that since the same drug may belong to different categories under different drug classification requirements, when there are multiple classification requirements input by the user, the same drug can correspond to multiple classification labels at the same time.

在一些可能的实施方式中，在上述步骤S105之后，还包括按分类标签将药品名称对应的药品进行分类，从而使得药品分类信息更具通用性，适用范围更为广泛，大大提高利用和管理药品信息的效率。In some possible implementation manners, after the above step S105, it further includes classifying the drugs corresponding to the drug names according to the classification labels, so that the drug classification information is more versatile, the scope of application is wider, and the utilization and management of drugs are greatly improved. Information efficiency.

S106：根据所述分类标签，将所述药品名称对应的药品分类。S106: According to the classification label, classify the drug corresponding to the drug name.

在本申请实施例中，将所述药品名称对应的药品按所述分类标签进行分类。In the embodiment of the present application, the drugs corresponding to the drug names are classified according to the classification labels.

本申请实施例中，通过获取原始用药记录表，并从所述原始用药记录表中提取用药信息，自动按预设清洗规则对所述用药信息进行清洗，得到药品名称，使得用于匹配标准药品通用名的药品名称简洁，提高匹配准确性，然后将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名，获取用户输入的药品分类需求，再根据匹配得到的标准药品通用名与所述药品分类需求，确定所述药品名称对应的分类标签，所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签，最后根据所述分类标签，将所述药品名称对应的药品分类，本申请无需关联其他医院的药品信息，同时可按需求对药品分类，大大提高了获取利用药品信息的效率，并且实现对药品信息进行有效的管理。In the embodiment of this application, by obtaining the original medication record table and extracting medication information from the original medication record table, the medication information is automatically cleaned according to the preset cleaning rules to obtain the drug name, so that it can be used to match standard drugs The drug name of the generic name is concise to improve the matching accuracy, and then the drug name is matched with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name, and obtain the drug classification requirements entered by the user , And then determine the classification label corresponding to the drug name according to the matched standard drug generic name and the drug classification requirements. The classification label is a label predefined according to the drug classification requirements and the standard drug generic name, and finally according to the The classification label is used to classify the drugs corresponding to the drug names. This application does not need to be associated with drug information from other hospitals. At the same time, drugs can be classified according to needs, which greatly improves the efficiency of obtaining and using drug information, and realizes effective drug information manage.

应理解，上述实施例中各步骤的序号的大小并不意味着执行顺序的先后，各过程的执行顺序应以其功能和内在逻辑确定，而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.

对应于上文实施例所述的药品分类方法，图6示出了本申请实施例提供的药品分类装置的结构框图，为了便于说明，仅示出了与本申请实施例相关的部分。Corresponding to the drug classification method described in the above embodiment, FIG. 6 shows a structural block diagram of a drug classification device provided in an embodiment of the present application. For ease of description, only the parts related to the embodiment of the present application are shown.

参照图6，该药品分类装置包括：用药信息提取单元61，信息清洗单元62，通用名匹配单元63，分类需求获取单元64，分类标签确定单元65，药品分类单元66，其中：6, the medicine classification device includes: a medicine information extraction unit 61, an information cleaning unit 62, a common name matching unit 63, a classification requirement acquisition unit 64, a classification label determination unit 65, and a medicine classification unit 66, in which:

用药信息提取单元61，用于获取原始用药记录表，并从所述原始用药记录表中提取用药信息；The medication information extraction unit 61 is configured to obtain an original medication record sheet, and extract medication information from the original medication record sheet;

信息清洗单元62，用于按预设清洗规则对所述用药信息进行清洗，得到药品名称；The information cleaning unit 62 is configured to clean the medication information according to a preset cleaning rule to obtain the name of the medication;

通用名匹配单元63，用于将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名；The generic name matching unit 63 is configured to match the drug name with the drug generic name in the drug information standard library, and determine the standard drug generic name corresponding to the drug name;

分类需求获取单元64，用于获取用户输入的药品分类需求；The classification requirement obtaining unit 64 is used to obtain the drug classification requirements input by the user;

分类标签确定单元65，用于根据匹配得到的标准药品通用名与所述药品分类需求，确定所述药品名称对应的分类标签，所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签；The classification label determining unit 65 is configured to determine the classification label corresponding to the drug name according to the matched standard drug generic name and the drug classification requirement, and the classification label is predefined according to the drug classification requirement and the standard drug generic name Tag of;

药品分类单元66，根据所述分类标签，将所述药品名称对应的药品分类。The medicine classification unit 66 classifies the medicine corresponding to the medicine name according to the classification label.

在一种可能的实施方式中，所述通用名匹配单元63包括：In a possible implementation manner, the common name matching unit 63 includes:

文本相似度计算模块，用于计算所述药品名称与所述药品信息标准库中的药品通用名的文本相似度；The text similarity calculation module is used to calculate the text similarity between the drug name and the drug generic name in the drug information standard library;

第一通用名确定模块，用于若所述文本相似度达到预设相似度阈值，则将所述文本相似度对应的药品通用名确定为所述药品名称的标准药品通用名。The first generic name determining module is configured to determine the generic drug name corresponding to the text similarity as the standard generic drug name of the drug name if the text similarity reaches a preset similarity threshold.

在一种可能的实施方式中，所述文本相似度计算模块具体包括：In a possible implementation manner, the text similarity calculation module specifically includes:

字符串信息获取子模块，用于获取所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度；The string information acquisition sub-module is used to acquire the string and string length of the drug name and the string and string length of the drug generic name;

编辑距离确定子模块，用于根据所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度，计算所述药品名称与所述药品通用名的编辑距离，所述编辑距离是指所述药品名称变换到与所述药品通用名相同时需要经历的最小变换次数；The edit distance determination submodule is used to calculate the edit distance between the drug name and the drug generic name according to the character string and character string length of the drug name and the character string and character string length of the drug generic name. The edit distance refers to the minimum number of times the drug name is changed to the same as the general name of the drug;

文本相似度确定子模块，用于根据所述编辑距离确定所述药品名称与所述药品通用名的文本相似度。The text similarity determination sub-module is used to determine the text similarity between the drug name and the generic name of the drug according to the edit distance.

在一种可能的实施方式中，所述文本相似度确定子模块具体包括：In a possible implementation manner, the text similarity determination submodule specifically includes:

计算子模块，用于根据如下公式计算所述药品名称与所述药品通用名的相似度值Sim(a，b)：The calculation sub-module is used to calculate the similarity value Sim(a, b) between the drug name and the drug generic name according to the following formula:

在一种可能的实施方式中，所述通用名匹配单元63还包括：In a possible implementation manner, the common name matching unit 63 further includes:

待定通用名确定模块，用于若与所述药品名称的文本相似度达到预设相似度阈值的药品通用名不止一个，则将与所述药品名称的文本相似度达到预设相似度阈值所对应的药品通用名确定为待定通用名；The undetermined generic name determination module is configured to correspond to the text similarity of the drug name reaching the preset similarity threshold if there is more than one generic name of the drug with the text similarity to the drug name reaching the preset similarity threshold The generic name of the drug is determined as the pending generic name;

第一适用信息确定模块，用于从所述原始用药记录表中获取所述药品名称对应的待分类药品的适用症状信息；The first applicable information determining module is configured to obtain applicable symptom information of the drug to be classified corresponding to the drug name from the original drug record table;

第二适用信息确定模块，用于从所述药品信息标准库中获取所述待定通用名对应的标准药品的适用症状信息；The second applicable information determining module is configured to obtain applicable symptom information of the standard drug corresponding to the pending generic name from the drug information standard database;

第二通用名确定模块，用于将与所述待分类药品的适用症状信息相同的标准药品对应的待定通用名确定为所述待分类药品的标准药品通用名。The second generic name determination module is used to determine the pending generic name corresponding to the standard drug with the same applicable symptom information of the drug to be classified as the standard drug generic name of the drug to be classified.

信息匹配模块，用于将所述药品名称与药品信息标准库中的药品通用名进行匹配；The information matching module is used to match the name of the drug with the generic name of the drug in the drug information standard library;

第一药品名称集确定模块，用于将匹配成功的药品名称归入第一类药品名称集，并将匹配到的药品通用名确定为所述第一类药品名称集中各药品名称对应的标准药品通用名；The first drug name set determination module is used to classify the matched drug names into the first class drug name set, and determine the matched drug generic name as the standard drug corresponding to each drug name in the first class drug name set common name;

第二药品名称集确定模块，用于将未能匹配成功的药品名称归入第二类药品名称集；The second drug name set determining module is used to classify the drug names that failed to match into the second drug name set;

第一聚类模块，用于将所述第一类药品名称集中的药品名称按指定聚类算法进行聚类，得到第一聚类名称子集，所述第一聚类名称子集中包括聚类得到的第一聚类名称；The first clustering module is configured to cluster the drug names in the first type of drug name set according to a specified clustering algorithm to obtain a first cluster name subset, and the first cluster name subset includes clusters The first cluster name obtained;

第二聚类模块，用于将所述第二类药品名称集中的药品名称按指定聚类算法进行聚类，得到第二聚类名称子集，所述第二聚类名称子集中包括聚类得到的第二聚类名称；The second clustering module is used to cluster the drug names in the second type of drug name set according to a specified clustering algorithm to obtain a second cluster name subset, and the second cluster name subset includes clusters The second cluster name obtained;

第一聚类匹配模块，用于将所述第一聚类名称与所述第二聚类名称进行匹配，根据匹配结果确定所述第二类药品名称集中的药品名称对应的标准药品通用名。The first cluster matching module is configured to match the first cluster name with the second cluster name, and determine the standard drug generic name corresponding to the drug name in the second type of drug name set according to the matching result.

在一种可能的实施方式中，当所述药品名称与药品信息标准库中的药品通用名无法匹配时，所述药品分类装置还包括：In a possible implementation, when the name of the drug cannot be matched with the generic name of the drug in the drug information standard library, the drug classification device further includes:

第一简化单元，用于将所述药品名称按预设简化规则进行简化，得到待匹配的第一简化药品名称。The first simplification unit is used to simplify the drug name according to a preset simplification rule to obtain the first simplified drug name to be matched.

第二简化单元，用于获取药品信息标准库中药品的药品名称，并按所述预设简化规则进行简化，得到第二简化药品名称。The second simplification unit is used to obtain the drug name of the drug in the drug information standard library and simplify it according to the preset simplification rule to obtain the second simplified drug name.

第三聚类单元，用于将所述第二简化药品名称进行聚类，形成设定数量的簇。The third clustering unit is used to cluster the second simplified drug names to form a set number of clusters.

匹配列表生成单元，用于根据各簇的中心药品名称，生成匹配列表。The matching list generating unit is used to generate a matching list according to the name of the central medicine of each cluster.

名称匹配单元，用于将所述待匹配的简化药品名称与所述匹配列表中的中心药品名称进行匹配，将匹配的中心药品名称确定为所述药品名称的标准药品通用名。The name matching unit is used to match the simplified drug name to be matched with the central drug name in the matching list, and determine the matched central drug name as the standard drug generic name of the drug name.

本申请实施例还提供一种计算机可读存储介质，所述计算机可读存储介质存储有计算机可读指令，所述计算机可读指令被处理器执行时实现如图1至图5表示的任意一种药品分类方法的步骤。The embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, and when the computer-readable instructions are executed by a processor, any one of those shown in FIGS. 1 to 5 is implemented. The steps of a drug classification method.

本申请实施例还提供一种计算机可读指令产品，当该计算机可读指令产品在智能设备上运行时，使得智能设备执行实现如图1至图5表示的任意一种药品分类方法的步骤。The embodiment of the present application also provides a computer-readable instruction product. When the computer-readable instruction product runs on a smart device, the smart device executes the steps of any one of the drug classification methods shown in Figs. 1 to 5.

本申请实施例还提供一种智能设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令，所述处理器执行所述计算机可读指令时实现如图1至图5表示的任意一种药品分类方法的步骤。An embodiment of the present application also provides a smart device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor. When the processor executes the computer-readable instructions, Realize the steps of any medicine classification method as shown in Fig. 1 to Fig. 5.

图7是本申请一实施例提供的智能设备的示意图。如图7所示，该实施例的智能设备7包括：处理器70、存储器71以及存储在所述存储器71中并可在所述处理器70上运行的计算机可读指令72。所述处理器70执行所述计算机可读指令72时实现上述各个药品分类方法实施例中的步骤，例如图1所示的步骤101至106。或者，所述处理器70执行所述计算机可读指令72时实现上述各装置实施例中各模块/单元的功能，例如图6所示单元61至66的功能。Fig. 7 is a schematic diagram of a smart device provided by an embodiment of the present application. As shown in FIG. 7, the smart device 7 of this embodiment includes a processor 70, a memory 71, and computer-readable instructions 72 that are stored in the memory 71 and can run on the processor 70. When the processor 70 executes the computer-readable instructions 72, the steps in the foregoing drug classification method embodiments, such as steps 101 to 106 shown in FIG. 1, are implemented. Alternatively, when the processor 70 executes the computer-readable instructions 72, the functions of the modules/units in the foregoing device embodiments are implemented, for example, the functions of the units 61 to 66 shown in FIG. 6.

示例性的，所述计算机可读指令72可以被分割成一个或多个模块/单元，所述一个或者多个模块/单元被存储在所述存储器71中，并由所述处理器70执行，以完成本申请。所述一个或多个模块/单元可以是能够完成特定功能的一系列计算机可读指令段，该指令段用于描述所述计算机可读指令72在所述智能设备7中的执行过程。Exemplarily, the computer-readable instructions 72 may be divided into one or more modules/units, and the one or more modules/units are stored in the memory 71 and executed by the processor 70, To complete this application. The one or more modules/units may be a series of computer-readable instruction segments capable of completing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions 72 in the smart device 7.

所述智能设备7可以是智能手机、笔记本、服务器、掌上电脑及云端智能设备等计算设备。所述智能设备7可包括，但不仅限于，处理器70、存储器71。本领域技术人员可以理解，图7仅仅是智能设备7的示例，并不构成对智能设备7的限定，可以包括比图示更多或更少的部件，或者组合某些部件，或者不同的部件，例如所述智能设备7还可以包括输入输出设备、网络接入设备、总线等。The smart device 7 may be a computing device such as a smart phone, a notebook, a server, a palmtop computer, and a cloud smart device. The smart device 7 may include, but is not limited to, a processor 70 and a memory 71. Those skilled in the art can understand that FIG. 7 is only an example of the smart device 7 and does not constitute a limitation on the smart device 7. It may include more or less components than those shown in the figure, or a combination of certain components, or different components. For example, the smart device 7 may also include input and output devices, network access devices, buses, and the like.

所述处理器70可以是中央处理单元(Central Processing Unit，CPU)，还可以是其他通用处理器、数字信号处理器(Digital Signal Processor，DSP)、专用集成电路(Application Specific Integrated Circuit，ASIC)、现成可编程门阵列(Field-Programmable Gate Array，FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 70 may be a central processing unit (Central Processing Unit, CPU), or other general-purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), Ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.

所述存储器71可以是所述智能设备7的内部存储单元，例如智能设备7的硬盘或内存。所述存储器71也可以是所述智能设备7的外部存储设备，例如所述智能设备7上配备的插接式硬盘，智能存储卡(Smart Media Card,SMC)，安全数字(Secure Digital,SD)卡，闪存卡(Flash Card)等。进一步地，所述存储器71还可以既包括所述智能设备7的内部存储单元也包括外部存储设备。所述存储器71用于存储所述计算机可读指令以及所述智能设备所需的其他程序和数据。所述存储器71还可以用于暂时地存储已经输出或者将要输出的数据。The memory 71 may be an internal storage unit of the smart device 7, for example, a hard disk or a memory of the smart device 7. The memory 71 may also be an external storage device of the smart device 7, such as a plug-in hard disk equipped on the smart device 7, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) Card, Flash Card, etc. Further, the memory 71 may also include both an internal storage unit of the smart device 7 and an external storage device. The memory 71 is used to store the computer readable instructions and other programs and data required by the smart device. The memory 71 can also be used to temporarily store data that has been output or will be output.

需要说明的是，上述装置/单元之间的信息交互、执行过程等内容，由于与本申请方法实施例基于同一构思，其具体功能及带来的技术效果，具体可参见方法实施例部分，此处不再赘述。It should be noted that the information interaction and execution process between the above-mentioned devices/units are based on the same concept as the method embodiment of this application, and its specific functions and technical effects can be found in the method embodiment section. I won't repeat it here.

所属领域的技术人员可以清楚地了解到，为了描述的方便和简洁，仅以上述各功能单元、模块的划分进行举例说明，实际应用中，可以根据需要而将上述功能分配由不同的功能单元、模块完成，即将所述装置的内部结构划分成不同的功能单元或模块，以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中，也可以是各个单元单独物理存在，也可以两个或两个以上单元集成在一个单元中，上述集成的单元既可以采用硬件的形式实现，也可以采用软件功能单元的形式实现。另外，各功能单元、模块的具体名称也只是为了便于相互区分，并不用于限制本申请的保护范围。上述***中单元、模块的具体工作过程，可以参考前述方法实施例中的对应过程，在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above functional units and modules is used as an example. In practical applications, the above functions can be allocated to different functional units and modules as needed. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of a software functional unit. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.

所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中，所述计算机可读存储介质可以是非易失性，也可以是易失性。基于这样的理解，本申请实现上述实施例方法中的全部或部分流程，可以通过计算机可读指令来指令相关的硬件来完成，所述的计算机可读指令可存储于一计算机可读存储介质中，该计算机可读指令在被处理器执行时，可实现上述各个方法实施例的步骤。其中，所述计算机可读指令包括计算机可读指令代码，所述计算机可读指令代码可以为源代码形式、对象代码形式、可执行文件或某些中间形式等。所述计算机可读介质至少可以包括：能够将计算机可读指令代码携带到装置/终端设备的任何实体或装置、记录介质、计算机存储器、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区，根据立法和专利实践，计算机可读介质不可以是电载波信号和电信信号。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer-readable storage medium. The computer-readable storage medium may be non-volatile or It is volatile. Based on this understanding, the implementation of all or part of the processes in the above-mentioned embodiments and methods in this application can be accomplished by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium. When the computer-readable instructions are executed by the processor, they can implement the steps of the foregoing method embodiments. Wherein, the computer-readable instruction includes computer-readable instruction code, and the computer-readable instruction code may be in the form of source code, object code, executable file, or some intermediate form. The computer-readable medium may at least include: any entity or device capable of carrying computer-readable instruction codes to the device/terminal device, recording medium, computer memory, read-only memory (ROM, Read-Only Memory), random access Memory (RAM, Random Access Memory), electric carrier signal, telecommunications signal, and software distribution medium. For example, U disk, mobile hard disk, floppy disk or CD-ROM, etc. In some jurisdictions, according to legislation and patent practices, computer-readable media cannot be electrical carrier signals and telecommunication signals.

在上述实施例中，对各个实施例的描述都各有侧重，某个实施例中没有详述或记载的部分，可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.

以上所述实施例仅用以说明本申请的技术方案，而非对其限制；尽管参照前述实施例对本申请进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围，均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims

一种药品分类方法，其中，包括：A method for classifying medicines, which includes:

获取原始用药记录表，并从所述原始用药记录表中提取用药信息；Obtain an original medication record sheet, and extract medication information from the original medication record sheet;

按预设清洗规则对所述用药信息进行清洗，得到药品名称；Clean the medication information according to the preset cleaning rules to obtain the name of the medication;

将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名；Match the drug name with the drug generic name in the drug information standard library, and determine the standard drug generic name corresponding to the drug name;

获取用户输入的药品分类需求；Obtain the drug classification requirements entered by the user;

根据匹配得到的标准药品通用名与所述药品分类需求，确定所述药品名称对应的分类标签，所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签；Determine the classification label corresponding to the drug name according to the matched standard drug generic name and the drug classification requirement, where the classification label is a label predefined according to the drug classification requirement and the standard drug generic name;

根据所述分类标签，将所述药品名称对应的药品分类。According to the classification label, the drug corresponding to the drug name is classified.
根据权利要求1所述的药品分类方法，其中，所述将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名的步骤，包括：The method for classifying drugs according to claim 1, wherein the step of matching the drug name with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name comprises:

计算所述药品名称与所述药品信息标准库中的药品通用名的文本相似度；Calculate the text similarity between the drug name and the drug generic name in the drug information standard library;

若所述文本相似度达到预设相似度阈值，则将所述文本相似度对应的药品通用名确定为所述药品名称的标准药品通用名。If the text similarity reaches the preset similarity threshold, the drug generic name corresponding to the text similarity is determined as the standard drug generic name of the drug name.
根据权利要求2所述的药品分类方法，其中，所述计算所述药品名称与所述药品信息标准库中的药品通用名的文本相似度的步骤，具体包括：The method for classifying medicines according to claim 2, wherein the step of calculating the text similarity between the name of the medicine and the generic name of the medicine in the medicine information standard library specifically comprises:

获取所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度；Acquiring the character string and character string length of the drug name and the character string and character string length of the drug generic name;

根据所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度，计算所述药品名称与所述药品通用名的编辑距离，所述编辑距离是指所述药品名称变换到与所述药品通用名相同时需要经历的最小变换次数；According to the character string and character string length of the drug name and the character string and character string length of the drug generic name, the edit distance between the drug name and the drug generic name is calculated, and the edit distance refers to the drug The minimum number of times the name is changed to be the same as the generic name of the drug;

根据所述编辑距离确定所述药品名称与所述药品通用名的文本相似度。The text similarity between the drug name and the generic name of the drug is determined according to the edit distance.
根据权利要求3所述的药品分类方法，其中，所述根据所述编辑距离确定所述药品名称与所述药品通用名的文本相似度的步骤，包括：The method for classifying drugs according to claim 3, wherein the step of determining the text similarity between the name of the drug and the generic name of the drug according to the edit distance comprises:

根据如下公式计算所述药品名称与所述药品通用名的相似度值Sim(a，b)：Calculate the similarity value Sim(a, b) between the name of the drug and the generic name of the drug according to the following formula:

其中，lev(a,b)表示所述药品名称的字符串a与所述药品通用名的字符串b之间的编辑距离，len(a)表示所述字符串a的字符串长度，len(b)表示所述字符串b的字符串长度，max(len(a),len(b))表示所述字符串a与所述字符串b中的较长字符串长度。Wherein, lev(a,b) represents the edit distance between the string a of the drug name and the string b of the generic name of the drug, len(a) represents the string length of the string a, len( b) represents the length of the character string of the character string b, and max(len(a), len(b)) represents the length of the longer character string of the character string a and the character string b.
根据权利要求2所述的药品分类方法，其中，所述将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名的步骤，还包括：The method for classifying drugs according to claim 2, wherein the step of matching the drug name with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name further comprises:

若与所述药品名称的文本相似度达到预设相似度阈值的药品通用名不止一个，则将与所述药品名称的文本相似度达到预设相似度阈值所对应的药品通用名确定为待定通用名；If there is more than one generic drug name whose text similarity with the drug name reaches the preset similarity threshold, the generic name of the drug corresponding to the text similarity with the drug name reaching the preset similarity threshold is determined as the pending generic name;

从所述原始用药记录表中获取所述药品名称对应的待分类药品的适用症状信息；Obtain the applicable symptom information of the drug to be classified corresponding to the drug name from the original drug record table;

从所述药品信息标准库中获取所述待定通用名对应的标准药品的适用症状信息；Obtain applicable symptom information of the standard drug corresponding to the undetermined generic name from the drug information standard database;

将与所述待分类药品的适用症状信息相同的标准药品对应的待定通用名确定为所述待分类药品的标准药品通用名。The undetermined generic name corresponding to the standard drug with the same applicable symptom information as the drug to be classified is determined as the standard drug generic name of the drug to be classified.
根据权利要求1所述的药品分类方法，其中，所述将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名的步骤，包括：The method for classifying drugs according to claim 1, wherein the step of matching the drug name with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name comprises:

将所述药品名称与药品信息标准库中的药品通用名进行匹配；Match the name of the drug with the generic name of the drug in the drug information standard library;

将匹配成功的药品名称归入第一类药品名称集，并将匹配到的药品通用名确定为所述第一类药品名称集中各药品名称对应的标准药品通用名；Classify the matched drug names into the first-category drug name set, and determine the matched drug generic names as the standard drug generic names corresponding to each drug name in the first-category drug name set;

将未能匹配成功的药品名称归入第二类药品名称集；Classify the drug names that failed to match into the second category of drug name set;

将所述第一类药品名称集中的药品名称按指定聚类算法进行聚类，得到第一聚类名称子集，所述第一聚类名称子集中包括聚类得到的第一聚类名称；Clustering the drug names in the first-type drug name set according to a designated clustering algorithm to obtain a first cluster name subset, and the first cluster name subset includes the first cluster name obtained by clustering;

将所述第二类药品名称集中的药品名称按指定聚类算法进行聚类，得到第二聚类名称子集，所述第二聚类名称子集中包括聚类得到的第二聚类名称；Clustering the drug names in the second type of drug name set according to a designated clustering algorithm to obtain a second cluster name subset, and the second cluster name subset includes the second cluster name obtained by clustering;

将所述第一聚类名称与所述第二聚类名称进行匹配，根据匹配结果确定所述第二类药品名称集中的药品名称对应的标准药品通用名。The first cluster name is matched with the second cluster name, and the standard drug generic name corresponding to the drug name in the second type of drug name set is determined according to the matching result.
一种智能设备，包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令，其中，所述处理器执行所述计算机可读指令时实现如下步骤：An intelligent device includes a memory, a processor, and computer-readable instructions that are stored in the memory and can run on the processor, wherein the processor implements the following steps when the processor executes the computer-readable instructions:

获取原始用药记录表，并从所述原始用药记录表中提取用药信息；Obtain an original medication record sheet, and extract medication information from the original medication record sheet;

按预设清洗规则对所述用药信息进行清洗，得到药品名称；Clean the medication information according to the preset cleaning rules to obtain the name of the medication;

将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名；Match the drug name with the drug generic name in the drug information standard library, and determine the standard drug generic name corresponding to the drug name;

获取用户输入的药品分类需求；Obtain the drug classification requirements entered by the user;

根据匹配得到的标准药品通用名与所述药品分类需求，确定所述药品名称对应的分类标签，所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签；Determine the classification label corresponding to the drug name according to the matched standard drug generic name and the drug classification requirement, where the classification label is a label predefined according to the drug classification requirement and the standard drug generic name;

根据所述分类标签，将所述药品名称对应的药品分类。According to the classification label, the drug corresponding to the drug name is classified.
根据权利要求7所述的智能设备，其中，所述将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名的步骤，包括：The smart device according to claim 7, wherein the step of matching the drug name with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name comprises:

计算所述药品名称与所述药品信息标准库中的药品通用名的文本相似度；Calculate the text similarity between the drug name and the drug generic name in the drug information standard library;

若所述文本相似度达到预设相似度阈值，则将所述文本相似度对应的药品通用名确定为所述药品名称的标准药品通用名。If the text similarity reaches the preset similarity threshold, the drug generic name corresponding to the text similarity is determined as the standard drug generic name of the drug name.
根据权利要求8所述的智能设备，其中，所述计算所述药品名称与所述药品信息标准库中的药品通用名的文本相似度的步骤，具体包括：The smart device according to claim 8, wherein the step of calculating the text similarity between the drug name and the drug generic name in the drug information standard library specifically includes:

获取所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度；Acquiring the character string and character string length of the drug name and the character string and character string length of the drug generic name;

根据所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度，计算所述药品名称与所述药品通用名的编辑距离，所述编辑距离是指所述药品名称变换到与所述药品通用名相同时需要经历的最小变换次数；According to the character string and character string length of the drug name and the character string and character string length of the drug generic name, the edit distance between the drug name and the drug generic name is calculated, and the edit distance refers to the drug The minimum number of times the name is changed to be the same as the generic name of the drug;

根据所述编辑距离确定所述药品名称与所述药品通用名的文本相似度。The text similarity between the drug name and the generic name of the drug is determined according to the edit distance.
根据权利要求9所述的智能设备，其中，所述根据所述编辑距离确定所述药品名称与所述药品通用名的文本相似度的步骤，包括：The smart device according to claim 9, wherein the step of determining the text similarity between the drug name and the generic name of the drug according to the edit distance comprises:

根据如下公式计算所述药品名称与所述药品通用名的相似度值Sim(a，b)：Calculate the similarity value Sim(a, b) between the name of the drug and the generic name of the drug according to the following formula:

其中，lev(a,b)表示所述药品名称的字符串a与所述药品通用名的字符串b之间的编辑距离，len(a)表示所述字符串a的字符串长度，len(b)表示所述字符串b的字符串长度，max(len(a),len(b))表示所述字符串a与所述字符串b中的较长字符串长度。Wherein, lev(a,b) represents the edit distance between the string a of the drug name and the string b of the generic name of the drug, len(a) represents the string length of the string a, len( b) represents the length of the character string of the character string b, and max(len(a), len(b)) represents the length of the longer character string of the character string a and the character string b.
根据权利要求8所述的智能设备，其中，所述将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名的步骤，还包括：The smart device according to claim 8, wherein the step of matching the drug name with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name further comprises:

若与所述药品名称的文本相似度达到预设相似度阈值的药品通用名不止一个，则将与所述药品名称的文本相似度达到预设相似度阈值所对应的药品通用名确定为待定通用名；If there is more than one generic drug name whose text similarity with the drug name reaches the preset similarity threshold, the generic name of the drug corresponding to the text similarity with the drug name reaching the preset similarity threshold is determined as the pending generic name;

从所述原始用药记录表中获取所述药品名称对应的待分类药品的适用症状信息；Obtain the applicable symptom information of the drug to be classified corresponding to the drug name from the original drug record table;

从所述药品信息标准库中获取所述待定通用名对应的标准药品的适用症状信息；Obtain applicable symptom information of the standard drug corresponding to the undetermined generic name from the drug information standard database;

将与所述待分类药品的适用症状信息相同的标准药品对应的待定通用名确定为所述待分类药品的标准药品通用名。The undetermined generic name corresponding to the standard drug with the same applicable symptom information as the drug to be classified is determined as the standard drug generic name of the drug to be classified.
根据权利要求7所述的智能设备，其中，所述将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名的步骤，包括：The smart device according to claim 7, wherein the step of matching the drug name with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name comprises:

将所述药品名称与药品信息标准库中的药品通用名进行匹配；Match the name of the drug with the generic name of the drug in the drug information standard library;

将匹配成功的药品名称归入第一类药品名称集，并将匹配到的药品通用名确定为所述第一类药品名称集中各药品名称对应的标准药品通用名；Classify the matched drug names into the first-category drug name set, and determine the matched drug generic names as the standard drug generic names corresponding to each drug name in the first-category drug name set;

将未能匹配成功的药品名称归入第二类药品名称集；Classify the drug names that failed to match into the second category of drug name set;

将所述第一类药品名称集中的药品名称按指定聚类算法进行聚类，得到第一聚类名称子集，所述第一聚类名称子集中包括聚类得到的第一聚类名称；Clustering the drug names in the first type of drug name set according to a designated clustering algorithm to obtain a first cluster name subset, and the first cluster name subset includes the first cluster name obtained by clustering;

将所述第二类药品名称集中的药品名称按指定聚类算法进行聚类，得到第二聚类名称子集，所述第二聚类名称子集中包括聚类得到的第二聚类名称；Clustering the drug names in the second type of drug name set according to a designated clustering algorithm to obtain a second cluster name subset, and the second cluster name subset includes the second cluster name obtained by clustering;

将所述第一聚类名称与所述第二聚类名称进行匹配，根据匹配结果确定所述第二类药品名称集中的药品名称对应的标准药品通用名。The first cluster name is matched with the second cluster name, and the standard drug generic name corresponding to the drug name in the second type of drug name set is determined according to the matching result.
一种计算机可读存储介质，所述计算机可读存储介质存储有计算机可读指令，其中，所述计算机可读指令被处理器执行时实现如下步骤：A computer-readable storage medium, the computer-readable storage medium stores computer-readable instructions, wherein, when the computer-readable instructions are executed by a processor, the following steps are implemented:

获取原始用药记录表，并从所述原始用药记录表中提取用药信息；Obtain an original medication record sheet, and extract medication information from the original medication record sheet;

按预设清洗规则对所述用药信息进行清洗，得到药品名称；Clean the medication information according to the preset cleaning rules to obtain the name of the medication;

将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名；Match the drug name with the drug generic name in the drug information standard library, and determine the standard drug generic name corresponding to the drug name;

获取用户输入的药品分类需求；Obtain the drug classification requirements entered by the user;

根据匹配得到的标准药品通用名与所述药品分类需求，确定所述药品名称对应的分类标签，所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签；Determine the classification label corresponding to the drug name according to the matched standard drug generic name and the drug classification requirement, where the classification label is a label predefined according to the drug classification requirement and the standard drug generic name;

根据所述分类标签，将所述药品名称对应的药品分类。According to the classification label, the drug corresponding to the drug name is classified.
根据权利要求13所述的计算机可读存储介质，其中，所述将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名的步骤，包括：The computer-readable storage medium according to claim 13, wherein the step of matching the drug name with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name comprises :

计算所述药品名称与所述药品信息标准库中的药品通用名的文本相似度；Calculate the text similarity between the drug name and the drug generic name in the drug information standard library;

若所述文本相似度达到预设相似度阈值，则将所述文本相似度对应的药品通用名确定为所述药品名称的标准药品通用名。If the text similarity reaches the preset similarity threshold, the drug generic name corresponding to the text similarity is determined as the standard drug generic name of the drug name.
根据权利要求14所述的计算机可读存储介质，其中，所述计算所述药品名称与所述药品信息标准库中的药品通用名的文本相似度的步骤，具体包括：14. The computer-readable storage medium according to claim 14, wherein the step of calculating the text similarity between the drug name and the drug generic name in the drug information standard library specifically comprises:

获取所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度；Acquiring the character string and character string length of the drug name and the character string and character string length of the drug generic name;

根据所述药品名称的字符串及字符串长度与所述药品通用名的字符串及字符串长度，计算所述药品名称与所述药品通用名的编辑距离，所述编辑距离是指所述药品名称变换到与所述药品通用名相同时需要经历的最小变换次数；According to the character string and character string length of the drug name and the character string and character string length of the drug generic name, the edit distance between the drug name and the drug generic name is calculated, and the edit distance refers to the drug The minimum number of times of changing the name to be the same as the generic name of the drug;

根据所述编辑距离确定所述药品名称与所述药品通用名的文本相似度。The text similarity between the drug name and the generic name of the drug is determined according to the edit distance.
根据权利要求15所述的计算机可读存储介质，其中，所述根据所述编辑距离确定所述药品名称与所述药品通用名的文本相似度的步骤，包括：15. The computer-readable storage medium according to claim 15, wherein the step of determining the text similarity between the drug name and the generic name of the drug according to the edit distance comprises:

根据如下公式计算所述药品名称与所述药品通用名的相似度值Sim(a，b)：Calculate the similarity value Sim(a, b) between the drug name and the generic name of the drug according to the following formula:

其中，lev(a,b)表示所述药品名称的字符串a与所述药品通用名的字符串b之间的编辑距离，len(a)表示所述字符串a的字符串长度，len(b)表示所述字符串b的字符串长度，max(len(a),len(b))表示所述字符串a与所述字符串b中的较长字符串长度。Wherein, lev(a,b) represents the edit distance between the string a of the drug name and the string b of the generic name of the drug, len(a) represents the string length of the string a, len( b) represents the length of the character string of the character string b, and max(len(a), len(b)) represents the length of the longer character string of the character string a and the character string b.
根据权利要求14所述的计算机可读存储介质，其中，所述将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名的步骤，还包括：The computer-readable storage medium according to claim 14, wherein the step of matching the drug name with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name, further include:

若与所述药品名称的文本相似度达到预设相似度阈值的药品通用名不止一个，则将与所述药品名称的文本相似度达到预设相似度阈值所对应的药品通用名确定为待定通用名；If there is more than one generic drug name whose text similarity to the drug name reaches the preset similarity threshold, the generic name of the drug corresponding to the text similarity to the drug name that reaches the preset similarity threshold is determined as the pending generic name;

从所述原始用药记录表中获取所述药品名称对应的待分类药品的适用症状信息；Obtain the applicable symptom information of the drug to be classified corresponding to the drug name from the original drug record table;

从所述药品信息标准库中获取所述待定通用名对应的标准药品的适用症状信息；Obtain applicable symptom information of the standard drug corresponding to the undetermined generic name from the drug information standard database;

将与所述待分类药品的适用症状信息相同的标准药品对应的待定通用名确定为所述待分类药品的标准药品通用名。The undetermined generic name corresponding to the standard drug with the same applicable symptom information as the drug to be classified is determined as the standard drug generic name of the drug to be classified.
根据权利要求13所述的计算机可读存储介质，其中，所述将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名的步骤，包括：The computer-readable storage medium according to claim 13, wherein the step of matching the drug name with the drug generic name in the drug information standard library to determine the standard drug generic name corresponding to the drug name comprises :

将所述药品名称与药品信息标准库中的药品通用名进行匹配；Match the name of the drug with the generic name of the drug in the drug information standard library;

将匹配成功的药品名称归入第一类药品名称集，并将匹配到的药品通用名确定为所述第一类药品名称集中各药品名称对应的标准药品通用名；Classify the matched drug names into the first-category drug name set, and determine the matched drug generic names as the standard drug generic names corresponding to each drug name in the first-category drug name set;

将未能匹配成功的药品名称归入第二类药品名称集；Classify the drug names that failed to match into the second category of drug name set;

将所述第一类药品名称集中的药品名称按指定聚类算法进行聚类，得到第一聚类名称子集，所述第一聚类名称子集中包括聚类得到的第一聚类名称；Clustering the drug names in the first type of drug name set according to a designated clustering algorithm to obtain a first cluster name subset, and the first cluster name subset includes the first cluster name obtained by clustering;

将所述第二类药品名称集中的药品名称按指定聚类算法进行聚类，得到第二聚类名称子集，所述第二聚类名称子集中包括聚类得到的第二聚类名称；Clustering the drug names in the second type of drug name set according to a designated clustering algorithm to obtain a second cluster name subset, and the second cluster name subset includes the second cluster name obtained by clustering;

将所述第一聚类名称与所述第二聚类名称进行匹配，根据匹配结果确定所述第二类药品名称集中的药品名称对应的标准药品通用名。The first cluster name is matched with the second cluster name, and the standard drug generic name corresponding to the drug name in the second type of drug name set is determined according to the matching result.
一种药品分类装置，其中，包括：A medicine classification device, which includes:

用药信息提取单元，用于获取原始用药记录表，并从所述原始用药记录表中提取用药信息；The medication information extraction unit is used to obtain the original medication record sheet, and extract medication information from the original medication record sheet;

信息清洗单元，用于按预设清洗规则对所述用药信息进行清洗，得到药品名称；The information cleaning unit is used to clean the medication information according to the preset cleaning rules to obtain the name of the medicine;

通用名匹配单元，用于将所述药品名称与药品信息标准库中的药品通用名进行匹配，确定所述药品名称对应的标准药品通用名；The generic name matching unit is used to match the drug name with the drug generic name in the drug information standard library, and determine the standard drug generic name corresponding to the drug name;

分类需求获取单元，用于获取用户输入的药品分类需求；The classification requirement obtaining unit is used to obtain the pharmaceutical classification requirements input by the user;

分类标签确定单元，用于根据匹配得到的标准药品通用名与所述药品分类需求，确定所述药品名称对应的分类标签，所述分类标签为预先根据药品分类需求以及标准药品通用名预定义的标签；The classification label determining unit is configured to determine the classification label corresponding to the drug name according to the matched standard drug generic name and the drug classification requirement, and the classification label is predefined according to the drug classification requirement and the standard drug generic name Label;

药品分类单元，根据所述分类标签，将所述药品名称对应的药品分类。The medicine classification unit classifies the medicine corresponding to the medicine name according to the classification label.
根据权利要求19所述的药品分类装置，其中，所述通用名匹配单元包括：The medicine classification device according to claim 19, wherein the common name matching unit comprises:

信息匹配模块，用于将所述药品名称与药品信息标准库中的药品通用名进行匹配；The information matching module is used to match the name of the drug with the generic name of the drug in the drug information standard library;

第一药品名称集确定模块，用于将匹配成功的药品名称归入第一类药品名称集，并将匹配到的药品通用名确定为所述第一类药品名称集中各药品名称对应的标准药品通用名；The first drug name set determination module is used to classify the successfully matched drug names into the first class drug name set, and determine the common name of the matched drug as the standard drug corresponding to each drug name in the first class drug name set common name;

第二药品名称集确定模块，用于将未能匹配成功的药品名称归入第二类药品名称集；The second drug name set determining module is used to classify the drug names that failed to match into the second drug name set;

第一聚类模块，用于将所述第一类药品名称集中的药品名称按指定聚类算法进行聚类，得到第一聚类名称子集，所述第一聚类名称子集中包括聚类得到的第一聚类名称；The first clustering module is configured to cluster the drug names in the first type of drug name set according to a specified clustering algorithm to obtain a first cluster name subset, and the first cluster name subset includes clusters The first cluster name obtained;

第二聚类模块，用于将所述第二类药品名称集中的药品名称按指定聚类算法进行聚类，得到第二聚类名称子集，所述第二聚类名称子集中包括聚类得到的第二聚类名称；The second clustering module is used to cluster the drug names in the second type of drug name set according to a specified clustering algorithm to obtain a second cluster name subset, and the second cluster name subset includes clusters The second cluster name obtained;

第一聚类匹配模块，用于将所述第一聚类名称与所述第二聚类名称进行匹配，根据匹配结果确定所述第二类药品名称集中的药品名称对应的标准药品通用名。The first cluster matching module is configured to match the first cluster name with the second cluster name, and determine the standard drug generic name corresponding to the drug name in the second type of drug name set according to the matching result.