CN113742348A

CN113742348A - Patient data matching method in CDR system, main index establishing method and device

Info

Publication number: CN113742348A
Application number: CN202111045885.3A
Authority: CN
Inventors: 刘新辉; 张勇斌
Original assignee: Shanghai Clinbrain Information Technology Co Ltd
Current assignee: Shanghai Clinbrain Information Technology Co Ltd
Priority date: 2021-09-07
Filing date: 2021-09-07
Publication date: 2021-12-03
Anticipated expiration: 2041-09-07
Also published as: CN113742348B

Abstract

The invention provides a patient data matching method, a main index establishing method and a main index establishing device in a CDR system. Wherein the patient data matching method in the CDR system comprises the following steps: acquiring data to be matched and confirmed data; sequentially acquiring at least two similarities of each confirmed data based on at least two combinations of the matching fields; and judging whether the matching is successful or not based on all the similarity, and obtaining the confirmed data matched with the data to be matched. Based on the patient data matching method in the CDR system, the main index of the patient can be constructed, all historical diagnosis records of the patient can be further obtained, the diagnosis of the patient condition and medical research can be assisted, and the problem that uniform patient identification does not exist among all business systems of a hospital in the prior art is solved. On the other hand, the effectiveness of the matching result is increased by using a multi-turn matching mode, and the complex data working conditions can be met.

Description

Patient data matching method in CDR system, main index establishing method and device

Technical Field

The invention relates to the technical field of data processing, in particular to a patient data matching method in a CDR system, a main index establishing method and a device.

Background

At present, a plurality of information systems are arranged in a hospital, patient identifications among the information systems are inconsistent, association and cross indexing cannot be carried out to obtain other related information, an information island is easily formed, the maximum utilization of medical data resources cannot be realized, and the consistency and the integrity of the patient information of each system are poor.

In summary, in the prior art, there is a problem that there is no uniform patient identification between the business systems.

Disclosure of Invention

The invention aims to provide a patient data matching method, a main index establishing method and a device in a CDR system, and aims to solve the problem that uniform patient identification does not exist among all business systems of a hospital in the prior art.

In order to solve the above technical problem, according to a first aspect of the present invention, there is provided a patient data matching method in a CDR system, including:

acquiring data to be matched and confirmed data, wherein the data to be matched comprises a matching field, and the confirmed data comprises the matching field;

sequentially acquiring the ith similarity of the data to be matched and each confirmed data based on the ith combination of the matching fields;

judging whether the matching is successful or not based on all the ith similarity, and if so, obtaining a piece of confirmed data matched with the data to be matched based on all the ith similarity;

wherein, the value range of i is all integers from 1 to n, and n is an integer larger than 1.

Optionally, the step of determining whether the matching is successful based on all the ith similarity includes: if the ith similarity corresponding to each piece of confirmed data is smaller than or equal to the ith threshold, the matching is failed; otherwise, the matching is successful;

or,

the step of judging whether the matching is successful or not based on all the ith similarity comprises the following steps: if the sum of all the ith similarity corresponding to each piece of confirmed data is smaller than a preset threshold value, the matching fails; otherwise, the matching is successful.

Optionally, the step of obtaining a piece of confirmed data matched with the data to be matched based on all the ith similarities includes: selecting the confirmed data with the largest sum of all the ith similarities.

Optionally, the step of obtaining a piece of confirmed data matched with the data to be matched based on all the ith similarities includes:

if the ith similarity of at least one confirmed data in the ith set is greater than the ith threshold and i is less than n, forming an ith +1 set by the confirmed data with the ith similarity greater than the ith threshold in the ith set and judging again;

otherwise, selecting the confirmed data with the largest sum of all the ith similarity in the ith set, or selecting the confirmed data with the largest ith similarity in the ith set;

wherein the 1 st set is all the confirmed data.

Optionally, the step of obtaining the ith similarity between the data to be matched and the confirmed data includes:

and sequentially obtaining a similarity value corresponding to each matching field in the ith combination, wherein the similarity value is weighted and averaged based on the ith weighting parameter to obtain the ith similarity.

Optionally, each matching field in the data to be matched only stores one attribute value, and each matching field in the confirmed data stores one or more than two attribute values; the step of obtaining the similar value corresponding to the matching field comprises:

and performing similarity calculation on the attribute values in the data to be matched and each attribute value in the corresponding matching field in the confirmed data, and obtaining the similarity value after weighted average of calculation results.

Optionally, the 1 st combination includes a name field, a gender field, and an identification number field, and the 1 st weighting parameter corresponding to the identification number field is greater than 0.5.

Optionally, the matching field includes a name field, and the method for obtaining the similar value corresponding to the name field includes: calculated according to the following formula:

wherein similarity represents the similarity value, ED_ABRepresents the edit distance between A and B, max () represents the maximum operation, L_AString length, L, for A_BAnd B represents the length of a character string of B, A represents the attribute value stored in the name field in the data to be matched, and B represents the attribute value stored in the name field in the confirmed data.

In order to solve the above technical problem, according to a second aspect of the present invention, there is provided a patient master index establishing method in a CDR system, including:

acquiring original data from at least two service systems, wherein the original data comprises matching fields;

the raw data is classified into first data and second data based on a cleaning rule;

the first data generates confirmed data based on a merging rule, and the confirmed data comprises a matching field and a main index field;

the second data is configured to be matched data, and the matched data obtains a matching result based on the patient data matching method in the CDR system;

if the matching is successful, combining the current data to be matched and the matched confirmed data;

and if the matching fails, generating temporary index data by the current data to be matched.

Optionally, the matching field includes a name field and an identity card field, and the merge rule includes:

judging whether the identity card fields of the first data and one piece of confirmed data are equal, and judging that the name fields of the first data and the confirmed data are equal;

if the first data is equal to the identity card field of one piece of confirmed data and the current first data is equal to the name field of the current piece of confirmed data; merging the current first data and the current confirmed data;

otherwise, the current first data is independently converted into a new confirmed data.

Optionally, each matching field in the first data stores only one attribute value, each matching field in the confirmed data stores one or more than two attribute values, and the step of determining whether the matching fields of the first data and the confirmed data are equal includes:

if the attribute value of the matching field of the first data is a null value, judging that the attribute values are not equal;

if the attribute value of the matching field of the first data is not a null value and the attribute value of the matching field of the first data is not equal to the attribute value of the matching field of the confirmed data, judging that the attribute values are not equal;

and if the attribute value of the matching field of the first data is not a null value and the attribute value of the matching field of the first data is equal to one of the attribute values of the matching field of the confirmed data, judging that the attribute values are equal.

Optionally, the step of merging the data to be merged and the confirmed data includes:

the data to be merged and the confirmed data are sequentially judged about each matching field, and if the attribute value of the matching field of the data to be merged is not a null value and the attribute value of the matching field of the data to be merged is not equal to the attribute value of the matching field of the confirmed data, the current attribute value of the data to be merged is stored in the matching field of the confirmed data;

wherein the data to be merged comprises the first data and the data to be matched.

In order to solve the technical problem, according to a third aspect of the present invention, a patient master index creating apparatus is provided, which includes a matching module for executing the data matching method.

Compared with the prior art, in the patient data matching method, the main index establishing method and the device in the CDR system provided by the invention, the patient data matching method in the CDR system comprises the following steps: acquiring data to be matched and confirmed data; sequentially acquiring at least two similarities of each confirmed data based on at least two combinations of the matching fields; and judging whether the matching is successful or not based on all the similarity, and obtaining the confirmed data matched with the data to be matched. Based on the patient data matching method in the CDR system, the main index of the patient can be constructed, all historical diagnosis records of the patient can be further obtained, the diagnosis of the patient condition and medical research can be assisted, and the problem that uniform patient identification does not exist among all business systems of a hospital in the prior art is solved. On the other hand, the effectiveness of the matching result is increased by using a multi-turn matching mode, and the complex data working conditions can be met.

Drawings

It will be appreciated by those skilled in the art that the drawings are provided for a better understanding of the invention and do not constitute any limitation to the scope of the invention. Wherein:

FIG. 1 is a schematic flow chart of a patient data matching method in a CDR system according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a patient primary index establishing method in the CDR system according to an embodiment of the present invention.

Detailed Description

To further clarify the objects, advantages and features of the present invention, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. It is to be noted that the drawings are in greatly simplified form and are not to scale, but are merely intended to facilitate and clarify the explanation of the embodiments of the present invention. Further, the structures illustrated in the drawings are often part of actual structures. In particular, the drawings may have different emphasis points and may sometimes be scaled differently.

As used in this application, the singular forms "a", "an" and "the" include plural referents, the term "or" is generally employed in a sense including "and/or," the terms "a" and "an" are generally employed in a sense including "at least one," the terms "at least two" are generally employed in a sense including "two or more," and the terms "first", "second" and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicit to the number of technical features indicated. Thus, features defined as "first", "second" and "third" may explicitly or implicitly include one or at least two of the features, "one end" and "the other end" and "proximal end" and "distal end" generally refer to the corresponding two parts, which include not only the end points, but also the terms "mounted", "connected" and "connected" should be understood broadly, e.g., as a fixed connection, as a detachable connection, or as an integral part; can be mechanically or electrically connected; either directly or indirectly through intervening media, either internally or in any other relationship. Furthermore, as used in the present invention, the disposition of an element with another element generally only means that there is a connection, coupling, fit or driving relationship between the two elements, and the connection, coupling, fit or driving relationship between the two elements may be direct or indirect through intermediate elements, and cannot be understood as indicating or implying any spatial positional relationship between the two elements, i.e., an element may be in any orientation inside, outside, above, below or to one side of another element, unless the content clearly indicates otherwise. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

The core idea of the invention is to provide a patient Data matching method, a main index establishing method and a device in a CDR (Clinical Data retrieval) system, so as to solve the problem that in the prior art, uniform patient identification does not exist among all business systems of a hospital.

The following description refers to the accompanying drawings.

Referring to fig. 1 to fig. 2, fig. 1 is a schematic flow chart illustrating a patient data matching method in a CDR system according to an embodiment of the present invention; fig. 2 is a flowchart illustrating a patient primary index establishing method in the CDR system according to an embodiment of the present invention.

As shown in fig. 1, the present embodiment provides a patient data matching method in a CDR system, including:

s100, acquiring data to be matched and confirmed data, wherein the data to be matched comprises a matching field, and the confirmed data comprises the matching field;

s200, sequentially obtaining the ith similarity of the data to be matched and each confirmed data based on the ith combination of the matching fields;

s300, judging whether the matching is successful or not based on all the ith similarity, and if so, obtaining a piece of confirmed data matched with the data to be matched based on all the ith similarity;

wherein, the value of i is all integers from 1 to n, and n is an integer larger than 1.

In step S100, the inclusion of the matching field should be understood in the following way. Assuming that the field name of one of the matching fields is "F", wherein the attribute value of the "F" field of a certain piece of data (specifically, the data to be matched or the confirmed data, which is referred to herein for convenience of description, simply as data) is null; based on different specifications or standards, an "F: NULL "or" F: (ii) a "may not contain a string representing the" F "field at all. However, as long as a piece of data includes a character string representing the "F" field in all data processed by the method, or an operation method related to the "F" field is included in a method for parsing, reading or post-processing the data, it should be understood that the piece of data includes the "F" field.

In step S200, a total of n calculations are actually performed, the ith calculation uses the ith combination, and the ith similarity is obtained. In step S200, the k1 th calculation and the k2 th calculation are different in that the k1 th combination and the k2 th combination are different and/or the calculation manner is different. Wherein k1 ≠ k2, k1 is all integers from 1 to n, and k2 is all integers from 1 to n.

In step S300, it is determined whether the matching is successful or not, or a scheme of obtaining the matching data, which can be understood by referring to the subsequent content of this embodiment, and a person skilled in the art can also modify the specific scheme provided in this embodiment, which should also be understood as a protection scope of the technical scheme of the present invention.

With the configuration, through multiple rounds of fuzzy matching, the data to be matched can find the closest matching data, and the establishment of a subsequent global index value is facilitated. The method is a core method for solving the problem that uniform patient identification does not exist among all business systems of a hospital.

In an embodiment, the step of determining whether the matching is successful based on all the ith similarities includes: if the ith similarity corresponding to each piece of confirmed data is less than or equal to an ith threshold (which should be understood here as meaning that 1 to n are both true for i), the matching fails; otherwise, the matching is successful.

Assume that the confirmed data are 3 pieces and are numbered 1, 2, and 3, respectively. The value of n is 3, the 1 st threshold is 0.9, the 2 nd threshold is 0.9, and the 3 rd threshold is 0.9. The similarity between the data to be matched and the confirmed data is shown in table 1.

TABLE 1 similarity of the matching data

Confirmed data numbering	Degree of similarity 1	Degree of similarity 2	3 rd degree of similarity
				1	0.8	0.7	0.6
2	0.6	0.7	0.8
				3	0.3	0.2	0.3

Since the ith similarity of each of the confirmed data in table 1 is smaller than the ith threshold, the matching is considered to be failed.

In another embodiment, the step of determining whether the matching is successful based on all the ith similarities comprises: if the sum of all the ith similarity corresponding to each piece of confirmed data is smaller than a preset threshold value, the matching fails; otherwise, the matching is successful.

Assume that the confirmed data are 3 pieces and are numbered 1, 2, and 3, respectively. The value of n is 3, and the preset threshold value is 2.7. The similarity between the data to be matched and the confirmed data is shown in table 2.

TABLE 2 similarity of the matching data

Since the sum of all similarities of each of the confirmed data in table 2 is less than 2.7, the matching is considered to fail.

In an embodiment, the step of obtaining a piece of confirmed data matching the data to be matched based on all the ith similarities includes: selecting the confirmed data with the largest sum of all the ith similarities.

Assume that the confirmed data are 3 pieces and are numbered 1, 2, and 3, respectively. The value of n is 3. The similarity between the data to be matched and the confirmed data is shown in table 3.

TABLE 3 similarity of the matching data

Confirmed data numbering	Degree of similarity 1	Degree of similarity 2	3 rd degree of similarity
				1	0.8	0.7	0.8
2	0.6	0.7	0.8
				3	0.3	0.2	0.3

Since the sum of all the similarities of the confirmed data numbered 1 in table 3 is 2.5, which is the maximum value, the 1 st piece of the confirmed data is selected as the matching data.

It should be understood that although the sum of all the similarity degrees of the 1 st data in table 3 is smaller than the preset threshold 2.7 in the previous example, in the present embodiment, it is not limited what the condition of the failure of the judgment is. In this embodiment, a scheme that the sum is greater than 1, i.e., the matching is considered to be successful, a scheme that the 1 st threshold to the 3 rd threshold are all 0.5, and other possible schemes may be selected. The examples herein are merely illustrative of selection criteria for matching data, and are not judgment criteria for whether matching is successful or not.

It is to be understood that when there are at least two pieces of data that meet the condition (e.g., the maximum of two pieces of data is equal), the data that has the highest similarity 1 is selected according to an additional rule, for example, the data that has the highest similarity 1 is selected, the data that has been randomly selected, the data that has the earliest creation time is selected, or another rule of comprehensive judgment is selected. In most cases, there is no case where there is exactly more than one piece of the confirmed data having the largest sum of all the ith similarities, and the additional rule is set only for preventing program errors, so that it can be set as a simpler rule. The description similar to the logic of this paragraph in the subsequent content of this specification can be understood in light of the idea of this paragraph.

In another embodiment, the step of obtaining a piece of confirmed data matching the data to be matched based on all the ith similarities includes:

s301, if the ith similarity of at least one confirmed data in the ith set is greater than the ith threshold and i is less than n, forming an ith +1 set by the confirmed data with the ith similarity greater than the ith threshold in the ith set and judging again;

s302, if not, selecting the confirmed data with the maximum sum of all the ith similarity in the ith set;

wherein the 1 st set is all the confirmed data.

Assume that the confirmed data are 5 pieces and are numbered 1, 2, 3, 4, 5, respectively. The value of n is 3, the 1 st threshold is 0.6, the 2 nd threshold is 0.75, and the 3 rd threshold is 0.65. The similarity between the data to be matched and the confirmed data is shown in table 4.

TABLE 4 similarity of the matching data

In round 1, since the 1 st similarity of the confirmed data of the number 3 is 0.6, the confirmed data of the number 1, 2, 4, 5 in the 2 nd set and the confirmed data of the number 4, 5 in the 3 rd set are equal to n in the case of the 3 rd set, i at this time is equal to n, and therefore, the data of the number 4 is selected by "selecting the confirmed data having the largest sum of all the i-th similarities in the i-th set". As can be seen from the example herein, although the sum of all the similarities of the data numbered 3 is the largest, the data is not the final matched data.

In another example, the 2 nd threshold is 0.8, and other conditions are exactly the same as in the previous example. At this time, there is no data greater than the 2 nd threshold in the 2 nd set, and therefore, data with the largest sum of all similarities in the 2 nd set, that is, data numbered 1, is selected.

The core idea of the above logic is to select according to a mechanism similar to a knockout competition, and if the similarity of a certain piece of the confirmed data in a certain round is low, the confirmed data is excluded from the candidate list.

s302, if not, selecting the confirmed data with the maximum ith similarity in the ith set;

wherein the 1 st set is all the confirmed data.

The main idea of the above embodiment is substantially the same as that of the previous embodiment, except that the data with the largest ith similarity is selected last, and the specific implementation process of the data can be understood with reference to the previous embodiment.

Further, the step of obtaining the ith similarity between the data to be matched and the confirmed data includes:

For example, the ith combination includes the matching fields "C", "D", and "E", where the ith weighting parameter of "C" is 0.2, the ith weighting parameter of "D" is 0.3, the ith weighting parameter of "E" is 0.5, the similarity value corresponding to "C" is 0.7, the similarity value corresponding to "D" is 0.5, and the similarity value corresponding to "E" is 0.8, and the final similarity calculation result is 0.2 × 0.7+0.3 +0.5 × 0.8 — 0.69.

It should be understood that when i takes different values, the ith weighting parameter corresponding to the same matching field may be different.

Further, each matching field in the data to be matched only stores one attribute value, and each matching field in the confirmed data stores one or more than two attribute values; the step of obtaining the similar value corresponding to the matching field comprises:

When the matching fields are "C", "D", and "E", one possible form of the data to be matched is shown in table 5.

TABLE 5 exemplary forms of data to be matched

Name of field	C	D	E
				Attribute value	3	8	6

One possible form of the confirmed data is shown in table 6.

TABLE 6 exemplary forms of validated data

In one piece of the confirmed data shown in table 6, the attribute values of the "C" field are 3, 4, and 7.

It should be understood that, in an actual service method, the data to be matched and the confirmed data further include other fields related to services, and the present specification does not limit the storage manner of the above data when storing non-matching fields.

In a preferred embodiment, the 1 st combination comprises a name field, a gender field and an identification number field, and the 1 st weighting parameter corresponding to the identification number field is greater than 0.5. For example, the 1 st weighting parameter of the name field is 0.1, the 1 st weighting parameter of the gender field is 0.1, and the 1 st weighting parameter corresponding to the identification number field is 0.8. With the configuration, the 1 st similarity obtained by calculation can have higher discrimination.

In some embodiments, the 2 nd combination may include a contact phone field, the 2 nd weighting parameter of the contact phone field is greater than 0.5, and other fields of the 2 nd combination may be set according to different requirements. The 3 rd combination may include a home address field, a 3 rd weighting parameter of the home address field is greater than 0.5, and other fields of the 3 rd combination may be set according to different requirements.

The matching field comprises a name field, and the method for acquiring the similar value corresponding to the name field comprises the following steps: calculated according to the following formula:

The edit distance may also be referred to as a Levenshtein (name) distance, which refers to the minimum number of edit operations required to convert one character string into another character string. Permitted editing operations include replacing one character with another, inserting one character, and deleting one character. The edit distance was first proposed by the russian scientist Levenshtein. So configured, on the one hand, the calculation problem of the similarity between the name strings is solved, and on the other hand, when two names are completely different, the calculation result is 0 and is identical with the expectation.

Referring to fig. 2, the method for establishing a patient main index in a CDR system includes:

s10, acquiring original data from at least two service systems, wherein the original data comprises matching fields;

s20, classifying the original data into a first data and a second data based on a cleaning rule;

s31, generating confirmed data by the first data based on a merging rule, wherein the confirmed data comprises a matching field and a main index field;

s41, configuring the second data into data to be matched, and obtaining a matching result by the data to be matched based on the data matching method;

s42, if the matching is successful, merging the current data to be matched and the matched confirmed data;

and S43, if the matching fails, generating temporary index data by the current data to be matched.

In fig. 2, the accurate data is the first data, the fuzzy data is the second data, and the fuzzy matching is the data matching method described above. In step S10, the data of the CDR system, i.e. the data originating from at least two service systems, is merged into the precision data in step S42, which should be understood as being merged into the precision data for which the primary index has been generated, i.e. the validated data.

The flow of processing stock data and incremental data is not actually different, but only when stock data (or the patient master index creation method in the CDR system is first run), the number of the confirmed data is 0 at first time, and when incremental data is processed, a part of the confirmed data already exists.

The generation rule of the main index field can be set according to actual requirements, and the expansion description is not performed here. The cleaning rule in step S20 may be set according to actual needs, and in an embodiment, the data with the name field being null and the id field being null may be classified as the second data, and the rest may be classified as the first data. Other rules may be set in other embodiments.

Further, the matching field includes a name field and an identity card field, and the merge rule includes:

I.e. equal data is merged and unequal data is independently converted into a new piece of said confirmed data. Wherein the conversion process may include: copying the whole content of the first data and adding the main index field, the conversion process may also include other steps required by the business logic, and those skilled in the art may set the conversion process according to common knowledge, and will not be described herein. Theoretically, it is possible that two pieces of data which do not meet the condition are actually directed to the same patient, but it is found in actual implementation that the error amount caused by the rule is very small, and after such an error occurs, the error can be corrected manually, so the above logic is adopted for setting and distinguishing in the present embodiment.

Where each of the matching fields in the first data stores only one attribute value and each of the matching fields in the validated data stores one or more than two attribute values, the logic herein may also be understood with reference to the foregoing in relation to tables 5 and 6. The step of determining whether the matching fields of the first data and the confirmed data are equal comprises:

The weight used in the weighted average may be set according to the number of occurrences of each attribute in the history data, or may be set according to another method.

The step of merging the data to be merged and the confirmed data comprises the following steps:

For example, the content of the data to be merged is shown in table 5, the content of the confirmed data is shown in table 6, and the merged data is shown in table 7.

TABLE 7 exemplary forms of consolidated validated data

The merging manner of the to-be-merged data and the confirmed data merging other non-matching fields may be set according to actual needs, and a description thereof is not given here.

Based on the method, a CDR system can be developed, the CDR system integrates medical data of each system of the hospital, and an EMPI (Enterprise Master Index) system establishes a Patient Master Index for the CDR system to perform unified management on the medical data. The accuracy of the patient master index depends on the accuracy of the patient information matching algorithm. The EMPI system provides patient master index generation and patient master index query functions. By using the patient master index in the EMPI system, doctors and related personnel can quickly find all historical treatment records of patients in the CDR system to assist in disease diagnosis and medical research.

The embodiment also provides a patient master index establishing device, which comprises a matching module, wherein the matching module is used for executing the patient data matching method in the CDR system.

Optionally, the patient master index creating apparatus further includes:

the system comprises an acquisition module, a matching module and a processing module, wherein the acquisition module is used for acquiring original data from at least two service systems, and the original data comprises matching fields;

a classification module for classifying the raw data into first data and second data based on a cleaning rule;

a merging module, configured to generate confirmed data from the first data based on a merging rule, where the confirmed data includes a matching field and a primary index field;

the input module is used for configuring the second data into data to be matched and inputting the data to be matched into the matching module; and the number of the first and second groups,

the processing module is used for processing data based on the matching result of the matching module, and if the matching is successful, the current data to be matched and the matched confirmed data are merged; and if the matching fails, generating temporary index data by the current data to be matched.

The specific workflow of the above device can be understood by referring to the description of the patient master index establishing method in the CDR system in the present specification.

The patient main index establishing device can solve the problem that uniform patient identification does not exist among all business systems in the prior art.

The above description is only for the purpose of describing the preferred embodiments of the present invention, and is not intended to limit the scope of the present invention, and any variations and modifications made by those skilled in the art according to the above disclosure are within the scope of the present invention.

Claims

1. A method of patient data matching in a CDR system, comprising:

2. The method for matching patient data in a CDR system of claim 1, wherein the step of determining whether the matching is successful based on all of the ith similarities comprises: if the ith similarity corresponding to each piece of confirmed data is smaller than or equal to the ith threshold, the matching is failed; otherwise, the matching is successful;

or,

3. The method for matching patient data in a CDR system according to claim 2, wherein said step of obtaining a piece of said confirmed data matching said data to be matched based on all of said ith similarities comprises: selecting the confirmed data with the largest sum of all the ith similarities.

4. The method for matching patient data in a CDR system according to claim 2, wherein said step of obtaining a piece of said confirmed data matching said data to be matched based on all of said ith similarities comprises:

wherein the 1 st set is all the confirmed data.

5. The method for matching patient data in a CDR system according to any of claims 1 to 4, wherein the step of obtaining the ith similarity between the data to be matched and the confirmed data comprises:

6. The method of matching patient data in a CDR system of claim 5, wherein each of said matching fields in said data to be matched stores only one attribute value, and each of said matching fields in said validated data stores one or more than two attribute values; the step of obtaining the similar value corresponding to the matching field comprises:

7. The method of claim 5, wherein the 1 st combination comprises a name field, a gender field, and an identification number field, and the 1 st weighting parameter corresponding to the identification number field is greater than 0.5.

8. The method of matching patient data in a CDR system of claim 5, wherein said matching field includes a name field, the method of obtaining said similarity value corresponding to said name field comprising: calculated according to the following formula:

9. A method for establishing a patient main index in a CDR system is characterized by comprising the following steps:

the second data is configured as data to be matched, and the data to be matched obtains a matching result based on a patient data matching method in a CDR system according to any one of claims 1-8;

10. The patient master index building method in a CDR system of claim 9, wherein the matching fields include a name field and an identity card field, and wherein the merge rule comprises:

11. The method of claim 10, wherein each of the matching fields of the first data stores only one attribute value, each of the matching fields of the confirmed data stores one or more attribute values, and the step of determining whether the matching fields of the first data and the confirmed data are equal comprises:

12. The method of claim 11, wherein each matching field of the data to be matched stores only one attribute value, and the step of merging the data to be merged and the confirmed data comprises:

13. A patient master index building apparatus comprising a matching module for performing the patient data matching method in the CDR system according to any one of claims 1 to 8.