CN109344132B - User information merging method, computer readable storage medium and terminal device - Google Patents

User information merging method, computer readable storage medium and terminal device Download PDF

Info

Publication number
CN109344132B
CN109344132B CN201811018263.XA CN201811018263A CN109344132B CN 109344132 B CN109344132 B CN 109344132B CN 201811018263 A CN201811018263 A CN 201811018263A CN 109344132 B CN109344132 B CN 109344132B
Authority
CN
China
Prior art keywords
information
user information
keywords
user
group
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811018263.XA
Other languages
Chinese (zh)
Other versions
CN109344132A (en
Inventor
程相
张昆轮
邓乾喜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN201811018263.XA priority Critical patent/CN109344132B/en
Publication of CN109344132A publication Critical patent/CN109344132A/en
Application granted granted Critical
Publication of CN109344132B publication Critical patent/CN109344132B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Storage Device Security (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention is applicable to the technical field of data processing, and provides a user information merging method, a computer readable storage medium and terminal equipment, wherein the method comprises the following steps: acquiring at least two groups of user information, and respectively extracting information type keywords in each group of user information; respectively extracting the residual information in each group of user information according to the information type keywords, wherein the residual information is the information except the information type keywords in the user information; comparing the acquired at least two groups of user information according to the information type keywords and the residual information; if the at least two groups of user information have the identical user information, one group of user information is stored, and the user information identical to the group of user information is deleted. By the method, redundancy of user information is effectively reduced, and the redundant information is prevented from occupying a large amount of storage space.

Description

User information merging method, computer readable storage medium and terminal device
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method for merging user information, a computer readable storage medium, and a terminal device.
Background
With the development of information technology, various application systems are in mass emergence, the data volume required to be collected is more and more, and the data processing task is more and more heavy. Data collection is an important basic work in information technology, the data formats of the existing collection are different, and taking user information as an example, the user information can come from a plurality of different application systems, and the different application systems have the problem of inconsistent collected information formats. Due to inconsistent information formats, information redundancy, repetition and the like may occur. The existing method for merging and removing the weight of the user information is mostly completed manually, has low speed and low efficiency, is not beneficial to subsequent data analysis and maintenance, and cannot mine the potential value of the user in time.
Disclosure of Invention
In view of this, the embodiments of the present invention provide a method for merging user information, a computer readable storage medium, and a terminal device, so as to solve the problems of redundant user information and occupation of a large amount of storage space in the prior art.
In a first aspect of the embodiment of the present invention, a method for merging user information is provided, which may include:
acquiring at least two groups of user information, and respectively extracting information type keywords in each group of user information;
respectively extracting the residual information in each group of user information according to the information type keywords, wherein the residual information is the information except the information type keywords in the user information;
comparing the acquired at least two groups of user information according to the information type keywords and the residual information;
if the at least two groups of user information have the identical user information, one group of user information is stored, and the user information identical to the group of user information is deleted.
In a second aspect of the embodiments of the present invention, there is provided a computer-readable storage medium storing a computer program which, when executed by a processor, performs the steps of:
acquiring at least two groups of user information, and respectively extracting information type keywords in each group of user information;
respectively extracting the residual information in each group of user information according to the information type keywords, wherein the residual information is the information except the information type keywords in the user information;
comparing the acquired at least two groups of user information according to the information type keywords and the residual information;
if the at least two groups of user information have the identical user information, one group of user information is stored, and the user information identical to the group of user information is deleted.
In a third aspect of the embodiment of the present invention, there is provided a terminal device including a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring at least two groups of user information, and respectively extracting information type keywords in each group of user information;
respectively extracting the residual information in each group of user information according to the information type keywords, wherein the residual information is the information except the information type keywords in the user information;
comparing the acquired at least two groups of user information according to the information type keywords and the residual information;
if the at least two groups of user information have the identical user information, one group of user information is stored, and the user information identical to the group of user information is deleted.
Compared with the prior art, the embodiment of the invention has the beneficial effects that:
according to the embodiment of the invention, at least two groups of user information are compared according to the information type keywords and the residual information, and the identical user information can be quickly found in the at least two groups of user information by the method; after the identical multiple sets of user information are found, only one set of user information is stored, and the rest of user information identical to the set of user information is deleted.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic implementation flow chart of a method for merging user information according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a device for merging user information according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a terminal device according to an embodiment of the present invention.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present invention. It will be apparent, however, to one skilled in the art that the present invention may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present invention with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It is also to be understood that the terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should be further understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in this specification and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
In order to illustrate the technical scheme of the invention, the following description is made by specific examples.
Fig. 1 is a schematic flow chart of an implementation of a method for merging user information according to an embodiment of the present invention, as shown in the drawing, the method may include the following steps:
step S101, at least two groups of user information are obtained, and information type keywords in each group of user information are respectively extracted.
Wherein the information type key may include at least one of:
identity keywords, gender keywords, contact keywords, aliases of identity keywords, aliases of gender keywords, aliases of contact keywords.
The identification keyword comprises at least one of the following: name, identification number.
The gender keyword includes: sex.
The contact key words comprise at least one of the following: mobile phone number, email box.
The alias of the identification keyword comprises at least one of the following: name, identity card.
The alias of the gender key comprises: sex information.
The alias of the contact key comprises at least one of the following: handset number, mailbox.
In practical applications, the identification keyword may further include a fingerprint, a face image, a driver license number, a passport number, and the like. The contact way key may also include WeChat, QQ, MSN, etc. The alias of the keyword is another name of the keyword, and the alias of the keyword may be numerous as long as it is consistent with the meaning represented by the keyword. For example, the alias of the identity representation key may also include a title, a certificate number, etc., and the alias of the contact key may also include a Q number, a QQ number, a micro signal number, a micro signal account number, etc.
In practical application, before extracting the information type keyword in the user information, the information type keyword may be predefined, where the predefined information type keyword includes: the identification keywords, the sex keywords and the contact information keywords are predefined. Further comprises: the method comprises the steps of defining aliases of identity keywords in advance, and establishing a corresponding relation between the identity keywords and the aliases of the identity keywords; defining the aliases of the gender keywords in advance, and establishing the corresponding relation between the gender keywords and the aliases of the gender keywords; and predefining the alias of the contact information key, and establishing the corresponding relation between the alias of the contact information key and the contact information key.
The predefined contact key is exemplified by a mobile phone number and an email box, and the alias of the predefined contact key is exemplified by the mobile phone number and the email box. The mobile phone number is associated with the mobile phone number and the electronic mailbox is associated with the mailbox. In other words, the phone number is an alias of the phone number, and the mailbox is an alias of the electronic mailbox.
Of course, in practical application, other types of keywords besides identity, gender and contact manner may be used, which is not limited herein.
Step S102, extracting the residual information in each group of user information according to the information type keywords, wherein the residual information is the information except the information type keywords in the user information.
For example, assume that the user information is "name three, sex men, mobile phone number 12345678901". The information type key words are name, gender and mobile phone number, and the rest information is Zhang three, men and 12345678901.
In the embodiment of the present invention, the extracting the remaining information in each group of user information according to the information type keyword includes:
dividing the user information according to the information type keywords to obtain at least two pieces of sub information, wherein each piece of sub information only comprises one information type keyword.
And eliminating the information type key words from the sub-information to obtain the residual information of the sub-information.
In practical application, the user information is divided according to the information type keywords, namely each group of user information is divided into at least two pieces of sub information, and each piece of sub information only contains one information type keyword. Specifically, the information type keyword may be detected in the user information, and information included in the middle of the first word of each information type keyword to the word preceding the first word of the next information type keyword may be used as one piece of sub information.
The user information is, for example, "name three, sex men, mobile phone number 12345678901". The detected information type keywords are name, gender and mobile phone number. Taking the information from the first word of the name to the middle of the previous word of the first word of the gender as sub-information, and the first piece of sub-information is name three; the information between the first word of the "sex" and the word before the first word of the "mobile phone number" is taken as one piece of sub information, and the second piece of sub information is the "sex man". And so on, the third piece of sub information is "mobile phone number 12345678901". And respectively eliminating information type keywords in each piece of sub information to obtain: the remaining information of the first piece of sub information is "Zhang san", the remaining information of the second piece of sub information is "Man", and the remaining information of the third piece of sub information is "12345678901".
It should be noted that, the above is only an example of dividing the user information into sub-information, and other methods may be used to divide the user information into sub-information, so long as one information type keyword and the remaining information corresponding to the information type keyword can be divided into one sub-information, and the specific dividing method is not limited.
And step S103, comparing the acquired at least two groups of user information according to the information type keywords and the residual information.
In the embodiment of the present invention, the comparing the acquired at least two sets of user information according to the information type keyword and the remaining information includes:
and correlating the information type keywords in the sub-information with the rest information of the sub-information to obtain a correlation result.
And comparing the acquired at least two groups of user information according to the information type keywords, the residual information and the association result.
As an example in step S102, the keyword in the first piece of sub information is "name", the remaining information in the first piece of sub information is "Zhang san", and the information type keyword corresponding to "Zhang san" is "name" can be obtained by associating "Zhang san" with "name".
In practical application, at least two groups of user information are compared according to the information type keywords, the residual information and the association result, namely, the information type keywords and the residual information in each group of user information are respectively found, then whether the information type keywords in each group of user information are identical or not is respectively compared, and if the information type keywords are identical, whether the residual information is identical or not is continuously compared; judging whether the residual information is the same, comparing whether the residual information is the same or not, and comparing whether the information type keywords corresponding to the residual information are the same or not according to the association result. If the information type keywords are the same in the two groups of user information and the rest information is the same, the two groups of user information are the same.
Illustratively, there are two sets of user information A and B, user information A is "name three, sex Man" and user information B is "name three, sex Man". Firstly, comparing information type keywords in two groups of user information, wherein only two information type keywords are in A and B, the name in A and the name in B are the same information type keywords (the name is an alias of the name), the gender in A and the gender in B are the same information type keywords, and therefore the information type keywords in A and B are the same. The remaining information continues to be compared, and only two pieces of remaining information exist in A and B. The content of Zhang San in A is the same as that of Zhang San in B, and the information type keywords corresponding to the two pieces of residual information are names, so Zhang San in A and Zhang San in B are the same residual information; the "man" in a is the same as the "man" in B in content, and the information type keywords corresponding to the two pieces of remaining information are both "sex", so that the "man" in a is the same remaining information as the "man" in B. In summary, in a and B, all information type keywords are the same and all remaining information is the same, so a and B are the same user information.
It should be noted that, the above is only an example of how to compare at least two sets of user information, and other comparison methods may also be available, so long as whether the sets of user information are the same or different can be obtained, and the comparison method is not specifically limited.
In the embodiment of the present invention, the comparing the acquired at least two sets of user information includes:
detecting each group of user information respectively to obtain a detection result of each group of user information;
determining whether the at least two sets of user information are all complete user information according to the detection result of each set of user information;
and if the at least two groups of user information are all complete user information, comparing the acquired at least two groups of user information.
In practical application, comparing each group of user information, firstly determining whether each group of user information is complete or not, and if not, not comparing; only complete can the comparison be made.
In the embodiment of the present invention, the detecting each set of user information to obtain a detection result of each set of user information includes:
detecting whether the user information contains the information type keyword.
And if the user information does not contain the information type keyword, the detection result of the current user information is that the user information is incomplete.
If the user information contains the information type keyword, detecting whether the user information contains the residual information associated with the information type keyword.
And if the user information contains the residual information related to the information type keyword, the detection result of the current user information is that the user information is complete.
And if the user information does not contain the residual information related to the information type keyword, the detection result of the current user information is that the user information is incomplete.
In practical application, whether the user information is complete or not is judged, namely whether the user information contains at least one information type keyword is judged, and the residual information corresponding to the information type keyword is not empty. If the above condition is satisfied, it is indicated that the user information is complete. Absent either item, the user information is incomplete.
Step S104, if the at least two groups of user information are completely the same, one group of user information is stored, and the user information same as the group of user information is deleted.
Illustratively, there are three sets of user information A, B, C. As can be seen from the comparison, a and B are identical user information, and C is different from A, B. Save a, delete B, and save C. B may also be saved, A deleted, and C saved. As long as one group is saved in a and B, the other group is deleted.
According to the embodiment of the invention, at least two groups of user information are compared according to the information type keywords and the residual information, and the identical user information can be quickly found in the at least two groups of user information by the method; after the identical multiple sets of user information are found, only one set of user information is stored, and the rest of user information identical to the set of user information is deleted.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
Fig. 2 is a schematic diagram of a device for merging user information provided in an embodiment of the present invention, and for convenience of explanation, only a portion relevant to the embodiment of the present invention is shown.
The user information merging device shown in fig. 2 may be a software unit, a hardware unit, or a unit combining soft and hard, which are built in an existing terminal device, or may be integrated into the terminal device as an independent pendant, or may exist as an independent terminal device.
As shown in the figure, the device 2 for combining user information includes:
an obtaining unit 21 is configured to obtain at least two sets of user information, and extract information type keywords in each set of user information.
And an extracting unit 22, configured to extract, according to the information type keyword, remaining information in each set of user information, where the remaining information is information in the user information except the information type keyword.
And the comparing unit 23 is configured to compare the acquired at least two sets of user information according to the information type keyword and the remaining information.
And the deleting unit 24 is configured to store one set of user information and delete the user information identical to the set of user information if the at least two sets of user information have identical user information.
Optionally, the extracting unit 22 includes:
the dividing module is used for dividing the user information according to the information type keywords to obtain at least two pieces of sub information, wherein each piece of sub information only comprises one information type keyword.
And the rejecting module is used for rejecting the information type keywords from the sub-information to obtain the residual information of the sub-information.
Optionally, the comparing unit 23 includes:
and the association module is used for associating the information type keyword in the sub-information with the rest information of the sub-information to obtain an association result.
And the first comparison module is used for comparing the acquired at least two groups of user information according to the information type keywords, the residual information and the association result.
Optionally, the comparing unit 23 further includes:
and the detection module is used for respectively detecting each group of user information to obtain a detection result of each group of user information.
And the determining module is used for determining whether the at least two groups of user information are all complete user information according to the detection result of each group of user information.
And the second comparison module is used for comparing the acquired at least two sets of user information if the at least two sets of user information are all complete user information.
Optionally, the detection module includes:
and the first detection sub-module is used for detecting whether the user information contains the information type keyword.
And the first result submodule is used for detecting that the current user information is incomplete if the user information does not contain the information type keyword.
And the second detection sub-module is used for detecting whether the user information contains the residual information associated with the information type keyword if the user information contains the information type keyword.
And the second result submodule is used for detecting the current user information as the complete user information if the user information contains the residual information related to the information type keyword.
And the third result submodule is used for detecting that the current user information is incomplete if the user information does not contain the residual information related to the information type keyword.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present invention. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
Fig. 3 is a schematic diagram of a terminal device according to an embodiment of the present invention. As shown in fig. 3, the terminal device 3 of this embodiment includes: a processor 30, a memory 31 and a computer program 32 stored in said memory 31 and executable on said processor 30. The processor 30, when executing the computer program 32, implements the steps in the above-described embodiment of the merging method of the respective user information, such as steps S101 to S104 shown in fig. 1. Alternatively, the processor 30, when executing the computer program 32, performs the functions of the modules/units of the apparatus embodiments described above, such as the functions of the modules 21 to 24 shown in fig. 2.
Illustratively, the computer program 32 may be partitioned into one or more modules/units that are stored in the memory 31 and executed by the processor 30 to complete the present invention. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions for describing the execution of the computer program 32 in the terminal device 3. For example, the computer program 32 may be divided into an acquisition unit, an extraction unit, a comparison unit, a deletion unit, each unit functioning in particular as follows:
the acquisition unit is used for acquiring at least two groups of user information and respectively extracting information type keywords in each group of user information.
And the extraction unit is used for respectively extracting the residual information in each group of user information according to the information type keywords, wherein the residual information is the information except the information type keywords in the user information.
And the comparison unit is used for comparing the acquired at least two groups of user information according to the information type keywords and the residual information.
And the deleting unit is used for storing one group of user information and deleting the user information identical to the group of user information if the at least two groups of user information have the identical user information.
Optionally, the extracting unit includes:
the dividing module is used for dividing the user information according to the information type keywords to obtain at least two pieces of sub information, wherein each piece of sub information only comprises one information type keyword.
And the rejecting module is used for rejecting the information type keywords from the sub-information to obtain the residual information of the sub-information.
Optionally, the comparing unit includes:
and the association module is used for associating the information type keyword in the sub-information with the rest information of the sub-information to obtain an association result.
And the first comparison module is used for comparing the acquired at least two groups of user information according to the information type keywords, the residual information and the association result.
Optionally, the comparing unit further includes:
and the detection module is used for respectively detecting each group of user information to obtain a detection result of each group of user information.
And the determining module is used for determining whether the at least two groups of user information are all complete user information according to the detection result of each group of user information.
And the second comparison module is used for comparing the acquired at least two sets of user information if the at least two sets of user information are all complete user information.
Optionally, the detection module includes:
and the first detection sub-module is used for detecting whether the user information contains the information type keyword.
And the first result submodule is used for detecting that the current user information is incomplete if the user information does not contain the information type keyword.
And the second detection sub-module is used for detecting whether the user information contains the residual information associated with the information type keyword if the user information contains the information type keyword.
And the second result submodule is used for detecting the current user information as the complete user information if the user information contains the residual information related to the information type keyword.
And the third result submodule is used for detecting that the current user information is incomplete if the user information does not contain the residual information related to the information type keyword.
The terminal device 3 may be a computing device such as a desktop computer, a notebook computer, a palm computer, a cloud server, etc. The terminal device may include, but is not limited to, a processor 30, a memory 31. It will be appreciated by those skilled in the art that fig. 3 is merely an example of the terminal device 3 and does not constitute a limitation of the terminal device 3, and may include more or less components than illustrated, or may combine certain components, or different components, e.g., the terminal device may further include an input-output device, a network access device, a bus, etc.
The processor 30 may be a central processing unit (Central Processing Unit, CPU), other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 31 may be an internal storage unit of the terminal device 3, such as a hard disk or a memory of the terminal device 3. The memory 31 may be an external storage device of the terminal device 3, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card) or the like, which are provided on the terminal device 3. Further, the memory 31 may also include both an internal storage unit and an external storage device of the terminal device 3. The memory 31 is used for storing the computer program as well as other programs and data required by the terminal device. The memory 31 may also be used for temporarily storing data that has been output or is to be output.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
In the embodiments provided in the present invention, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present invention may implement all or part of the flow of the method of the above embodiment, or may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium may include content that is subject to appropriate increases and decreases as required by jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is not included as electrical carrier signals and telecommunication signals.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (8)

1. A method for merging user information, comprising:
acquiring at least two groups of user information, and respectively extracting information type keywords in each group of user information;
respectively extracting the residual information in each group of user information according to the information type keywords, wherein the residual information is the information except the information type keywords in the user information;
comparing the acquired at least two groups of user information according to the information type keywords and the residual information;
if the at least two groups of user information have the identical user information, one group of user information is stored, and the user information identical to the group of user information is deleted;
the step of respectively extracting the rest information in each group of user information according to the information type keywords comprises the following steps:
dividing the user information according to the information type keywords to obtain at least two pieces of sub information, wherein each piece of sub information only comprises one information type keyword, and the information type keyword comprises: identity keywords, gender keywords, contact information keywords, aliases of the identity keywords, aliases of the gender keywords and aliases of the contact information keywords;
and eliminating the information type key words from the sub-information to obtain the residual information of the sub-information.
2. The method for merging user information according to claim 1, wherein comparing the acquired at least two sets of user information according to the information type keyword and the remaining information comprises:
correlating the information type keywords in the sub-information with the rest information of the sub-information to obtain a correlation result;
and comparing the acquired at least two groups of user information according to the information type keywords, the residual information and the association result.
3. The method for merging user information according to claim 2, wherein the comparing the acquired at least two sets of user information includes:
detecting each group of user information respectively to obtain a detection result of each group of user information;
determining whether the at least two sets of user information are all complete user information according to the detection result of each set of user information;
and if the at least two groups of user information are all complete user information, comparing the acquired at least two groups of user information.
4. The method for merging user information according to claim 3, wherein the detecting each group of user information to obtain the detection result of each group of user information includes:
detecting whether the user information contains the information type keyword;
if the user information does not contain the information type keyword, the detection result of the current user information is that the user information is incomplete;
if the user information contains the information type keyword, detecting whether the user information contains the residual information associated with the information type keyword;
if the user information contains the residual information related to the information type keyword, the detection result of the current user information is that the user information is complete;
and if the user information does not contain the residual information related to the information type keyword, the detection result of the current user information is that the user information is incomplete.
5. A computer-readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements the steps of the user information merging method according to any one of claims 1 to 4.
6. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor, when executing the computer program, realizes the steps of:
at least two groups of user information are obtained, information type keywords of each group of user information are respectively extracted,
respectively extracting the residual information in each group of user information according to the information type keywords, wherein the residual information is the information except the information type keywords in the user information;
comparing the acquired at least two groups of user information according to the information type keywords and the residual information;
if the at least two groups of user information have the identical user information, one group of user information is stored, and the user information identical to the group of user information is deleted;
the step of respectively extracting the rest information in each group of user information according to the information type keywords comprises the following steps:
dividing the user information according to the information type keywords to obtain at least two pieces of sub information, wherein each piece of sub information only comprises one information type keyword, and the information type keyword comprises: identity keywords, gender keywords, contact information keywords, aliases of the identity keywords, aliases of the gender keywords and aliases of the contact information keywords;
and eliminating the information type key words from the sub-information to obtain the residual information of the sub-information.
7. The terminal device of claim 6, wherein the comparing the acquired at least two sets of user information according to the information type keyword and the remaining information comprises:
correlating the information type keywords in the sub-information with the rest information of the sub-information to obtain a correlation result;
and comparing the acquired at least two groups of user information according to the information type keywords, the residual information and the association result.
8. The terminal device of claim 7, wherein the comparing the acquired at least two sets of user information comprises:
detecting each group of user information respectively to obtain a detection result of each group of user information;
determining whether the at least two sets of user information are all complete user information according to the detection result of each set of user information;
and if the at least two groups of user information are all complete user information, comparing the acquired at least two groups of user information.
CN201811018263.XA 2018-09-03 2018-09-03 User information merging method, computer readable storage medium and terminal device Active CN109344132B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811018263.XA CN109344132B (en) 2018-09-03 2018-09-03 User information merging method, computer readable storage medium and terminal device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811018263.XA CN109344132B (en) 2018-09-03 2018-09-03 User information merging method, computer readable storage medium and terminal device

Publications (2)

Publication Number Publication Date
CN109344132A CN109344132A (en) 2019-02-15
CN109344132B true CN109344132B (en) 2024-04-02

Family

ID=65296870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811018263.XA Active CN109344132B (en) 2018-09-03 2018-09-03 User information merging method, computer readable storage medium and terminal device

Country Status (1)

Country Link
CN (1) CN109344132B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009075849A (en) * 2007-09-20 2009-04-09 Canon Inc Information processor, information processing method, program thereof, and storage medium
CN103095900A (en) * 2011-11-07 2013-05-08 希姆通信息技术(上海)有限公司 Information integration method in cellphone and cellphone
CN103516856A (en) * 2012-06-26 2014-01-15 腾讯科技(深圳)有限公司 Method and apparatus for information combination
CN104572946A (en) * 2014-12-30 2015-04-29 小米科技有限责任公司 Method and device for processing data of yellow pages
CN107592398A (en) * 2017-08-31 2018-01-16 上海爱优威软件开发有限公司 A kind of intelligent information storage method and system
CN108170731A (en) * 2017-12-13 2018-06-15 腾讯科技(深圳)有限公司 Data processing method, device, computer storage media and server
CN108388675A (en) * 2018-03-26 2018-08-10 深圳市买买提信息科技有限公司 Circulation method and terminal device are drawn in a kind of identity

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009075849A (en) * 2007-09-20 2009-04-09 Canon Inc Information processor, information processing method, program thereof, and storage medium
CN103095900A (en) * 2011-11-07 2013-05-08 希姆通信息技术(上海)有限公司 Information integration method in cellphone and cellphone
CN103516856A (en) * 2012-06-26 2014-01-15 腾讯科技(深圳)有限公司 Method and apparatus for information combination
CN104572946A (en) * 2014-12-30 2015-04-29 小米科技有限责任公司 Method and device for processing data of yellow pages
CN107592398A (en) * 2017-08-31 2018-01-16 上海爱优威软件开发有限公司 A kind of intelligent information storage method and system
CN108170731A (en) * 2017-12-13 2018-06-15 腾讯科技(深圳)有限公司 Data processing method, device, computer storage media and server
CN108388675A (en) * 2018-03-26 2018-08-10 深圳市买买提信息科技有限公司 Circulation method and terminal device are drawn in a kind of identity

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种新的新闻信息处理模型;李倩, 朱友芹, 王永县;山东大学学报(理学版);20050630(03);全文 *

Also Published As

Publication number Publication date
CN109344132A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN109299164B (en) Data query method, computer readable storage medium and terminal equipment
CN108563961B (en) Sensitive data identification method, device, equipment and medium for data desensitization platform
CN109977684B (en) Data transmission method and device and terminal equipment
CN111159329B (en) Sensitive word detection method, device, terminal equipment and computer readable storage medium
CN108563651B (en) Multi-video target searching method, device and equipment
CN112732893B (en) Text information extraction method and device, storage medium and electronic equipment
CN111460098B (en) Text matching method and device and terminal equipment
CN110765760A (en) Legal case distribution method and device, storage medium and server
CN113010116A (en) Data processing method and device, terminal equipment and readable storage medium
CN114722199A (en) Risk identification method and device based on call recording, computer equipment and medium
CN112783825A (en) Data archiving method, data archiving device, computer device and storage medium
CN107071553B (en) Method, device and computer readable storage medium for modifying video and voice
CN117216239A (en) Text deduplication method, text deduplication device, computer equipment and storage medium
CN109344132B (en) User information merging method, computer readable storage medium and terminal device
CN113032821A (en) Data desensitization method and device, electronic equipment and readable storage medium
CN107748705B (en) Method for recovering system EVT log fragments, terminal equipment and storage medium
CN115544214A (en) Event processing method and device and computer readable storage medium
CN111611056A (en) Data processing method and device, computer equipment and storage medium
CN110971759A (en) Processing method and device for unsubscribed short message and server
CN111786992A (en) Member registration and authentication method, terminal device and readable storage medium
Yudhana et al. Digital Forensic on Secure Digital High Capacity using DFRWS Method
CN111611417B (en) Image de-duplication method, device, terminal equipment and storage medium
CN113392105B (en) Service data processing method and terminal equipment
CN117112846B (en) Multi-information source license information management method, system and medium
CN112905024B (en) Syllable recording method and device for word

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant