CN112015854B - Heterogeneous data attribute association method based on self-organizing mapping neural network - Google Patents

Heterogeneous data attribute association method based on self-organizing mapping neural network Download PDF

Info

Publication number
CN112015854B
CN112015854B CN202010690533.2A CN202010690533A CN112015854B CN 112015854 B CN112015854 B CN 112015854B CN 202010690533 A CN202010690533 A CN 202010690533A CN 112015854 B CN112015854 B CN 112015854B
Authority
CN
China
Prior art keywords
neuron
neurons
input
winning
output
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010690533.2A
Other languages
Chinese (zh)
Other versions
CN112015854A (en
Inventor
钱玉洁
张紫薇
张�杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Changzhou Campus of Hohai University
Original Assignee
Changzhou Campus of Hohai University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changzhou Campus of Hohai University filed Critical Changzhou Campus of Hohai University
Priority to CN202010690533.2A priority Critical patent/CN112015854B/en
Publication of CN112015854A publication Critical patent/CN112015854A/en
Application granted granted Critical
Publication of CN112015854B publication Critical patent/CN112015854B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Animal Behavior & Ethology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a heterogeneous data attribute association method based on a self-organizing map neural network, which can carry out attribute association in a plurality of heterogeneous databases through matching entities. Firstly, taking the input attribute to be matched as an output neuron, taking all the attributes of a first database as input neurons, and selecting a winning neuron according to the matching degree of the neurons. And then, taking all neurons in the winning neuron field function range as output neurons, taking all attributes of the next database as input neurons, and performing self-organizing iteration. At this time, the domain function of the winning neuron is calculated, the matching degree reward value is set for each output neuron according to the domain function value, and the winning neuron is selected according to the matching value of each neuron, and the process is iterated in all databases. Finally, the winning neurons of all iterative processes are extracted, and relevant attributes are associated.

Description

Heterogeneous data attribute association method based on self-organizing mapping neural network
Technical Field
The invention relates to a method for associating related attributes in a database, in particular to a heterogeneous data attribute association method based on a self-organizing map (SOM) neural network.
Background
The association of data attributes is generally used for fusion applications of heterogeneous data, such as semantic association of a search engine, knowledge fusion of a knowledge graph and the like; in massive heterogeneous data, an artificial neural network is an effective way for modeling attribute association.
Self-Organizing Map (Self-Organizing Map) neural networks are an unsupervised, competing type artificial neural network. The method uses a competition learning strategy, gradually optimizes the network by means of mutual competition among neurons, and maintains the topological structure of the input space by using a domain function. Because of the phenomena of entity ambiguity, entity co-fingering and the like among heterogeneous data, the self-organizing mapping neural network based on competition learning can effectively construct a correlation topological structure through competition and cooperation of synonymous and near-sense entities, and is suitable for attribute correlation of massive heterogeneous data.
Disclosure of Invention
The invention provides a heterogeneous data attribute association method based on a self-organizing map (SOM) neural network, which is used for carrying out attribute association through matching entities in a plurality of heterogeneous databases.
The technical scheme adopted in the invention is as follows:
a heterogeneous data attribute association method based on self-organizing map (SOM) neural network. In a plurality of heterogeneous databases, when attribute association is performed by matching entities,
the method comprises the following specific steps:
step 1: initialization of
Performing first iteration, taking the input attribute as an output neuron, taking all the attributes of the first database as input neurons, and selecting a winning neuron according to the matching degree of the neurons;
step 2: calculating prize values
Taking all neurons in the winning neuron field function range as output neurons, taking all attributes of the next database as input neurons, and performing self-organizing iteration;
at this time, a domain function of the winning neuron is calculated, the domain function value is set as a prize value, and neurons within all domain radii are assigned. The winning neuron obtains the highest rewarding value, and the more similar the other neurons are to the winning neuron, the higher rewarding value is obtained;
step 3: and comparing the matching degree of the output neurons with that of the input neurons, selecting the neuron with the highest matching degree with any output neuron from the input neurons as the winning neuron, and carrying out matching calculation of the number of the output neurons and the number of the input neurons. At this time, the matching degree calculation process needs to be substituted into the reward value set by each neuron;
step 4: iterating, repeating the step 2 and the step 3, and traversing all databases;
step 5: finally, the winning neurons of all iterative processes are extracted, and relevant attributes are associated.
Preferably, in step 1, a winning neuron is selected according to the matching degree of the neurons, and the specific calculation formula is shown in formula (1):
wherein i is the input neuron number; n is the set of input neurons.
Preferably, the method for calculating the prize value in step 2 is as shown in formula (2):
wherein j is the input neuron number; delta is a constant from 0 to 1, and is set according to the correlation between databases; k is the iteration number; g is the highest matching value; lambda is the radius of the field.
Preferably, in the matching degree calculating process in step 3, the prize value set by each neuron is substituted, and the calculating method is as shown in formula (3):
wherein j is the output neuron number; k is the output neuron set.
The beneficial effects are that: the invention provides a heterogeneous data attribute association method based on a self-organizing map (SOM) neural network, which is used for carrying out attribute association through matching entities in a plurality of heterogeneous databases. Has the following advantages: 1) Based on unsupervised learning, the training phase does not need human intervention (i.e., sample tags are not needed), and data can be clustered under the condition that the category is not known; 2) Features that have an inherent association with a problem can be identified; 3) Has high generalization capability and can identify input samples which have never been encountered before.
Drawings
FIG. 1 is a first round of iterative schematic;
FIG. 2 is a second iteration schematic;
FIG. 3 is a third iteration schematic;
FIG. 4 is a fourth iteration schematic;
fig. 5 is a schematic diagram of association attributes.
Detailed Description
In order to better understand the technical solutions in the present application, the following description will clearly and completely describe the technical solutions in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
The technical scheme of the invention is further described in detail below with reference to the accompanying drawings:
a heterogeneous data attribute association method based on self-organizing map (SOM) neural network. In a plurality of heterogeneous databases, when attribute association is performed through a matching entity, the specific steps are as follows:
step 1: initializing, as shown in fig. 1, by taking input attributes as output neurons and taking all the attributes of a database D as input neurons, selecting a winning neuron according to the matching degree of the neurons;
step 2: the prize value is calculated, the output neurons are all neurons within the radius of the winning neuron, the input neurons are the database C, and the next iteration is performed, as shown in fig. 2. At this time, a domain function of the winning neuron is calculated, the domain function value is set as a prize value, and neurons within all domain radii are assigned. The winning neuron obtains the highest rewarding value, and the more similar the other neurons are to the winning neuron, the higher rewarding value is obtained;
step 3: and comparing the matching degree of the output neurons with that of the input neurons, selecting the neuron with the highest matching degree with any output neuron from the input neurons as the winning neuron, and calculating the matching of the number of the output neurons and the number of the input neurons. At this time, the matching degree calculation process needs to substitute the prize value f (j) set by each neuron;
step 4: iterating, repeating step 2 and step 3, as shown in fig. 3 and fig. 4, traversing all databases;
step 5: finally, as in fig. 5, winning neurons of all iterative processes are extracted, correlating the relevant attributes.
Preferably, a winning neuron is selected according to the matching degree of the neurons, and the specific calculation formula is shown in formula (1):
wherein i is the input neuron number; n is the set of input neurons.
Preferably, in the step 2, the domain function value is set to a prize value, and neurons in all domain radii are assigned. The calculation method of the reward value is shown in the formula (2):
wherein j is the input neuron number; delta is a constant from 0 to 1, and is set according to the correlation between databases; k is the iteration number; g is the highest matching value; lambda is the radius of the field.
Preferably, in the matching degree calculating process in step 3, the prize value set by each neuron is substituted, and the calculating method is as shown in formula (3):
wherein j is the output neuron number; k is the output neuron set.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims (4)

1. A heterogeneous data attribute association method based on self-organizing mapping neural network is characterized in that when attribute association is carried out by matching entities in a plurality of heterogeneous databases, the method comprises the following steps:
step 1: initialization of
Performing first iteration, taking the input attribute as an output neuron, taking all the attributes of the first database as input neurons, and selecting a winning neuron according to the matching degree of the neurons;
step 2: calculating prize values
Taking all neurons in the winning neuron field function range as output neurons, taking all attributes of the next database as input neurons, and performing self-organizing iteration;
calculating a domain function, setting a domain function value as a reward value, and endowing neurons in all domain radiuses; the winning neuron obtains the highest rewarding value, and the more similar the other neurons are to the winning neuron, the higher rewarding value is obtained;
step 3: comparing the matching degree of the output neurons with that of the input neurons, selecting the neuron with the highest matching degree with any output neuron from the input neurons as a winning neuron, and carrying out matching calculation of the number of the output neurons and the number of the input neurons; at this time, the matching degree calculation process needs to be substituted into the reward value set by each neuron in the step 2;
step 4: iterating, repeating the step 2 and the step 3, and traversing all databases;
step 5: finally, the winning neurons of all iterative processes are extracted, and relevant attributes are associated.
2. The heterogeneous data attribute association method based on the self-organizing map neural network according to claim 1, wherein in the step 1, a calculation formula of selecting a winning neuron according to the matching degree of the neuron is shown in formula (1):
wherein i is the input neuron number; n is the set of input neurons.
3. The heterogeneous data attribute association method based on the self-organizing map neural network according to claim 1, wherein the calculation method of the reward value in the step 2 is as shown in formula (2):
wherein j is the output neuron number; delta is a constant from 0 to 1, and is set according to the correlation between databases; k is the iteration number; g is the highest matching value; lambda is the radius of the field.
4. The heterogeneous data attribute association method based on the self-organizing map neural network according to claim 3, wherein the matching degree calculation process in the step 3 is to substitute the prize value set by each neuron, and the calculation method is as shown in formula (3):
wherein i is the input neuron number; j is the output neuron number; n is the input neuron set; k is the output neuron set.
CN202010690533.2A 2020-07-17 2020-07-17 Heterogeneous data attribute association method based on self-organizing mapping neural network Active CN112015854B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010690533.2A CN112015854B (en) 2020-07-17 2020-07-17 Heterogeneous data attribute association method based on self-organizing mapping neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010690533.2A CN112015854B (en) 2020-07-17 2020-07-17 Heterogeneous data attribute association method based on self-organizing mapping neural network

Publications (2)

Publication Number Publication Date
CN112015854A CN112015854A (en) 2020-12-01
CN112015854B true CN112015854B (en) 2023-07-18

Family

ID=73499544

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010690533.2A Active CN112015854B (en) 2020-07-17 2020-07-17 Heterogeneous data attribute association method based on self-organizing mapping neural network

Country Status (1)

Country Link
CN (1) CN112015854B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112816884B (en) * 2021-03-01 2022-08-02 中国人民解放军国防科技大学 Method, device and equipment for monitoring health state of satellite lithium ion battery

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550375A (en) * 2016-02-01 2016-05-04 北京天广汇通科技有限公司 Heterogeneous data integrating method and system
CN107545046A (en) * 2017-08-17 2018-01-05 北京奇安信科技有限公司 A kind of fusion method and device of multi-source heterogeneous data
CN110837891A (en) * 2019-10-23 2020-02-25 南京大学 Self-organizing mapping method and system based on SIMD architecture

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105550375A (en) * 2016-02-01 2016-05-04 北京天广汇通科技有限公司 Heterogeneous data integrating method and system
CN107545046A (en) * 2017-08-17 2018-01-05 北京奇安信科技有限公司 A kind of fusion method and device of multi-source heterogeneous data
CN110837891A (en) * 2019-10-23 2020-02-25 南京大学 Self-organizing mapping method and system based on SIMD architecture

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于神经网络的异构数据库语义集成的研究;才苗;《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅱ辑》;20091115;第8-48页 *

Also Published As

Publication number Publication date
CN112015854A (en) 2020-12-01

Similar Documents

Publication Publication Date Title
CN110263227B (en) Group partner discovery method and system based on graph neural network
US20200167659A1 (en) Device and method for training neural network
CN111445963B (en) Subgraph isomorphic constraint solving method based on graph node information aggregation
CN110598061A (en) Multi-element graph fused heterogeneous information network embedding method
CN116503676B (en) Picture classification method and system based on knowledge distillation small sample increment learning
Ahmadi et al. Learning fuzzy cognitive maps using imperialist competitive algorithm
CN112015854B (en) Heterogeneous data attribute association method based on self-organizing mapping neural network
CN112836007B (en) Relational element learning method based on contextualized attention network
CN118035448A (en) Method, device and medium for classifying paper fields in citation network based on pseudo tag depolarization
Du et al. CGaP: Continuous growth and pruning for efficient deep learning
Zuo et al. Domain selection of transfer learning in fuzzy prediction models
Hatami et al. A multi-label feature selection based on mutual information and ant colony optimization
CN110109005B (en) Analog circuit fault testing method based on sequential testing
CN113095480A (en) Interpretable graph neural network representation method based on knowledge distillation
CN110866838A (en) Network representation learning algorithm based on transition probability preprocessing
Hatami et al. A graph-based multi-label feature selection using ant colony optimization
CN116090538A (en) Model weight acquisition method and related system
CN113205175A (en) Multi-layer attribute network representation learning method based on mutual information maximization
Jiang et al. A CTR prediction approach for advertising based on embedding model and deep learning
CN112131569A (en) Risk user prediction method based on graph network random walk
Xu et al. Multi-level self-adaptive prototypical networks for few-shot node classification on attributed networks
CN111274498A (en) Network characteristic community searching method
Tao et al. A deep clustering algorithm based on self-organizing map neural network
CN114842231A (en) Deep attention embedding graph clustering method with smooth structure
CN117648623B (en) Network classification algorithm based on pooling comparison learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant