CN113159279A

CN113159279A - Cross-domain knowledge assistance method and system based on neural network and deep learning

Info

Publication number: CN113159279A
Application number: CN202110289106.8A
Authority: CN
Inventors: 邢廷炎; 周长兵; 杨艳霞
Original assignee: China University of Geosciences Beijing
Current assignee: China University of Geosciences Beijing
Priority date: 2021-03-18
Filing date: 2021-03-18
Publication date: 2021-07-23
Anticipated expiration: 2041-03-18
Also published as: CN113159279B

Abstract

The invention discloses a cross-domain knowledge assistance system based on a neural network and deep learning, which comprises a plurality of devices (1) distributed in different knowledge fields, a deep learning coordination module (2) and a global data storage module (9); each device (1) comprises a data cleaning module (3), a data acquisition module (4) and a data reading module (5); the device (1) further comprises a single-machine storage module (7), and the single-machine storage module (7) is in data communication connection with the data cleaning module (3), the data acquisition module (4) and the data reading module (5) respectively. According to the full-flow service migration method and system for the federal machine learning, data cleaning is carried out on the data records before data record training is carried out, so that abnormal parts of the data records can be removed, the accuracy of the data records can be guaranteed, and the accuracy of a data model can be guaranteed.

Description

Cross-domain knowledge assistance method and system based on neural network and deep learning

Technical Field

The invention relates to the technical field of intelligent manufacturing, in particular to a cross-domain knowledge assistance method and system based on a neural network and deep learning.

Background

The 21 st century has shifted from preliminary automation to a highly automated, i.e., intelligent, era. The intelligent era brings real help to the life, work, industrial production and management of people, such as intelligent home furnishing and household appliances based on artificial intelligence, industrial robots in industrial production, industrial monitoring robots and the like, these are all appearing in people's lives more and more along with the rapid development of scientific technology, it can be said that the great improvements in production and life of the human society depend on the progress of technology, automated operation and intelligent control, however, typically require a computer or microcomputer to handle numerous logical relationships, and, therefore, which requires a large number of mathematical and logical calculations, which necessarily increase the logical computing power requirements of the processor, the operation processing capability of the large scale integrated circuit or the very large scale integrated circuit also directly affects the production cost. In the twentieth century, the year of intelligent production and intelligent manufacturing, in the modern life, devices are not only intelligent and automatic, but also run independently from previous devices to cooperative operation nowadays, so that the operation and cooperation across domains and devices cannot be separated, and the development of intelligent computation and artificial intelligence among different devices or fields is necessarily involved, and the demand of the computing processing capacity of a processor is further strengthened.

On the other hand, in the aspect of realizing artificial intelligence, an important research is to form a mapping relation between input and output by using the capability of simulating human processing transactions, namely simulating human facing events, namely input, and adopting processing means or technology, namely output; nowadays, another method for solving the mapping relationship is proposed, i.e. deep learning, and the concept of deep learning is derived from the research of artificial neural networks. A multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning forms a more abstract class or feature of high-level representation properties by combining low-level features to discover a distributed feature representation of the data. Deep learning is a new field in machine learning research, and the motivation is to establish and simulate a neural network for human brain to analyze and learn, which simulates the mechanism of human brain to interpret data.

For example, patent CN110489395A discloses a method for automatically acquiring knowledge of multi-source heterogeneous data, which aims to provide a method for acquiring knowledge that has more integrity, universality and convenience and is beneficial to the transfer of knowledge. The method is realized by the following technical scheme: defining a concept-entity-attribute-relation-label in a top-down or bottom-up mode to obtain a knowledge model of an entity object, then directly storing data, acquiring the data by identification software such as crawler software and OCR (optical character recognition) software to obtain knowledge data, and completing conversion from a heterogeneous data source to a heterogeneous knowledge source; obtaining entity-attribute-relation triple instantiation under a known knowledge mode through a structured knowledge generation method; and then updating knowledge and knowledge models by using a long-short term memory network model (LSTM model) and a publisher-completer cooperation mode to obtain a workflow for expanding and supplementing new knowledge, and obtaining a data stream containing concept, entity, relation and attribute value instantiation triples by using the knowledge model formed by knowledge modeling.

Patent CN 111461153a discloses a population feature deep learning method, which includes establishing various population feature classification databases, and further includes the following steps: comparing the various crowd characteristic databases with database classification models to form twelve behavior types with six opposite dimensions, constructing various crowd characteristic classification databases as judgment of crowd behavior trend 2) comparing the effective identification behavior characteristics with the multi-crowd characteristic database, determining the crowd characteristic types of target people, and updating the multi-crowd characteristic database; 3) applying a deep learning model according to different application scenes of a user, and optimizing a crowd characteristic database to form a deep learning database suitable for the user; 4) and matching corresponding service modes according to different behavior characteristics of the crowd. And the method provides behavior characteristic data support for social activities, so that various propaganda and marketing actions, job site selection and matching service modes in the aspects of teaching according to people are realized.

Patent CN109716382A discloses a system for managing, evaluating and monitoring compliance of a subject with task performance requirements within an action plan, comprising: an optical sensor for capturing a subject's facial expression, eye movements, gaze point, and a subject's head pose at a compliance assessment and monitoring segment; a domain knowledge database comprising concept data entities and task data entities, each concept data entity comprising knowledge and skill content items, each task data entity comprising presentation content material items; a subject module for assessing an emotional state and a cognitive state of a subject using sensory data collected by the optical sensor; a trainer module for selecting task data entities for delivery and presentation to the subject after each completion of the task data entities based on the subject's knowledge of related concept data entities and the probability of understanding of the skill content items and the probability of the subject achieving a target level of compliance.

Patent CN112308240A discloses a system for edge side machine cooperation and optimization based on federal learning, which includes: r federal learning systems, R is larger than or equal to 1, a model parameter distribution unit and a model training and optimizing unit. The ith federal learning system in the R federal learning systems comprises M i edge-side machines with unevenly distributed operating experience, M i ≧ 2, i ═ 1, 2, … …, R. The model parameter distribution unit is used for distributing initial parameters of federal learning to Mi edge side machines, receiving intermediate model parameters, and aggregating and updating the received intermediate model parameters to obtain new model parameters. The model training and optimizing unit is used for training a local operation model based on the initial parameters and respective operation data, and transmitting intermediate model parameters obtained after training to the model parameter distribution unit; and obtaining a system cooperative operation model according to the new model parameters. The local operation model is a model for performing work response according to different operation environments.

The patent CN110750591A discloses an artificial intelligence knowledge management system implemented by a computer system, in which an input management module is provided for managing input data of a neural network-like algorithm used in developing an artificial intelligence model; the artificial intelligence model management module is used for managing the artificial intelligence model and providing selection; the system is provided with an output management module for managing output data generated by a neural network algorithm when an artificial intelligent model is developed; and then, managing the result of each calculation by a calculation result management module, providing parameters for adjusting the artificial intelligence model, recalculating the generated output data, constructing a knowledge base, and forming distributed records scattered on a plurality of block chain nodes by a block chain technology.

Patent CN110427406A discloses a method and device for mining the relationship between people related to organization. The method comprises the following steps: acquiring all dimension data information sets of natural people related to an organization and the organization to which the natural people belong; acquiring a feature subset of each dimension of an organization to which the personnel belong after clustering according to natural person names or other attribute information; combining organizational mechanisms to which the same kind of natural people belong, and performing vector transformation according to the similarity characteristic of each combination; training a classification model of the same-name person according to the similarity vector, and predicting a classification result by using the model; and according to the classification result, merging the same natural person, and aggregating the associated natural person, the organization to which the natural person belongs and the associated organization data set to generate the related personnel relationship structure of the organization. The embodiment of the invention can accurately and visually dig out the mutual relation of related personnel of different organizations, thereby meeting the requirement of establishing the relation among the related personnel of isolated and dispersed organizations.

It can be seen that, at present, the following defects exist in the artificial intelligence or the technology for assisting knowledge learning across multiple devices:

1. in the prior art, a reinforcement learning training model usually utilizes data collected by the reinforcement learning training model to perform learning, optimization and control, and the processing of the data is rarely knowledge assistance across domains, or the reinforcement learning training model assists in data processing, because sufficient data and related data association need to be provided to realize intellectualization, however, in the aspect of data processing, an effective and rapid processing method is not available, so that the data training model can be trained as soon as possible.

2. In the prior art, in order to synthesize and summarize various types of data, although a solution is provided by multiple organizations and scholars aiming at the dilemma of data island and data privacy, an effective method for safely accessing and processing multiple data is unavailable.

3. In the prior art, the size and the number of data volume are not considered when data records are trained, and when all data are directly trained to obtain a model, the data volume is easy to be too large, so that on one hand, the data computation volume is large and the data computation is difficult; meanwhile, the data quantity is large, so that the data training model is inaccurate easily.

4. In the prior art, abnormal data records possibly existing in the data records are not subjected to data preliminary cleaning, so that abnormal data are easily generated to cause model abnormality obtained by data training.

In view of the above technical problems, it is desirable to provide a technical solution for obtaining a data model, which can perform data training quickly while reducing the requirement for the capability of a data processing system, so as to process data quickly. However, the prior art has not provided an effective solution to the above technical problem.

In view of the above technical problems, it is desirable to provide a method and a system for assisting cross-domain knowledge based on neural network and deep learning to solve the above technical problems.

Disclosure of Invention

In view of the above technical problems, an object of the present invention is to provide a method and a system for assisting cross-domain knowledge based on neural network and deep learning, so as to solve the problems in the background art.

In order to achieve the purpose, the invention provides the following technical scheme:

a cross-domain knowledge assistance system based on a neural network and deep learning comprises a plurality of devices distributed in different knowledge fields, a deep learning coordination module and a global data storage module; each device comprises a data cleaning module, a data acquisition module and a data reading module; the equipment also comprises a stand-alone storage module which is in data communication connection with the data cleaning module, the data acquisition module and the data reading module respectively;

the data training fusion submodule is arranged on part of the equipment; the device provided with the data training fusion submodule is provided with a local data storage module, and the local data storage module is respectively in data communication connection with the data training fusion submodule and the data reading module;

the deep learning coordination module, the data reading module, the data training fusion sub-module, the local data storage module and the global data storage module are in data communication connection through a data communication network; the deep learning coordination module is used for scheduling all equipment participating in data training and learning, a working module and data progress;

when data training and learning are carried out, the equipment is in operation, the data acquisition module acquires operation data and state data installed on the equipment to form data records, the data records are stored in the single-machine storage module of the equipment, the data cleaning module reads the data records stored in the single-machine storage module, each data record is analyzed by using a mathematical statistical method and set requirements, and when a certain data record is found to be obvious and unreasonable, the data record is deleted;

the deep learning coordination module groups all the devices, divides all the devices into a plurality of groups according to a certain rule, ensures that at least one data training fusion submodule exists in each group, sends the grouped information to the data reading module, the data training fusion submodule and the local data storage module, and modifies the reading permission of the data records of the data reading module, the data training fusion submodule and the local data storage module;

the data training fusion submodule establishes data communication connection with the corresponding grouped data reading modules according to the reading permission distributed by the deep learning coordination module, so that the data training fusion submodule reads the data records stored in the single-machine storage module through the data reading module to perform data learning training to obtain a data training submodel, and stores the data training submodel in the local data storage module and the global data storage module, wherein when the data training fusion submodule of each group performs data training to obtain the data training submodel, the data training submodel which is stored in the global data storage module and obtained through data training is read to serve as an initial model so as to perform fast convergence to the data training submodel;

the number of the data training submodels stored in the global data storage module is a fixed value, and when the number of the obtained data training submodels exceeds the fixed value, the data training submodels which are stored at the beginning are deleted, and the number of the deleted data training submodels is recorded; stopping performing data training on the packet when a specified number of the data training submodels are deleted;

and obtaining a total depth data training model by adopting a parameter weighting mode for all the last remaining data training submodels, and sending the total depth data training model to the global data storage module for storage.

Preferably, the deep learning coordination module randomly extracts a certain amount of data records from the data records of each group and sends the data records to the global data storage module; the general data federation model is verified, when the data records are verified by the general data federation model and data in data output and data records meet model precision requirements, the general data federation model is established, otherwise, the deep learning coordination module adds the times of acquiring the data training submodel by the data training fusion submodule to the groups and carries out the process of acquiring the data training submodel again; and finally, obtaining a total depth data training model by adopting a parameter weighting mode for all the remaining data training submodels, and sending the total depth data training model to the global data storage module for storage.

Preferably, data communication and data access between the data reading module, the data training fusion sub-module, the local data storage module, the global data storage module and the deep learning coordination module are performed in an encrypted manner, so as to ensure the security of data records in transmission.

Preferably, when the data cleaning module cleans the data of the data records, the existing historical data records or the data training submodel stored in the local data storage module is used for primarily cleaning the data, each data record is analyzed, and when the data records deviate to a certain degree, the data records are removed, so that the cleaning of the data records is more accurate.

Preferably, when the deep learning coordination module cleans and analyzes the data in the stand-alone storage module in the device, the removed unreasonable data records are sent to the deep learning coordination module, so that the data records are processed by the deep learning coordination module, and the reason for the abnormal data records is analyzed, so that the parameters of the data training submodel can be modified subsequently.

Preferably, when all the devices are grouped by the deep learning coordination module, the size of the data record quantity of each device is obtained in advance, and when the devices are grouped, the data record quantity is large in the same group, and the data record quantity is small in one group, so that the data record with the large data record quantity is prevented from inundating the data with the small data record quantity during data training, and the accuracy of the total deep data training model is constructed.

Preferably, when all the devices are grouped by the deep learning coordination module, the size of the data record quantity of each device is obtained in advance, and when the devices are grouped, the number of the devices grouped with large data record quantity is small, while the number of the devices grouped with small data record quantity is large, so as to ensure that the number of the data records of each group is moderate, and thus the calculation quantity of all the data training fusion sub-modules is proper.

Preferably, when the data training submodel is obtained by performing data training on the packets through the data training fusion submodel, the deep learning coordination module regroups the devices each time the data training submodel is completed for a certain number of times, and then performs data training by using the data training fusion submodel again to obtain the data training submodel, so as to accelerate the accuracy of the data training submodel.

In another aspect, the present application further provides a cross-domain knowledge assistance method based on a neural network and deep learning, including a cross-domain knowledge assistance system based on a neural network and deep learning, including the following steps:

step S1, initializing the cross-domain knowledge assistance system based on neural network and deep learning, wherein the deep learning coordination module obtains the data record quantity of each device in advance, and when all devices are grouped, the data record quantity is large in the same group, and the data record quantity is small in one group, so as to prevent the data record quantity large in data record quantity from inundating the data with small record quantity during data training, and the number of the devices in the group with large data record quantity is small, and the number of the devices in the group with small data record quantity is large, so as to ensure that the data record quantity of each group is moderate; ensuring that at least one data training fusion submodule exists in each group, sending the information of the group to the data reading module, the data training fusion submodule and the local data storage module, and modifying the reading permission of the data record of the data reading module, the data training fusion submodule and the local data storage module;

step S2, the data cleaning module reads the data records stored in the single machine storage module, analyzes each data record by using a mathematical statistical method and set requirements, and deletes a certain data record when the data record is obvious and unreasonable;

step S3, the data training and fusion submodule establishes data communication connection with the corresponding grouped data reading modules according to the reading authority distributed by the deep learning coordination module, so that the data training and fusion submodule reads the data records stored in the single-machine storage module through the data reading module to perform data learning training, and a data training submodel is obtained;

step S4, storing the data training submodel in the local data storage module and the global data storage module, wherein when the data training fusion submodel of each group is used for data training to obtain the data training submodel, the data training submodel of other groups stored in the global data storage module and obtained through data training is read as an initial model so as to carry out fast convergence to the data training submodel;

step S5, the number of the data training submodels stored in the global data storage module is a fixed value, when the number of the obtained data training submodels exceeds the fixed value, the data training submodels which are stored at the beginning are deleted, and the number of the deleted data training submodels is recorded; stopping performing data training on the packet when a specified number of the data training submodels are deleted;

and step S6, obtaining a total depth data training model by adopting a parameter weighting mode for all the last remaining data training submodels, and sending the total depth data training model to the global data storage module for storage.

Compared with the prior art, the invention has the beneficial effects that:

1. the cross-domain knowledge assistance method and system based on the neural network and the deep learning adopt the deep learning and neural network technology and optimize the parameters in the data training by adopting the existing model so as to accelerate the generation mode of the data training model, thereby ensuring that the training is more accurate and reducing the overall cost of the data.

2. The invention relates to a cross-domain knowledge assistance method and a system based on a neural network and deep learning, which are characterized in that when equipment is grouped for data training, the size of the data record quantity of each equipment is obtained in advance, and when the equipment is grouped, the data record quantity is large in the same group, and the data record quantity is small in one group, so that the data record with large data record quantity is prevented from submerging the data with small data record quantity during the data training, and the accuracy of a total data federation model is constructed; meanwhile, the number of the devices for the grouping with large data record quantity is small, while the number of the devices for the grouping with small data record quantity is large, so that the number of the data records of each grouping is ensured to be moderate, and the calculation quantity of all the data training fusion sub-modules is proper.

3. According to the cross-domain knowledge assistance method and system based on the neural network and the deep learning, data cleaning is carried out on the data records before data record training is carried out, so that abnormal parts of the data records can be removed, the accuracy of the data records can be ensured, and the accuracy of a data model can be ensured.

Drawings

FIG. 1 is a schematic view of the overall structure of the present invention;

FIG. 2 is a schematic diagram of a data flow structure of each module according to the present invention.

In the figure: 1. equipment; 2. a deep learning coordination module; 3. a data cleaning module; 4. a data acquisition module; 5. a data reading module; 6. a data training fusion submodule; 7. a single machine storage module; 8. a local area data storage device; 9. a global data storage module; 10. and (4) grouping.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The first embodiment is as follows:

a cross-domain knowledge assistance system based on neural network and deep learning comprises a plurality of devices 1 distributed in different knowledge fields, a deep learning coordination module 2 and a global data storage module 9; each device 1 comprises a data cleaning module 3, a data acquisition module 4 and a data reading module 5; the device 1 further comprises a single-computer storage module 7, and the single-computer storage module 7 is in data communication connection with the data cleaning module 3, the data acquisition module 4 and the data reading module 5 respectively;

a data training fusion sub-module 6, wherein the data training fusion sub-module 6 is arranged on part of the equipment 1; a local data storage module 8 is arranged on the device 1 provided with the data training fusion submodule 6, and the local data storage module 8 is respectively in data communication connection with the data training fusion submodule 6 and the data reading module 5;

the deep learning coordination module 2, the data reading module 5, the data training fusion sub-module 6, the local data storage module 8 and the global data storage module 9 are in data communication connection through a data communication network; the deep learning coordination module 2 schedules all the devices 1 participating in data training and learning, the working module and the data progress;

when data training and learning are carried out, when the equipment 1 is in operation, the data acquisition module 4 acquires operation data and state data installed on the equipment 1 to form data records, the data records are stored in the single-machine storage module 7 of the equipment 1, the data cleaning module 3 reads the data records stored in the single-machine storage module 7, each data record is analyzed by using a mathematical statistics method and set requirements, and when a certain data record is found to be obvious and unreasonable, the data record is deleted;

the deep learning coordination module 2 groups all the devices 1, divides all the devices 1 into a plurality of groups 10 according to a certain rule, ensures that at least one data training fusion sub-module 6 exists in each group 10, sends the grouped information to the data reading module 5, the data training fusion sub-module 6 and the local area data storage module 9, and modifies the reading authority of the data records of the data reading module 5, the data training fusion sub-module 6 and the local area data storage module 5;

the data training and fusion sub-module 6 establishes data communication connection with the data reading module 5 of the corresponding group according to the reading authority distributed by the deep learning coordination module 2, so that the data training and fusion sub-module 6 reads the data records stored in the single machine storage module 7 through the data reading module 5 to perform data learning training to obtain a data training sub-model, and stores the data training submodels in the local data storage module 8 and the global data storage module 9, wherein, when the data training fusion submodule 6 of each group carries out data training to obtain a data training submodel, reading the data training submodel obtained by data training of other groups stored in the global data storage module 9 as an initial model so as to rapidly converge to the data training submodel;

the number of the data training submodels stored in the global data storage module 9 is a fixed value, and when the number of the obtained data training submodels exceeds the fixed value, the data training submodels which are stored at the beginning are deleted, and the number of the deleted data training submodels is recorded; stopping performing data training on the packet when a specified number of the data training submodels are deleted;

and obtaining a total depth data training model by adopting a parameter weighting mode for all the last remaining data training submodels, and sending the total depth data training model to the global data storage module 9 for storage.

Preferably, the deep learning coordination module 2 randomly extracts a certain amount of data records from the data records of each group and sends the data records to the global data storage module 8; the total data federation model is verified, when the data records are verified by using the total data federation model and data in data output and data records meet model precision requirements, the total data federation model is established, otherwise, the deep learning coordination module 2 adds the times of acquiring the data training submodel by the data training fusion submodule 6 to the groups, and the process of acquiring the data training submodel is carried out again; and finally, obtaining a total depth data training model by adopting a parameter weighting mode for all the remaining data training submodels, and sending the total depth data training model to the global data storage module 9 for storage.

Preferably, data communication and data access between the data reading module 5, the data training fusion sub-module 6, the local data storage module 8, the global data storage module 9 and the deep learning coordination module 2 are performed in an encrypted manner, so as to ensure the security of data records in transmission.

Preferably, when the data cleaning module 3 cleans the data of the data records, the existing historical data records or the data training submodel stored in the local data storage module 8 is used for primarily cleaning the data, each data record is analyzed, and when the data record deviates to a certain degree, the data record is removed, so that the cleaning of the data record is more accurate.

Preferably, when the deep learning coordination module 2 cleans up and analyzes the data in the stand-alone storage module 7 of the device 1, the removed unreasonable data records are sent to the deep learning coordination module 2, so that the data records are processed by the deep learning coordination module 2, and the reason for the abnormal data records is analyzed, so that the parameters of the data training submodel can be modified subsequently.

Preferably, when all the devices 1 are grouped by the deep learning coordination module 2, the size of the data record quantity of each device 1 is obtained in advance, and when the data records are grouped, the data record quantity is large in the same group, and the data record quantity is small in one group, so that the data records with large data record quantity can be prevented from inundating the data with small data record quantity when data training is carried out, and the accuracy of the total deep data training model can be constructed.

Preferably, when all the devices 1 are grouped by the deep learning coordination module 2, the size of the data record quantity of each device 1 is obtained in advance, and when the devices 1 are grouped, the number of the devices 1 with the large data record quantity is small, while the number of the devices 1 with the small data record quantity is large, so as to ensure that the number of the data records of each group is moderate, and thus the calculation quantity of all the data training fusion sub-modules 4 is proper.

Preferably, when the data training submodel is obtained by performing data training on the packet through the data training fusion submodel 4, and each time the data training submodel is completed for a certain number of times, the deep learning coordination module 2 regroups the device 1, and then performs data training again by using the data training fusion submodel 4 to obtain the data training submodel, so as to accelerate the accuracy of the data training submodel.

The second embodiment is as follows:

a cross-domain knowledge assistance method based on neural network and deep learning comprises a cross-domain knowledge assistance system based on neural network and deep learning, and comprises the following steps:

step S1, initializing the cross-domain knowledge assistance system based on neural network and deep learning, where the deep learning coordination module 2 obtains the size of data record volume of each device 1 in advance, and when all the devices 1 are grouped, the data record volume is large in the same group, and the data record volume is small in one group, so as to prevent the data record with large data record volume from flooding the data with small data record volume during data training, and the number of the devices 1 grouped with large data record volume is small, and the number of the devices 1 grouped with small data record volume is large, so as to ensure that the number of the data record of each group is moderate; ensuring that at least one data training fusion sub-module 6 exists in each packet, sending the information of the packet to the data reading module 5, the data training fusion sub-module 6 and the local area data storage module 9, and modifying the reading authority of the data records of the data reading module 5, the data training fusion sub-module 6 and the local area data storage module 5;

step S2, the data cleaning module 3 reads the data records stored in the stand-alone storage module 7, analyzes each data record by using mathematical statistics method and set requirements, and deletes a certain data record when it is found that the data record is obviously unreasonable;

step S3, the data training and fusion sub-module 4 establishes data communication connection with the corresponding grouped data reading module 5 according to the reading authority distributed by the deep learning coordination module 2, so that the data training and fusion sub-module 4 reads the data record stored in the single-machine storage module 7 through the data reading module 5 to perform data learning training, and obtains a data training sub-model;

step S4, storing the data training submodels in the local data storage module 8 and the global data storage module 9, wherein when the data training fusion submodel 6 of each group performs data training to obtain a data training submodel, the data training submodels obtained by data training of other groups stored in the global data storage module 9 are read as initial models to perform fast convergence to the data training submodels;

step S5, the number of the data training submodels stored in the global data storage module 9 is a fixed value, and when the number of the obtained data training submodels exceeds the fixed value, the data training submodel which is stored at the beginning is deleted, and the number of the deleted data training submodels is recorded; stopping performing data training on the packet when a specified number of the data training submodels are deleted;

step S6, a parameter-weighted mode is adopted for all the last remaining data training submodels to obtain a total depth data training model, and the total depth data training model is sent to the global data storage module 9 for storage.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

Although embodiments of the present invention have been shown and described, it will be appreciated by those skilled in the art that changes, modifications, substitutions and alterations can be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims and their equivalents.

Claims

1. A cross-domain knowledge assistance system based on neural network and deep learning comprises a plurality of devices (1) distributed in different knowledge fields, a deep learning coordination module (2) and a global data storage module (9); each device (1) comprises a data cleaning module (3), a data acquisition module (4) and a data reading module (5); the device (1) further comprises a single-machine storage module (7), and the single-machine storage module (7) is in data communication connection with the data cleaning module (3), the data acquisition module (4) and the data reading module (5) respectively;

a data training fusion submodule (6), wherein the data training fusion submodule (6) is arranged on part of the equipment (1); a local data storage module (8) is arranged on the device (1) provided with the data training fusion submodule (6), and the local data storage module (8) is respectively in data communication connection with the data training fusion submodule (6) and the data reading module (5);

the deep learning coordination module (2), the data reading module (5), the data training fusion sub-module (6), the local data storage module (8) and the global data storage module (9) are in data communication connection through a data communication network; the deep learning coordination module (2) schedules all the devices (1) participating in data training and learning, the working module and the data progress;

the method is characterized in that:

when data training and learning are carried out, the equipment (1) is in operation, the data acquisition module (4) acquires operation data and state data installed on the equipment (1) to form data records, the data records are stored in the single-machine storage module (7) of the equipment (1), the data cleaning module (3) reads the data records stored in the single-machine storage module (7), each data record is analyzed by using a mathematical statistical method and set requirements, and when a certain data record is found to be obvious and unreasonable, the data record is deleted;

the deep learning coordination module (2) groups all the devices (1), divides all the devices (1) into a plurality of groups (10) according to a certain rule, ensures that at least one data training fusion submodule (6) exists in each group (10), sends the information of the groups to the data reading module (5), the data training fusion submodule (6) and the local area data storage module (9), and modifies the reading authority of the data records of the data reading module (5), the data training fusion submodule (6) and the local area data storage module (5);

the data training and fusing submodule (6) is used for training and fusing the data according to the reading authority distributed by the deep learning coordination module (2), establishing a data communication connection with the data reading module (5) of its corresponding packet, so that the data training and fusion sub-module (6) reads the data records stored in the single-machine storage module (7) through the data reading module (5) to carry out data learning training to obtain a data training sub-model, and storing the data training sub-models in the local data storage module (8) and the global data storage module (9), wherein, when the data training fusion submodule (6) of each group carries out data training to obtain a data training submodel, reading a data training sub-model obtained by data training of other groups stored in the global data storage module (9) as an initial model so as to rapidly converge to the data training sub-model;

the number of the data training submodels stored in the global data storage module (9) is a fixed value, when the number of the obtained data training submodels exceeds the fixed value, the data training submodels which are stored at the beginning are deleted, and the number of the deleted data training submodels is recorded; stopping performing data training on the packet when a specified number of the data training submodels are deleted;

and obtaining a total depth data training model by adopting a parameter weighting mode for all the last remaining data training submodels, and sending the total depth data training model to the global data storage module (9) for storage.

2. The system of claim 1, wherein the system comprises: the deep learning coordination module (2) randomly extracts a certain amount of data records from the data records of each group and sends the data records to the global data storage module (8); the general data federation model is verified, when the data records are verified by the general data federation model and data in data output and data records meet model precision requirements, the general data federation model is established, otherwise, the deep learning coordination module (2) adds the times of acquiring the data training submodel by the data training fusion submodule (6) to the groups, and the process of acquiring the data training submodel is carried out again; and finally, obtaining a total depth data training model by adopting a parameter weighting mode for all the remaining data training submodels, and sending the total depth data training model to the global data storage module (9) for storage.

3. The system of claim 1, wherein the system comprises: data communication and data access between the data reading module (5), the data training fusion sub-module (6), the local data storage module (8), the global data storage module (9) and the deep learning coordination module (2) are performed in an encrypted mode, so that the safety of data records in transmission is guaranteed.

4. The system of claim 1, wherein the system comprises: when the data cleaning module (3) cleans the data of the data records, the existing historical data records or the data training submodel stored in the local data storage module (8) is used for primarily cleaning the data, each data record is analyzed, and when the data records deviate to a certain degree, the data records are removed, so that the data records can be cleaned more accurately.

5. The system of claim 4, wherein the neural network and deep learning based cross-domain knowledge assistance system comprises: when the deep learning coordination module (2) cleans and analyzes the data in the single-machine storage module (7) in the equipment (1) abnormally, the removed unreasonable data records are sent to the deep learning coordination module (2), so that the data records are processed by the deep learning coordination module (2), and the reasons of the abnormal data records are analyzed, so that the parameters of the data training submodel can be modified subsequently.

6. The system of claim 1, wherein the system comprises: when all the devices (1) are grouped by the deep learning coordination module (2), the size of the data record quantity of each device (1) is obtained in advance, and when the devices are grouped, the data record quantity is large in the same group, and the data record quantity is small in one group, so that the data record with the large data record quantity is prevented from inundating the data with the small data record quantity when data training is carried out, and the accuracy of the total deep data training model is built.

7. The system of claim 6, wherein the neural network and deep learning based cross-domain knowledge assistance system comprises: when all the devices (1) are grouped by the deep learning coordination module (2), the size of the data record quantity of each device (1) is obtained in advance, and when the devices are grouped, the number of the devices (1) which are grouped with large data record quantity is small, the number of the devices (1) which are grouped with small data record quantity is large, so that the number of the data records of each group is ensured to be moderate, and the calculation quantity of all the data training fusion sub-modules (4) is proper.

8. The system of claim 1, wherein the system comprises: when the data training submodel is obtained by performing data training on the packet through the data training fusion submodule (4), and each time the data training submodel is completed for a certain number of times, the deep learning coordination module (2) regroups the equipment (1), and then performs data training again through the data training fusion submodule (4) to obtain the data training submodel, so that the accuracy of the data training submodel is improved.

9. A cross-domain knowledge assistance method based on neural network and deep learning, comprising the cross-domain knowledge assistance system based on neural network and deep learning of any one of claims 1 to 8, characterized by comprising the following steps:

step S1, initializing the cross-domain knowledge assistance system based on neural network and deep learning, wherein the deep learning coordination module (2) acquires the data record quantity of each device (1) in advance, and when all the devices (1) are grouped, the data record quantity is large in the same group, and the data record quantity is small in one group, so as to prevent the data record quantity large in data record quantity from inundating the data with small record quantity during data training, the number of the devices (1) grouped with large data record quantity is small, and the number of the devices (1) grouped with small data record quantity is large, so as to ensure that the data record quantity of each group is moderate; ensuring that at least one data training fusion submodule (6) exists in each packet, sending the information of the packet to the data reading module (5), the data training fusion submodule (6) and the local area data storage module (9), and modifying the reading authority of the data record of the data reading module (5), the data training fusion submodule (6) and the local area data storage module (5);

step S2, the data cleaning module (3) reads the data records stored in the single-machine storage module (7), analyzes each data record by using a mathematical statistical method and set requirements, and deletes a certain data record when the data record is obviously unreasonable;

step S3, the data training fusion submodule (4) establishes data communication connection with the corresponding grouped data reading module (5) according to the reading authority distributed by the deep learning coordination module (2), so that the data training fusion submodule (4) reads the data record stored in the single-machine storage module (7) through the data reading module (5) to perform data learning training, and obtains a data training submodel;

step S4, storing the data training submodel in the local data storage module (8) and the global data storage module (9), wherein when the data training fusion submodules (6) of each group are used for data training to obtain the data training submodel, the data training submodel obtained by data training of other groups stored in the global data storage module (9) is read as an initial model so as to rapidly converge to the data training submodel;

step S5, the number of the data training submodels stored in the global data storage module (9) is a fixed value, and when the number of the obtained data training submodels exceeds the fixed value, the data training submodels which are stored at the beginning are deleted, and the number of the deleted data training submodels is recorded; stopping performing data training on the packet when a specified number of the data training submodels are deleted;

and step S6, obtaining a total depth data training model by adopting a parameter weighting mode for all the last remaining data training submodels, and sending the total depth data training model to the global data storage module (9) for storage.

10. The method of claim 9, wherein the method comprises: when the data training submodel is obtained by performing data training on the packet through the data training fusion submodule (4), and each time the data training submodel is completed for a certain number of times, the deep learning coordination module (2) regroups the equipment (1), and then performs data training again through the data training fusion submodule (4) to obtain the data training submodel, so that the accuracy of the data training submodel is improved.