CN114037024A

CN114037024A - Multitask neural network based data identification system and method

Info

Publication number: CN114037024A
Application number: CN202210012305.9A
Authority: CN
Inventors: 杨德顺; 罗晓忠; 孙海航; 肖罗; 徐建宇
Original assignee: Xinjian Intelligent Control Shenzhen Technology Co ltd
Current assignee: Xinjian Intelligent Control Shenzhen Technology Co ltd
Priority date: 2022-01-07
Filing date: 2022-01-07
Publication date: 2022-02-11

Abstract

The invention provides a data identification system and method based on a multitask neural network, wherein the system comprises the following components: the data recognition model is constructed on the basis of a neural network and comprises a first main branch neural network and a plurality of second main branch neural networks which are added according to tasks and take the first main branch neural network as a main line; the task flow manager is used for adding a plurality of second main branch neural networks with unique identification characteristics by taking the first main branch neural network as a main line according to the task progress; the identifying feature is also used for task flow communication between the second main neural network and the first main neural network. The first main branch neural network and the second main branch neural network have corresponding information circulation, the first main branch neural network mainly extracts general characteristics for each task, and the second main branch neural network mainly provides a different attention mechanism for different tasks, so that corresponding task functions can be better completed.

Description

Multitask neural network based data identification system and method

Technical Field

The invention relates to the technical field of data identification and data identification model construction, in particular to a multitask neural network data identification system and method based on a new fast learning task.

Background

The traditional data identification model carries out model training through sample fields, searches content characteristics of target fields and generates corresponding rule models, and the searching of the target fields has huge task amount and can reduce the speed of data model optimization.

An Artificial Neural Network (ANN), referred to as Neural Network (NN), is a mathematical model or computational model that mimics the structure and function of a biological neural network. The data recognition model can be constructed according to an artificial neural network to reduce the problem of difficult optimization in the traditional data recognition model, and the neural network is an operation model and is formed by a large number of nodes (or called neurons) and mutual connection. Each node represents a particular output function called the stimulus function, the activation function. The connection between every two nodes represents a weighted value, called weight, for the signal passing through the connection, which is equivalent to the memory of the artificial neural network. The output of the network is different according to the connection mode of the network, the weight value and the excitation function. The network itself is usually an approximation to some algorithm or function in nature, and may also be an expression of a logic strategy.

Adding a new task to a trained neural network can cause the problem of 'catastrophic forgetting', that is, the neural network forgets or loses some previously learned abilities when learning the new task, so that the performance of the original task is greatly reduced.

The current common solution is multi-task learning, which puts training data of all tasks together at the same time and uses a multi-task neural network for joint training. However, in this way, as the number of tasks increases, the number of samples of a new task is diluted, the efficiency of learning the new task by the neural network is not high, and the data of the previous task needs to be reviewed, so that the data demand is huge.

The performance of an original task is reduced when a new task is learned by a neural network at present, a new task sample is diluted in multi-task learning, the efficiency of learning the new task by the neural network is not high, data of a previous task is required to be reviewed, and the data demand is huge.

Disclosure of Invention

In view of the above, the present invention provides a system and method for identifying data based on a multitasking neural network.

The technical scheme adopted by the invention is as follows:

a multitasking neural network based data recognition system comprising:

the data recognition model is constructed on the basis of a neural network and comprises a first main branch neural network and a plurality of second main branch neural networks which are added according to tasks and take the first main branch neural network as a main line;

the task flow manager is used for adding a plurality of second main branch neural networks with unique identification characteristics by taking the first main branch neural network as a main line according to the task progress;

the identifying feature is also used for task flow communication between the second main neural network and the first main neural network.

Preferably, the first master neural network is used to extract identifying features for each task.

Preferably, the second master neural network provides corresponding attention mechanisms for different tasks to complete corresponding task functions.

Preferably, the task flow manager constructs the second master principal neural network using common ones of the identified features of the first master principal neural network selectively to new tasks to reduce the weights that the second master principal neural network needs to train.

Preferably, the task flow manager constructs the second principal neural network based on the extracted identifying features of the trained first principal neural network.

Preferably, when the first master neural network is trained, the training weight of the first master neural network may use random initialization or preset training weight, and the training weight of the first master neural network is smaller than the training weight of the second master neural network.

Preferably, when a new task flow enters, the task flow manager adds one or more new second main branch neural networks with unique identification features, the new second main branch neural networks are used for learning of the new task, and the task flow manager selectively freezes the first main branch neural network and other second main branch neural networks for the new task.

The invention also provides a data identification method based on the multitask neural network, which comprises the following steps: obtaining a current data recognition model, wherein the data recognition model is constructed based on a neural network and is provided with a first main branch neural network and a plurality of second main branch neural networks which are added by taking the first main branch neural network as a main line according to tasks,

when the first master neural network is trained, the training weight of the first master neural network can use random initialization or preset training weight, and the training weight of the first master neural network is smaller than that of the second master neural network;

when a new task flow enters, the task flow manager adds one or more new second main branch neural networks with unique identification characteristics, the new second main branch neural networks are used for learning of the new task, and at the moment, the task flow manager selectively freezes the first main branch neural network and other second main branch neural networks for the new task.

The method and the device solve the problem that the performance of the original task is reduced when the neural network learns the new task by keeping the integrity of the weight used by the original task in the neural network structure, and achieve the purpose that the neural network keeps the performance of the original task when the new task is learned.

According to the method, the new weak main branch neural network is constructed by using the general characteristics with strong universality in the trained first main branch neural network, the weight of the new second main branch neural network to be trained is reduced, the network parameter is lightened, and the learning efficiency of a new task is improved.

According to the method, the data recognition model is constructed by the method of saving the structural integrity of the neural network of the original task and only training the newly-added second main neural network weight, so that huge data in the original task can be omitted when the newly-added task is trained, the problems that a new task sample is diluted and the data demand is huge in multi-task learning are solved, and the new task learning can be completed only by using a small amount of data of the new task.

Drawings

The invention is illustrated and described only by way of example and not by way of limitation in the scope of the invention as set forth in the following drawings, in which:

FIG. 1 is a schematic block diagram of the system of the present invention;

FIG. 2 is a flow chart of the present invention;

FIG. 3 is a block diagram of the present invention during initial training of a data stream;

FIG. 4 is a block diagram of the present invention in which training of new data streams is added.

Detailed Description

In order to make the objects, technical solutions, design methods, and advantages of the present invention more apparent, the present invention will be further described in detail by specific embodiments with reference to the accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The data identification structure of the invention mainly constructs corresponding neural networks according to tasks, the constructed neural networks comprise a first main neural network and a second main neural network added according to the number of the tasks, the first main neural network and the second main neural network have corresponding information circulation, the first main neural network mainly extracts general characteristics for each task, and the second main neural network mainly provides a different attention mechanism for different tasks, thereby better completing corresponding task functions.

When a new task is added, the general features of the shallow layer, the middle layer and the deep layer on the first main branch neural network can be selectively used according to the features of the new task to construct a new second main branch neural network, or the general features similar to the shallow layer, the middle layer and the deep layer can be used for reducing the weight of the new second main branch neural network needing training and adapting to the new task more quickly.

The following technical scheme is specifically included:

referring to fig. 1 and 2, the present invention provides a multitask neural network based data recognition system, including:

the identifying feature is also used for task flow communication between the second main neural network and the first main neural network. The first main branch neural network is used for extracting identification features for each task; the second main branch neural network provides corresponding attention mechanisms for different tasks so as to complete corresponding task functions.

In the above, the task flow manager constructs the second master principal neural network using common ones of the identified features of the first master principal neural network selectively to new tasks to reduce the weights that the second master principal neural network needs to train. Or the task flow manager constructs the second main branch neural network according to the general features in the extracted identification features of the trained first main branch neural network.

When the first master neural network is trained, the training weight of the first master neural network can use random initialization or preset training weight, and the training weight of the first master neural network is smaller than the training weight of the second master neural network.

referring to fig. 3, when a new task flow enters, the task flow manager adds one or more new second principal neural networks with uniquely identifying features, which are used for new task learning, at which time the task flow manager selectively freezes the first and other second principal neural networks for the new task. The task flow manager selectively uses common ones of the identified features of the first master principal neural network to construct the second master principal neural network for new tasks to reduce the weights that the second master principal neural network needs to train. Or the task flow manager constructs the second main branch neural network according to the general features in the extracted identification features of the trained first main branch neural network.

During initial training, the weight of the first master neural network can be initialized randomly or according to a preset training model weight, wherein the first master neural network uses a small learning rate, the second master neural network uses a learning rate which is larger than that of the strong master neural network according to task characteristics, and training is finished after a specified number of training iterations. The first main branch neural network obtains the general characteristic of strong universality of satisfying multiple tasks. When a new task is added, a new second main branch neural network can be constructed by using the general features of the shallow layer, the middle layer and the deep layer on the first main branch neural network according to the selective use of the features of the new task, or the general features similar to the shallow layer, the middle layer and the deep layer are used for reducing the weight of the new second main branch neural network needing to be trained, so that the new task can be adapted more quickly.

In the above, the learning rate of the first main branch neural network is lower than 10%, and 90% of the resources used for the second main branch neural network are used according to the task characteristics, and the training is finished after the training iteration is performed for a specified number of times. The strong main neural network obtains the general characteristic of strong universality of satisfying multiple tasks.

Referring to fig. 4, during the second training, a second main branch neural network is needed to be added for learning of a new task, the weights of the first main branch neural network and other second main branch neural networks can be frozen, and only the newly added weak main branch neural network weight is trained. At the moment, the training of the new weak main neural network can be completed only by a small amount of data and training iteration times, so that the overall multi-task neural network can quickly adapt to the new task without influencing the performance of the original task, and the performance required by the new task is achieved.

Therefore, the data recognition model is constructed by the method of saving the structural integrity of the neural network of the original task and only training the newly added second main branch neural network weight, huge data in the original task can not be used when the newly added task is trained, the problems that a new task sample is diluted and the data demand is huge in multi-task learning are solved, and the new task learning can be completed only by using a small amount of data of the new task.

Having described embodiments of the present invention, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described embodiments. The terminology used herein is chosen in order to best explain the principles of the embodiments, the practical application, or improvements made to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. A multitask-based neural network data recognition system, comprising:

2. The multitask neural network-based data recognition system of claim 1, wherein the first master neural network is configured to extract recognition features for each task.

3. The multitask neural network-based data recognition system of claim 1, wherein the secondary master neural network provides a corresponding attention mechanism for different tasks for completion of corresponding task functions.

4. The multitask neural network-based data recognition system of claim 1, wherein the task flow manager constructs the second master neural network using common ones of the recognition features of the first master neural network selectively for new tasks to reduce the weights that the second master neural network needs to train.

5. The multitask neural network-based data recognition system of claim 4, wherein the task flow manager builds the second master principal neural network from the extracted recognition features of the already trained first master principal neural network.

6. The multitask neural network data recognition system according to claim 4 or 5, wherein the training weights of the first master neural network can use random initialization or preset training weights when the first master neural network is trained, and the training weights of the first master neural network are smaller than the training weights of the second master neural network.

7. The multitask neural network-based data recognition system of claims 4 or 5, wherein when a new task flow enters, the task flow manager adds one or more new second primary neural networks with unique identifying characteristics, the new second primary neural networks being used for new task learning, when the task flow manager selectively freezes the first primary neural network and the other second primary neural networks for the new task.

8. A data identification method based on a multitask neural network is characterized by comprising the following steps:

obtaining a current data recognition model, wherein the data recognition model is constructed based on a neural network and is provided with a first main branch neural network and a plurality of second main branch neural networks which are added by taking the first main branch neural network as a main line according to tasks,

9. The multitask neural network-based data recognition method of claim 8, wherein the task flow manager selectively uses common features of the recognition features of the first master neural network to construct the second master neural network for new tasks to reduce the weights that the second master neural network needs to train.

10. The multitask neural network-based data recognition method of claim 8, wherein the task flow manager constructs the second master principal neural network from the extracted recognition features of the already trained first master principal neural network.