CN112101417A - Continuous learning method and device based on condition batch normalization - Google Patents

Continuous learning method and device based on condition batch normalization Download PDF

Info

Publication number
CN112101417A
CN112101417A CN202010818695.XA CN202010818695A CN112101417A CN 112101417 A CN112101417 A CN 112101417A CN 202010818695 A CN202010818695 A CN 202010818695A CN 112101417 A CN112101417 A CN 112101417A
Authority
CN
China
Prior art keywords
task
loss function
feature vector
classification
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010818695.XA
Other languages
Chinese (zh)
Inventor
丁贵广
项刘宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN202010818695.XA priority Critical patent/CN112101417A/en
Publication of CN112101417A publication Critical patent/CN112101417A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computational Linguistics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a continuous learning method and a device based on condition batch normalization, which comprises the following steps: acquiring an original feature vector of a current task, and converting the original feature vector into a task self-adaptive feature vector through condition batch normalization transformation of a neural network; performing category classification on the task self-adaptive feature vectors through a category classifier, and establishing a category classification loss function according to a category classification result; acquiring a task classifier to classify the tasks according to the transformation parameters of the condition batch normalization, and establishing a task classification loss function according to a task classification result; and training the samples corresponding to the current new task according to the class classification loss function and the task classification loss function. Thus, a special feature space is created for each classification task, so that interference between different tasks is reduced, and the phenomenon that an old task is catastrophically forgotten when a new task is learned is reduced.

Description

Continuous learning method and device based on condition batch normalization
Technical Field
The invention relates to the technical field of artificial intelligence and deep learning, in particular to a continuous learning method and device based on condition batch normalization.
Background
With the rapid development of deep learning technology and neural network technology, deep learning technology is widely applied to many living applications, such as target recognition, target detection, face recognition, and the like. However, the neural network trained by the deep learning algorithm at present often has a single task property, that is, after the training of the neural network is completed, the task which can be completed by the neural network is determined, so that the neural network is difficult to further change in the using process.
For example, a neural network capable of identifying cats and dogs is trained by using a target identification algorithm, and after the training is finished, the neural network can only complete the identification task of cats and dogs, but is difficult to identify other animals or other objects. With the increase of the requirements of human beings on the neural network, the relearning capability of the neural network also becomes a research hotspot.
In order to enable the neural network to continuously expand the network structure and complete a new learning task, it is often necessary to first add randomly initialized network parameters to the network and train the neural network using a data set of the new task, so that the neural network can learn the new task. In this process, the neural network is often "forgotten catastrophically", i.e., the performance of the neural network on the old task drops dramatically after the neural network is trained on the new task and data set.
In recent years, many continuously learning algorithms have been proposed to solve the "catastrophic forgetting" problem of neural networks. The mainstream continuous learning algorithm can be mainly classified into a model constraint method, a network extension method and a data replay method.
The model constraint method mainly restrains the variable quantity of model parameters in the training process of a new task through a constraint neural network, so that the neurons of an old task cannot be changed too much in the training process of the new task. The more classical methods are EWC (Elastic Weight Consolidation algorithm) and LwF (Learning with out Learning, not Forgetting). The former is to calculate the Fisher information matrix of each neuron as the importance weight of the neuron, and weight the Fisher information matrix according to the weight of the Fisher information matrix, which is the offset of each neuron, and finally perform constraint as a part of the loss function. And the later records the predicted score of the original network on the new task data, takes the predicted score as a label, and uses a knowledge distillation loss function to constrain the network output in the training process of the new task.
The network expansion method mainly achieves the purpose that a new task and an old task can be competent at the same time by dynamically expanding each part of the network. For example, pruning the current neural network after each task has been learned and reinitializing the pruned network parameters for learning subsequent tasks. However, this method is often limited by the network capacity or the network parameters, and is difficult to be performed in a continuous learning scenario with many tasks. The data replay method mainly reduces the problem of catastrophic forgetting by reserving some representative samples in the data set of the old task by a set with a fixed size and adding the representative samples of the old task into the training set for combined training in the learning process of a new task. The representative method is iCaRL (Incremental classification and Representation Learning), the method is improved on the basis of LwF knowledge distillation, an old task data set with a fixed size is stored and is subjected to 'replay' in the new task training process, the prediction of a network trained on a new task on the old task data is constrained not to be too far through a knowledge distillation loss function, and finally the purpose of continuous Learning is achieved.
Disclosure of Invention
The present invention is directed to solving, at least to some extent, one of the technical problems in the related art.
Therefore, the invention aims to provide a continuous learning method and a continuous learning device based on condition batch normalization, and aims to create a special feature space for each classification task, so that interference among different tasks is reduced, and the phenomenon that an old task is forgotten catastrophically when a new task is learned is reduced.
The continuous learning method based on the condition batch normalization comprises the following steps: acquiring an original feature vector of a current task, and converting the original feature vector into a task self-adaptive feature vector through condition batch normalization transformation of a neural network; performing category classification on the task self-adaptive feature vectors through a category classifier, and establishing a category classification loss function according to a category classification result; acquiring a task classifier to classify the tasks according to the transformation parameters of the condition batch normalization, and establishing a task classification loss function according to a task classification result; and training the samples corresponding to the current new task according to the class classification loss function and the task classification loss function.
In addition, the continuous learning method based on the condition batch normalization according to the invention can also have the following additional technical characteristics:
according to some embodiments of the invention, obtaining the data replay set and initial parameters of the neural network may be further included; and constraining the neural network according to the data replay set and the initial parameters of the neural network.
According to some embodiments of the present invention, the formula for converting the original feature vector into the task adaptive feature vector through the conditional batch normalization transform of the neural network is:
Figure BDA0002633695090000021
wherein f is the original feature vector of the current task, E [ f]Is the mean of the original feature vectors of the current task, Var [ f ]]Is the variance of the original feature vector of the current task, epsilon is a constant value, betai、γiIn order to normalize the transformation parameters for the condition batch,
Figure BDA0002633695090000022
feature vectors are adapted for the task. According to some embodiments of the invention, the class classification penalty function is:
Figure BDA0002633695090000023
wherein L isclsAs a class classification loss function, yiIn the form of a true category label,
Figure BDA0002633695090000024
cross Encopy (A, B) is a function that calculates the A, B cross entropy loss for the class classifier parameters. According to some embodiments of the invention, the task classification penalty function is:
Figure BDA0002633695090000031
wherein L istaskFor task classification loss function, TiFor the (i) th task(s),
Figure BDA0002633695090000032
for the task classifier parameters, concat [ beta, gamma ]]And normalizing the parameters for the self-adaptive batches after the series connection.
According to some embodiments of the invention, training a sample corresponding to a current new task according to a class classification loss function and a task classification loss function, including adding the class classification loss function and the task classification loss function to obtain a total loss function; and training the sample corresponding to the current new task according to the total loss function.
In order to achieve the above object, a second embodiment of the present invention provides a continuous learning apparatus based on conditional batch normalization, including: the system comprises a conversion module, a category classification module, a task classification module and a training module, wherein the conversion module is used for acquiring an original feature vector of a current task and converting the original feature vector into a task self-adaptive feature vector through condition batch normalization transformation of a neural network; the category classification module is used for performing category classification on the task self-adaptive feature vectors through a category classifier and establishing a category classification loss function according to a category classification result; the task classification module is used for acquiring the transformation parameters of the task classifier for normalization of the condition batch, performing task classification and establishing a task classification loss function according to a task classification result; and the training module is used for training the samples corresponding to the current new task according to the class classification loss function and the task classification loss function.
In addition, the continuous learning apparatus based on the condition batch normalization according to the above embodiment of the present invention may further have the following additional technical features:
further, in a possible implementation manner of the embodiment of the present application, the method may further include:
the acquisition module is used for acquiring the data replay set and initial parameters of the neural network;
and the constraint module is used for constraining the neural network according to the data replay set and the initial parameters of the neural network.
Further, in a possible implementation manner of the embodiment of the present application, the formula for converting the original feature vector into the task adaptive feature vector by the transformation module through the conditional batch normalization transformation of the neural network is as follows:
Figure BDA0002633695090000033
wherein f is the original feature vector of the current task, E [ f]Is the mean of the original feature vectors of the current task, Var [ f ]]Is the variance of the original feature vector of the current task, epsilon is a constant value, betai、γiIn order to normalize the transformation parameters for the condition batch,
Figure BDA0002633695090000034
feature vectors are adapted for the task.
Further, in a possible implementation manner of the embodiment of the present application, the conversion module is specifically configured to add the category classification loss function and the task classification loss function to obtain a total loss function; and training the sample corresponding to the current new task according to the total loss function.
The continuous learning method based on condition batch normalization provided by the embodiment of the invention has the following beneficial effects:
the continuous learning method based on the condition batch normalization comprises the following steps: acquiring an original feature vector of a current task, and converting the original feature vector into a task self-adaptive feature vector through condition batch normalization transformation of a neural network; performing category classification on the task self-adaptive feature vectors through a category classifier, and establishing a category classification loss function according to a category classification result; acquiring a task classifier to classify the tasks according to the transformation parameters of the condition batch normalization, and establishing a task classification loss function according to a task classification result; and training the samples corresponding to the current new task according to the class classification loss function and the task classification loss function. Thus, a special feature space is created for each classification task, so that interference between different tasks is reduced, and the phenomenon that an old task is catastrophically forgotten when a new task is learned is reduced.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The foregoing and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a schematic flow chart of a continuous learning method based on condition batch normalization according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an algorithm of a continuous learning method based on condition batch normalization according to an embodiment of the present invention;
fig. 3 is a schematic structural diagram of a continuous learning apparatus based on condition batch normalization according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
The invention aims to provide a continuous learning method and a continuous learning device based on condition batch normalization, so that a special feature space is created for each classification task, interference among different tasks is reduced, and the phenomenon that an old task is forgotten catastrophically when a new task is learned is reduced.
The following describes a continuous learning method and apparatus based on conditional batch normalization proposed according to an embodiment of the present invention with reference to the accompanying drawings.
With the expansion of the application of the neural network, the neural network is required to be capable of continuously learning new functions in the using process in more and more application scenes, for example, when the neural network is applied to the field of image recognition, the neural network is required to be capable of continuously learning new tasks in the using process, namely, the neural network capable of recognizing animals is required to be added, the function of recognizing fruits is required to be added in the using process, and the accuracy of recognizing animals is basically not reduced. For another example, when the neural network is applied to attribute identification of pedestrians, the neural network is required to continuously learn a new task in the using process, that is, the neural network can identify the age attribute characteristics of people in the surveillance video.
After the traditional neural network completes training, the neural network has the characteristic of only completing fixed tasks, so that the development of the neural network is hindered to a certain extent, and the neural network capable of realizing continuous learning is urgently needed to be designed.
Fig. 1 is a schematic flow chart of a continuous learning method based on condition batch normalization according to an embodiment of the present invention. As shown in fig. 1, the continuous learning method based on conditional batch normalization includes:
step 101, obtaining an original feature vector of a current task, and converting the original feature vector into a task self-adaptive feature vector through condition batch normalization transformation of a neural network.
Specifically, a data replay set of the current task is used for training a neural network to obtain an original feature vector of the current task, and the obtained original feature vector is converted into a task self-adaptive feature vector according to a preset neural network condition batch normalization variation method. The method for batch normalization change of the predetermined neural network condition can be performed according to formula (1), where formula (1) is:
Figure BDA0002633695090000051
wherein f is the original feature vector of the current task, E [ f]Is the mean of the original feature vectors of the current task, Var [ f ]]Is the variance of the original feature vector of the current task, epsilon is a constant value, betai、γiIn order to normalize the transformation parameters for the condition batch,
Figure BDA0002633695090000052
feature vectors are adapted for the task.
And 102, performing class classification on the task self-adaptive feature vectors through a class classifier, and establishing a class classification loss function according to a class classification result.
Wherein, the task self-adaptive characteristic vector is transformed by formula (1)
Figure BDA0002633695090000053
Is moved into a special feature space in which the task adaptation of the feature vectors can also be continued
Figure BDA0002633695090000054
And the task classification prediction is carried out, so that the interference of a new task to an original task is reduced.
In particular, the task is adapted to the feature vector
Figure BDA0002633695090000055
Input class classifier wclsAnd carrying out class classification and generating a class classification loss function obtained according to a class classification result. The method for obtaining the class classification loss function according to the class classification result may be obtained according to formula (2), where formula (2) is:
Figure BDA0002633695090000056
wherein L isclsAs a class classification loss function, yiIn the form of a true category label,
Figure BDA0002633695090000057
cross Encopy (A, B) is a function that calculates the A, B cross entropy loss for the class classifier parameters.
And 103, acquiring a task classifier to classify the task according to the transformation parameters of the condition batch normalization, and establishing a task classification loss function according to a task classification result.
Wherein, the channelsAfter the transformation of formula (1), the original feature vector is scaled and translated, and then moved to a special feature space, and the task classifier wtaskIt can be understood as a classifier designed to guarantee the uniqueness of the feature space of each task adaptive feature vector, task classifier wtaskIs input as a translation factor betaiScaling factor gammaiThe output is a prediction of the task ID (Identity document).
In particular, the task is adapted to the feature vector
Figure BDA0002633695090000058
And inputting the task classifier to classify the tasks, and generating a task classification loss function according to a task classification result. The method for obtaining the task classification loss function according to the task classification result may be obtained according to formula (3), where formula (3) is:
Figure BDA0002633695090000061
wherein L istaskFor task classification loss function, TiFor the (i) th task(s),
Figure BDA0002633695090000062
for the task classifier parameters, concat [ beta, gamma ]]And normalizing the parameters for the self-adaptive batches after the series connection.
And 104, training the samples corresponding to the current new task according to the class classification loss function and the task classification loss function.
Specifically, as shown in fig. 2, after the class classifier and the task classifier predict the task from two directions, a class classification loss function and a task classification loss function are generated, and then a sum L of the class classification loss function and the task classification loss function is obtained according to a formula (4),
L=Ltask+Lclsformula (4)
And training a sample corresponding to the current new task according to the sum L of the class classification loss function and the task classification loss function to obtain a feature vector of the neural network suitable for the new task.
In the embodiment of the invention, the continuous learning method based on condition batch normalization of the invention obtains the original feature vector of the current task, and converts the original feature vector into the task adaptive feature vector through condition batch normalization transformation of the neural network; performing category classification on the task self-adaptive feature vectors through a category classifier, and establishing a category classification loss function according to a category classification result; acquiring a task classifier to classify the tasks according to the transformation parameters of the condition batch normalization, and establishing a task classification loss function according to a task classification result; and training the samples corresponding to the current new task according to the class classification loss function and the task classification loss function. Thus, a special feature space is created for each classification task, so that interference between different tasks is reduced, and the phenomenon that an old task is catastrophically forgotten when a new task is learned is reduced.
In order to implement the above embodiment, the present application further provides a continuous learning apparatus based on condition batch normalization.
Fig. 3 is a schematic structural diagram of a continuous learning apparatus based on condition batch normalization according to an embodiment of the present invention.
As shown in fig. 3, the apparatus includes: a conversion module 301, a category classification module 302, a task classification module 303 and a training module 304.
The conversion module 301 is configured to obtain an original feature vector of a current task, and convert the original feature vector into a task adaptive feature vector through conditional batch normalization transformation of a neural network;
the category classification module 302 is configured to perform category classification on the task adaptive feature vectors through a category classifier, and establish a category classification loss function according to a category classification result;
the task classification module 303 is configured to obtain a transformation parameter of the task classifier for performing task classification on the conditional batch normalization, and establish a task classification loss function according to a task classification result;
and the training module 304 is configured to train a sample corresponding to the current new task according to the class classification loss function and the task classification loss function.
Further, in a possible implementation manner of the embodiment of the present application, the method preferably includes: an acquisition module 305 and a constraint module 306, wherein,
an obtaining module 305, configured to obtain a data replay set and initial parameters of a neural network;
and a constraint module 306, configured to constrain the neural network according to the data replay set and the initial parameters of the neural network.
Further, in a possible implementation manner of the embodiment of the present application, the transformation module 301 is specifically configured to transform the original feature vector into the task adaptive feature vector through a conditional batch normalization transformation of a neural network, where the transformation method may be according to formula (5), and formula (5) is:
Figure BDA0002633695090000071
wherein f is the original feature vector of the current task, E [ f]Is the mean of the original feature vectors of the current task, Var [ f ]]Is the variance of the original feature vector of the current task, epsilon is a constant value, betai、γiIn order to normalize the transformation parameters for the condition batch,
Figure BDA0002633695090000072
feature vectors are adapted for the task. Further, in a possible implementation manner of the embodiment of the present application, the conversion module is specifically configured to:
adding the category classification loss function and the task classification loss function to obtain a total loss function;
and training the sample corresponding to the current new task according to the total loss function.
It should be noted that the foregoing explanation of the method embodiment is also applicable to the apparatus of this embodiment, and is not repeated herein.
According to the continuous learning device based on condition batch normalization, a conversion module obtains an original feature vector of a current task, and the original feature vector is converted into a task self-adaptive feature vector through condition batch normalization conversion of a neural network; the category classification module performs category classification on the task self-adaptive feature vectors through a category classifier, and establishes a category classification loss function according to a category classification result; the task classification module performs class classification on the task self-adaptive feature vectors through a class classifier and establishes a class classification loss function according to a class classification result; and the training module trains the samples corresponding to the current new task according to the class classification loss function and the task classification loss function. Thus, a special feature space is created for each classification task, so that interference between different tasks is reduced, and the phenomenon that an old task is catastrophically forgotten when a new task is learned is reduced.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present invention, "a plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, and the program may be stored in a computer readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims (10)

1. A continuous learning method based on condition batch normalization is characterized by comprising the following steps:
acquiring an original feature vector of a current task, and converting the original feature vector into a task self-adaptive feature vector through condition batch normalization transformation of a neural network;
performing category classification on the task self-adaptive feature vector through a category classifier, and establishing a category classification loss function according to a category classification result;
acquiring a task classifier to classify the tasks according to the transformation parameters of the condition batch normalization, and establishing a task classification loss function according to a task classification result;
and training the samples corresponding to the current new task according to the class classification loss function and the task classification loss function.
2. The method of claim 1, further comprising:
acquiring a data replay set and initial parameters of a neural network;
and constraining the neural network according to the data replay set and the initial parameters of the neural network.
3. The method of claim 1, wherein the formula for converting the raw feature vector into a task-adaptive feature vector through a conditional batch normalization transform of a neural network is:
Figure FDA0002633695080000011
wherein f is the original feature vector of the current task, E [ f]Is the mean of the original feature vectors of the current task, Var [ f ]]Is the variance of the original feature vector of the current task, epsilon is a constant value, betai、γiIn order to normalize the transformation parameters for the condition batch,
Figure FDA0002633695080000012
feature vectors are adapted for the task.
4. The method of claim 1, wherein the class classification loss function is:
Figure FDA0002633695080000013
wherein L isclsAs a class classification loss function, yiIn the form of a true category label,
Figure FDA0002633695080000014
cross Encopy (A, B) is a function that calculates the A, B cross entropy loss for the class classifier parameters.
5. The method of claim 1, wherein the task classification loss function is:
Figure FDA0002633695080000015
wherein L istaskFor task classification loss function, TiFor the (i) th task(s),
Figure FDA0002633695080000016
for the task classifier parameters, concat [ beta, gamma ]]And normalizing the parameters for the self-adaptive batches after the series connection.
6. The method of claim 4 or 5, wherein training the samples corresponding to the current new task according to the class classification loss function and the task classification loss function comprises:
adding the category classification loss function and the task classification loss function to obtain a total loss function;
and training a sample corresponding to the current new task according to the total loss function.
7. A continuous learning apparatus based on conditional batch normalization, comprising:
the conversion module is used for acquiring an original feature vector of a current task and converting the original feature vector into a task self-adaptive feature vector through condition batch normalization transformation of a neural network;
the category classification module is used for carrying out category classification on the task self-adaptive feature vector through a category classifier and establishing a category classification loss function according to a category classification result;
the task classification module is used for acquiring the transformation parameters of the task classifier for normalization of the condition batch, performing task classification and establishing a task classification loss function according to a task classification result;
and the training module is used for training the samples corresponding to the current new task according to the class classification loss function and the task classification loss function.
8. The apparatus of claim 7, further comprising:
the acquisition module is used for acquiring the data replay set and initial parameters of the neural network;
and the constraint module is used for constraining the neural network according to the data replay set and the initial parameters of the neural network.
9. The apparatus according to claim 7, wherein the conversion module is specifically configured to convert the raw feature vector into a task-adaptive feature vector through a conditional batch normalization transformation of a neural network according to a formula:
Figure FDA0002633695080000021
wherein f is the original feature vector of the current task, E [ f]Is the mean of the original feature vectors of the current task, Var [ f ]]Is the variance of the original feature vector of the current task, epsilon is a constant value, betai、γiIn order to normalize the transformation parameters for the condition batch,
Figure FDA0002633695080000022
feature vectors are adapted for the task.
10. The apparatus of claim 9, the conversion module, in particular to:
adding the category classification loss function and the task classification loss function to obtain a total loss function;
and training a sample corresponding to the current new task according to the total loss function.
CN202010818695.XA 2020-08-14 2020-08-14 Continuous learning method and device based on condition batch normalization Pending CN112101417A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010818695.XA CN112101417A (en) 2020-08-14 2020-08-14 Continuous learning method and device based on condition batch normalization

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010818695.XA CN112101417A (en) 2020-08-14 2020-08-14 Continuous learning method and device based on condition batch normalization

Publications (1)

Publication Number Publication Date
CN112101417A true CN112101417A (en) 2020-12-18

Family

ID=73753829

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010818695.XA Pending CN112101417A (en) 2020-08-14 2020-08-14 Continuous learning method and device based on condition batch normalization

Country Status (1)

Country Link
CN (1) CN112101417A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114463605A (en) * 2022-04-13 2022-05-10 中山大学 Continuous learning image classification method and device based on deep learning
WO2023070274A1 (en) * 2021-10-25 2023-05-04 Robert Bosch Gmbh A method and an apparatus for continual learning

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023070274A1 (en) * 2021-10-25 2023-05-04 Robert Bosch Gmbh A method and an apparatus for continual learning
CN114463605A (en) * 2022-04-13 2022-05-10 中山大学 Continuous learning image classification method and device based on deep learning
CN114463605B (en) * 2022-04-13 2022-08-12 中山大学 Continuous learning image classification method and device based on deep learning

Similar Documents

Publication Publication Date Title
Yap et al. Adaptive image processing: a computational intelligence perspective
Wang et al. Unsupervised learning of visual representations using videos
US20210182567A1 (en) Method for accelerated detection of object in videos, server, and non-transitory computer readable storage medium
Ma et al. Facial expression recognition using constructive feedforward neural networks
Samangouei et al. Explaingan: Model explanation via decision boundary crossing transformations
CN110827129A (en) Commodity recommendation method and device
Shinde et al. Extracting classification rules from modified fuzzy min–max neural network for data with mixed attributes
CN111783902A (en) Data augmentation and service processing method and device, computer equipment and storage medium
CN110598603A (en) Face recognition model acquisition method, device, equipment and medium
Aswolinskiy et al. Time series classification in reservoir-and model-space
CN112101417A (en) Continuous learning method and device based on condition batch normalization
CN110705600A (en) Cross-correlation entropy based multi-depth learning model fusion method, terminal device and readable storage medium
CN111543988B (en) Adaptive cognitive activity recognition method and device and storage medium
Moreno-Barea et al. Gan-based data augmentation for prediction improvement using gene expression data in cancer
KR20200119042A (en) Method and system for providing dance evaluation service
CN116630816B (en) SAR target recognition method, device, equipment and medium based on prototype comparison learning
CN111445545B (en) Text transfer mapping method and device, storage medium and electronic equipment
CN117033956A (en) Data processing method, system, electronic equipment and medium based on data driving
Dorobanţiu et al. A novel contextual memory algorithm for edge detection
CN111930935B (en) Image classification method, device, equipment and storage medium
Aji et al. Oil palm unstripped bunch detector using modified faster regional convolutional neural network
Liu Pattern recognition using Hilbert space
CN113496251A (en) Device for determining a classifier for identifying an object in an image, device for identifying an object in an image and corresponding method
Zeng et al. Underwater image target detection with cascade classifier and image preprocessing method
CN117912640B (en) Domain increment learning-based depressive disorder detection model training method and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20201218

RJ01 Rejection of invention patent application after publication