CN114692815A - Method for optimizing low-bit model training - Google Patents

Method for optimizing low-bit model training Download PDF

Info

Publication number
CN114692815A
CN114692815A CN202011617715.3A CN202011617715A CN114692815A CN 114692815 A CN114692815 A CN 114692815A CN 202011617715 A CN202011617715 A CN 202011617715A CN 114692815 A CN114692815 A CN 114692815A
Authority
CN
China
Prior art keywords
training
model
data
network
bit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011617715.3A
Other languages
Chinese (zh)
Inventor
张东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hefei Ingenic Technology Co ltd
Original Assignee
Hefei Ingenic Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hefei Ingenic Technology Co ltd filed Critical Hefei Ingenic Technology Co ltd
Priority to CN202011617715.3A priority Critical patent/CN114692815A/en
Publication of CN114692815A publication Critical patent/CN114692815A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a method for optimizing training of a low-bit model, which aims to overcome the defects in the prior art and solve the problems of serious precision loss and difficult convergence of the existing 2-bit model in the training process. The method comprises the following steps: s1, training a full-precision model: training a full-precision model based on the data set; s2, training a low-bit model: then training 4bit model, 2bit model in turn, and adopting different weight attenuation coefficient and optimizer under different bit width.

Description

Method for optimizing low-bit model training
Technical Field
The invention relates to the technical field of image processing, in particular to a method for optimizing low-bit model training.
Background
In recent years, with the rapid development of science and technology, a big data age has come. Deep learning takes a Deep Neural Network (DNN) as a model, and achieves remarkable results in key fields of many human intelligence, such as image recognition, reinforcement learning, semantic analysis and the like. The Convolutional Neural Network (CNN) is a typical DNN structure, can effectively extract hidden layer features of an image and accurately classify the image, and is widely applied to the field of image identification and detection in recent years.
In the prior art, a Relu function is mostly adopted when a full-precision model is trained, because the real number range represented by full-precision number is wide, the numerical range required in the training process can be met, however, when low bits are trained, because of the limit of bit width, the representation range is limited, the model cannot be effectively converged in the training process, and the precision of the final model is not ideal.
Commonly used terms in the art include:
convolutional Neural Networks (CNN): is a type of feedforward neural network that contains convolution calculations and has a depth structure.
And (3) quantification: quantization refers to the process of approximating a continuous value (or a large number of possible discrete values) of a signal to a finite number (or fewer) of discrete values.
Low bit: and quantizing the data into data with bit width of 8bit, 4bit or 2 bit.
Disclosure of Invention
In order to solve the problems, the method aims to overcome the defects in the prior art and solve the problems that the existing 2-bit model has serious precision loss and is difficult to converge in the training process.
Fine-tuning a low-bit model based on a full-precision model: firstly, a full-precision model is trained by using a data set to reach target precision, and then a low-bit model is finely trained on the basis of the full-precision model.
Specifically, the invention provides a method for optimizing low bit model training, which comprises the following steps:
s1, training a full-precision model: training a full-precision model based on the data set;
s2, training a low-bit model: then training 4bit model, 2bit model in turn, and adopting different weight attenuation coefficient and optimizer under different bit width.
The step S1 further includes:
s1.1, training data:
the data set for the training model is ImageNet1000, which is a subset of the ImageNet data set with about 1.2 millions of training set, 5 thousand of validation sets, 15 thousand of test sets, 1000 classes;
s1.2, establishing a model:
the basic neural network model adopted by training is MobileNet V1, and the network is a model based on deep separable convolution;
s1.3, training a network:
the basic steps for training the network are: setting the weight attenuation coefficient to be 0.0005, firstly training 60 epochs by adopting an adam optimizer, and then using an SGD optimizer until the training is finished;
s1.4, testing the network effect: and testing the network result by using the test set.
The step S2 further includes:
s2.1, data quantization: quantizing the data to be quantized to obtain low-bit data;
s2.2, carrying out low bit model training:
s2.2.1, training 4bit model;
s2.2.2, training a 2bit model;
s2.2.3, testing network effects:
s2.2.4, output network.
Step S2.1, quantization is performed according to formula 1:
Figure BDA0002875309610000031
description of variables: wfFor full-precision data being an array, WqTo simulate quantized data, maxwFull precision data WfMedian maximum value, minwFull precision data WfAnd b is the bit width after quantization.
The step S2.2.1, the training 4bit model: during training, the weight sum activation is quantized to 4 bits, and the weight attenuation coefficient is set to be 0, namely weight attenuation is not adopted; and training the model until convergence by adopting an adam optimizer.
The step S2.2.2, the training 2bit model: obtaining a model with weight and activation quantized to 4 bits after training in the step S2.2.1, training the model with weight and activation quantized to 2 bits based on the model, setting a weight attenuation coefficient to 0.00005 when training the 2bit model, adopting an adam optimizer during the first 60epoch training, then reducing the learning rate to 0.0001, and adopting an SGD optimizer to train the model until convergence.
Thus, the present application has the advantages that: the method is simple, and the aim of improving the model precision of the convolutional neural network during quantification is fulfilled by training a full-precision model based on a data set, then training a 4-bit model and a 2-bit model, and adopting different weight attenuation coefficients and optimizers under different bit widths.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the invention and together with the description serve to explain the principles of the invention.
FIG. 1 is a schematic flow diagram of the process of the present invention.
FIG. 2 is a schematic diagram of the training process of the low bit model of the present invention.
Detailed Description
In order that the technical contents and advantages of the present invention can be more clearly understood, the present invention will now be described in further detail with reference to the accompanying drawings.
As shown in fig. 1, the present invention relates to a method for optimizing low bit model training, comprising the steps of:
s1, training a full-precision model: training a full-precision model based on the data set;
s2, training a low-bit model: then training 4bit model, 2bit model in turn, and adopting different weight attenuation coefficient and optimizer under different bit width.
Specifically, the method comprises the following steps:
1, full-precision model training:
1) training data:
the data set for the training model is ImageNet1000, which is a subset of the ImageNet data set with about 1.2 millions of training set, 5 thousand of validation sets, 15 thousand of test sets, 1000 classes.
2) Model:
the basic neural network model adopted in the training is MobileNet V1, which is a model based on deep separable convolution.
3) Training a network:
the basic steps for training the network are: setting the weight attenuation coefficient to 0.0005, firstly training 60 epochs by using an adam optimizer, and then using an SGD optimizer until the training is finished.
4) Testing the network effect:
and testing the network result by using the test set.
2 low bit model training:
data quantization: the data to be quantized is quantized according to the following formula to obtain low-bit data.
Figure BDA0002875309610000051
Description of variables: wfIs an array, W, for full-precision dataqTo simulate the quantized data, maxwFull precision data WfMedian maximum value, minwFull precision data WfAnd b is the bit width after quantization.
The low bit model training process is shown in fig. 2, and is mainly divided into 2 steps, which are respectively:
1) training a 4-bit model:
during training, the weight sum activation is quantized to 4 bits, and the weight attenuation coefficient is set to be 0, namely, weight attenuation is not adopted. And the optimizer trains the model using adam until convergence.
2) Training a 2bit model:
obtaining a model with weight and activation quantized into 4 bits after the first training, then training the model with weight and activation quantized into 2 bits based on the model, setting a weight attenuation coefficient to be 0.00005 when training the 2bit model, adopting an adam optimizer during the first 60epoch training, then reducing the learning rate to be 0.0001, and adopting an SGD optimizer to train the model until convergence.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention, and various modifications and changes may be made to the embodiment of the present invention by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (6)

1. A method for optimizing training of a low bit model, the method comprising the steps of:
s1, training a full-precision model: training a full-precision model based on a data set;
s2, training a low-bit model: then training 4bit model, 2bit model in turn, and adopting different weight attenuation coefficient and optimizer under different bit width.
2. The method for optimizing low bit rate model training of claim 1, wherein said step S1 further comprises:
s1.1, training data:
the data set for the training model is ImageNet1000, which is a subset of the ImageNet data set with about 1.2 millions of training set, 5 thousand of validation sets, 15 thousand of test sets, 1000 classes;
s1.2, establishing a model:
the basic neural network model adopted by training is MobileNet V1, and the network is a model based on deep separable convolution;
s1.3, training a network:
the basic steps for training the network are: setting the weight attenuation coefficient to be 0.0005, firstly adopting an adam optimizer to train 60 epochs, and then adopting an SGD optimizer until the training is finished;
s1.4, testing the network effect: and testing the network result by using the test set.
3. The method for optimizing low bit rate model training of claim 1, wherein said step S2 further comprises:
s2.1, data quantization: quantizing the data to be quantized to obtain low-bit data;
s2.2, carrying out low bit model training:
s2.2.1, training 4bit model;
s2.2.2, training a 2bit model;
s2.2.3, testing network effects:
s2.2.4, output network.
4. A method for optimizing low bit rate model training as claimed in claim 3, wherein said step S2.1, quantization according to equation 1:
Figure FDA0002875309600000021
description of variables: wfFor full-precision data being an array, WqTo simulate the quantized data, maxwFull precision data WfMedian maximum value, minwFull precision data WfAnd b is the bit width after quantization.
5. A method for optimizing low bit model training as claimed in claim 3, wherein said step S2.2.1, said training 4bit model: during training, the weight sum activation is quantized to 4 bits, and the weight attenuation coefficient is set to be 0, namely weight attenuation is not adopted; and training the model until convergence by adopting an adam optimizer.
6. A method for optimizing low bit model training as claimed in claim 5, wherein said step S2.2.2, training 2bit model: obtaining a model with weight and activation quantized to 4 bits after training in the step S2.2.1, training the model with weight and activation quantized to 2 bits based on the model, setting a weight attenuation coefficient to 0.00005 when training the 2bit model, adopting an adam optimizer during the first 60epoch training, then reducing the learning rate to 0.0001, and adopting an SGD optimizer to train the model until convergence.
CN202011617715.3A 2020-12-31 2020-12-31 Method for optimizing low-bit model training Pending CN114692815A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011617715.3A CN114692815A (en) 2020-12-31 2020-12-31 Method for optimizing low-bit model training

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011617715.3A CN114692815A (en) 2020-12-31 2020-12-31 Method for optimizing low-bit model training

Publications (1)

Publication Number Publication Date
CN114692815A true CN114692815A (en) 2022-07-01

Family

ID=82134849

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011617715.3A Pending CN114692815A (en) 2020-12-31 2020-12-31 Method for optimizing low-bit model training

Country Status (1)

Country Link
CN (1) CN114692815A (en)

Similar Documents

Publication Publication Date Title
CN112116030B (en) Image classification method based on vector standardization and knowledge distillation
CN108900346B (en) Wireless network flow prediction method based on LSTM network
CN111564160B (en) Voice noise reduction method based on AEWGAN
CN105096955B (en) A kind of speaker's method for quickly identifying and system based on model growth cluster
CN108847223A (en) A kind of audio recognition method based on depth residual error neural network
CN111161744B (en) Speaker clustering method for simultaneously optimizing deep characterization learning and speaker identification estimation
CN111429947A (en) Speech emotion recognition method based on multi-stage residual convolutional neural network
CN113177558B (en) Radiation source individual identification method based on small sample feature fusion
CN113140220A (en) Lightweight end-to-end speech recognition method based on convolution self-attention transformation network
CN115392285A (en) Deep learning signal individual recognition model defense method based on multiple modes
WO2020253692A1 (en) Quantification method for deep learning network parameters
CN114299995A (en) Language emotion recognition method for emotion assessment
CN113206808B (en) Channel coding blind identification method based on one-dimensional multi-input convolutional neural network
CN114692815A (en) Method for optimizing low-bit model training
CN110619886A (en) End-to-end voice enhancement method for low-resource Tujia language
CN113762500B (en) Training method for improving model precision during quantization of convolutional neural network
CN113762497B (en) Low-bit reasoning optimization method for convolutional neural network model
CN116248202A (en) Method for realizing radio frequency channel calibration based on deep learning
CN112434716B (en) Underwater target data amplification method and system based on condition countermeasure neural network
CN113220892A (en) BERT-based self-adaptive text classification method and device
CN114692814A (en) Quantification method for optimizing neural network model activation
CN108563639B (en) Mongolian language model based on recurrent neural network
CN113593551B (en) Objective evaluation method for interference effect of voice communication based on command word recognition
CN113762495A (en) Method for improving precision of low bit quantization model of convolutional neural network model
Avila et al. Low-bit shift network for end-to-end spoken language understanding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination