CN107766850B - Face recognition method based on combination of face attribute information - Google Patents
Face recognition method based on combination of face attribute information Download PDFInfo
- Publication number
- CN107766850B CN107766850B CN201711232374.6A CN201711232374A CN107766850B CN 107766850 B CN107766850 B CN 107766850B CN 201711232374 A CN201711232374 A CN 201711232374A CN 107766850 B CN107766850 B CN 107766850B
- Authority
- CN
- China
- Prior art keywords
- layer
- attribute
- convolution
- face
- loss function
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/172—Classification, e.g. identification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Oral & Maxillofacial Surgery (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a face recognition method based on combination of face attribute information, and belongs to the technical field of digital image processing. The invention discloses a new method for fusing identity information and attribute information to improve the accuracy of face recognition, aiming at the technical problems that the existing fusion method needs to train a plurality of DCNN networks and then carries out score fusion or feature fusion for further training, the work task is heavy and complicated, and the practical application is not facilitated. The face identity authentication network and the attribute recognition network are fused to form a fusion network, and the identity characteristic and the face attribute characteristic are simultaneously learned in a joint learning mode, so that the face recognition accuracy is improved, the face attribute characteristic can be predicted, and the face identity authentication network is a multi-task network; a weighting function sensitive to cost is adopted, so that the target domain data distribution is not depended on, and the balance training in a source data domain is realized; and the modified fusion framework only adds a few parameters, and the additional calculation load is small.
Description
Technical Field
The invention belongs to the technical field of digital image processing, and particularly relates to a face recognition method based on combination of face attribute information.
Background
With the rapid development of deep learning, the face recognition technology has been developed rapidly, and many products applying the face recognition technology are produced. However, the current face recognition technology has many limitations, and the very typical problems are influenced by environmental factors such as a large side face, light and the like, which all reduce the performance of the face recognition system. Many researchers have done much work on face pose correction, domain adaptation (domain), etc., and while many efforts have been made, they are still in the exploration phase. According to research, under the condition that a plurality of conditions are greatly changed, the identification of the facial attribute information (such as gender, eyebrow shape, nose bridge height and the like) of a plurality of people is not greatly influenced, and the facial attribute information can still be accurately identified. Therefore, the accuracy of face recognition can be improved by combining the face attribute information.
At present, a plurality of multitask frameworks applied to face attribute learning are not available. Many of these approaches, while simple in concept, are very burdensome. For example: using AdaBoost to select independent feature subspace and independent SVM classifier for each attribute to realize classification on different attributes; or separately training DCNN (deep convolutional neural network) for each attribute and then training an independent SVM classifier for classification. The work task is very complicated, and the practical utilization value is low. Rudd et al propose a hybrid target optimization network to learn face attributes, and train different attributes jointly, greatly reducing the work difficulty and making it easier to implement.
In the aspect of fusion, many workers try to add attribute information into face recognition to improve the accuracy of face recognition. However, no mature algorithm exists at present for how to fuse the face attribute information and the face identity authentication information. Currently existing fusion methods are roughly divided into two categories:
(1) the fusion framework is shown in fig. 1, and the identity recognition network and n (n > 2) attribute recognition networks are respectively trained, an input picture is subjected to DCNN (deep neural network) to extract features, similarity scores (label corresponding probability values) are output through a full connection layer Fc and a softmax layer, and then all the similarity scores are added to form a new identity similarity score prediction target identity.
(2) Feature level fusion, which can be further divided into an aggregation method and a subspace learning method. The aggregation method is to extract attribute features and identity authentication features by using a network, then simply connect the two features at a feature level or limit the two features to have the same dimension, and then carry out element averaging or multiplication. The subspace learning method is to connect the two features in series, then map the connected features to a more appropriate subspace, and then learn the fusion parameters by adopting a supervised or unsupervised learning method. Unsupervised learning does not utilize identity information for fusion learning, and supervised learning uses identity information for fusion learning, relatively. The fusion framework of the feature layer is as shown in fig. 2, which is similar to the structure of the fractional layer, and respectively trains an identity recognition network and n attribute recognition networks, then inputs the picture into all networks, then extracts the features of the pooling layer of the last pooling layer, fuses all the pooling layer features together through a feature connector, and then performs prediction output on a new feature training SVM or other classifiers.
Both methods need to train a plurality of DCNNs, and then perform score fusion or feature fusion for further training, which is heavy and complicated in work task and not beneficial to practical application.
Disclosure of Invention
The invention aims to: aiming at the existing problems, a new mode of fusing identity information and attribute information is disclosed to improve the accuracy of face recognition. The invention fuses the face identity authentication network and the attribute identification network to form a fusion network, and simultaneously learns the identity characteristics and the face attribute characteristics by adopting a joint learning mode.
The invention relates to a face recognition method based on combination of face attribute information, which comprises the following steps:
constructing a fusion network model:
a third module blockC is used as an input layer of the converged network model, the third module blockC is connected with a first module blockA1, the first module blockA1 is respectively connected with a first module blockA2, and a second module blockB 1; the first module BlockA2 is sequentially connected with a first module BlockA3, a pooling layer of a first global average pooling mode, a first full connection layer and a second full connection layer Softmax layer to form an identity identification network;
the first module Block A2 and the second module Block B1 are both connected with a feature connector, and the feature connection layer is sequentially connected with the second module Block B2, a pooling layer of a second global average pooling mode and a third full connection layer to form a face attribute identification network;
wherein, the first module Block a1 stacks 5 inclusion structures, the first module Block a2 stacks 10 inclusion structures, the first module Block A3 stacks 5 inclusion structures, the inclusion structure includes a feature connector, a convolution layer, a pooling layer, a normalization layer and an input interface layer, and four paths of parallel convolution are included between the feature connector and the input interface layer: the first path is a convolution layer and a normalization layer which are connected in series, wherein the convolution layer is connected with the input interface layer, and the convolution kernel is 1 multiplied by 1; the second path is two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 3 multiplied by 3; the third path comprises two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 5 multiplied by 5; the fourth path comprises a pooling layer, a convolution layer and a normalization layer which are sequentially connected in series, wherein the pooling layer is connected with the input interface layer, the pooling mode is maximum pooling, the pooling kernel is 2 multiplied by 2, and the convolution kernel of the convolution layer is 1 multiplied by 1;
the second modules, BlockB1 and BlockB2, are convolution structures: the system comprises an input interface layer, a convolution layer with convolution kernel of 1 × 1, a convolution layer with convolution kernel of 3 × 3 and an output interface layer which are connected in sequence;
the third module blockC comprises an input layer, 3 groups of convolution layers and pooling layer groups which are connected in series and an output interface layer, wherein the cores of the convolution layers and the pooling layers are respectively 3 multiplied by 3 and 2 multiplied by 2, and the pooling mode is maximum pooling;
training the fusion network model:
step 101: collecting a training sample set, and carrying out image preprocessing on the training sample, wherein the image preprocessing comprises size normalization, image pixel value mean normalization and random angle turnover normalization; randomly dividing the training sample set into a plurality of sub-training sets, wherein the sample number of each sub-training set is S;
step 102: initializing a neural network parameter and an attribute distribution weight of an attribute loss function, and acquiring a network parameter of first iteration and the attribute distribution weight of the attribute loss function; wherein the attribute distribution weight of the attribute loss function comprises the attribute distribution weight of the attribute loss function of the positive sampleAnd of negative examplesAttribute distribution weights for attribute loss functionsi is an attribute category identifier;
step 103: using the sub-training set as the input image of the fusion network model, predicting the identity label and each attribute label, comparing the error with the real label, and calculating the loss functionWherein lsoftmaxRepresents the loss function of the Softmax layer,/centerlossCenter loss function, l, representing the identity of the first fully-connected layer facemultitaskA face attribute loss function, λ, representing the third connection layer1And λ2Represents a predetermined loss weight, 0 < lambda1,λ2If the ratio is less than 1, taking a channel test observation value;
whereinFC[i]jThe result of the attribute all-connected layer output attribute i representing the jth picture,labels, P, representing pairs of pictures j to attributes it iAttribute distribution weights representing attribute loss functions of positive or negative samples of the t-th iteration corresponding to attribute i, C representing the number of attribute classes;
step 104: calculating a loss functionGradient of (2)Wherein WtA network parameter representing a t-th iteration;
iteratively updating the network parameters: : wt+1=Wt+Vt+1WhereinBeta represents the learning rate of the preset negative gradient, mu represents the weight of the last gradient value, VtRepresents the gradient of the t-th iteration, and the gradient of the first iteration is 0 (if the initial value of t is 0, i.e. V)00), the weight mu is a preset value;
iteratively updating the attribute distribution weights of the attribute loss function:wherein the dimension parameterr=∑Pt iyiFC[i]Current normalized variable Zi=∑Pt iexp(-αyiFC[i]),FC[i]Representing the current output result of the third fully-connected layer for attribute i, i.e., S FCs [ i]jComposition FC [ i],yiTrue labels, i.e. S, representing the current sub-training set for attribute iComposition yi;
Step 105: step 103-104 is repeatedly executed, the network parameter and the attribute distribution weight of the attribute loss function of each attribute are iteratively updated until the loss functionConverging and storing the currently updated network parameters and the attribute distribution weight of the attribute loss function;
and (3) identification processing of the image to be identified:
step 201: carrying out size normalization and image pixel value mean value normalization processing on an image to be recognized;
step 202: loading the network parameters saved in the training process;
step 203: inputting the image to be recognized processed in the step 201 into the fusion network model constructed in the invention, performing forward propagation, and respectively predicting an identity tag and a C attribute tag through a second full-connection layer and a third full-connection layer, wherein the identity tag is an index tag corresponding to the maximum probability value through the second full-connection layer and a softmax layer; and the attribute label is directly output through the third full connection layer.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
(1) the invention provides a new fusion framework of a human face attribute supervision human face recognition learning task, under the framework, the invention not only improves the accuracy of human face recognition, but also can predict the attribute characteristics of the human face, and is a multi-task network;
(2) a multi-task learning framework is improved, and a cost-sensitive weighting function is adopted, so that the balance training in a source data domain is realized without depending on the data distribution of a target domain;
(3) the modified fusion framework only adds a few parameters, and the additional calculation load is small. Compared with the existing method of independently training the face attribute network and the identity recognition network, extracting the features and fusing the features, the method reduces the workload and the operation load to a certain extent, and is more convenient for practical deployment and application.
Drawings
FIG. 1 is a diagram of a fusion framework for a prior art fractional layer;
FIG. 2 is a diagram of a fusion framework for a prior art fractional layer;
FIG. 3 is a schematic view of the fusion framework of the present invention;
FIG. 4 is a schematic diagram of the first module Block A of the fusion framework of the present invention;
FIG. 5 is a schematic diagram of the second module Block B of the fusion framework of the present invention;
fig. 6 is a schematic structural diagram of a third module BlockC of the fusion framework of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the following embodiments and accompanying drawings.
Referring to fig. 3, the converged network model of the present invention includes first, second and third modules: BlockA, BlockB and BlockC, pooling layer, full connectivity layer FC, feature connector (Filter connectivity), Softmax layer; the third module blockC is an input layer of the converged network model, and is connected with the first module blockA1, the first module blockA1 is respectively connected with the first module blockA2, and the second module blockB 1; the first module BlockA2 is sequentially connected with a first module BlockA3, a pooling layer (global average pooling mode), a first full connection layer (FC 1024) and a second full connection layer (FC N) Softmax layer to form an identity identification network; the first module BlockA2 and the second module BlockB1 are both connected with a feature connector, that is, features obtained by BlockA2 and BlockB1 are stacked in depth through the feature connector, and the feature connection layer is sequentially connected with a second module BlockB2, a pooling layer (global average pooling mode) and a third full connection layer (FC 8), so as to form a face attribute recognition network.
Wherein the first fully-connected layer is used to generate a center loss (centrorloss), that is, to make the face features of each identity gather at the center of the corresponding identity, reduce the intra-class distance, and increase the inter-class distance, and the output dimension of the first fully-connected layer depends on the feature dimension of the input image, for example 1024; the second full connection layer is used for outputting N (identity category number, namely N people) dimensional full connection characteristics, and the second full connection layer outputs final identity information through the softmax layer; the output dimension of the third fully-connected layer depends on the number of preset face attributes, namely, the recognition results (whether corresponding attributes exist) of different belongings are obtained through the second fully-connected layer, wherein the face attributes comprise sex, size of mouth, thickness of lips, size of eyes, thickness of eyebrows, height of nose bridge, size of nose, width of forehead and the like.
The first module, Block a (i), is a Block in which three inclusion structures are stacked (connected in series) and is used to extract shallow, intermediate, and advanced features in a picture, that is, Block a (i) is a different number of inclusion structures, where Block a1 stacks 5 inclusion structures, Block a2 stacks 10 inclusion structures, and Block A3 stacks 5 inclusion structures. The inclusion structure is shown in fig. 4, and the feature connector (Filter classification), the convolutional layer (conv), the pooling layer (posing), the normalization layer (batch normalization), and the input interface layer (Previous layer) include four parallel convolutions between the feature connector and the input interface layer: the first path is a convolution layer and a normalization layer which are connected in series, wherein the convolution layer is connected with the input interface layer, and the convolution kernel is 1 multiplied by 1; the second path is two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 3 multiplied by 3; the third path comprises two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 5 multiplied by 5; the fourth path comprises a pooling layer, a convolution layer and a normalization layer which are sequentially connected in series, wherein the pooling layer is connected with the input interface layer, the pooling mode is maximum pooling, the pooling kernel is 2 x 2, and the convolution kernel of the convolution layer is 1 x 1. The normalization layer is used for simulating large-scale change of parameters, so that training tends to be stable, deeper training of the network becomes easy, convergence is accelerated, a certain regularization effect is achieved, and overfitting of the model is prevented.
The first module Block A (i) of the invention carries out channel dimension reduction and weighted summation on the shallow layer feature and the intermediate feature through 1 x 1 convolution to form a primary feature and a part of intermediate feature of the face attribute, then learns the feature through 3 x 3 convolution kernels respectively, and finally forms the high-level feature of the face attribute, wherein in a neural network, the shallow layer and the intermediate feature contain certain general information, and the high-level feature is a targeted feature guided by a learning task. The present invention combines shallow and intermediate features in an identity recognition network to learn high-level features in the face attributes. Therefore, the convolution Block B with a small amount of added parameters can simultaneously realize the simultaneous learning of the identity characteristics and the attributes.
The second blocks blockb (i) are all convolution structures shown in fig. 5, and include an input interface layer (Previous layer), a convolution layer with a convolution kernel of 1 × 1, a convolution layer with a convolution kernel of 3 × 3, and an output interface layer (Top layer) connected in sequence.
Referring to fig. 6, the third block BlockC includes an input layer, 3 groups of convolutional layers and pooling layer groups connected in series, and an output interface layer (Top layer), where cores of the convolutional layers and the pooling layers are 3 × 3 and 2 × 2, respectively, and the pooling mode is maximum pooling.
After a small number of parameters are added, the face attribute characteristics and the identity characteristics are fused, network synchronous training is realized, and the face recognition accuracy is improved. The added parameters are mainly the parameters of Block B and a full link layer of a multitask classifier.
BlockB parameter increment: the number of input feature maps, N, is denoted by M1Denotes the number of 1 × 1 convolution kernels, N2Representing the number of 3 × 3 convolution kernels, the BlockB parameter is: numparam1=MN1+9N1N2;
Number of added parameters of the convolutional layer: if A represents the input feature dimension of the convolutional layer and C represents the attribute feature type, the input dimension of the fully-connected layer is C, so the parameters are: numparam2=AC;
For example, for M128, N1=64,N2An application scenario of 128 may be: numparam181920; and A is of the order of 103C is of the order of 102Then Numparam2Typically of the order of 105The overall added parameters are not much more than the millions of parameters of the whole face recognition network.
The invention utilizes the shallow feature and the intermediate feature of the face identity recognition, further learns to generate a part of intermediate features and the final attribute advanced feature, and then outputs the attribute prediction through the full connection layer. C is used to represent attribute feature type, the output dimension of the full connection layer is C, and for a certain attribute i, the full connection output is FC [ i]If i is more than or equal to 1 and less than or equal to C, the classification result Y [ i ] of the attribute i]And error E [ i ]]Respectively as follows:wherein the loss is: l isi=max(0,1-yiFC[i])。
In the multitask optimization process, the data imbalance problem is a problem which must be solved, so that the loss of each attribute cannot be directly added. In this case, the invention defines a hybrid objective function, which uses the distribution of the attributes in the data field to perform a weighted summation of the loss of each attribute, but in the selection of the weighting function, this is achieved by a cost-sensitive weighting functionThe weighting function is:whereinRepresenting a scale parameter, r ═ Σ Pt iyiFC[i]Current normalized variable Zi=∑Pt iexp(-αyiFC[i]),FC[i]Representing the current output result of the third fully-connected layer for attribute i, i.e., S FCs [ i]jComposition FC [ i],yiTrue labels, i.e. S, representing the current sub-training set for attribute iComposition yiThereby obtaining a multitask penalty functionWherein N is the number of pictures in a batch, C is the number of attribute types, FC [ i]jThe result of the attribute all-connected layer output attribute i representing the jth picture,a label representing picture j versus attribute i.
The overall system penalty function is then:wherein lsoftmaxRepresents the loss function of the Softmax layer,/centerlossCenter loss function, l, representing the face identity of the first fully-connected layermultitaskFace attribute loss function, λ, corresponding to the third connection layer1And λ2Represents a predetermined loss weight, 0 < lambda1,λ2And (5) taking the observed value of the experience, wherein the suggested value is 0.08 and 0.02. Therefore, the whole system is trained under the co-supervision of identity recognition loss and attribute recognition loss, parameters are optimized, and the fusion of attribute recognition and identity recognition at the parameter level is realized, rather than the existing fusion mode at the feature level and the final similarityResulting in fusion of the planes.
The face recognition method based on the fusion network model constructed by the invention mainly comprises two processes of training and recognition, which are specifically as follows:
1. training process:
step 101: and acquiring a training sample set, and performing image preprocessing on the training sample, wherein the image preprocessing comprises size normalization, image pixel value mean normalization and random inversion normalization (left-right inversion is performed to increase the number of the training sample set). For example, scaling the picture size to 128 × 3, 112 × 3(H × W × C, H is the picture height, W is the picture width, C is the picture channel, and 3 is represented as RGB color picture), and then performing mean normalization and random flip normalization;
randomly dividing the training sample set into a plurality of sub-training sets, wherein the sample number of each sub-training set is S;
step 102: initializing a neural network parameter (such as an Xavier method), and distributing the weight of the attribute loss function, thereby obtaining the network parameter of the first iteration and the attribute distribution weight of the attribute loss function. Wherein the attribute distribution weight of the attribute loss function comprises the attribute distribution weight of the attribute loss function of the positive sampleAttribute distribution weight of attribute loss function of sum negative sampleAnd respectively represent positive and negative sample loss function initialization weights for attribute i,andrepresenting the number of positive and negative samples with the attribute of i in the training sample set;
step 103: the sub-training set is used as an input image of the fusion network model constructed by the invention, the identity label and the C attribute label are predicted, and the error is compared with the real labelCalculating a loss functionIn this embodiment, λ1And λ2The preferable values of (b) are 0.08 and 0.02, respectively. Namely, it isWhereinFC[i]jThe result of the attribute all-connected layer output attribute i representing the jth picture,table indicates the label of picture j to attribute i, Pt iAn attribute distribution weight representing an attribute loss function of the positive or negative sample of the tth iteration corresponding to attribute i;
step 104: calculating the gradient of the loss functionWherein WtA network parameter representing a t-th iteration;
updating the network parameter W at the t +1 th timet+1:Wt+1=Wt+Vt+1WhereinBeta represents the learning rate of the preset negative gradient, mu represents the weight of the last gradient value, VtRepresents the gradient of the t-th iteration, and the gradient of the first iteration is 0 (if the initial value of t is 0, i.e. V)00), the weight mu is a preset value;
updating attribute distribution weight of attribute loss function at t +1 th timeNamely, the attribute distribution weights of the attribute loss functions of the positive and negative samples are updated according to the updating mode, wherein alpha is a scale parameter,r=∑Pt iyiFC[i]current normalized variable Zi=∑Pt iexp(-αyiFC[i]),FC[i]Representing the current output result of the third fully-connected layer for attribute i, i.e., S FCs [ i]jComposition FC [ i],yiTrue labels, i.e. S, representing the current sub-training set for attribute iComposition yi;
Step 105: step 103 and step 104 are repeatedly executed, the network parameter and the attribute distribution weight of the attribute loss function of each attribute are iteratively updated untilAnd (6) converging. And storing the currently updated network parameters and attribute distribution weights of the attribute loss functions.
2. The identification process comprises the following steps:
step 201: carrying out size normalization and image pixel value mean value normalization processing on an image to be recognized;
step 202: loading the network parameters saved in the training process;
step 203: inputting the image to be recognized processed in step 201 into the fusion network model constructed in the invention, calculating forward, and predicting the identity label and the C attribute label through two full-connection layers (second and third). In the specific embodiment, the identity tag obtains the index tag corresponding to the maximum probability value through FC1024 and softmax; the face attribute label is output through FC8, where the face attribute label Y is:
while the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except mutually exclusive features and/or steps.
Claims (2)
1. The face recognition method based on the combination of the face attribute information is characterized by comprising the following steps:
constructing a fusion network model:
a third module blockC is used as an input layer of the converged network model, the third module blockC is connected with a first module blockA1, the first module blockA1 is respectively connected with a first module blockA2, and a second module blockB 1; the first module BlockA2 is sequentially connected with a first module BlockA3, a pooling layer of a first global average pooling mode, a first full connection layer and a second full connection layer Softmax layer to form an identity identification network;
the first module Block A2 and the second module Block B1 are both connected with a feature connector, and the feature connection layer is sequentially connected with the second module Block B2, a pooling layer of a second global average pooling mode and a third full connection layer to form a face attribute identification network;
wherein, the first module Block a1 stacks 5 inclusion structures, the first module Block a2 stacks 10 inclusion structures, the first module Block A3 stacks 5 inclusion structures, the inclusion structure includes a feature connector, a convolution layer, a pooling layer, a normalization layer and an input interface layer, and four paths of parallel convolution are included between the feature connector and the input interface layer: the first path is a convolution layer and a normalization layer which are connected in series, wherein the convolution layer is connected with the input interface layer, and the convolution kernel is 1 multiplied by 1; the second path is two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 3 multiplied by 3; the third path comprises two convolution layers and a normalization layer which are connected in series, wherein the convolution kernel of the convolution layer connected with the input interface layer is 1 multiplied by 1, and the convolution kernel of the other convolution layer is 5 multiplied by 5; the fourth path comprises a pooling layer, a convolution layer and a normalization layer which are sequentially connected in series, wherein the pooling layer is connected with the input interface layer, the pooling mode is maximum pooling, the pooling kernel is 2 multiplied by 2, and the convolution kernel of the convolution layer is 1 multiplied by 1;
the second modules, BlockB1 and BlockB2, are convolution structures: the system comprises an input interface layer, a convolution layer with convolution kernel of 1 × 1, a convolution layer with convolution kernel of 3 × 3 and an output interface layer which are connected in sequence;
the third module blockC comprises an input layer, 3 groups of convolution layers and pooling layer groups which are connected in series and an output interface layer, wherein the cores of the convolution layers and the pooling layers are respectively 3 multiplied by 3 and 2 multiplied by 2, and the pooling mode is maximum pooling;
training the fusion network model:
step 101: collecting a training sample set, and carrying out image preprocessing on the training sample, wherein the image preprocessing comprises size normalization, image pixel value mean normalization and random angle turnover normalization; randomly dividing the training sample set into a plurality of sub-training sets, wherein the sample number of each sub-training set is S;
step 102: initializing a neural network parameter and an attribute distribution weight of an attribute loss function, and acquiring a network parameter of first iteration and the attribute distribution weight of the attribute loss function; wherein the attribute distribution weight of the attribute loss function comprises the attribute distribution weight of the attribute loss function of the positive sampleAttribute distribution weight of attribute loss function of sum negative sampleWherein i is a face attribute class identifier;
step 103: using the sub-training set as the input image of the fusion network model, predicting the identity label and each attribute label, comparing the error with the real label, and calculating the loss functionWhereinRepresenting the loss function of the Softmax layer,a central loss function representing the identity of the face at the first fully connected layer,a face attribute loss function, λ, representing the third connection layer1And λ2Represents a predetermined loss weight, 0 < lambda1,λ2If the ratio is less than 1, taking a channel test observation value;
whereinRepresents the current output result of the third fully-connected layer of the jth picture to attribute i,a real label representing picture j versus attribute i,attribute distribution weights representing attribute loss functions of positive or negative samples of the t-th iteration corresponding to attribute i, C representing the number of attribute classes;
step 104: calculating a loss functionGradient of (2)Wherein WtA network parameter representing a t-th iteration;
iteratively updating the network parameters: wt+1=Wt+Vt+1WhereinBeta represents the learning rate of the preset negative gradient, mu represents the weight of the last gradient value, VtRepresenting the gradient of the t iteration, wherein the gradient of the first iteration is 0, and the weight mu is a preset value;
iteratively updating attribute distribution weights for attribute loss functionsHeavy:wherein the dimension parameterCurrent normalized variableRepresenting the current output result of the third fully-connected layer for attribute i, i.e., S FCs [ i]jComposition FC [ i],yiTrue labels, i.e. S, representing the current sub-training set for attribute iComposition yi;
Step 105: step 103-104 is repeatedly executed, the network parameter and the attribute distribution weight of the attribute loss function of each attribute are iteratively updated until the loss functionConverging and storing the currently updated network parameters and the attribute distribution weight of the attribute loss function;
and (3) identification processing of the image to be identified:
step 201: carrying out size normalization and image pixel value mean value normalization processing on an image to be recognized;
step 202: loading the network parameters saved in the training process;
step 203: inputting the image to be recognized processed in the step 201 into the fusion network model, performing forward propagation, and predicting identity tags and C types of face attribute tags through a second full-connection layer and a third full-connection layer respectively, wherein the identity tags are subjected to the extraction of index tags corresponding to the maximum probability value through the second full-connection layer and the softmax layer; and the face attribute label is directly output through the third full-connection layer.
2. The method of claim 1, wherein the attribute loss function for positive samplesProperty distribution weight ofAttribute distribution weight of attribute loss function of sum negative sampleThe initial values of (a) are: andrepresenting the number of positive and negative samples with attribute i in the training sample set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711232374.6A CN107766850B (en) | 2017-11-30 | 2017-11-30 | Face recognition method based on combination of face attribute information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711232374.6A CN107766850B (en) | 2017-11-30 | 2017-11-30 | Face recognition method based on combination of face attribute information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107766850A CN107766850A (en) | 2018-03-06 |
CN107766850B true CN107766850B (en) | 2020-12-29 |
Family
ID=61276369
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711232374.6A Active CN107766850B (en) | 2017-11-30 | 2017-11-30 | Face recognition method based on combination of face attribute information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107766850B (en) |
Families Citing this family (36)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108509862B (en) * | 2018-03-09 | 2022-03-25 | 华南理工大学 | Rapid face recognition method capable of resisting angle and shielding interference |
CN108520213B (en) * | 2018-03-28 | 2021-10-19 | 五邑大学 | Face beauty prediction method based on multi-scale depth |
CN108846380B (en) * | 2018-04-09 | 2021-08-24 | 北京理工大学 | Facial expression recognition method based on cost-sensitive convolutional neural network |
CN110555340B (en) * | 2018-05-31 | 2022-10-18 | 赛灵思电子科技(北京)有限公司 | Neural network computing method and system and corresponding dual neural network implementation |
CN109033938A (en) * | 2018-06-01 | 2018-12-18 | 上海阅面网络科技有限公司 | A kind of face identification method based on ga s safety degree Fusion Features |
JP7113674B2 (en) * | 2018-06-15 | 2022-08-05 | パナソニック インテレクチュアル プロパティ コーポレーション オブ アメリカ | Information processing device and information processing method |
CN108898125A (en) * | 2018-07-10 | 2018-11-27 | 深圳市巨龙创视科技有限公司 | One kind being based on embedded human face identification and management system |
CN109214286B (en) * | 2018-08-01 | 2021-05-04 | 中国计量大学 | Face recognition method based on deep neural network multi-layer feature fusion |
CN109191184A (en) * | 2018-08-14 | 2019-01-11 | 微梦创科网络科技(中国)有限公司 | Advertisement placement method and system based on image recognition |
CN109359515A (en) * | 2018-08-30 | 2019-02-19 | 东软集团股份有限公司 | A kind of method and device that the attributive character for target object is identified |
CN109344713B (en) * | 2018-08-31 | 2021-11-02 | 电子科技大学 | Face recognition method of attitude robust |
CN109508627A (en) * | 2018-09-21 | 2019-03-22 | 国网信息通信产业集团有限公司 | The unmanned plane dynamic image identifying system and method for shared parameter CNN in a kind of layer |
CN109359599A (en) * | 2018-10-19 | 2019-02-19 | 昆山杜克大学 | Human facial expression recognition method based on combination learning identity and emotion information |
CN109711386B (en) * | 2019-01-10 | 2020-10-09 | 北京达佳互联信息技术有限公司 | Method and device for obtaining recognition model, electronic equipment and storage medium |
CN110069994B (en) * | 2019-03-18 | 2021-03-23 | 中国科学院自动化研究所 | Face attribute recognition system and method based on face multiple regions |
CN111723613A (en) * | 2019-03-20 | 2020-09-29 | 广州慧睿思通信息科技有限公司 | Face image data processing method, device, equipment and storage medium |
CN110009051A (en) * | 2019-04-11 | 2019-07-12 | 浙江立元通信技术股份有限公司 | Feature extraction unit and method, DCNN model, recognition methods and medium |
CN110084216B (en) * | 2019-05-06 | 2021-11-09 | 苏州科达科技股份有限公司 | Face recognition model training and face recognition method, system, device and medium |
CN110135389A (en) * | 2019-05-24 | 2019-08-16 | 北京探境科技有限公司 | Face character recognition methods and device |
CN110348387B (en) * | 2019-07-12 | 2023-06-27 | 腾讯科技(深圳)有限公司 | Image data processing method, device and computer readable storage medium |
CN110516569B (en) * | 2019-08-15 | 2022-03-08 | 华侨大学 | Pedestrian attribute identification method based on identity and non-identity attribute interactive learning |
CN110956116B (en) * | 2019-11-26 | 2023-09-29 | 上海海事大学 | Face image gender identification model and method based on convolutional neural network |
CN111046759A (en) * | 2019-11-28 | 2020-04-21 | 深圳市华尊科技股份有限公司 | Face recognition method and related device |
CN111275057B (en) * | 2020-02-13 | 2023-06-20 | 腾讯科技(深圳)有限公司 | Image processing method, device and equipment |
CN111353411A (en) * | 2020-02-25 | 2020-06-30 | 四川翼飞视科技有限公司 | Face-shielding identification method based on joint loss function |
CN111401294B (en) * | 2020-03-27 | 2022-07-15 | 山东财经大学 | Multi-task face attribute classification method and system based on adaptive feature fusion |
CN111428671A (en) * | 2020-03-31 | 2020-07-17 | 杭州博雅鸿图视频技术有限公司 | Face structured information identification method, system, device and storage medium |
CN111507248B (en) * | 2020-04-16 | 2023-05-26 | 成都东方天呈智能科技有限公司 | Face forehead region detection and positioning method and system based on low-resolution thermodynamic diagram |
CN111680595A (en) * | 2020-05-29 | 2020-09-18 | 新疆爱华盈通信息技术有限公司 | Face recognition method and device and electronic equipment |
CN112507312B (en) * | 2020-12-08 | 2022-10-14 | 电子科技大学 | Digital fingerprint-based verification and tracking method in deep learning system |
CN112990270B (en) * | 2021-02-10 | 2023-04-07 | 华东师范大学 | Automatic fusion method of traditional feature and depth feature |
CN113139460A (en) * | 2021-04-22 | 2021-07-20 | 广州织点智能科技有限公司 | Face detection model training method, face detection method and related device thereof |
CN113705439B (en) * | 2021-08-27 | 2023-09-08 | 中山大学 | Pedestrian attribute identification method based on weak supervision and metric learning |
CN114360009B (en) * | 2021-12-23 | 2023-07-18 | 电子科技大学长三角研究院(湖州) | Multi-scale characteristic face attribute recognition system and method in complex scene |
CN117079337B (en) * | 2023-10-17 | 2024-02-06 | 成都信息工程大学 | High-precision face attribute feature recognition device and method |
CN117894083B (en) * | 2024-03-14 | 2024-06-28 | 中电科大数据研究院有限公司 | Image recognition method and system based on deep learning |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203395A (en) * | 2016-07-26 | 2016-12-07 | 厦门大学 | Face character recognition methods based on the study of the multitask degree of depth |
CN106355170A (en) * | 2016-11-22 | 2017-01-25 | Tcl集团股份有限公司 | Photo classifying method and device |
CN106815566A (en) * | 2016-12-29 | 2017-06-09 | 天津中科智能识别产业技术研究院有限公司 | A kind of face retrieval method based on multitask convolutional neural networks |
CN107038429A (en) * | 2017-05-03 | 2017-08-11 | 四川云图睿视科技有限公司 | A kind of multitask cascade face alignment method based on deep learning |
-
2017
- 2017-11-30 CN CN201711232374.6A patent/CN107766850B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106203395A (en) * | 2016-07-26 | 2016-12-07 | 厦门大学 | Face character recognition methods based on the study of the multitask degree of depth |
CN106355170A (en) * | 2016-11-22 | 2017-01-25 | Tcl集团股份有限公司 | Photo classifying method and device |
CN106815566A (en) * | 2016-12-29 | 2017-06-09 | 天津中科智能识别产业技术研究院有限公司 | A kind of face retrieval method based on multitask convolutional neural networks |
CN107038429A (en) * | 2017-05-03 | 2017-08-11 | 四川云图睿视科技有限公司 | A kind of multitask cascade face alignment method based on deep learning |
Non-Patent Citations (2)
Title |
---|
"Patch-based face hallucination with multitask deep neural network";Wei-jen Ko;《2016 ICME》;20160829;第11-15页 * |
基于卷积神经网络的人脸识别方法;陈耀丹;《东北师大学报》;20160630;第48卷(第2期);第70-76页 * |
Also Published As
Publication number | Publication date |
---|---|
CN107766850A (en) | 2018-03-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107766850B (en) | Face recognition method based on combination of face attribute information | |
CN108564029B (en) | Face attribute recognition method based on cascade multitask learning deep neural network | |
WO2021042828A1 (en) | Neural network model compression method and apparatus, and storage medium and chip | |
US11417148B2 (en) | Human face image classification method and apparatus, and server | |
Lee et al. | Deeply-supervised nets | |
Cheng et al. | Exploiting effective facial patches for robust gender recognition | |
CN111639544B (en) | Expression recognition method based on multi-branch cross-connection convolutional neural network | |
Ali et al. | Boosted NNE collections for multicultural facial expression recognition | |
WO2020114118A1 (en) | Facial attribute identification method and device, storage medium and processor | |
Lin et al. | Regression Guided by Relative Ranking Using Convolutional Neural Network (R $^ 3 $3 CNN) for Facial Beauty Prediction | |
CN109033938A (en) | A kind of face identification method based on ga s safety degree Fusion Features | |
CN102314614B (en) | Image semantics classification method based on class-shared multiple kernel learning (MKL) | |
CN110097029B (en) | Identity authentication method based on high way network multi-view gait recognition | |
US11093800B2 (en) | Method and device for identifying object and computer readable storage medium | |
CN111476806B (en) | Image processing method, image processing device, computer equipment and storage medium | |
CN109376787B (en) | Manifold learning network and computer vision image set classification method based on manifold learning network | |
WO2015008567A1 (en) | Facial impression estimation method, device, and program | |
Li et al. | Task relation networks | |
CN114463812A (en) | Low-resolution face recognition method based on dual-channel multi-branch fusion feature distillation | |
CN112101087A (en) | Facial image identity de-identification method and device and electronic equipment | |
CN114492634A (en) | Fine-grained equipment image classification and identification method and system | |
Watson et al. | Person re-identification combining deep features and attribute detection | |
CN104598898A (en) | Aerially photographed image quick recognizing system and aerially photographed image quick recognizing method based on multi-task topology learning | |
Jia et al. | Multiple metric learning with query adaptive weights and multi-task re-weighting for person re-identification | |
Hou et al. | A face detection algorithm based on two information flow block and retinal receptive field block |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |