CN106951858A

CN106951858A - A kind of recognition methods of personage's affiliation and device based on depth convolutional network

Info

Publication number: CN106951858A
Application number: CN201710163830.XA
Authority: CN
Inventors: 郭金林; 白亮; 李珏; 老松杨
Original assignee: National University of Defense Technology
Current assignee: National University of Defense Technology
Priority date: 2017-03-17
Filing date: 2017-03-17
Publication date: 2017-07-14

Abstract

Include the invention discloses a kind of personage's affiliation recognition methods based on depth convolutional network with device：Input facial image is simultaneously pre-processed；Build convolutional neural networks and convolution kernel is set；Facial image is repeated convolution in convolutional neural networks using convolution kernel to operate with pondization；Image after being operated to convolution and pondization, which return, obtains identity characteristic；Extract the linked character between identity characteristic；Affiliation between facial image is recognized according to linked character.The present invention can carry out the affiliation identification between personage.

Description

A kind of recognition methods of personage's affiliation and device based on depth convolutional network

Technical field

The present invention relates to field of artificial intelligence, especially, it is related to a kind of personage's relationship based on depth convolutional network Relation recognition method and apparatus.

Background technology

Research for facial image is always the highly important content of computer vision field.The research of facial image it So it is important, it is, because face expresses many personal information, to there is special role in social life.In artificial intelligence neck Domain, imitates cognition of the human vision completion to face and has been achieved for great successes.Nowadays in recognition of face, authentication etc. Many aspects, computer vision can successfully substitute the mankind.By facial image recognize personage affiliation be still The work of novel and rich challenge.

It is the problem risen in recent years that character relation is studied from facial image, in recent years, related several databases and Algorithm is suggested in succession, but all scale is too small and standard differs for most of existing databases.First is held within 2014 Affiliation identification contest, certain methods now are assessed with unified measurement system, establish two on relationship close The database KinFaceW-I and KinFaceW-II of system.

In psychology, biology and computer vision field, closed on the personage based on facial image within past 5 years It is that identification is broadly divided into two schools, a kind of is description based on engineer, and another is to be based on similarity-based learning.For For method based on description, people are extracted some important features such as colour of skin, histogram of gradients, Gabor gradient sides Characterized to pyramid, conspicuousness information, self-similarity characteristics and dynamic expression etc. as conventional face, it is also proposed that one kind is based on The Feature Descriptor of spatial pyramid is as the feature of facial image, and the SVM improved is to by between two individuals Characteristic distance is classified；In the method based on similarity-based learning, subspace and metric learning are used as learning preferably Feature space weighs the similitude of facial sample.Representative algorithm includes：Sub-space learning and proximity space measurement Study, by multiple features fusion, learns a kind of distinction measurement to expand non-close relationship gap, reduces affiliation distance, with Reach identifying purpose.

It is present however, when machine vision attempts to simulate human vision, it tends to be difficult to imitate the social experience of the mankind Artificial intelligence is substantial amounts of artificial labeled data to supply the mode of this shortcoming, more robust to construct with sufficient training Algorithm for pattern recognition.The more common recognition of face of relation recognition difficulty between personage many upper greatly, comparison other from a kind of appearance and A corresponding identity is to a pair of faces and certain relation, and this relation is set by the mankind.And when a people only possesses While one identity, relation and personage to can be between, personage multi-to-multi complex relationship.

Can only carry out recognition of face for prior art, personage can not be carried out between relation recognition the problem of, not yet have at present Effective solution.

The content of the invention

In view of this, it is an object of the invention to propose a kind of personage's affiliation identification side based on depth convolutional network Method and device, can carry out the affiliation identification between personage.

Based on above-mentioned purpose, the technical scheme that the present invention is provided is as follows：

According to an aspect of the invention, there is provided a kind of personage's affiliation identification side based on depth convolutional network Method, including：

Input facial image is simultaneously pre-processed；

Build convolutional neural networks and convolution kernel is set；

Facial image is repeated convolution in convolutional neural networks using convolution kernel to operate with pondization；

Image after being operated to convolution and pondization, which return, obtains identity characteristic；

Extract the linked character between identity characteristic；

Affiliation between facial image is recognized according to linked character.

In some embodiments, the input facial image and carry out pretreatment and include：

Input facial image to be identified；

Face datection and rotation correction are carried out to facial image；

Facial image is cut into the sample of specified size.

In some embodiments, the structure convolutional neural networks and the convolution kernel is set to include：

According to successively greedy algorithm training network initial value；

Network parameter is adjusted according to back-propagation algorithm；

Multiple convolution kernels are set according to local sensing method and weights sharing method.

In some embodiments, successively greedy algorithm training network initial value includes the basis：

Each layer parameter of order training method convolutional neural networks；

Input of the output that former each layer is trained as later layer；

Network initial value is determined according to each layer parameter of the network trained.

In some embodiments, it is described to be included according to back-propagation algorithm adjustment network parameter：

According to data set sample, the result of propagated forward determines cost function in neutral net；

The residual error of every layer of each neuron in neutral net is determined according to cost function；

According to local derviation of the residual computations cost function of every layer of each neuron to every layer of each neuron parameter；

Local derviation, e-learning speed adjust network parameter according to cost function to every layer of each neuron parameter.

In some embodiments, it is described to be according to the multiple convolution kernels of local sensing method setting：Multiple convolution kernels are set, Wherein each convolution kernel with facial image specify size part and non-integral carry out convolution operation；It is described common according to weights Enjoying the multiple convolution kernels of method setting is：Multiple convolution kernels are set, wherein the part in each convolution kernel to specified range is adopted Sample, and unsampled part is considered as has identical feature with the part sampled.

In some embodiments, convolution is repeated to facial image in convolutional neural networks in the use convolution kernel Include with pond operation：

Convolution is carried out to facial image in convolutional neural networks using convolution kernel, characteristic pattern is generated；

Maximum pondization operation is carried out to characteristic pattern, characteristic pattern is updated；

Convolution is carried out to characteristic pattern in convolutional neural networks using convolution kernel, characteristic pattern is updated；

Above step is repeated until characteristic pattern turns into the compressed data with space-invariance.

In some embodiments, the affiliation between the facial image includes set membership, father and daughter's relation, mothers and sons Relation, mother and daughter relationship.

In some embodiments, the structure convolutional neural networks are to use using affiliation species as main clue Data set sample builds convolutional neural networks；The image to after convolution and pondization operation carries out returning acquisition identity characteristic Facial image has the probability of specific characteristic.

According to an aspect of the present invention, a kind of electronic equipment, including at least one processor are additionally provided；And with institute State the memory of at least one processor communication connection；Wherein, have can be by least one described processor for the memory storage The instruction of execution, the instruction is by least one described computing device, so that at least one described processor is able to carry out State method.

From the above it can be seen that the technical scheme of the invention provided is by using input facial image and is located in advance Manage, build convolutional neural networks and convolution kernel is set, facial image is repeated in convolutional neural networks using convolution kernel Convolution and pondization operate, the image after convolution and pondization operation return between acquisition identity characteristic, extraction identity characteristic Linked character and the technological means of the affiliation between facial image is recognized according to linked character, can carry out between personage Affiliation is recognized.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to institute in embodiment The accompanying drawing needed to use is briefly described, it should be apparent that, drawings in the following description are only some implementations of the present invention Example, for those of ordinary skill in the art, on the premise of not paying creative work, can also be obtained according to these accompanying drawings Obtain other accompanying drawings.

Fig. 1 is a kind of stream of personage's affiliation recognition methods based on depth convolutional network according to the embodiment of the present invention Cheng Tu；

Fig. 2 in a kind of personage's affiliation recognition methods based on depth convolutional network according to the embodiment of the present invention, The structure chart of deep neural network；

Fig. 3 in a kind of personage's affiliation recognition methods based on depth convolutional network according to the embodiment of the present invention, The volume and administrative division map of many convolution kernels in depth convolutional neural networks；

Fig. 4 in a kind of personage's affiliation recognition methods based on depth convolutional network according to the embodiment of the present invention, The illustraton of model of depth convolutional neural networks；

Fig. 5 in a kind of personage's affiliation recognition methods based on depth convolutional network according to the embodiment of the present invention, The overall construction drawing of depth convolution own coding neutral net；

Fig. 6 in a kind of personage's affiliation recognition methods based on depth convolutional network according to the embodiment of the present invention, The structure chart of depth identity convolutional neural networks；

Fig. 7 in a kind of personage's affiliation recognition methods based on depth convolutional network according to the embodiment of the present invention, The flow chart of maximum pondization operation；

Fig. 8 is one that performs a kind of electronic equipment of synchronous phasor measurement unit on-line calibration method implementation of the invention The hardware architecture diagram of example.

Embodiment

For the object, technical solutions and advantages of the present invention are more clearly understood, below in conjunction with the embodiment of the present invention Accompanying drawing, the technical scheme in the embodiment of the present invention is further carried out it is clear, complete, describe in detail, it is clear that it is described Embodiment is only a part of embodiment of the invention, rather than whole embodiments.Based on the embodiment in the present invention, this area The every other embodiment that those of ordinary skill is obtained, belongs to the scope of protection of the invention.

Based on above-mentioned purpose, there is provided a kind of personage based on depth convolutional network according to one embodiment of present invention Affiliation recognition methods.

As shown in figure 1, the facial image affiliation identification side based on convolutional network provided according to embodiments of the present invention Method includes：

Step S101, inputs facial image and is pre-processed；

Step S103, builds convolutional neural networks and sets convolution kernel；

Step S105, convolution is repeated to facial image in convolutional neural networks using convolution kernel and is operated with pondization；

Step S107, the image after being operated to convolution and pondization, which return, obtains identity characteristic；

Step S109, extracts the linked character between identity characteristic；

Step S111, the affiliation between facial image is recognized according to linked character.

Input facial image to be identified；

Face datection and rotation correction are carried out to facial image；

Facial image is cut into the sample of specified size.

According to successively greedy algorithm training network initial value；

Network parameter is adjusted according to back-propagation algorithm；

Each layer parameter of order training method convolutional neural networks；

Input of the output that former each layer is trained as later layer；

In summary, by means of the technical solution of the present invention, by using input facial image and pre-processed, Build convolutional neural networks and convolution kernel is set, convolution is repeated to facial image in convolutional neural networks using convolution kernel Carry out returning the pass obtained between identity characteristic, extraction identity characteristic with pondization operation, to the image after convolution and pondization operation Join feature and the technological means of the affiliation between facial image is recognized according to linked character, the relationship between personage can be carried out Relation recognition.

Based on above-mentioned purpose, according to second embodiment of the present invention, there is provided a kind of people based on depth convolutional network Thing affiliation recognition methods.

The purpose of machine learning is the sample in future to be predicted by this function by sample learning to a function Value.Finding this function needs extensive work, it is established that deep learning network is one of which.In supervised learning, it is assumed that There is training sample set (xⁱ,yⁱ), then neutral net can use model h_W,b(x) a kind of nonlinear function is represented, wherein (w, b) It is the parameter for fitting data.

Neutral net is made up of all multi-neurons, and they are interconnected with one another, and the output of a neuron is as next The input of neuron.Fig. 2 is illustrated that a typical deep neural network schematic diagram.Neutral net parameter (W, b), whereinIt is the parameter that couples between l layers of jth unit and l+1 layers of i-th cell, i.e. weight on connecting line,It is l+1 The bias term of layer i-th cell.WithRepresent the output valve of l layers of i-th cell.For given parameters set (W, b), nerve net Network just can be according to function h_W,b(x) output result is calculated：

z⁽ⁱ⁺¹⁾=W⁽ⁱ⁾x+b⁽ⁱ⁾,a⁽ⁱ⁺¹⁾=f (z⁽ⁱ⁺¹⁾) (1)

h_W,b(x)=a⁽ⁿ⁾ (2)

Input data is calculated via network parameter, the process of output activation value is called propagated forward.Wherein functionIt is referred to as " activation primitive ".Activation primitive f () can be used as from sigmoid functions.

Although depth network terseness in theory and stronger learning characteristic ability are just exploited before the more than ten years, But real rise is work in recent years, reason is that the network training before greedy algorithm occurs has huge difficulty. The embodiment of the present invention will illustrate two respectively to the highly important algorithm of deep neural network, and one is successively greedy algorithm, separately One is reverse conduction algorithm.

Successively greedy algorithm：The training method of conventional deep neural network is that random setting is carried out to network parameter initially The poor adjusting parameter with label is exported according to network after value, calculating network activation value, until network convergence.Which results in following Problem：Random setting initial value can trigger the local minimum problem that converges to, furthermore, with overall error transfer factor parameter to low layer The parameter influence of level is too small so that the hidden layer of low-level is difficult to effective study.Successively greedy algorithm significantly improves depth The training method of neutral net so that network performance is further improved.Successively the main thought of greedy algorithm is：Order training method is each One layer in layer parameter, each training network.Before having trainedThe output of layer is used as theThe input of layer； The initial value of whole network is set from each layer parameter individually trained.This top-down supervised learning, according to label pair Whole network carries out back-propagation algorithm, adjusts network parameter.

Reverse conduction algorithm：To data set { (x¹,y¹)…(x^m,y^m), entered by sample before neutral net is carried out to biography Broadcast and obtain result y=h_W,b(x) after, the cost function that can define sample (x, y) is：

The overall cost function of data set is：

The purpose of Section 2 is the amplitude for reducing weight in formula, prevents overfitting.

Trying to achieve parameter, (W b) so that the cost function of network is minimum, in continuous iteration optimization, can use gradient Descent method is constantly updated to parameter, and wherein α is learning rate：

Reverse conduction algorithm is for calculating partial derivative's：

First, neutral net carries out propagated forward, and L is obtained to each j_jThe output valve of layer.

For one has the network of n-layer, each neuron i residual errors of n-th layer are calculated:

This residual error represents, contribution of i-th of neuron to final output value and the error of actual value.

Other layers of l under output layer, continue to calculate residual error：

δ¹=(W^(l))^Tδ^l+1·f′(z^(l)) (8)

The meaning of reverse conduction embodies out in two steps more than, i.e. successive derivation from back to front.

Local derviation numerical value is calculated, to update weight.

Calculating is obtained after partial derivative, you can according to formula (6) update network weight, progressively reduce J (W, value b), finally It is able to solve neutral net.

Autocoder (Auto-Encoder, AE) is a kind of unsupervised learning algorithm, and depth self-encoding encoder make use of The existing depth structure of neutral net shown in Fig. 3, is a kind of neutral net with input reconstruct output.The function learnt is h_W,b(x) ≈ x, network also application successively greedy algorithm training, back-propagation algorithm adjustment network parameter.Input is meeting Different expressions are transformed to as the number of plies is different, these expressions are exactly the feature being originally inputted.Self-encoding encoder is in order to reconstruct original The input of beginning, just must be learned by the key character hidden in data.

The study of one identity function seems simple, if but sparse limitation will force depth self-encoding encoder to acquire The feature of meaning.It is n as input data, a hidden layer L of network to set a vector dimension₂There is m hidden neuron. What AE to be completed is input in domainOn change, if now limitation m ＜ n, then AE is forced to learn defeated The compression expression entered.If the data of input are the nonsignificant datas having no bearing on totally independently of one another, learning outcome does not have Meaning.But if containing some rules and structure for being relative to each other in input data, at this moment algorithm can just learn to than original The more representational feature of beginning data.

Openness principle is added from inspiration biologically, biologically research shows that human vision is inputted to some The neuron of only a fraction is activated when having responded, and remaining most neurons is all repressed.Openness original Limitation then is will be so that most neuron be all repressed.Due to applying the sigmoid letters provided in formula (3) Number is as activation primitive, so output is considered holddown close to 0, output is state of activation close to 1.

Openness principle is added, defining the sparse factor is：

WhereinRepresent be hidden neuron j average active degree, when openness parameter ρ be endowed one it is less Value, that is, requireClose to ρ.KL is relative entropy, relative entropy computing cause the value of the sparse factor withIncrease and list with ρ difference Adjust and be incremented by, be zero when both differ, the value of the sparse factor is also zero.

The cost function of entire depth self-encoding encoder is：

Wherein, J (W, b) formula (4) definition as before；β is the parameter for controlling openness weight.

Convolutional neural networks (Convolutional Neural Networks, CNN) by vision system structure inspire and Produce, be to solve the best depth model of pattern recognition problem effect in image at present, achieved on ImageNet current Best result.

Convolutional neural networks may learn a kind of mapping relations for being input to output, can be implicit during this The feature hidden in data is practised, without any accurate mathematical expression formula.The various features of convolutional neural networks make it in figure As having big advantage in problem.CNN convolutional Neural meta design makes the structure that it extremely adapts to view data, local sensing The characteristics of being shared with weights reduces computation complexity, can also obtain certain space-invariance.And the level constantly deepened Calculate, also make it that initial data is increasingly becoming the more preferable feature of level of abstraction.

Common neutral net using calculation be to connect entirely by the way of, as shown in figure 3, the computing mode connected entirely So that the neuron of each in hidden layer needs to travel through each pixel of input picture, this mode can directly produce huge Amount of calculation.

In order to reduce number of parameters, convolutional neural networks employ the mode of local experiences.This and human visual system couple Extraneous cognition is consistent, experience the local visual field first, comprehensive local to grasp global information.In actual natural image, by The distribution of meaningful content is not global but local in image, and does not need each neuron to feel all pixels Know.The parameter amount that the convolution operation of addition convolution kernel shown in Fig. 3 is calculated needed for directly reducing.

The operation for further reducing parameter is that weights are shared.Why can using weights share thought, be because In natural image, and content all signatures of not all, the content of different piece can share same feature, certain part Feature may be also applicable in another part.From the perspective of statistics, feature is unrelated with the position where it.From certain The feature that individual position learns, when the other positions of this feature and sample do convolution operation, can be obtained as a kind of detector What is arrived is exactly different activation values of the whole large-size images for this feature.

If only setting the convolution kernel that a size is 10*10,100 features can be obtained, such feature extraction is not Fully.Multiple convolution kernels are added, as shown in Figure 3, so that it may to learn to arrive more features, complete sufficient feature extraction.Each Convolution kernel all can generate new images, referred to as characteristic pattern (Feature Map) by convolution operation.The number and convolution kernel of characteristic pattern Number it is the same, as described in above, regard convolution kernel as detector, characteristic pattern actually reflects artwork to some convolution kernel The response of representative feature.

Convolution operation carries out computing using following formula：

Wherein, M_jRepresent j-th of characteristic pattern of convolution operation to be carried out.

The feature obtained by convolution operation reduces the dimension of initial data, but this data remains unchanged excessively huge, example Such as, input picture is the gray level image of one 100 × 100, if defining the convolution kernel that 100 sizes are 10 × 10, this Hundred convolution kernels and image carry out convolution operation, and obtained characteristic pattern size is：(100-10+1) × (100-10+1)=8, 281.Due to there is 100 features, therefore the size of all characteristic patterns is total up to 828,100.If such characteristic pattern be applied to Train the tasks, the phenomenon for facing dyscalculia and over-fitting (over-fitting) of still meeting such as grader.

It is why shared using convolution operation and weights, attribute of the image with respect to " static state " is based on, has been write from memory above Identical feature may be shared by recognizing diverse location, in order to handle large-size images, and the feature of diverse location can be polymerize Statistics.The region is substituted with the average value (Average-Pooling) or maximum (Max-Pooling) in some region Value, this operation is called pond (Pooling).Pond operation has been actually accomplished a kind of space down-sampling, not only causes spy The dimension levied effectively is reduced, and can also obtain certain space-invariance.Shown in maximum pond is calculated as follows：

In formula, R_iThe region of pondization operation to be carried out is illustrated, is region in the region of [m, n] in a step-length Middle maximum is by the sign as this region.

The two-dimensional design of convolution kernel and the operation of space down-sampling, are very suitable for the data characteristicses of image.In the picture Successive range in carry out pond, the feature of that down-sampling, actually from same convolution kernel, is to same feature Response, such pondization make it that feature is provided with translation invariance.Convolutional neural networks have unique excellent in terms of image procossing Gesture, summarizes These characteristics as follows：

First, local experiences and the shared special construction of weights more adapt to view data, and layout has imitated biological neural net Network, the more other neural network models of network complexity are substantially reduced.

Second, the feature extracted using CNN is from the study to data, rather than engineer so that feature is more increased Effect, there is versatility.CNN can merge multilayer perceptron directly using image as input, direct while characteristics of image is extracted The problems such as treatment classification, identification.

The characteristics of 3rd, CNN network weight are shared ensure that network operations support concurrent operation, and this point is substantially increased The efficiency of network training, it is particularly important in the big data epoch.

In actual CNN constructions, common model uses multilayer convolution, convolutional layer and pond layer alternately, most After add full articulamentum.In CNN bottom, being generally characterized by for acquiring is local, and the overall situationization of feature is as level is deepened And carry out, finally realize the feature extraction of input data.

Deep neural network structure shown in Fig. 4 is current CNN classical architecture, and the model is carried out parallel using 2 GPU Calculate.Parameter has been divided into two parts, parallel training, identical by first layer, the second layer, the convolutional layer of the 4th layer and layer 5 Data are trained on two different GPU, and obtained output is directly connected to the input as next layer.

Input is the coloured image of 224 × 224 × 3 sizes.

First layer is convolutional layer, has the convolution kernel that 96 sizes are 11 × 11, each upper 48 of GPU.

The second layer is pond layer, using maximum pond method (max-pooling), and pond core size is 2 × 2.

Third layer is convolutional layer, has the convolution kernel that 256 sizes are 5 × 5, each upper 128 of GPU.

4th layer is pond layer, using maximum pond method (max-pooling), and pond core size is 2 × 2.

Layer 5 is convolutional layer, has the convolution kernel that 384 sizes are 3 × 3, each upper 192 of GPU.It is complete with last layer Connection.

Layer 6 is convolutional layer, has the convolution kernel that 384 sizes are 3 × 3, each upper 192 of GPU.This layer of convolution Without addition pond layer between layer and last layer.

Layer 7 is convolutional layer, has the convolution kernel that 256 sizes are 5 × 5, each upper 128 of GPU.

8th layer is pond layer, using maximum pond method (max-pooling), and pond core size is 2 × 2.

9th layer is full articulamentum：Using the 8th layer by the characteristic pattern in pond connect into one 4,096 dimension it is vectorial as This layer of input.

Tenth layer is full articulamentum：Input 4,096 dimension vector to Softmax layers carry out Softmax recurrence, the 1 of output, 000 dimensional vector representative picture belongs to the probability of the category.

The model obtains the champion of 2012 in ImageNet LSVRC, and top-5 error rates are 15.3%.This CNN The training set number of pictures about 1,270,000 of network, checking intensive 50,000, test intensive 150,000.

In depth model as shown in Figure 4, last layer is Softmax layers.Softmax recurrence is one kind in depth model In commonly use multi-categorizer.The label that can be exported by weighing network is reversely passed with the mistake of given true tag Broadcast.When output of the selection classification results as network, entire depth network is considered a grader.When required Be not classification results, and simply median, then the activation value of the neuron of deep neural network high-level be needed for Feature.

In fact, each layer of deep neural network is all another feature of initial data, simply with network level Intensification, network is generally designed as more deep more compact structure, and the activation value of deeper hidden layer often has more ability to express.

The embodiment of the present invention thinks whether two people between have certain relation, must be first to two if wanting to identify Personage has gained some understanding.The identity characteristic for representing two personages respectively is extracted first, and this process is needed based on a depth Deeep ConvFID Net in convolution autoencoder network, i.e. figure；After respective identity characteristic is obtained, then between it Relation is learnt, and this process is based on a depth self-encoding encoder, i.e., the Deep AEFP in figure.The present invention will be given in detail The construction and training process of two kinds of different depth neutral nets of structure are needed, and two networks are effectively combined, is used to Extract linked character.

Current research shows, although depth convolutional network can will extract feature and complete classification feature and realize simultaneously, But for facial image, accuracy rate of the network to recognition of face in itself be not high, present invention application depth convolutional network is carried Take out the identity characteristic for representing personal identification.After the identity characteristic of a pair of personages is obtained, two are sought using multilayer self-encoding encoder Relation between person.The thought of self-encoding encoder is using input reconstruct desired value, it is contemplated that being found in this restructuring procedure The median of input and output represents both close relations.It is new that the present invention incorporates two kinds of depth network designs one Depth convolution own coding neutral net (Deep Convolutional Auto-Encoder Networks, CNN-AE Net), This depth model is as shown in Figure 5.Depth convolution own coding neutral net designed by the present invention is by inputting a pair of personages, most Learn the associations feature between personage couple eventually.

Entire depth convolution own coding neutral net is defined as CNN-AE.In this depth model, input picture is first ConvFID Net (Convolutional networks for Facial can be defined as by a convolutional neural networks ID).Being originally inputted can be converted with more the representational FID of identity (Facial ID) by ConvFID networks.A pair of personages FID using as the input of a multilayer self-encoding encoder, the upper arrow shown in Fig. 5 is represented before self-encoding encoder to computing, lower section Arrow represents autoencoder network reverse feedback.This multilayer self-encoding encoder is defined as AE-FP (Auto-Encoder for Face Pairs).The activation value of network high-level, which can be taken, is interconnection vector RF (Relational Features).

The facial image (Person 1and Person 2) for inputting personage couple is defined as (p₁,p₂), what the present invention was built Depth convolution autoencoder network will complete following learning process：

In order to obtain effective FID, it is necessary to build efficient ConvFID.The depth for obtaining identity characteristic is given in Fig. 6 Convolutional neural networks ConvFID structures.The details of depth network is illustrated in figure, includes size and number, the convolution of convolution kernel Afterwards the size and number of characteristic pattern, down-sampling layer number and down-sampling step-length.Softmax, which is returned, is used as last layer, is used to Identity characteristic is matched with identity label.Last convolutional layer is full articulamentum, and input picture is most set to one by network at last The vector of individual 160 dimension, is used as its identity characteristic.

To represent the size of image, the present invention is complete, and a piece is represented using X × Y × C form, wherein (X, Y) representative image Size, and the port number of C representative images.Convolution kernel actually it is also assumed that be one have two-dimensional structure small image, therefore Use same expression.

As shown in fig. 6, input is the coloured image that a size is 63 × 55 × 3, it is noted here that in training originally Invention has used various sizes of input, in the image conduct of other yardsticks to obtain more preferable network effect in training During network inputs, the size of the characteristic pattern of output is operated by each layer convolution kernel to be varied from, can be by changing last layer Convolutional layer so that the size of full articulamentum is the vector of 160 dimensions.

Input picture enters ConvFID, by first convolution kernel, and its size is 4 × 4, totally 20.We define convolution Operate the formula (14) below following：

For input picture x^l-1In each pixel (i, j), wherein l represents the number of plies of neutral net, by convolution Operate can be multiplied with convolution kernel summation, wherein k₁,k₂The size of convolution kernel is represented, is k in first convolutional layer₁=k₂= 4。

Convolution kernel by core size can be (k for image that a size is (m, n) in the enterprising line slip of image₁,k₂) After convolution operation, picture size can be changed into (m+k₁-1,n+k₂-1)。

When actually convolution operation is realized using Matlab, we use conv2 functions.Function is in actual motion When, convolution kernel first can be rotated into 180 degree.

By first layer convolution operation, input image size is changed into the characteristic pattern of 20 60 × 52.These subsequent characteristic patterns The operation in maximum pond will be carried out.The principle in maximum pond has been presented in Fig. 7 clearly illustrating.

Following formula give the specifically expression of maximum pondization：

In order to obtain certain space-invariance, also for being further compressed to data, as shown in fig. 7, entering one Walk the operation that maximum pond is carried out to obtained characteristic pattern.In simple terms, it is exactly in the certain area of selected step-length, i.e., (15) in region by s × s sizes in pond in, maximum in chosen area replaces the value in this region, totality The size of characteristic pattern will be compressed.

Then passing through the characteristic pattern in first time pond turns into the P2 layers in Fig. 6, the feature for being comprising 20 30 × 26 sizes Figure.The convolution kernel that next 40 size of layer are 3 × 3 is again passed by, C3 layers, the feature for being comprising 40 28 × 24 sizes are generated Figure.The size of P4 layers this 40 characteristic patterns behind pond is the half of last layer.As shown in fig. 6, by multiple convolution core pond, At C7 layers, possess the characteristic pattern of 80 4 × 3 sizes, the feature of C8 layers 2 × 80 is turned into for 3 × 3 convolution kernel by 80 sizes Vector, is connected using the mode connected entirely with C9 layers.FID layers shown in figure are also a full articulamentum, and dimension is 160 dimensions.

Last layer is that Softmax returns layer, and this layer provides the prediction of piece identity by by given label.This In it is noted that the classification feature that possesses of network in itself is not final purpose, if to be made to the accuracy rate of person recognition Lifting, can use other graders after FID is obtained.Softmax is returned because its design is towards many classification problems, therefore each Plant and be widely used in depth model.

In many classification problems, Softmax has good performance, and grader is mutual exclusion to each label.

Recognize that affiliation is the expansion in human face analysis field from facial image, this work can expand artificial intelligence Application.For a family, the identification of affiliation can help them to set up family tree, or even look for huge clan. The much-talked-about topic of society is found in lost children, and the method for machine vision can indirect labor's decision-making etc. rapidly.

It is the problem of present invention will be studied that the affiliation between personage how is identified from facial image.The present invention according to The secondary identity characteristic and linked character for extracting personage, the affiliation of personage is recognized based on linked character.To verification algorithm Process and setting provide detailed description, and result is compared and analyzed in many ways.

The present invention chooses the data sample of identification affiliation, bag from data set KinFaceW-I and KinFaceW-II Include set membership, father and daughter's relation, mother-child relationship (MCR) and mother and daughter relationship.Two data places have the facial image of parents and children It is the network collection of the public data under the conditions of unrestricted, the action of not restricted personage, illumination, expression, age, people In terms of kind, background.The difference of two datasets is that akin a pair of facial images are gathered around in KinFaceW-I It is to be obtained on different photos, and in KinFaceW-II, the most akin facial image of tool is same Obtained on one photo.

In the two databases, there are the father and son of mistake defined above, father and daughter, mothers and sons, the affiliation of mother and daughter. In KinFaceW-I databases, 156 couples of fathers and sons, 134 couples of father and daughter, 116 couples of mothers and sons, 127 couples of mother and daughters are had.In KinFaceW-II In database, four kinds of affiliations are containing 250 pairs of facial images.

Database have passed through artificial mark, give part negative sample.In the checking collection of KinFaceW-I databases In, 156 pairs of positive samples and 156 pairs of negative samples are given, each of which relation is probably 27 pairs of facial images. In KinFaceW-II databases, five parts are splitted data into, a copy of it is used as test set.Test set is aligned comprising 250 altogether in converging Sample and 250 pairs of negative samples, each relation have 50 to align negative sample.

Obtain KinFaceW-I and KinFaceW-II databases after, by its size be cropped to 63 × 55 × 3 size with Adapt to designed ConvFID models.And the same fritter that each sample graph is sampled into diverse location, it is multiple to train ConvFID。

The algorithm of the present invention is divided into two stages：Extract linked character and identification relation stage.Extracting linked character rank Section, topmost part is to train depth model in advance, and carrying out propagated forward using the model trained can obtain feature. In the identification relation stage, two parts of training and test are divided into again.The present invention states that the whole that algorithm is realized is matched somebody with somebody according to sequence of steps Put.It is divided into training depth network portion, extract linked character part and utilizes linked character identification affiliation part.

Train the depth convolutional neural networks ConvFID stages：

Training data：YouTubeFace Data Base, altogether using 47,850 pictures are trained.

Training environment：Based on the Python2.7＆Theano0.7 [65] under OS X Yosemite systems.Processor 2.7GHz Intel Core i5, internal memory 8G.

Training time：6 ConvFID networks are trained altogether, iteration is 20 times during each network training, average each iteration About 480s, a total of about 16 hours of training time.

Train the depth self-encoding encoder AEFP stages：

Training data：1,000 pairs of facial images are trained in KinFaceW-II data sets.

Training environment：It is interior based on Matlab R2012b [66] processor IR GPU G2030 under Windows7 systems Deposit 4G.

Training time：It it is 300 times, a total of about 17 minutes to network iterations.

Extract the linked character stage：

Extract data：Extract the linked character of all images in KinFaceW-I/II data sets.

Extraction environment：Identity characteristic FID is extracted under OS systems Python2.7 and takes 217s.

Linked character RF is extracted under Win7 systems Matlab.

Recognize the affiliation stage：

Training set and test set：The linked character RF of the facial image pointed to according to evaluation rule.

Environment-identification：Matlab＆LibSVM under Win7 systems.

Algorithm is as follows according to the evaluation rule of above-mentioned practical basis：Have two typically, for the task of checking and identification Plant evaluating standard], in all discriminations that the present invention is mentioned, employ Open-set.Because wishing the people reached Thing relationship identification system can be judged for unknown images, without redesigning system.

In training set and test set are set, the various relations in two databases are all divided into five parts so that training The number ratio of collection and test set is about 4:1.It is worth noting that, the generation of negative sample is also derived from the two data collectives, i.e., Choose a parent, choose again immediately one be not his offspring facial image, such a pair of data are used as negative sample.

In study facial image in the linked character of affiliation, this section defines set membership by taking set membership as an example For F-S relations.Facial image of the data in KinFaceW-I and KinFaceW-II databases by pretreatment.

Facial image by pretreatment is subjected to propagated forward in the CNN-AE models trained, obtained characteristic pattern With identity FID.

Define (p₁,p₂), for the facial image of input, first have to respectively through the calculating shown in formula (16)：

Here (w, b) is depth convolutional neural networks ConvFID parameter, and YoutubeFace is passed through above After the substantial amounts of training of database, superior performance is shown in testing.It is believed thatWithIt is initial data (p₁,p₂) more efficient expression, represent the different personages of input data.

Herein, it should be noted that in high-rise neuron response, it can be found that the response of neuron is always sparse , the actual last neuron number for participating in expression FID identity characteristics is simultaneously few.

Obviously, not all neuron is all made that response, and senior neuron has specific pattern, The semantic information of higher level can be recognized.

Entirety to algorithm is realized, one section of false code explanation is provided here.

Algorithm：Father and son's relationship linked character is recognized from facial image based on deep learning.

Input：

Father's facial image F；Son's facial image S；

The network C NNAE of definition_W,b(including two parts：ConvFID_W,bWith AEFP_W,b)

ConvFID_W,b：{input,layer1,layer2,…layer9,FID_layer,Softmax layer}；

AEFP_W,b:{input,layer1,layer2,RF layer,output},{input,layer1,output}；

Step：

F=P1；S=P2；RF (label)=N；

All faces are marked, corresponding affiliation provides same label；

Forward calculation FID_F=ConvFID_W,b(F)；FID_S=ConvFID_W,b(S)；

Unsupervised calculating：

Output_layer_F=AEFP_W,b(FID_F)；

Minimize(output_layer_S,output_layer_F)；

RF (F, S)=AEFP_W,b ^layer3(FID_F,FID_S)；

Extract training set and the RF of all persons' relation is concentrated in checking；

RF two classified calculatings are carried out using SVM classifier.

When choosing linked character, selected hidden layer is that contrast is obtained in an experiment.It is hidden in our comparing cell AEFP Layer 1, hidden layer 2 and output layer.During network is iterated, with the increase of iterations, in the interative computation of 400 times When, the linked character extracted in AEFP networks is to discrimination in the identification of set membership after SVM classifier 73.8%.

Based on above-mentioned purpose, the synchronous phasor measurement is performed there is provided one kind according to the 3rd embodiment of the present invention One embodiment of the electronic equipment of unit on-line calibration method.

The electronic equipment for performing the synchronous phasor measurement unit on-line calibration method includes at least one processor； And the memory being connected with least one described processor communication；Wherein, have can be by described at least one for the memory storage The instruction of individual computing device, the instruction is by least one described computing device, so that at least one described processor energy It is enough to perform any one method as described above.

As shown in figure 8, the electronic equipment for performing the method for speech processing in the real time phone call provided for the present invention The hardware architecture diagram of one embodiment.

By taking electronic equipment as shown in Figure 8 as an example, include a processor 801 and a storage in the electronic equipment Device 802, and can also include：Input unit 803 and output device 804.

Processor 801, memory 802, input unit 803 and output device 804 can pass through bus or other modes In connection, Fig. 8 exemplified by being connected by bus.

Memory 802 is as a kind of non-volatile computer readable storage medium storing program for executing, available for storage non-volatile software journey The synchronous phasor measurement unit in sequence, non-volatile computer executable program and module, such as the embodiment of the present application exists Corresponding programmed instruction/the module of line calibration method.Processor 801 is stored in non-volatile soft in memory 802 by operation Part program, instruction and module, so that various function application and the data processing of execute server, that is, realize that the above method is real Apply the synchronous phasor measurement unit on-line calibration method of example.

Memory 802 can include storing program area and storage data field, wherein, storing program area can store operation system Application program required for system, at least one function；Storage data field can be stored according to synchronous phasor measurement unit on-line calibration Device uses created data etc..In addition, memory 802 can include high-speed random access memory, it can also include Nonvolatile memory, for example, at least one disk memory, flush memory device or other non-volatile solid state memory parts. In certain embodiments, memory 802 is optional including the memory remotely located relative to processor 801.The reality of above-mentioned network Example includes but is not limited to internet, intranet, LAN, mobile radio communication and combinations thereof.

Input unit 803 can receive the numeral or character information of input, and produce online with synchronous phasor measurement unit The key signals input that the user of calibrating installation is set and function control is relevant.Output device 804 may include that display screen etc. is shown Equipment.

One or more of modules are stored in the memory 802, when being performed by the processor 801, are held Synchronous phasor measurement unit on-line calibration method in the above-mentioned any means embodiment of row.

Any one embodiment of the electronic equipment for performing the synchronous phasor measurement unit on-line calibration method, can The effect identical or similar to reach corresponding foregoing any means embodiment.

One of ordinary skill in the art will appreciate that realize all or part of flow in above-described embodiment method, being can be with Related hardware is instructed to complete by computer program, described program can be stored in a computer read/write memory medium In, the program is upon execution, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, described storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..The embodiment of the computer program, can reach that corresponding foregoing any means embodiment is identical Or similar effect.

In addition, typically, device, equipment described in the disclosure etc. can be various electric terminal equipments, such as mobile phone, individual Digital assistants (PDA), panel computer (PAD), intelligent television etc. or large-scale terminal device, such as server, therefore this Disclosed protection domain should not limit as certain certain types of device, equipment.Client described in the disclosure can be with electricity The combining form of sub- hardware, computer software or both is applied in any one above-mentioned electric terminal equipment.

In addition, the computer program for being also implemented as being performed by CPU according to disclosed method, the computer program It can store in a computer-readable storage medium.When the computer program is performed by CPU, perform in disclosed method and limit Fixed above-mentioned functions.

In addition, above method step and system unit can also utilize controller and make it that controller is real for storing The computer-readable recording medium of the computer program of existing above-mentioned steps or Elementary Function is realized.

In addition, it should be appreciated that computer-readable recording medium (for example, memory) of the present invention can be easy The property lost memory or nonvolatile memory, or both volatile memory and nonvolatile memory can be included.As Example and it is nonrestrictive, nonvolatile memory, which can include read-only storage (ROM), programming ROM (PROM), electricity, to be compiled Journey ROM (EPROM), electrically erasable programmable ROM (EEPROM) or flash memory.Volatile memory can include depositing at random Access to memory (RAM), the RAM can serve as external cache.Nonrestrictive as an example, RAM can be with Diversified forms are obtained, such as synchronous random access memory (DRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double data rate SDRAM (DDR SDRAM), enhancing SDRAM (ESDRAM), synchronization link DRAM (SLDRAM) and directly RambusRAM (DRRAM).Institute The storage device of disclosed aspect is intended to the memory of including but not limited to these and other suitable type.

Those skilled in the art will also understand is that, the various illustrative logical blocks with reference to described by disclosure herein, mould Block, circuit and algorithm steps may be implemented as the combination of electronic hardware, computer software or both.It is hard in order to clearly demonstrate This interchangeability of part and software, the function with regard to various exemplary components, square, module, circuit and step it is entered General description is gone.This function is implemented as software and is also implemented as hardware depending on concrete application and application Design constraint to whole system.Those skilled in the art can in a variety of ways realize described for every kind of concrete application Function, but this realize that decision should not be interpreted as causing a departure from the scope of the present disclosure.

Various illustrative logical blocks, module and circuit with reference to described by disclosure herein, which can be utilized, to be designed to The following part of function described here is performed to realize or perform：General processor, digital signal processor (DSP), special collection Into circuit (ASIC), field programmable gate array (FPGA) or other PLDs, discrete gate or transistor logic, divide Any combinations of vertical nextport hardware component NextPort or these parts.General processor can be microprocessor, but alternatively, processing Device can be any conventional processors, controller, microcontroller or state machine.Processor can also be implemented as computing device Combination, for example, the combination of DSP and microprocessor, multi-microprocessor, one or more microprocessors combination DSP core or any Other this configurations.

The step of method with reference to described by disclosure herein or algorithm, can be directly contained in hardware, be held by processor In capable software module or in combination of the two.Software module may reside within RAM memory, flash memory, ROM storages Device, eprom memory, eeprom memory, register, hard disk, removable disk, CD-ROM or known in the art it is any its In the storage medium of its form.Exemplary storage medium is coupled to processor so that processor can be from the storage medium Middle reading information writes information to the storage medium.In an alternative, the storage medium can be with processor collection Into together.Processor and storage medium may reside within ASIC.ASIC may reside within user terminal.In a replacement In scheme, processor and storage medium can be resident in the user terminal as discrete assembly.

In one or more exemplary designs, the function can be real in hardware, software, firmware or its any combination It is existing.If realized in software, computer-readable can be stored in using the function as one or more instructions or code Transmitted on medium or by computer-readable medium.Computer-readable medium includes computer-readable storage medium and communication media, The communication media includes helping to be sent to computer program into any medium of another position from a position.Storage medium It can be any usable medium that can be accessed by a general purpose or special purpose computer.It is nonrestrictive as an example, the computer Computer-readable recording medium can include RAM, ROM, EEPROM, CD-ROM or other optical disc memory apparatus, disk storage equipment or other magnetic Property storage device, or can be used for carrying or storage form for instruct or data structure needed for program code and can Any other medium accessed by universal or special computer or universal or special processor.In addition, any connection can It is properly termed as computer-readable medium.If for example, using coaxial cable, optical fiber cable, twisted-pair feeder, digital subscriber line (DSL) or such as infrared ray, radio and microwave wireless technology come from website, server or other remote sources send software, Then the wireless technology of above-mentioned coaxial cable, optical fiber cable, twisted-pair feeder, DSL or such as infrared elder generations, radio and microwave is included in The definition of medium.As used herein, disk and CD include compact disk (CD), laser disk, CD, digital versatile disc (DVD), floppy disk, Blu-ray disc, wherein disk generally magnetically reproduce data, and CD utilizes laser optics ground reproduce data.On The combination for stating content should also be as being included in the range of computer-readable medium.

Disclosed exemplary embodiment, but disclosed exemplary embodiment should be noted, it should be noted that without departing substantially from On the premise of the scope of the present disclosure that claim is limited, it may be many modifications and change.According to disclosure described herein Function, step and/or the action of the claim to a method of embodiment are not required to perform with any particular order.Although in addition, this public affairs The element opened can be described or required in individual form, it is also contemplated that it is multiple, it is unless explicitly limited odd number.

It should be appreciated that use in the present invention, unless context clearly supports exception, singulative " one " (" a ", " an ", " the ") is intended to also include plural form.It is to be further understood that use in the present invention " and/ Or " refer to include any of one or more than one project listed in association and be possible to combine.

Above-mentioned embodiment of the present disclosure sequence number is for illustration only, and the quality of embodiment is not represented.

One of ordinary skill in the art will appreciate that realizing that all or part of step of above-described embodiment can be by hardware To complete, the hardware of correlation can also be instructed to complete by program, described program can be stored in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only storage, disk or CD etc..

Claims

1. a kind of personage's affiliation recognition methods based on depth convolutional network, it is characterised in that including：

Input facial image is simultaneously pre-processed；

Build convolutional neural networks and convolution kernel is set；

Extract the linked character between identity characteristic；

Affiliation between facial image is recognized according to linked character.

2. according to the method described in claim 1, it is characterised in that the input facial image simultaneously carries out pretreatment and included：

Input facial image to be identified；

Face datection and rotation correction are carried out to facial image；

Facial image is cut into the sample of specified size.

3. according to the method described in claim 1, it is characterised in that the structure convolutional neural networks simultaneously set convolution kernel bag Include：

According to successively greedy algorithm training network initial value；

Network parameter is adjusted according to back-propagation algorithm；

4. method according to claim 3, it is characterised in that the basis successively greedy algorithm training network initial value bag Include：

Each layer parameter of order training method convolutional neural networks；

Input of the output that former each layer is trained as later layer；

5. method according to claim 3, it is characterised in that described that network parameter bag is adjusted according to back-propagation algorithm Include：

6. method according to claim 3, it is characterised in that described that multiple convolution kernels are set according to local sensing method For：Multiple convolution kernels are set, wherein each convolution kernel with facial image specify size part and non-integral carry out convolution Operation；It is described to be according to the multiple convolution kernels of weights sharing method setting：Multiple convolution kernels are set, wherein each convolution kernel to specify In the range of part sampled, and by unsampled part be considered as with sampling part there is identical feature.

7. according to the method described in claim 1, it is characterised in that the use convolution kernel is in convolutional neural networks to face Convolution and pond operation, which is repeated, in image includes：

8. the method according to any one in claim 1-7, it is characterised in that the relationship between the facial image is closed System includes set membership, father and daughter's relation, mother-child relationship (MCR), mother and daughter relationship.

9. method according to claim 8, it is characterised in that the structure convolutional neural networks are use with affiliation Species builds convolutional neural networks for the data set sample of main clue；The image to after convolution and pondization operation is returned Returning acquisition identity characteristic has the probability of specific characteristic for facial image.

10. a kind of electronic equipment, it is characterised in that including at least one processor；And it is logical with least one described processor Believe the memory of connection；Wherein, have can be by the instruction of at least one computing device, the instruction for the memory storage By at least one described computing device, so that at least one described processor is able to carry out as any one in claim 1-9 Method described in.