CN110414372A

CN110414372A - Method for detecting human face, device and the electronic equipment of enhancing

Info

Publication number: CN110414372A
Application number: CN201910608863.XA
Authority: CN
Inventors: 陈东; 徐常胜; 姚寒星
Original assignee: Liang Liang Visual Field Beijing Science And Technology Ltd
Current assignee: Liang Liang Visual Field Beijing Science And Technology Ltd
Priority date: 2019-07-08
Filing date: 2019-07-08
Publication date: 2019-11-05

Abstract

The embodiment of the present application discloses the method for detecting human face, device and electronic equipment of a kind of enhancing, and the method for detecting human face of enhancing therein includes: to carry out preliminary Face datection to target image, obtains one or more corresponding identification region of the target image；Judge in the identification region whether to include facial image using preset differentiation network, if, then the identification region for including described image face is marked in the target image, wherein, described to differentiate that network interacts the convolutional neural networks that training obtains for a kind for the treatment of process and the preset validity deterministic process for the super resolution image using preset low resolution image conversion super resolution image.The application can detect the minimum face in image on the basis of not increasing false detection rate, can effectively improve the reliability of Face datection process and the accuracy of Face datection result.

Description

Method for detecting human face, device and the electronic equipment of enhancing

Technical field

This application involves computer data processing technology field more particularly to a kind of method for detecting human face of enhancing, device And electronic equipment.

Background technique

With the continuous development of depth learning technology, in recent years under the promotion of deep learning the relevant technologies, for specific The detection of target is always research hotspot, such as Face datection, possesses very high research and commercial value.

It is existing relatively high based on size requirement of the human-face detector of deep learning to human face region in image, because This effect when detecting to the minimum face in image is bad.Be in particular in that if detection threshold value setting is excessively high, hold It is easy to miss inspection face；When threshold value is arranged too low, then it is easy to cause erroneous detection (area judging that will be face is face).So In Debugging need to be weighed when concrete application under special scenes, but satisfied effect is also often not achieved.

Based on the above issues, it needs to design a kind of people that can detect minimum face on the basis of not increasing false detection rate Face detection mode.

Summary of the invention

For the problems of the prior art, the application provides the method for detecting human face, device and electronic equipment of a kind of enhancing, The minimum face in image can be detected on the basis of not increasing false detection rate, can effectively improve Face datection process can By the accuracy of property and Face datection result.

In order to solve the above technical problems, the application the following technical schemes are provided:

In a first aspect, the application provides a kind of method for detecting human face of enhancing, comprising:

Preliminary Face datection is carried out to target image, obtains one or more corresponding identification region of the target image；

Judge in the identification region whether to include facial image using preset differentiation network, if so, described The identification region for including described image face is marked in target image, wherein the differentiation network is a kind of application The treatment process of preset low resolution image conversion super resolution image and the preset validity for the super resolution image Deterministic process interacts the convolutional neural networks that training obtains.

Further, further includes:

Obtain trained image and the corresponding Face datection result of the trained image；

Preliminary Face datection is carried out with image to the training, obtain the trained image it is corresponding one or more Identification region；

The corresponding identification region is cut out in image from the training, obtains each identification region point Not corresponding original picture block；

Down-sampling processing is carried out to each original picture block, obtains corresponding each input picture block；

Based on each input picture block, the corresponding original picture block and the corresponding face of the trained image Testing result, using the treatment process of the low resolution image conversion super resolution image and for the true of the super resolution image Solidity deterministic process interacts training to the differentiation network.

Further, described to be used based on each input picture block, the corresponding original picture block and the training The corresponding Face datection of image is as a result, using the treatment process of the low resolution image conversion super resolution image and for described The validity deterministic process of super resolution image interacts training to the differentiation network, comprising:

Image processing step: it is used for input picture block input to convert super resolution image for low resolution image Generation network, and using the output result of the generation network as super resolution image block corresponding with the input picture block；

Validity judgment step: differentiating network for current super resolution image block input, and by the corresponding original Beginning image block and the corresponding Face datection result of the trained image respectively as standard set so that the differentiation network according to The standard set exports the validity result of the corresponding super resolution image block；

Secondary image processing step: if the validity result of the super resolution image block does not meet preset requirement, by institute The validity result for stating super resolution image block is input to the generation network, so that the generation network is according to the current oversubscription It distinguishes that the validity result of image block carries out micronization processes to the super resolution image block, obtains new super resolution image block；

The validity judgment step and the secondary image processing step are repeated, until current super resolution image The validity result of block meets preset requirement.

Further, described by the corresponding original picture block and the corresponding Face datection result of the trained image Respectively as standard set, so that the differentiation network exports the true of the corresponding super resolution image block according to the standard set Spend result, comprising:

Using the corresponding original picture block as the first standard set, so that described differentiation network application first standard set Judge whether the corresponding super resolution image block is high partial image；

And use the corresponding Face datection result of image as the second standard set the training, so that the differentiation net Network judges whether the corresponding super resolution image block is facial image using second standard set；

If the super resolution image block is the high partial image, and the super resolution image block is also the facial image, then Judge whether the super resolution image block meets threshold requirement；

Wherein, the super resolution image block whether be the judging result of high partial image, the super resolution image block whether be The judging result of facial image, and, the judging result whether the super resolution image block meets threshold requirement collectively constitutes institute State the validity result of super resolution image block；

It is corresponding, if it is not the high partial image that the super resolution image block, which belongs to, is not the facial image and not Meet at least one of threshold requirement situation, then determines that the validity result of the super resolution image block does not meet preset requirement.

It further, include oversubscription network and subdivision network in the generation network；Wherein, the oversubscription network is one Kind includes convolutional layer, warp lamination and the convolutional Neural net for avoiding gradient disappearance and the convergent residual block of accelerating algorithm Network；It includes convolutional layer and the volume for avoiding gradient disappearance and the convergent residual block of accelerating algorithm that the subdivision network, which is a kind of, Product neural network；

It is corresponding, the life that input picture block input is used to convert low resolution image to super resolution image At network, and using the output result of the generation network as super resolution image block corresponding with the input picture block, comprising: will The input picture block inputs the oversubscription network, so that the oversubscription network carries out at resolution ratio raising the input picture block It manages and exports corresponding characteristic pattern；

The characteristic pattern is inputted in preset thermal map prediction network so that thermal map prediction network exports the characteristic pattern The prediction thermal map of corresponding face unit；Wherein, the thermal map prediction network is the convolutional neural networks that training obtains in advance；

The prediction thermal map and the characteristic pattern are attached to and are inputted the subdivision network, so that the subdivision network pair The prediction thermal map and the characteristic pattern after connection carry out detail recovery processing, obtain corresponding super resolution image block.

Further, the validity result by the super resolution image block is input to the generation network, so that should It generates network and micronization processes is carried out to the super resolution image block according to the validity result of the current super resolution image block, Obtain new super resolution image block, comprising:

The validity result of the super resolution image block and the corresponding super resolution image block are inputted into the oversubscription net Network, so that the oversubscription network carries out resolution ratio raising processing to the super resolution image block, and by the output knot of the oversubscription network The validity result of fruit and the super resolution image block inputs the subdivision network, so that the subdivision network is to the oversubscription network Output result carry out detail recovery processing, obtain new super resolution image block.

It further, include convolutional layer and two full articulamentums in the differentiation network, and two full articulamentums are distinguished Two output ends of the corresponding differentiation network；

Wherein, output end be used to export the super resolution image block whether be high partial image judging result； Another described output end be used for export the super resolution image block whether be facial image judging result.

Second aspect, the application provide a kind of human face detection device of enhancing, comprising:

It is one corresponding to obtain the target image for carrying out preliminary Face datection to target image for preliminary detection module Or multiple identification regions；

Fine detection module, for judging in the identification region whether to include face figure using preset differentiation network Picture, if so, the identification region for including described image face is marked in the target image, wherein described to sentence Other network is a kind of to convert the treatment process of super resolution image and preset for described super using preset low resolution image The validity deterministic process of resolution image interacts the convolutional neural networks that training obtains.

The third aspect, the application provides a kind of electronic equipment, including memory, processor and storage are on a memory and can The computer program run on a processor, the processor realize the Face datection side of the enhancing when executing described program The step of method.

Fourth aspect, the application provide a kind of computer readable storage medium, are stored thereon with computer instruction, the finger Enable the step of being performed the method for detecting human face for realizing the enhancing.

Method for detecting human face, device and the electronic equipment of a kind of enhancing provided by the present application, the face inspection of enhancing therein Survey method includes: to carry out preliminary Face datection to target image, obtains the corresponding one or more identification regions of the target image； Judge in the identification region whether to include facial image using preset differentiation network, if so, in the target image In the identification region for including described image face is marked, wherein the differentiation network be it is a kind of application it is preset low The treatment process and the preset validity deterministic process for the super resolution image of resolution image conversion super resolution image Interact the convolutional neural networks that training obtains.The application can detect in image on the basis of not increasing false detection rate Minimum face can effectively improve the reliability of Face datection process and the accuracy of Face datection result, and generate quality more High, the better super-resolution image of effect is to effectively improve the efficiency of Face datection.

Detailed description of the invention

In order to illustrate more clearly of this specification embodiment or technical solution in the prior art, below will to embodiment or Attached drawing needed to be used in the description of the prior art is briefly described, it should be apparent that, the accompanying drawings in the following description is this theory Some embodiments of bright book for those of ordinary skill in the art without creative efforts, can be with It obtains other drawings based on these drawings.

Fig. 1 is the configuration diagram of the face detection system of the enhancing of the embodiment of the present application；

Fig. 2 be the embodiment of the present application include image acquisition component 13 enhancing face detection system framework signal Figure；

Fig. 3 is the example schematic of convolutional neural networks universal architecture；

Fig. 4 is the example schematic that network is differentiated in the embodiment of the present application；

Fig. 5 is the example schematic of oversubscription network in the embodiment of the present application；

Fig. 6 is the example schematic that network is segmented in the embodiment of the present application；

Fig. 7 is the example schematic of thermal map prediction network in the embodiment of the present application；

Fig. 8 is the example schematic of residual block in the embodiment of the present application；

Fig. 9 is the applicating example schematic diagram that confrontation network is generated in the embodiment of the present application；

Figure 10 is the flow diagram of the application on site of the method for detecting human face of the enhancing in the embodiment of the present application；

Figure 11 is the stream of the online or offline progress model training of the method for detecting human face of the enhancing in the embodiment of the present application Journey schematic diagram；

Figure 12 is the idiographic flow schematic diagram of the step 050 in the method for detecting human face of the enhancing in the embodiment of the present application；

Figure 13 is the idiographic flow schematic diagram of the step 052 in the method for detecting human face of the enhancing in the embodiment of the present application；

Figure 14 is the idiographic flow schematic diagram of the step 051 in the method for detecting human face of the enhancing in the embodiment of the present application；

Figure 15 is the flow diagram of the model training process of the method for detecting human face of the enhancing in the application application example；

Figure 16 is the flow diagram of the model application process of the method for detecting human face of the enhancing in the application application example；

Figure 17 is the first structural schematic diagram of the human face detection device of the enhancing in the embodiment of the present application；

Figure 18 is second of structural schematic diagram of the human face detection device of the enhancing in the embodiment of the present application；

Figure 19 is the hardware block diagram of the server of the method for detecting human face of the enhancing of the embodiment of the present application.

Specific embodiment

In order to make those skilled in the art more fully understand the technical solution in this specification, below in conjunction with this explanation Attached drawing in book embodiment is clearly and completely described the technical solution in this specification embodiment, it is clear that described Embodiment be only this specification a part of the embodiment, instead of all the embodiments.The embodiment of base in this manual, Every other embodiment obtained by those of ordinary skill in the art without making creative efforts, all should belong to The range of this specification protection.

Effect when being detected existing for the human face detection tech in view of existing enhancing to the minimum face in image And bad problem, applicant consider to examine target from the accuracy rate for how improving using super resolution technology target detection Survey task combines the overall resolving ideas for detecting minimum face with super resolution task, and for super-resolution rebuilding The two aspects of the architecture design thinking of generation confrontation network are of the existing technology to solve the problems, such as, the application provides a kind of increasing Human face detection device, electronic equipment and the computer readable storage medium of strong method for detecting human face, enhancing, by target figure As carrying out preliminary Face datection, the corresponding one or more identification regions of the target image are obtained；Using preset differentiation network Judge in the identification region whether to include facial image, if so, to including described image in the target image The identification region of face is marked, wherein the differentiation network is a kind of preset low resolution image conversion super-resolution of application The treatment process of image and the preset validity deterministic process for the super resolution image interact what training obtained Convolutional neural networks can detect the minimum face in image on the basis of not increasing false detection rate, can effectively improve people The reliability of face detection process and the accuracy of Face datection result, and generate that quality is higher, the better super-resolution figure of effect As the efficiency to effectively improve Face datection, and then it can effectively improve and carry out practical application (example using the face testing result Such as authentication scene, passenger flow statistics scene, safe patrol scene, gate inhibition identify that scene and face pay scene) it is reliable Property, accuracy and detection efficiency.

For this purpose, a kind of face detection system of enhancing is provided in the embodiment of the present application, referring to Fig. 1, the people of the enhancing Face detection system includes: server 11 and client device 12, and client device 12 can wrap containing display interface, it is various not The deep learning library of same type can be deployed in the server 11.

Wherein, the server 11 can differentiate the establishment process of network with off-line execution, i.e., the described server 11 can be from Trained image data is obtained in history image database or client device 12 etc., these training are respectively attached with image data Added with corresponding recognition result label, then, human-face detector that the server 11 can be obtained using default (such as it is a kind of For carrying out the convolutional neural networks of recognition of face) preliminary Face datection is carried out with image to the training, obtain the training With the corresponding one or more identification regions of image；And the corresponding identification region is cut in image from the training It cuts out, obtains the corresponding original picture block of each identification region；Then each original picture block adopt Sample processing, obtains corresponding each input picture block；It is based on each input picture block, the corresponding original picture block again The Face datection of enhancing corresponding with the trained image as a result, using the low resolution image conversion super resolution image place Reason process and training is interacted to the differentiation network for the validity deterministic process of the super resolution image.Then, Destination image data to be identified is sent to the server 11 online by the client device 12, and the server 11 is online The destination image data is received, is deployed with the server 11 of deep learning frame online or offline to the destination image data It is pre-processed, and the application human-face detector carries out preliminary Face datection to destination image data, obtains the target image The corresponding one or more identification regions of data, the differentiation network then obtained by training judge the identification region In whether include facial image, if so, in the target image to include described image face identification region into This can be sent to the client with the markd destination image data online by line flag, then, the server 11 End equipment 12 enables user or other equipment communicated to connect with the client device 12 from the client device 12 Know the corresponding face recognition result of destination image data.

In practical applications, it carries out differentiating that the part of the Face datection of foundation and the enhancing of network can be held in server end Row, that is, framework as shown in Figure 1, operation that can also be whole are completed in client device.It specifically can be according to client The processing capacity of equipment and the limitation of user's usage scenario etc. select.For example, user can carry out model wound online It builds, model creation can also be carried out offline.The application is not construed as limiting this.

It is understood that the client device B1 may include mobile phone, it is flat panel electronic, network machine top box, portable Computer, desktop computer, personal digital assistant (PDA), mobile unit, intelligent wearable device and all-in-one machine etc., or be used for The APP of the Face datection enhanced.Wherein, the intelligent wearable device may include smart glasses, smart watches, intelligent hand Ring etc..Concrete form the application of client device 12 is not construed as limiting.

In order to enable the efficiency and conformability of the Face datection of enhancing are higher, referring to fig. 2, client device 12 be may include There are image acquisition component 13, such as camera.User can carry out image bat to target item by operation client device 12 It takes the photograph, the process of the Face datection then directly enhanced using the client device 12 to collected target image is simultaneously checked The Face datection of enhancing as a result, the application client device 12 acquired image is sent to server 11, and receive The Face datection result for the enhancing that server 11 returns.

In a kind of citing, a camera being installed on certain street, the authorized user of the camera holds a mobile terminal, And the camera and mobile terminal are connect with a server communication, in the specific application process, the camera is with default Period periodically acquires the image of a certain position in the street, and after collecting image every time, is sent to image online Server, server receive image online, and carry out preliminary Face datection to target image online or offline, obtain the target figure As corresponding one or more identification regions, then using preset differentiation network judge in the identification region whether include Facial image, if so, the identification region for including described image face is marked in the target image, then, This is sent to the mobile terminal with the markd target image online by the server, allows users to hold from it The mobile terminal having checks the corresponding face recognition result of destination image data.

Above-mentioned client device can have communication module (i.e. communication unit), can be led to long-range server Letter connection, realizes and transmits with the data of the server.The server may include the server of task schedule center side, It also may include the server of halfpace in other implement scenes, such as have communication linkage with task schedule central server Third-party server platform server.The server may include single computer unit, also may include multiple The server cluster of server composition or the server architecture of distributed devices.

Any suitable network protocol can be used between the server and the client device to be communicated, including In the network protocol that the application submitting day is not yet developed.The network protocol for example may include ICP/IP protocol, UDP/IP Agreement, http protocol, HTTPS agreement etc..Certainly, the network protocol for example can also include using on above-mentioned agreement RPC agreement (Remote Procedure Call Protocol, remote procedure call protocol), REST agreement (Representational State Transfer, declarative state transfer protocol) etc..

In one or more embodiments of the application, the human-face detector differentiates that network, oversubscription network, thermal map are pre- Survey grid network and subdivision network all can be convolutional neural networks, wherein convolutional neural networks universal architecture can successively include Input layer, convolutional layer, pond layer, full articulamentum and output layer.Wherein, convolutional layer, pond layer and full articulamentum can be common groups At the hidden layer of the convolutional neural networks, in hidden layer, the combination of convolutional layer+pond layer can be actually needed according to model Occur repeatedly, can also flexibly using convolutional layer+convolutional layer or the combination of convolutional layer+convolutional layer+pond layer in hidden layer, There is no limit in the convolutional neural networks universal architecture for these.

Based on above content, a kind of concrete example of the human-face detector can be convolutional neural networks universal architecture, Referring to Fig. 3, input layer can directly handle multidimensional data.Convolutional layer (Convolution Layer) be used for input data into Row feature extraction, internal includes multiple convolution kernels, and each element for forming convolution kernel corresponds to a weight coefficient and one Departure (bias vector), and the activation primitive of convolutional layer uses ReLU (Rectified Linear Unit).And It is followed by pond layer (Pooling layer) in convolutional layer, after convolutional layer carries out feature extraction, the characteristic pattern of output can be passed It is handed to pond layer and carries out feature selecting and information filtering.Full articulamentum (Fully is followed by several convolutional layers+pond layer Connected Layer, abbreviation FC), output layer has used Softmax activation primitive then to do point of the Face datection enhanced Class.

Based on above content, a kind of concrete example for differentiating network includes multiple in the differentiation network referring to fig. 4 Convolutional layer and two full articulamentums, and two full articulamentums respectively correspond two output ends of the differentiation network, i.e., two outputs Threshold application function sigmoid classifies respectively at end；Wherein, an output end is for exporting the super resolution image block Whether be high partial image judging result, shown with SR image and HR image；Another described output end is for exporting institute State super resolution image block whether be facial image judging result, shown with facial image and inhuman face image.It can manage Solution differentiates that network is a sorter network, and input is to generate super-resolution image block that network generates and corresponding As the true picture block of supervisory signals, which may be facial image, it is also possible to inhuman face image；There are two outputs Branch: first branch is used to judge whether the image block (super-resolution image block) is true picture block；Second branch uses To judge whether the image block is facial image.

Based on above content, referring to Fig. 5, which includes a kind of concrete example of the oversubscription network for one kind Convolutional layer, warp lamination and the convolutional neural networks for avoiding gradient disappearance and the convergent residual block of accelerating algorithm, wherein anti- Convolutional layer executes up-sampling operation, and each warp lamination expands 2 times to upper one layer of output resolution ratio；Behind each convolutional layer There are batch normalization (BN) and rectification linear unit (ReLU).The purpose that residual block is added is to alleviate gradient descent procedures In issuable gradient disperse the phenomenon that.By the through path in the middle part of residual block between layering, can disappear to avoid gradient Problem, accelerating algorithm convergence.The function of oversubscription network is to carry out increase resolution processing to the low-resolution image vector of input, Resolution ratio is expanded 4 times, it is higher but still may compare to obtain a resolution ratio in this way by the dimension for increasing image block vector Then fuzzy image block is sent to subdivision network and carries out micronization processes.

Based on above content, referring to Fig. 6, which includes a kind of concrete example of the subdivision network for one kind Convolutional layer and for avoiding gradient from disappearing and the convolutional neural networks of the convergent residual block of accelerating algorithm, and by convolutional layer, residual The components such as poor block are constituted.In addition to the last one convolutional layer, there are batch normalization layer and ReLU activation primitive after remaining convolutional layer Layer.The function of subdivision network is the output progress micronization processes for previous step oversubscription network, under the confrontation for differentiating network Detail recovery operation is carried out to the image after up-sampling, more advanced characteristics of image is arrived in study, towards spy similar with original image It is close to levy distribution arrangement, it is therefore an objective to the data distribution difference between the oversubscription that furthers image and original image, while net is differentiated in order to allow Whether network study is to more advanced features, convenient for classifying to " being face ".

Based on above content, for a kind of concrete example of the thermal map prediction network referring to Fig. 7, thermal map predicts that network is one A convolutional neural networks, using Hourglass framework (a kind of generally acknowledged framework), the framework is in stacking hourglass shape.It can benefit With the duplicate feature for handling multiple scales with top-down mode from bottom to top, and can capture between different piece Various spatial relationships, network export the set of thermal map, each thermal map characterizes different component (i.e. facial various components: mouth Bar, nose, eyes, chin profile) probability existing for each pixel.Since the framework of thermal map prediction network is to have disclosed Structure, therefore the application repeats no more.

It is understood that a kind of concrete example of the oversubscription network and the residual block in subdivision network is referring to Fig. 8, by Multiple convolutional layers, rectification linear unit and warp lamination are composed.Relative to it is existing include batch normalization unit Residual error block structure, residual block provided by the embodiments of the present application can with the final performance capabilities of lift scheme, accelerate training process.

In addition, being directed to above content, referring to Fig. 9, the oversubscription network collectively constitutes one of the application with subdivision network Or multiple generation networks as described in the examples, and the generation network, thermal map predict network group together with the differentiation network Network is fought at generating, effectively train to the differentiation network.What the thermal map prediction network obtained oversubscription network It up-samples characteristic pattern and exports the thermal map of four kinds of components of face as input, and be input to after being connected with up-sampling characteristic pattern thin Superresolution processing is carried out in subnetwork.Thermal map is used to can effectively improve the super-resolution of facial various pieces as priori knowledge Outcome quality so that face is more clear really, generate meet nature face structure feature as a result, to accelerate the instruction of network Practice speed.

In order to detect the minimum face in image on the basis of not increasing false detection rate, face can be effectively improved The reliability of detection process and the accuracy of Face datection result, specifically can be real by the face detection system of the enhancing Existing model foundation and the Face datection process of enhancing, although this application provides as the following examples or method shown in the drawings behaviour Make step or apparatus structure, but based on routine or may include more in the method or device without creative labor Or less operating procedure or modular unit after the merging of part.In the step of there is no necessary causalities in logicality or knot In structure, the modular structure of the execution of these steps sequence or device is not limited to the embodiment of the present application or shown in the drawings executes sequence Or modular structure.Device in practice, server or the end product of the method or modular structure are in application, can be by According to embodiment or method shown in the drawings or modular structure carry out sequence execution or it is parallel execute (such as parallel processor or The environment of person's multiple threads, the even implementation environment including distributed treatment, server cluster).

Based on the face detection system of above-mentioned enhancing, the embodiment of the present application can execute offline convolution mind by server 11 The Face datection process enhanced online is executed through network establishment process, and by client device 12 and server 11, specifically It is described in detail by following embodiments.

The application provides a kind of embodiment of the application on site of the method for detecting human face of enhancing, referring to Figure 10, the enhancing Method for detecting human face executing subject can be server above-mentioned or client device, the method for detecting human face of the enhancing Include specifically following content:

Step 100: preliminary Face datection being carried out to target image, obtains the corresponding one or more identifications of the target image Region.

In step 100, server can receive the target image of pending identification, and application setting from client device Face datection namely preliminary Face datection are carried out to the target image in the human-face detector above-mentioned of the server internal, if It includes at least one facial image that recognition result, which is in the target image, then by the result of preliminary Face datection in former target figure It is cut out and as in, obtain at least one corresponding identification region of target image, also just obtained at least one corresponding described knowledge The image block of at least one target image in other region.

It is understood that the human-face detector may operate in current server, also can be set in third party In server, in current server from after the target image that client device receives pending identification, third party is forwarded it to Server, so that third-party server carries out Face datection to the target image using human-face detector therein, namely preliminary Face datection, if it includes at least one facial image that recognition result, which is in the target image, by the knot of preliminary Face datection Fruit is cut out in former target image to come, and obtains at least one corresponding identification region of target image, has also just obtained corresponding to extremely These image blocks are then sent back to current clothes by the image block of at least one target image of a few identification region Business device.

Step 200: judge in the identification region whether to include facial image using preset differentiation network, if so, Then the identification region for including described image face is marked in the target image, wherein the differentiation network is A kind for the treatment of process using preset low resolution image conversion super resolution image and preset it is directed to the super resolution image Validity deterministic process interact the obtained convolutional neural networks of training.

In step 200, server is after obtaining the corresponding image block of the differentiation network, successively by these image blocks As input data, input in preset differentiation network so that the differentiation network judge in the identification region whether include Facial image, if including facial image in the identification region a certain as the result is shown for differentiating network output, in institute It states in target image and the identification region for including described image face is marked.

As can be seen from the above description, the method for detecting human face of enhancing provided by the embodiments of the present application, can not increase erroneous detection The minimum face in image is detected on the basis of rate, can effectively improve the reliability and Face datection knot of Face datection process The accuracy of fruit.

In order to further increase the reliability of Face datection process and the accuracy of Face datection result, the application is also A kind of embodiment of the online or offline progress model training of the method for detecting human face of enhancing, referring to Figure 11, the enhancing are provided Method for detecting human face executing subject can be server above-mentioned or client device, the method for detecting human face of the enhancing Include specifically following content:

Step 010: obtaining trained image and the corresponding Face datection result of the trained image；

Step 020: preliminary Face datection being carried out with image to the training, obtains the corresponding identification of the trained image Region.

Step 030: the corresponding identification region being cut out in image from the training, obtains each knowledge The corresponding original picture block in other region.

Step 040: down-sampling processing being carried out to each original picture block, obtains corresponding each input picture block.

Step 050: being based on each input picture block, the corresponding original picture block and the training image pair The Face datection answered is as a result, using the treatment process of the low resolution image conversion super resolution image and for the super-resolution The validity deterministic process of image interacts training to the differentiation network.

Referring to Figure 12, the step 050 specifically includes following content:

Step 051: an image processing step: input picture block input being used to convert low resolution image to super The generation network of resolution image, and using the output result of the generation network as super-resolution figure corresponding with the input picture block As block.

Step 052: validity judgment step: current super resolution image block input being differentiated into network, and will be corresponded to The original picture block and the corresponding Face datection result of the trained image respectively as standard set so that the differentiation Network exports the validity result of the corresponding super resolution image block according to the standard set.

Step 053: secondary image processing step: if the validity result of the super resolution image block does not meet default want It asks, then the validity result of the super resolution image block is input to the generation network, so that the generation network is according to current The super resolution image block validity result to the super resolution image block carry out micronization processes, obtain new super-resolution figure As block.

Step 054: the validity judgment step and the secondary image processing step are repeated, until current is super The validity result of resolution image block meets preset requirement.

Referring to Figure 13, the validity judgment step of the step 052 specifically includes following content:

Step 0521: current super resolution image block input is differentiated into network.

Step 0522: using the corresponding original picture block as the first standard set, so that the differentiation network application should First standard set judges whether the corresponding super resolution image block is high partial image.

Step 0523: using the corresponding Face datection result of image as the second standard set the training, so that described sentence Other network application second standard set judges whether the corresponding super resolution image block is facial image.

It is understood that if the super resolution image block is the high partial image, and the super resolution image block is also institute Facial image is stated, then judges whether the super resolution image block meets threshold requirement；Wherein, the super resolution image block whether be The judging result of high partial image, the super resolution image block whether be facial image judging result, and, the super-resolution figure As judging result that whether block meets threshold requirement collectively constitutes the validity result of the super resolution image block；It is corresponding, If it is not the high partial image that the super resolution image block, which belongs to, is not the facial image and is unsatisfactory in threshold requirement extremely A kind of few situation, then determine that the validity result of the super resolution image block does not meet preset requirement.

Refer to following three conditions that is, meeting the preset requirement while meeting:

(1) the super resolution image block is the high partial image；

(2) the super resolution image block is the facial image；

(3) the super resolution image block meets threshold requirement.

Based on above content, due to including oversubscription network and subdivision network in the generation network；Wherein, the oversubscription It includes convolutional layer, warp lamination and the convolution for avoiding gradient disappearance and the convergent residual block of accelerating algorithm that network, which is a kind of, Neural network；The subdivision network be one kind include convolutional layer and for avoid gradient disappear and the convergent residual error of accelerating algorithm The convolutional neural networks of block；Referring to Figure 14, an image processing step of the step 051 specifically includes following content:

Step 0511: the input picture block being inputted into the oversubscription network, so that the oversubscription network schemes the input As block carries out resolution ratio raising processing and exports corresponding characteristic pattern.

Step 0512: the characteristic pattern is inputted in preset thermal map prediction network so that thermal map prediction network exports institute State the prediction thermal map of the corresponding face unit of characteristic pattern；Wherein, the thermal map prediction network is the convolution mind that training obtains in advance Through network.

It is understood that the face unit refers to the organ and/or face contour of human face, the selection of the organ It can be at least one in face, the face contour can also carry out region division, specifically can be according to practical application feelings Shape is configured.In a kind of citing, the face unit can be chosen: mouth, nose, eyes and chin profile in face.

Step 0513: the prediction thermal map and the characteristic pattern being attached to and inputted the subdivision network, so that should Segment network to after connection the prediction thermal map and the characteristic pattern carry out detail recovery processing, obtain corresponding super-resolution figure As block.

The application of above-mentioned steps 0511 to 0513 enables the application to generate, and quality is higher, the better super-resolution of effect Image, to accelerate the training speed of network.

The secondary image processing step of the step 053 specifically includes following content:

In order to which this programme is further illustrated, the application also provides a kind of concrete application of the method for detecting human face of enhancing The method for detecting human face of example, the enhancing specifically includes following content:

The application example of the application is first to carry out Face datection to image using the good human-face detector of pre-training, by face Region detected by detector cut out from image come, by these image blocks carry out it is down-sampled after sequentially input into generate pair In anti-network, original image is then used as supervisory signals, makes to generate network while carrying out superresolution processing to low-resolution image, Differentiate that network carries out the differentiation of maximum capacity to the super-resolution image of generation.That is, generate network be responsible for generating can " with it is false disorderly The super-resolution image of similar original digital image data distribution very ", and differentiate network then in the continuous evolution of super-resolution image Learn to more different characteristic, generation network is supervised to generate more true, details super-resolution image more abundant.In During this, differentiate network also by study to it is a kind of distinguish image whether be face mode, by being compared with label To every subseries as a result, comparison result is then returned to generation network, to obtain the super-resolution figure for more having distinction Picture, under the action of loss function and preset threshold, image block develops towards direction that is clear, easily offering an explanation, classifier for The classification capacity of facial image also obtains corresponding promotion, is described as follows:

(1) in order to improve the accuracy rate of target detection using super resolution technology, i.e., object detection task and super-resolution Rate task combines the minimum face of detection, and overall plan is described in two stages:

1. model training process, referring to Figure 15.

1) pre-training is carried out to human-face detector first, inputs input picture into human-face detector, benefit after reaching requirement Detect that (testing result may be missed the doubtful human face region in image comprising the non-face region in part with human-face detector Inspection), then all image blocks detected are cut out to come, executes down-sampling operation and (interpolation behaviour is carried out to the image block vector Make, obtain a low-resolution image block, original image block is then used as supervisory signals to participate in subsequent calculating process) after be input to Generate network.

2) generate network for input low resolution image block vector carry out up-sampling treatment (to low-resolution image block to Amount rises dimension into original image block vector magnitude) after the characteristic pattern (feature map) that can be up-sampled.It is at this time that this feature figure is defeated Enter into thermal map prediction network, the prediction thermal map of face unit can be obtained, subdivision is inputted after then connecting thermal map with characteristic pattern Network, continue to execute micronization processes (with differentiate in the confronting with each other of network study to more information in relation to image detail simultaneously Assign the image block), super resolution image is generated, as the next input for differentiating network.

3) thermal map prediction network is the good convolutional neural networks of pre-training, after the prediction thermal map for obtaining face unit, They are connect with up-sampling characteristic pattern, the space of face unit and visibility information are injected into during super-resolution. In this way, the higher level information beyond image pixel intensities similitude is explored, and is used as in human face super-resolution Additional prior.Face unit includes mouth, nose, eyes, four part of chin profile, by predicting this tetrameric thermal map, and It is combined with the up-sampling characteristic pattern that oversubscription network obtains, the matter of finally obtained facial super-resolution image can be made Amount is higher, more true for the oversubscription result of facial various pieces, clear, efficiently solves and occurs once in a while in oversubscription result Facial twisted phenomena.

4) there are two branches for the output of differentiation network, generate two kinds to the super-resolution image that previous step generates and export: the It is a kind of to export to input the probability that (super-resolution image) is true picture (the original image block i.e. in the first step before down-sampling)； Second of output is the probability that input is facial image.

5) training process is related to two kinds of supervisory signals.One is to execute the down-sampled low-resolution image block for operating and obtaining Original image block；Another signal is the face frame demarcated in training data (image i.e. containing face mark).First signal Super-resolution process is supervised in network for differentiating, second signal is for differentiating that supervising face assorting process in network (is It is not face).

6) the architecture design part for fighting network is generated in relation to generating network and differentiating that the detailed description of network is shown in.

2. model application process, referring to Figure 16.

1) for image to be detected, first with human-face detector to ROI region (i.e. the interested region of detector or Human-face detector by calculate think may be human face region part) cut, then it is defeated to differentiate network in；

2) differentiate that network does these regions (image block vector) differentiation of " whether being face ": if face then provides High score (decimal between 0 to 1)；If not face, then provide lower score, if score is more than preset threshold, this area is assert Domain is that face is gone forward side by side line flag (to the image block region picture frame in original input image), otherwise ignores this region (i.e. not It is marked).

(2) in order to design a kind of framework for fighting network for the generation of super-resolution rebuilding, network is fought to generating Architecture design expansion narration:

It generates confrontation network (GAN) to be made of two sub- networks, one is for carrying out oversubscription to low-resolution image block The generation network for distinguishing processing executes the oversubscription network of up-sampling operation including one and carries out the subdivision net of micronization processes operation Network, the two are all convolutional neural networks, comprising multiple convolutional layers, warp lamination, residual block, ReLU activation primitive layer, are in batches returned One changes layer.The input for generating network is low-resolution image block vector, is exported as amplified high-definition picture block vector；Separately One sub-network is the differentiation network and a convolutional network classified to the oversubscription image of generation, and output includes two Branch, the two branches are carried out two sort operations, respectively contain a full articulamentum and Sigmoid activation primitive layer, are used for Final classification results are generated, the result is that a bivector, vector value are made of the decimal between 0 to 1, respectively indicate oversubscription The probability of the probability of resolution image block yes or no true picture block, image block yes or no facial image.

The thought for generating confrontation network is: generating network and is used to export super resolution image, differentiates that network is used to the oversubscription Resolution image and true picture corresponding thereto are differentiated, pull open as far as possible between oversubscription image and true picture certain away from From, and classification results are fed back into generation network；Generating network can then make great efforts to shorten this distance, and generation is more not easy by area Point, possess super-resolution image with the even more like feature distribution of the feature distribution of true picture, until differentiating that network is differentiated not (until reaching preset threshold value, super resolution image feature distribution at this time should be with the feature of true picture until the two out It is distributed closely similar).It generates network and confronts with each other in this course with network is differentiated, satisfactory can more surpass towards generating The target of image in different resolution is advanced.

On this basis, we increase another classification feature to differentiation network, this is also the core think of of technique Think, in order to allow and differentiate e-learning to another judgement: for distinguishing whether current image block is facial image Ability.We by the face markup information in data set as supervisory signals, to the image block currently to be differentiated execute whether Classification task, to judge whether the image block is facial image.Since information is continuous during above-mentioned super-resolution for image block Increase, detail section constantly improve, obtain more and more advanced features, therefore study is arrived more more identifications by classifier Information characteristics, after the completion of training, can directly generate arbitrary image block whether be face classification results.

As can be seen from the above description, the method for detecting human face for the enhancing that the application application example provides, simultaneously by many experiments It finds after being counted to detection effect, in the case where threshold value can be reduced in a certain range using the technical solution, compares Only the accuracy rate of the simple testing result obtained using equality detector is higher, and false detection rate is lower, effectively increases for image In minimum face Detection accuracy, reduce false detection rate.In addition, using the thermal map of prediction as priori knowledge to small-sized face The result that image progress super-resolution obtains is more excellent, this also accelerates the training speed of overall network to a certain extent.This Technology also provides specific solution for the minimum target of others in detection image, can be in replacement training data and to whole It is applied in the object detection task except face after body framework re -training.

In software view, present invention also provides in a kind of method for detecting human face for realizing aforementioned enhancing whole or The human face detection device of the enhancing of partial content, the human face detection device of the enhancing can be client device 102 above-mentioned, It can be server apparatus 101 above-mentioned, referring to Figure 17, the human face detection device of the enhancing may include: Preliminary detection mould Block 01 and fine detection module 02, in which:

Preliminary detection module 01 obtains the target image corresponding one for carrying out preliminary Face datection to target image A or multiple identification regions；

Fine detection module 02, for judging in the identification region whether to include face using preset differentiation network Image, if so, the identification region for including described image face is marked in the target image, wherein described Differentiate that network is a kind of to convert the treatment process of super resolution image and preset for described using preset low resolution image The validity deterministic process of super resolution image interacts the convolutional neural networks that training obtains.

The embodiment of the human face detection device of enhancing provided by the present application specifically can be used for executing in above-described embodiment The process flow of the embodiment of the method for detecting human face of enhancing, details are not described herein for function, is referred to above method implementation The detailed description of example.

As can be seen from the above description, the human face detection device of enhancing provided by the embodiments of the present application, can not increase erroneous detection The minimum face in image is detected on the basis of rate, can effectively improve the reliability and Face datection knot of Face datection process The accuracy of fruit.

In order to further increase the reliability of Face datection process and the accuracy of Face datection result, the application is also A kind of embodiment of the online or offline progress model training of the human face detection device of enhancing, referring to Figure 18, the enhancing are provided Human face detection device also specifically include following content:

Historical data obtains module 001, for obtaining trained image and the corresponding enhancing of the trained image Face datection result；

Preliminary face detection module 002 obtains the training for carrying out preliminary Face datection with image to the training With the corresponding one or more identification regions of image.

Original picture block obtains module 003, for being carried out the corresponding identification region in image from the training It cuts out, obtains the corresponding original picture block of each identification region.

Down-sampling processing module 004 obtains corresponding each for carrying out down-sampling processing to each original picture block A input picture block.

Interactive training module 005, for based on each input picture block, the corresponding original picture block and described The Face datection of the corresponding enhancing of trained image as a result, using the low resolution image conversion super resolution image treatment process And training is interacted to the differentiation network for the validity deterministic process of the super resolution image.

Wherein, the interactive training module 005 specifically includes following content:

Image processing unit, for executing an image processing step: the input picture block being inputted and is used In the generation network for converting low resolution image to super resolution image, and using the output result of the generation network as with it is described defeated Enter the corresponding super resolution image block of image block.

Validity judging unit, for executing validity judgment step: current super resolution image block input is sentenced Other network, and using the corresponding original picture block and the corresponding Face datection result of the trained image as standard Collection, so that the validity result for differentiating network and exporting the corresponding super resolution image block according to the standard set.

Secondary image processing unit, for executing the secondary image processing step: if the super resolution image block is true Solidity result does not meet preset requirement, then the validity result of the super resolution image block is input to the generation network, made The generation network is obtained to refine the super resolution image block according to the validity result of the current super resolution image block Processing, obtains new super resolution image block.

Cycling element, for repeating the validity judgment step and the secondary image processing step, until working as The validity result of preceding super resolution image block meets preset requirement.

Wherein, the validity judging unit specifically includes following content:

Super resolution image block inputs subelement, for current super resolution image block input to be differentiated network.

High partial image judgment sub-unit is used for using the corresponding original picture block as the first standard set, so that described Differentiate that network application first standard set judges whether the corresponding super resolution image block is high partial image.

Facial image judgment sub-unit, for using the Face datection result of the corresponding enhancing of image as training Two standard sets, so that described differentiation network application second standard set judges whether the corresponding super resolution image block is face Image.

Wherein, an image processing unit is specifically used for executing following content:

The input picture block is inputted into the oversubscription network, so that the oversubscription network divides the input picture block Resolution, which improves, to be handled and exports corresponding characteristic pattern；

The characteristic pattern is inputted in preset thermal map prediction network so that thermal map prediction network exports the characteristic pattern The prediction thermal map of corresponding face unit；Wherein, the thermal map prediction network is the convolutional neural networks that training obtains in advance, institute Stating face unit includes: mouth, nose, eyes and the chin profile in face；

Wherein, the secondary image processing unit is specifically used for executing following content:

Embodiment of the method provided by the application is above-mentioned can be in client device 12, server apparatus 11, computer set It is executed in group or similar arithmetic unit.For running on the server, Figure 19 is a kind of enhancing of the embodiment of the present invention Method for detecting human face server hardware block diagram.As shown in figure 19, server apparatus 11 may include one or more (processor 1020 can include but is not limited to Micro-processor MCV or programmable to a (one is only shown in Figure 19) processor 1020 The processing unit of logical device FPGA etc.), memory 1040 for storing data and the transmission module for communication function 1060.It will appreciated by the skilled person that structure shown in Figure 19 is only to illustrate, not to above-mentioned electronic device Structure causes to limit.For example, server apparatus 11 may also include the more perhaps less component than shown in Figure 19 or have The configuration different from shown in Figure 19.

Memory 1040 can be used for storing the software program and module of application software, such as the enhancing in the embodiment of the present invention The corresponding program instruction/module of method for detecting human face, processor 1020 passes through the software that is stored in memory 1040 of operation Program and module realize the people of the enhancing of above-mentioned application program thereby executing various function application and data processing Face detecting method.Memory 1040 may include high speed random access memory, may also include nonvolatile memory, such as one or more A magnetic storage device, flash memory or other non-volatile solid state memories.In some instances, memory 1040 can be into one Step includes the memory remotely located relative to processor 1020, these remote memories can pass through network connection to server Equipment 11.The example of above-mentioned network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.

Transmission module 1060 is used to that data to be received or sent via a network.Above-mentioned network specific example may include The wireless network that the communication providers of server apparatus 11 provide.In an example, transmission module 1060 includes a network Adapter (Network Interface Controller, NIC), can be connected by base station with other network equipments so as to It is communicated with internet.In an example, transmission module 1060 can be radio frequency (Radio Frequency, RF) module, It is used to wirelessly be communicated with internet.

The content of Face datection based on aforementioned enhancing, the embodiment of the present application also provide a kind of electronic equipment, including display The memory of screen, processor and storage processor executable instruction.The display screen may include touch screen, liquid crystal display The equipment that device, projection device etc. show the information content.The electronic device types may include that mobile terminal, dedicated vehicle insurance are set Standby, vehicle device interactive device, PC etc..The Face datection of the enhancing may be implemented in the processor when executing described instruction The all or part of the content of method, for example, following content may be implemented when executing described instruction in the processor:

As can be seen from the above description, electronic equipment provided by the embodiments of the present application, it can be on the basis of not increasing false detection rate Detect the minimum face in image, can effectively improve Face datection process reliability and Face datection result it is accurate Property.

The content of Face datection based on aforementioned enhancing, embodiments herein also provide the people that can be realized above-mentioned enhancing A kind of computer readable storage medium of all or part of the steps in face detecting method embodiment, the computer-readable storage Computer program is stored on medium, which realizes the face of the enhancing in above-described embodiment when being executed by processor The whole of detection method, for example, the processor realizes following step when executing the computer program:

As can be seen from the above description, computer readable storage medium provided by the embodiments of the present application, can not increase erroneous detection The minimum face in image is detected on the basis of rate, can effectively improve the reliability and Face datection knot of Face datection process The accuracy of fruit.

It is above-mentioned that this specification specific embodiment is described.Other embodiments are in the scope of the appended claims It is interior.In some cases, the movement recorded in detail in the claims or step can be come according to the sequence being different from embodiment It executes and desired result still may be implemented.In addition, process depicted in the drawing not necessarily require show it is specific suitable Sequence or consecutive order are just able to achieve desired result.In some embodiments, multitasking and parallel processing be also can With or may be advantageous.

Above-mentioned instruction can store in a variety of computer readable storage mediums.The computer readable storage medium can To include the physical unit for storing information, can by after information digitalization again in the way of electricity, magnetic or optics etc. Media are stored.It may include: that information is stored in the way of electric energy that computer readable storage medium described in the present embodiment, which has, Device such as, various memory, such as RAM, ROM；The device of information is stored in the way of magnetic energy such as, hard disk, floppy disk, tape, Core memory, magnetic bubble memory, USB flash disk；Using optical mode storage information device such as, CD or DVD.Certainly, there are also other Readable storage medium storing program for executing of mode, such as quantum memory, graphene memory etc..The following devices or server or visitor Instruction in family end or system ibid describes.

Although this application provides the method operating procedure as described in embodiment or flow chart, based on conventional or noninvasive The labour for the property made may include more or less operating procedure.The step of enumerating in embodiment sequence is only numerous steps One of execution sequence mode, does not represent and unique executes sequence.It, can when device or client production in practice executes To execute or parallel execute (such as at parallel processor or multithreading according to embodiment or method shown in the drawings sequence The environment of reason).

The device or module that above-described embodiment illustrates can specifically realize by computer chip or entity, or by having The product of certain function is realized.For convenience of description, it is divided into various modules when description apparatus above with function to describe respectively. The function of each module can be realized in the same or multiple software and or hardware when implementing the application.It is of course also possible to Realization the module for realizing certain function is combined by multiple submodule or subelement.

Method, apparatus or module described herein can realize that controller is pressed in a manner of computer readable program code Any mode appropriate is realized, for example, controller can take such as microprocessor or processor and storage can be by (micro-) The computer-readable medium of computer readable program code (such as software or firmware) that processor executes, logic gate, switch, specially With integrated circuit (Application Specific Integrated Circuit, ASIC), programmable logic controller (PLC) and embedding Enter the form of microcontroller, the example of controller includes but is not limited to following microcontroller: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, Memory Controller are also implemented as depositing A part of the control logic of reservoir.It is also known in the art that in addition to real in a manner of pure computer readable program code Other than existing controller, completely can by by method and step carry out programming in logic come so that controller with logic gate, switch, dedicated The form of integrated circuit, programmable logic controller (PLC) and insertion microcontroller etc. realizes identical function.Therefore this controller It is considered a kind of hardware component, and hardware can also be considered as to the device for realizing various functions that its inside includes Structure in component.Or even, it can will be considered as the software either implementation method for realizing the device of various functions Module can be the structure in hardware component again.

As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It is realized by the mode of software plus required hardware.Based on this understanding, the technical solution of the application is substantially in other words The part that contributes to existing technology can be embodied in the form of software products, and can also pass through the implementation of Data Migration It embodies in the process.The computer software product can store in storage medium, such as ROM/RAM, magnetic disk, CD, packet Some instructions are included to use so that a computer equipment (can be personal computer, mobile terminal, server or network are set It is standby etc.) execute method described in certain parts of each embodiment of the application or embodiment.

System, device, module or the unit that above-described embodiment illustrates can specifically realize by computer chip or entity, Or it is realized by the product with certain function.It is a kind of typically to realize that equipment is computer.Specifically, computer for example may be used Think personal computer, laptop computer, vehicle-mounted human-computer interaction device, cellular phone, camera phone, smart phone, individual Digital assistants, media player, navigation equipment, electronic mail equipment, game console, tablet computer, wearable device or The combination of any equipment in these equipment of person.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net Network interface and memory.

Memory may include the non-volatile memory in computer-readable medium, random access memory (RAM) and/or The forms such as Nonvolatile memory, such as read-only memory (ROM) or flash memory (flash RAM).Memory is computer-readable medium Example.

Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data. The example of the storage medium of computer includes, but are not limited to phase change memory (PRAM), static random access memory (SRAM), moves State random access memory (DRAM), other kinds of random access memory (RAM), read-only memory (ROM), electric erasable Programmable read only memory (EEPROM), flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), Digital versatile disc (DVD) or other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices Or any other non-transmission medium, can be used for storage can be accessed by a computing device information.As defined in this article, it calculates Machine readable medium does not include temporary computer readable media (transitory media), such as the data-signal and carrier wave of modulation.

It will be understood by those skilled in the art that the embodiment of this specification can provide as the production of method, system or computer program Product.Therefore, in terms of this specification embodiment can be used complete hardware embodiment, complete software embodiment or combine software and hardware Embodiment form.

This specification embodiment can describe in the general context of computer-executable instructions executed by a computer, Such as program module.Generally, program module includes routines performing specific tasks or implementing specific abstract data types, journey Sequence, object, component, data structure etc..This specification embodiment can also be practiced in a distributed computing environment, in these points Cloth calculates in environment, by executing task by the connected remote processing devices of communication network.In distributed computing ring In border, program module can be located in the local and remote computer storage media including storage equipment.

Each embodiment in this specification is described in a progressive manner, the same or similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.The whole of the application or Person part can be used in numerous general or special purpose computing system environments or configuration.Such as: personal computer, server calculate Machine, handheld device or portable device, mobile communication terminal, multicomputer system, based on microprocessor are at laptop device System, programmable electronic equipment, network PC, minicomputer, mainframe computer, the distribution including any of the above system or equipment Formula calculates environment etc..

All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method Part explanation.In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", The description of " specific example " or " some examples " etc. means specific features described in conjunction with this embodiment or example, structure, material Or feature is contained at least one embodiment or example of this specification embodiment.In the present specification, to above-mentioned term Schematic representation be necessarily directed to identical embodiment or example.Moreover, description specific features, structure, material or Person's feature may be combined in any suitable manner in any one or more of the embodiments or examples.In addition, in not conflicting feelings Under condition, those skilled in the art by different embodiments or examples described in this specification and different embodiment or can show The feature of example is combined.

Although depicting the application by embodiment, it will be appreciated by the skilled addressee that the application there are many deformation and Variation is without departing from spirit herein, it is desirable to which the attached claims include these deformations and change without departing from the application's Spirit.

The foregoing is merely the embodiments of this specification, are not limited to this specification embodiment.For ability For field technique personnel, this specification embodiment can have various modifications and variations.It is all this specification embodiment spirit and Any modification, equivalent replacement, improvement and so within principle should be included in the scope of the claims of this specification embodiment Within.

Claims

1. a kind of method for detecting human face of enhancing characterized by comprising

Judge in the identification region whether to include facial image using preset differentiation network, if so, in the target The identification region for including described image face is marked in image, wherein the differentiation network is default for a kind of application Low resolution image conversion super resolution image treatment process and the preset validity judgement for the super resolution image Process interacts the convolutional neural networks that training obtains.

2. the method for detecting human face of enhancing according to claim 1, which is characterized in that further include:

Preliminary Face datection is carried out with image to the training, obtains one or more corresponding identification of the trained image Region；

The corresponding identification region is cut out in image from the training, it is right respectively to obtain each identification region The original picture block answered；

Based on each input picture block, the corresponding original picture block and the corresponding Face datection of the trained image As a result, using the treatment process of the low resolution image conversion super resolution image and for the validity of the super resolution image Deterministic process interacts training to the differentiation network.

3. the method for detecting human face of enhancing according to claim 2, which is characterized in that described based on each input figure As block, the corresponding original picture block and the corresponding Face datection of the trained image as a result, scheming using the low resolution Treatment process as conversion super resolution image and the validity deterministic process for the super resolution image are to the differentiation net Network interacts training, comprising:

Image processing step: input picture block input is used to convert low resolution image to the life of super resolution image At network, and using the output result of the generation network as super resolution image block corresponding with the input picture block；

Validity judgment step: differentiating network for current super resolution image block input, and by the corresponding original graph Picture block and the corresponding Face datection result of the trained image are respectively as standard set, so that the differentiation network is according to Standard set exports the validity result of the corresponding super resolution image block；

Secondary image processing step:, will be described super if the validity result of the super resolution image block does not meet preset requirement The validity result of resolution image block is input to the generation network, so that the generation network is according to the current super-resolution figure As block validity result to the super resolution image block carry out micronization processes, obtain new super resolution image block；

The validity judgment step and the secondary image processing step are repeated, until current super resolution image block Validity result meets preset requirement.

4. the method for detecting human face of enhancing according to claim 3, which is characterized in that described by the corresponding original graph Picture block and the corresponding Face datection result of the trained image are respectively as standard set, so that the differentiation network is according to Standard set exports the validity result of the corresponding super resolution image block, comprising:

Using the corresponding original picture block as the first standard set, so that described differentiation network application first standard set judges Whether the corresponding super resolution image block is high partial image；

And use the corresponding Face datection result of image as the second standard set the training, so that the differentiation network is answered Judge whether the corresponding super resolution image block is facial image with second standard set；

If the super resolution image block is the high partial image, and the super resolution image block is also the facial image, then judges Whether the super resolution image block meets threshold requirement；

Wherein, whether the super resolution image block is high partial image judging result, the super resolution image block are face The judging result of image, and, the judging result whether the super resolution image block meets threshold requirement collectively constitutes described super The validity result of resolution image block；

It is corresponding, if it is not the high partial image that the super resolution image block, which belongs to, is not the facial image and is unsatisfactory for At least one of threshold requirement situation then determines that the validity result of the super resolution image block does not meet preset requirement.

5. the method for detecting human face of enhancing according to claim 3, which is characterized in that include super in the generation network Subnetwork and subdivision network；Wherein, it includes convolutional layer, warp lamination and for avoiding gradient from disappearing that the oversubscription network, which is one kind, Lose the convolutional neural networks of the simultaneously convergent residual block of accelerating algorithm；The subdivision network is that one kind includes convolutional layer and for keeping away Exempt from the convolutional neural networks of gradient disappearance and the convergent residual block of accelerating algorithm；

It is corresponding, the generation net that input picture block input is used to convert low resolution image to super resolution image Network, and using the output result of the generation network as super resolution image block corresponding with the input picture block, comprising: it will be described Input picture block inputs the oversubscription network, so that the oversubscription network carries out resolution ratio raising processing simultaneously to the input picture block Export corresponding characteristic pattern；

The characteristic pattern is inputted in preset thermal map prediction network so that thermal map prediction network exports the characteristic pattern and corresponds to Face unit prediction thermal map；Wherein, the thermal map prediction network is the convolutional neural networks that training obtains in advance；

The prediction thermal map and the characteristic pattern are attached to and are inputted the subdivision network, so that the subdivision network is to connection The prediction thermal map and the characteristic pattern afterwards carries out detail recovery processing, obtains corresponding super resolution image block.

6. the method for detecting human face of enhancing according to claim 5, which is characterized in that described by the super resolution image block Validity result be input to the generation network so that the generation network is according to the true of the current super resolution image block It spends result and micronization processes is carried out to the super resolution image block, obtain new super resolution image block, comprising:

The validity result of the super resolution image block and the corresponding super resolution image block are inputted into the oversubscription network, made The oversubscription network carries out resolution ratio raising processing to the super resolution image block, and by the output result of the oversubscription network and institute The validity result for stating super resolution image block inputs the subdivision network, so that output of the subdivision network to the oversubscription network As a result detail recovery processing is carried out, new super resolution image block is obtained.

7. the method for detecting human face of enhancing according to any one of claims 1 to 6, which is characterized in that the differentiation network In include convolutional layer and two full articulamentums, and two full articulamentums respectively correspond two output ends of the differentiation network；

Wherein, output end be used to export the super resolution image block whether be high partial image judging result；It is another A output end be used for export the super resolution image block whether be facial image judging result.

8. a kind of human face detection device of enhancing characterized by comprising

Preliminary detection module, for carrying out preliminary Face datection to target image, obtain the target image it is corresponding one or Multiple identification regions；

Fine detection module, for judging in the identification region whether to include facial image using preset differentiation network, If so, the identification region for including described image face is marked in the target image, wherein the differentiation net Network is a kind of to convert the treatment process of super resolution image using preset low resolution image and preset be directed to the super-resolution The validity deterministic process of image interacts the convolutional neural networks that training obtains.

9. a kind of electronic equipment including memory, processor and stores the calculating that can be run on a memory and on a processor Machine program, which is characterized in that the processor realizes the described in any item enhancings of claim 1 to 7 when executing described program The step of method for detecting human face.

10. a kind of computer readable storage medium, is stored thereon with computer instruction, which is characterized in that described instruction is performed The step of method for detecting human face of the described in any item enhancings of Shi Shixian claim 1 to 7.