CN108509940A

CN108509940A - Facial image tracking, device, computer equipment and storage medium

Info

Publication number: CN108509940A
Application number: CN201810359958.8A
Authority: CN
Inventors: 杨帆
Original assignee: Beijing Dajia Internet Information Technology Co Ltd
Current assignee: Beijing Dajia Internet Information Technology Co Ltd
Priority date: 2018-04-20
Filing date: 2018-04-20
Publication date: 2018-09-07
Anticipated expiration: 2038-04-20
Also published as: CN108509940B

Abstract

The embodiment of the invention discloses a kind of facial image tracking, device, computer equipment and storage mediums, include the following steps：Obtain the location information and offset parameter of facial image in target image；The location information is corrected according to the offset parameter；Location information after the correction is defined as change location of the facial image in next frame target image.Orientation training to convergence state convolutional neural networks model, can according in current goal picture facial image location information and feature be inferred in current target image to next frame Target Photo, facial image accurately offset parameter, therefore, it is possible to be inferred to the location information of facial image in next frame target image after getting current target image.By the above method when shooting is asynchronous with tracking processing speed, the tracking problem of image is lost during capable of solving facial image tracking.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.

Description

Facial image tracking, device, computer equipment and storage medium

Technical field

The present embodiments relate to image processing field, especially a kind of facial image tracking, device, computer are set Standby and storage medium.

Background technology

Feature tracking, seeks under the premise of detecting face, continue in subsequent frames capture face position and its The information such as size, including the identification of face and the tracking technique of face.In fields such as customs, airport, bank, teleconferences It closes, is required for Given Face target into line trace.

In the prior art in face key point technology, feature is carried out to the face of previous frame first with full convolutional network and is carried It takes, face is then estimated in the position of present frame according to the position of previous frame face, recycles same full convolutional network to working as The face location that previous frame is estimated carries out feature extraction, and the face characteristic of previous frame is then carried out phase on present frame face characteristic Closing property compares, and it is exactly accurate location of the face in present frame to obtain the highest position of score.

The inventor of the invention has found that the frame per second of video is every between 25~30 frames in the prior art under study for action Second, the detection speed of the one process of Face datection algorithm is about per second in 10 frames, and is needed before and after calculating when prior art tracking The face of frame and its feature of neighbouring position are extremely difficult to real-time purpose since the calculation amount of extraction characteristic procedure is bigger, Therefore it is slow that there are reaction speeds, and is easily lost target facial image.

Invention content

The embodiment of the present invention provides a kind of facial image tracking, dress fast implementing face tracking by offset parameter It sets, computer equipment and storage medium.

In order to solve the above technical problems, the technical solution that the embodiment of the invention uses is：A kind of people is provided Face image tracking, includes the following steps：

Obtain the location information and offset parameter of facial image in target image；

The location information is corrected according to the offset parameter；

Location information after the correction is defined as change location of the facial image in next frame target image.

Optionally, it before described the step of obtaining the location information and offset parameter of facial image in target image, also wraps It includes：

Acquire the target image in video information；

The target image is input in preset position coordinates model, wherein the position coordinates model is training To convergent convolutional neural networks model；

Obtain the location information of facial image in the target image of the position coordinates model output.

Optionally, the location information by after the correction is defined as the facial image in next frame target image Change location the step of after, further include：

Data correction is carried out to preset tracking framework according to the offset parameter；

Tracking framework after correction is moved at the change location in next frame target image, so that the tracking framework Adapt to variation of the facial image in the next frame target image.

Optionally, the acquisition methods of the offset parameter are：

Obtain the actual position information of facial image in preset original sample image；

Facial image in the original sample image is changed according to preset image change method, to obtain at least One training sample image for being derived from the original sample image；

The training sample image is input in convolutional neural networks model, to calculate people in the training sample image Offset parameter of the change location information between the actual position information after face image variation.

Optionally, described that the training sample image is input in convolutional neural networks model, to calculate the training Offset parameter of the change location information between the actual position information in sample image after facial image variation Step includes：

Obtain the training sample image for being marked with the actual position information；

The training sample image is input in convolutional neural networks model to obtain the variation position of the facial image Confidence ceases and offset parameter；

Trace-back operation is carried out to the change location according to the offset parameter and obtains backtracking location information；

Whether the actual position information of the training sample image and the backtracking location information are compared by loss function Unanimously；

When the actual position information and the inconsistent change location information, the update volume of iterative cycles iteration Weight in product neural network model, until the comparison result terminates when consistent.

Optionally, the offset parameter includes：Coordinate shift parameter and scale offset parameter.

Optionally, described image changing method includes：Translation variation, rotationally-varying and scaling are carried out to the facial image One or more mixing changes in variation.

In order to solve the above technical problems, the embodiment of the present invention also provides a kind of facial image tracking, including：

Acquisition module, location information and offset parameter for obtaining facial image in target image；

Processing module, for being corrected to the location information according to the offset parameter；

Execution module, for the location information after the correction to be defined as the facial image in next frame target image In change location.

Optionally, further include：

First acquisition submodule, for acquiring the target image in video information；

First processing submodule, for the target image to be input in preset position coordinates model, wherein described Position coordinates model is to train to convergent convolutional neural networks model；

First implementation sub-module, for obtaining facial image in the target image that the position coordinates model exports Location information.

Optionally, further include：

First correction module, for carrying out Data correction to preset tracking framework according to the offset parameter；

Second implementation sub-module, the change location being moved to for the tracking framework after correcting in next frame target image Place, so that the tracking framework adapts to variation of the facial image in the next frame target image.

Optionally, further include：

First acquisition submodule, the actual position information for obtaining facial image in preset original sample image；

Second processing submodule is used for according to preset image change method to facial image in the original sample image It is changed, to obtain at least one training sample image for being derived from the original sample image；

Third implementation sub-module, for the training sample image to be input in convolutional neural networks model, to calculate Change location information in the training sample image after facial image variation is inclined between the actual position information Shifting parameter.

Optionally, further include：

Second acquisition submodule, for obtaining the training sample image for being marked with the actual position information；

Third handles submodule, for the training sample image to be input in convolutional neural networks model to obtain State the change location information and offset parameter of facial image；

First operation submodule is recalled for carrying out trace-back operation to the change location according to the offset parameter Location information；

First compares submodule, the actual position information for comparing the training sample image by loss function and institute Whether consistent state backtracking location information；

4th implementation sub-module is used for when the actual position information and the backtracking location information are inconsistent, repeatedly Weight in the update convolutional neural networks model of loop iteration, until the comparison result terminates when consistent.

In order to solve the above technical problems, the embodiment of the present invention also provides a kind of computer equipment, including memory and processing Device is stored with computer-readable instruction in the memory, when the computer-readable instruction is executed by the processor so that The processor executes the step of facial image tracking described in the claims.

In order to solve the above technical problems, the embodiment of the present invention also provides a kind of storage Jie being stored with computer-readable instruction Matter, when the computer-readable instruction is executed by one or more processors so that one or more processors execute above-mentioned power Profit requires the step of facial image tracking.

The embodiment of the present invention has the beneficial effect that：Orientation training is to the convolutional neural networks model of convergence state, Neng Gougen According in current goal picture facial image location information and feature be inferred in current target image to next frame Target Photo, Facial image accurately offset parameter, therefore, it is possible to be inferred to next frame target image after getting current target image The location information of middle facial image.By the above method, during facial image tracking can be solved, shooting and tracking processing speed When spending asynchronous, the tracking problem of image is lost.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.

Description of the drawings

To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.

Fig. 1 is the basic procedure schematic diagram of facial image tracking of the embodiment of the present invention；

Fig. 2 is the idiographic flow schematic diagram that facial image location information of the embodiment of the present invention obtains；

Fig. 3 is the flow diagram that amendment of the embodiment of the present invention tracks framework；

Fig. 4 is the flow diagram that the embodiment of the present invention obtains offset parameter；

Fig. 5 is deviation post model training method flow diagram of the embodiment of the present invention；

Fig. 6 is facial image tracks of device basic structure schematic diagram of the embodiment of the present invention；

Fig. 7 is computer equipment basic structure block diagram of the embodiment of the present invention.

Specific implementation mode

In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described.

In some flows of description in description and claims of this specification and above-mentioned attached drawing, contain according to Multiple operations that particular order occurs, but it should be clearly understood that these operations can not be what appears in this article suitable according to its Sequence is executed or is executed parallel, and the serial number such as 101,102 etc. of operation is only used for distinguishing each different operation, serial number It itself does not represent and any executes sequence.In addition, these flows may include more or fewer operations, and these operations can To execute or execute parallel in order.It should be noted that the descriptions such as " first " herein, " second ", are for distinguishing not Same message, equipment, module etc., does not represent sequencing, does not also limit " first " and " second " and be different type.

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation describes, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, the every other implementation that those skilled in the art are obtained without creative efforts Example, shall fall within the protection scope of the present invention.

Embodiment 1

It includes wireless communication that those skilled in the art of the present technique, which are appreciated that " terminal " used herein above, " terminal device " both, The equipment of number receiver, only has the equipment of the wireless signal receiver of non-emissive ability, and includes receiving and transmitting hardware Equipment, have on bidirectional communication link, can execute two-way communication reception and emit hardware equipment.This equipment May include：Honeycomb or other communication equipments are shown with single line display or multi-line display or without multi-line The honeycomb of device or other communication equipments；PCS (Personal Communications Service, PCS Personal Communications System), can With combine voice, data processing, fax and/or communication ability；PDA (Personal Digital Assistant, it is personal Digital assistants), may include radio frequency receiver, pager, the Internet/intranet access, web browser, notepad, day It goes through and/or GPS (Global Positioning System, global positioning system) receiver；Conventional laptop and/or palm Type computer or other equipment, have and/or the conventional laptop including radio frequency receiver and/or palmtop computer or its His equipment." terminal " used herein above, " terminal device " they can be portable, can transport, be mounted on the vehicles (aviation, Sea-freight and/or land) in, or be suitable for and/or be configured in local runtime, and/or with distribution form, operate in the earth And/or any other position operation in space." terminal " used herein above, " terminal device " can also be communication terminal, on Network termination, music/video playback terminal, such as can be PDA, MID (Mobile Internet Device, mobile Internet Equipment) and/or mobile phone with music/video playing function, can also be the equipment such as smart television, set-top box.

VGG is Oxford University's computer vision group (VisualGeometry Group) and GoogleDeepMind companies The depth convolutional neural networks that researcher researches and develops together.VGG is explored between the depth of convolutional neural networks and its performance Relationship, by stacking the small-sized convolution kernel of 3*3 and the maximum pond layer of 2*2 repeatedly, VGG has successfully constructed 16~19 layer depths Convolutional neural networks.The expansion of VGG is very strong, and the generalization moved on other image datas is very good.The structure of VGG is very Succinctly, whole network all employs an equal amount of convolution kernel size (3*3) and maximum pond size (2*2).Up to the present, VGG is still usually utilized to extraction characteristics of image.Model parameter after VGG training is increased income in its official website, can be used to Retraining (be equivalent to and provide extraordinary initialization weight) is carried out in specific image classification task.

In present embodiment, deep learning and content understanding are carried out using VGG convolutional neural networks models.But it is not limited to This can use CNN convolutional neural networks model or CNN convolutional neural networks models in some selective embodiments Branch model.

It is the basic procedure schematic diagram of the present embodiment facial image tracking referring specifically to Fig. 1, Fig. 1.

As shown in Figure 1, a kind of facial image tracking, includes the following steps：

S1100, the location information and offset parameter for obtaining facial image in target image；

When carrying out video human face image trace, the extraction target figure from video information is obtained by way of timing acquiring Picture.Such as the speed opened with 0.5s mono- extracts a Target Photo in the frame picture image of video information, but not limited to this, According to the difference of concrete application scene, the adjustment of adaptability can be carried out by acquiring the speed of target image, and Adjustment principle is, is Processing capacity of uniting is stronger and the tracking more high then acquisition time of accuracy requirement is shorter, reaches the frequency that image is acquired with picture pick-up device Until when synchronous；Otherwise, then acquisition time interval is longer, but longest acquisition time interval must not exceed 1s.Target image is specific Refer to being acquired from video information comprising facial image frame picture image.

The location information of facial image is extracted by training to convergent convolutional neural networks model in target image, The position of the facial image in the target image of input can be believed by orientation training to convergent convolutional neural networks model Breath extracts.

In present embodiment, location information can be (being not limited to)：The seat of facial image facial contour in Target Photo Mark set also can be the point coordinates or face key point (eyes, nose and mouth for characterizing facial contour center Bar) coordinate set at position, and in present embodiment, location information further includes the size of facial image, i.e. facial image Width and height.

S1200, the location information is corrected according to the offset parameter；

After obtaining the location information in facial image, the position coordinates of facial image are carried out according to preset offset parameter Correction, wherein the formula of correction is：

Δ x=(c '_x-c_x)/ω

Δ y=(c '_y-c_y)/h

Wherein, Δ x is offset of the facial image in two-dimensional coordinates in x-axis direction, and Δ y is facial image in two dimension Offset in reference axis on y-axis direction, c '_xFor the coordinate in x-axis direction after correction, c '_yFor the offset in y-axis after correction, c_xFor the coordinate in x-axis direction before correction, c_yFor the coordinate before correction on y-axis direction, ω is the width of facial image before correction, h For the height of facial image before correction.

It can be according to the ruler of the coordinate position and facial image of offset parameter and facial image according to the variation of above-mentioned formula The very little position coordinates for being inferred to facial image after correcting.

In present embodiment, the step of being corrected to location information by offset parameter, is also by above-mentioned convolutional neural networks Model carries out, and orientation training to convergent convolutional neural networks model can be after the location information for obtaining facial image, to this Face location information is corrected.

In some embodiments, offset parameter further includes：Width and height to facial image are corrected, wherein The formula of correction is：

γ_ω=ω '/ω

γ_h=h '/h

Wherein, ω is the width of facial image before correction, and h is the height of facial image before correction, γ_ωFor facial image Width deviations parameter, γ_hFor the offsets in height parameter of facial image, ω ' is the width of facial image after correction, and h ' is after correcting The height of facial image.

S1300, the location information after the correction is defined as to change of the facial image in next frame target image Change position.

Using the location information after correction as the location information of facial image in next frame targeted graphical, face is realized with this The rapid location of image tracks.Since the convolutional neural networks model of orientation training to convergence state can be according to current goal Facial image location information and feature in picture are inferred in current target image to next frame Target Photo, facial image essence Accurate offset parameter, therefore, it is possible to be inferred to facial image in next frame target image after getting current target image Location information.By the above method, during capable of solving facial image tracking, shooting is asynchronous with tracking processing speed When, lose the tracking problem of image.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.

Change location position refers to physical location of the facial image in next frame target image.

In some embodiments, it before being corrected to the location information in target image, needs to target image Facial image location information extracts.Specific extraction process is referring to Fig. 2, Fig. 2 believes for the present embodiment facial image position Cease the idiographic flow schematic diagram obtained.

As shown in Fig. 2, further including following step before step S1100：

Target image in S1011, acquisition video information；

S1012, the target image is input in preset position coordinates model, wherein the position coordinates model For training to convergent convolutional neural networks model；

In present embodiment, position coordinates model, that is, convolutional neural networks model, the convolution of orientation training to convergence state Neural network model can extract the location information of facial image, while can also be extracted to location information Afterwards, continue to carry out position correction to facial image.

S1013, the location information for obtaining facial image in the target image that the position coordinates model exports.

The location information of facial image in grouped data, that is, target image of position coordinates model output.

For example, in the present embodiment, convolutional neural networks modal position coordinate model and deviation post model composition, Two models are vgg convolutional neural networks models.Wherein, position coordinates model and deviation post Cascade, position coordinates Model is used to obtain the position coordinates of the facial image in target image, i.e. the grouped data of position coordinate model output is target The position coordinates of facial image in image.Deviation post model is then used to obtain the location information after correction, input offset position Set the location information that facial image at the data of model further includes the output of position coordinates model outside.

In some embodiments, it during facial image tracking, needs to be provide with tracking box body on facial image, under The data of corresponding adjustment tracking framework are needed after the location determination of one frame target image.It is this reality referring specifically to Fig. 3, Fig. 3 Apply the flow diagram of example amendment tracking framework.

As shown in figure 3, further including following step after step S1300：

S1311, Data correction is carried out to preset tracking framework according to the offset parameter；

The tracking framework in current target image, being located on facial image is corrected according to offset parameter.Specifically Ground, tracking framework are closed wire frame by preset scalar and form, and the design parameter for tracking framework is controlled by framework control, framework Control controls the size of the position and framework of framework, and specifically, control location parameter is the location information of face figure, framework ruler Very little parameter is the width and height of facial image.By the way that different parameter informations, the position of framework and size is arranged It can be with changing together.

After being corrected to the parameter for tracking framework according to offset parameter, it is next to track the position of framework and Size Conversion The tracking framework parameter of frame target image.

S1312, the tracking framework after correction is moved at the change location in next frame target image so that it is described with Track framework adapts to variation of the facial image in the next frame target image.

The location parameter of tracking framework after correction is the change location parameter in next frame target image, therefore, correction Tracking box body afterwards can be moved to the position after variation along current position.The continuity of cooperation and video playing, tracking box Body is linearly moving carrying out position movement, that is, the movement between location information and current location information after correcting is linear Smooth, and what indirect jump carried out.

Due to, the location information synchronizing variation of facial image after the parameter of the tracking framework after correction and correction, therefore, with Track framework adapts to variation of the facial image in next frame target image.

In some embodiments, the acquisition of offset parameter is needed by orientation training to convergent convolutional neural networks mould Type extracts, and is the flow diagram that the present embodiment obtains offset parameter referring specifically to Fig. 4, Fig. 4.

As shown in figure 4, facial image tracking further includes：

S2100, the actual position information for obtaining facial image in preset original sample image；

In present embodiment, convolutional neural networks modal position coordinate model and deviation post model composition, two models It is vgg convolutional neural networks models.Wherein, position coordinates model and deviation post Cascade, position coordinates model are used for The position coordinates of the facial image in target image are obtained, i.e. the grouped data of position coordinate model output is in target image The position coordinates of facial image.Deviation post model is then used to obtain the location information after correction, input offset position model Further include the location information of position coordinates model output outside facial image at data.

The calculating of position is carried out by deviation post model after the extraction and correction of offset parameter, and therefore, the present embodiment is to inclined The training method of pan position model illustrates.

The training of deviation post model need include facial image original training sample, utilize PostgreSQL database wider The human face data of face, as original training sample image.

Face location information in the original training sample image that gets is extracted, extraction is using in the prior art Trained to the convergent neural network for obtaining facial image location information extracts.But it is not limited to time, in some realities When mode in, the location information of original sample image can be by manually demarcating completion.

Actual position information is the physical location of facial image in this original sample image.

S2200, facial image in the original sample image is changed according to preset image change method, to obtain Take at least one training sample image for being derived from the original sample image；

The original sample image for obtaining actual position information is changed, specific variation refers to original sample image In facial image carry out translation variation, it is rotationally-varying and scaling variation in one or more mixing changes.It is specific to become Change mode is random, that is, the quantity for extracting variation is that random algorithm generates, and the corresponding variation pattern of each quantity is also It is randomly selected in three variation patterns.By bivariate, treated that original sample image can derive more training Sample image, and the image change of bivariate can be that the deviation post model after the completion of training is more stablized.

Adjusting range to original sample image change is [0.8,1.2].That is translation variation, rotationally-varying and scaling variation Range [0.8,1.2] carry out value.Since the image change amplitude between frame picture is little, therefore, takes in the range Value more meets the demand of practical application.

S2300, the training sample image is input in convolutional neural networks model, to calculate the training sample figure Offset parameter of the change location information between the actual position information as in after facial image variation.

The actual position information in original sample image is also carried after changing due to training sample image, is needed according to right Convolutional neural networks model is trained, so that convolutional neural networks model can obtain offset parameter.Specifically, convolutional Neural The training of network model, that is, deviation post model is referring to Fig. 5, Fig. 5 is the deviation post model training method flow of the present embodiment Schematic diagram.

As shown in figure 5, step S2300 specifically includes following step：

S2310, acquisition are marked with the training sample image of the actual position information；

When being trained, the input layer of convolutional neural networks model obtains the training sample for being marked with actual position information Image.

S2320, the training sample image is input in convolutional neural networks model to obtain the facial image Change location information and offset parameter；

Training sample image is input to convolutional neural networks model, the classification information of convolutional neural networks model output For offset parameter.Specifically, which is built for the offset parameter and change location to facial image Information is trained.Due to, which is not trained to convergence state, therefore, the offset parameter of output With certain randomness, data discrete is larger, and loss function is needed to verify the data of output.

S2330 carries out trace-back operation to the change location according to the offset parameter and obtains backtracking location information；

The specific method for calculating to obtain backtracking location information according to trace-back operation is：It is subtracted on the basis of change location information Offset parameter obtains backtracking location information.Since offset parameter is change of the actual position information between change location information Amount, i.e. actual position information can obtain change location information plus offset parameter, and reverse operating is according to change location information Backtracking location information can be obtained by subtracting offset parameter.

S2340, the actual position information that the training sample image is compared by loss function and the backtracking position are believed Whether breath is consistent；

The classification value of convolutional neural networks model is verified using L1-loss loss functions in present embodiment.Tool Body, loss function is by calculating the absolute value distance between the actual position information of training sample image and backtracking location information Whether it is less than preset threshold value, when less than the threshold value, indicates poor between location information and actual position information by backtracking Away from smaller, offset parameter meets the requirements；Otherwise, then offset parameter is undesirable.

S2350, when the actual position information and the backtracking location information it is inconsistent when, the update of iterative cycles iteration Weight in the convolutional neural networks model, until the comparison result terminates when consistent.

When the actual position information of convolutional neural networks model and inconsistent backtracking location information, need to calculate using reversed Method is corrected the weight in convolutional neural networks model, so that the output result of convolutional neural networks model judges with classification The expected result of information is identical.

The training of convolutional neural networks model needs a large amount of training sample image to be trained, and therefore, it is necessary to hold repeatedly The step of row step S2310-S2350.Until when convolutional neural networks model training to convergence.

In order to solve the above technical problems, the embodiment of the present invention also provides facial image tracks of device.

It is the present embodiment facial image tracks of device basic structure schematic diagram referring specifically to Fig. 6, Fig. 6.

As shown in fig. 6, a kind of facial image tracks of device, including：Acquisition module 2100, processing module 2200 and execution mould Block 2300.Wherein, acquisition module 2100 is used to obtain the location information and offset parameter of facial image in target image；Handle mould Block 2200 is for being corrected location information according to offset parameter；Execution module 2300 is used to determine the location information after correction Justice is change location of the facial image in next frame target image.

Facial image tracks of device, can be according to current by the convolutional neural networks model of orientation training to convergence state Facial image location information and feature in Target Photo are inferred in current target image to next frame Target Photo, face figure As accurately offset parameter, therefore, it is possible to be inferred to face in next frame target image after getting current target image The location information of image.By the above method, during capable of solving facial image tracking, shooting is different from tracking processing speed When step, the tracking problem of image is lost.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.

In some embodiments, facial image tracks of device further includes：First acquisition submodule, the first processing submodule With the first implementation sub-module.Wherein, the first acquisition submodule is used to acquire the target image in video information；First processing submodule Block is for target image to be input in preset position coordinates model, wherein position coordinates model is to train to convergent volume Product neural network model；First implementation sub-module is used to obtain the position of facial image in the target image that position coordinates model exports Confidence ceases.

In some embodiments, facial image tracks of device further includes：First correction module and second executes submodule Block.Wherein, the first correction module is used to carry out Data correction to preset tracking framework according to offset parameter；Second executes son Tracking framework after module is used to correct is moved at the change location in next frame target image, so that tracking framework adaptation In variation of the facial image in next frame target image.

In some embodiments, facial image tracks of device further includes：First acquisition submodule, second processing submodule With third implementation sub-module.Wherein, the first acquisition submodule is used to obtain the reality of facial image in preset original sample image Border location information；Second processing submodule be used for according to preset image change method to facial image in original sample image into Row variation, to obtain an at least training sample image for being derived from original sample image；Third implementation sub-module will be for that will instruct Practice sample image to be input in convolutional neural networks model, to calculate the variation position in training sample image after facial image variation Confidence manner of breathing is for the offset parameter between actual position information.

In some embodiments, facial image tracks of device further includes：Second acquisition submodule, third handle submodule Block, the first operation submodule and the 4th implementation sub-module.Wherein, the second acquisition submodule is marked with actual bit confidence for obtaining The training sample image of breath；Third processing submodule is for training sample image to be input in convolutional neural networks model to obtain Take the change location information and offset parameter of facial image；First operation submodule be used for according to offset parameter to change location into Row trace-back operation obtains backtracking location information；First compares the reality that submodule is used to compare training sample image by loss function Whether border location information and backtracking location information are consistent；4th implementation sub-module is used for when actual position information and backtracking position letter When ceasing inconsistent, the weight in the update convolutional neural networks model of iterative cycles iteration, until terminating when comparison result is consistent.

In some embodiments, offset parameter includes：Coordinate shift parameter and scale offset parameter.

In some embodiments, image change method includes：Translation variation, rotationally-varying and contracting are carried out to facial image Put one or more mixing changes in variation.

In order to solve the above technical problems, the embodiment of the present invention also provides computer equipment.It is this referring specifically to Fig. 7, Fig. 7 Embodiment computer equipment basic structure block diagram.

As shown in fig. 7, the internal structure schematic diagram of computer equipment.As shown in fig. 7, the computer equipment includes passing through to be Processor, non-volatile memory medium, memory and the network interface of bus of uniting connection.Wherein, the computer equipment is non-easy The property lost storage medium is stored with operating system, database and computer-readable instruction, and control information sequence can be stored in database Row when the computer-readable instruction is executed by processor, may make processor to realize a kind of facial image tracking.The calculating The processor of machine equipment supports the operation of entire computer equipment for providing calculating and control ability.The computer equipment It can be stored with computer-readable instruction in memory, when which is executed by processor, processor may make to hold A kind of facial image tracking of row.The network interface of the computer equipment is used for and terminal connection communication.People in the art Member it is appreciated that Fig. 7 shown in structure, only with the block diagram of the relevant part-structure of application scheme, constitute pair The restriction for the computer equipment that application scheme is applied thereon, specific computer equipment may include than as shown in the figure more More or less component either combines certain components or is arranged with different components.

Processor is for executing acquisition module 2100 in Fig. 6, processing module 2200 and execution module in present embodiment 2300 concrete function, memory are stored with the program code and Various types of data executed needed for above-mentioned module.Network interface is used for To the data transmission between user terminal or server.Memory in present embodiment is stored in facial image tracks of device The program code and data needed for all submodules are executed, server is capable of the program code of invoking server and data execute institute There is the function of submodule.

Computer orientation training, can be according to the people in current goal picture to the convolutional neural networks model of convergence state Face image location information and feature are inferred in current target image to next frame Target Photo, and facial image accurately deviates ginseng Number, therefore, it is possible to be inferred to the location information of facial image in next frame target image after getting current target image. By the above method when shooting is asynchronous with tracking processing speed, image is lost during capable of solving facial image tracking Tracking problem.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.

The present invention also provides a kind of storage mediums being stored with computer-readable instruction, and the computer-readable instruction is by one When a or multiple processors execute so that one or more processors execute facial image track side described in any of the above-described embodiment The step of method.

One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, which can be stored in a computer-readable storage and be situated between In matter, the program is when being executed, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, storage medium above-mentioned can be The non-volatile memory mediums such as magnetic disc, CD, read-only memory (Read-Only Memory, ROM) or random storage note Recall body (Random Access Memory, RAM) etc..

It should be understood that although each step in the flow chart of attached drawing is shown successively according to the instruction of arrow, These steps are not that the inevitable sequence indicated according to arrow executes successively.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing Part steps may include that either these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, execution sequence is also not necessarily to be carried out successively, but can be with other Either the sub-step of other steps or at least part in stage execute step in turn or alternately.

The above is only some embodiments of the present invention, it is noted that for the ordinary skill people of the art For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims

1. a kind of facial image tracking, which is characterized in that include the following steps：

The location information is corrected according to the offset parameter；

2. facial image tracking according to claim 1, which is characterized in that face figure in the acquisition target image Before the step of location information and offset parameter of picture, further include：

Acquire the target image in video information；

The target image is input in preset position coordinates model, wherein the position coordinates model is that training is extremely received The convolutional neural networks model held back；

3. facial image tracking according to claim 1, which is characterized in that believe the position by after the correction Breath was defined as after the step of change location of the facial image in next frame target image, further included：

Tracking framework after correction is moved at the change location in next frame target image, so that the tracking framework is adapted to In variation of the facial image in the next frame target image.

4. facial image tracking according to claim 1, which is characterized in that the acquisition methods of the offset parameter For：

Facial image in the original sample image is changed according to preset image change method, to obtain at least one It is derived from the training sample image of the original sample image；

The training sample image is input in convolutional neural networks model, to calculate face figure in the training sample image As offset parameter of the change location information after variation between the actual position information.

5. facial image tracking according to claim 4, which is characterized in that described that the training sample image is defeated Enter into convolutional neural networks model, to calculate the change location information phase in the training sample image after facial image variation Include for the step of offset parameter between the actual position information：

The training sample image is input in convolutional neural networks model and is believed with the change location for obtaining the facial image Breath and offset parameter；

By loss function compare the training sample image actual position information and the backtracking location information it is whether consistent；

When the actual position information and the backtracking location information are inconsistent, the update convolution god of iterative cycles iteration Through the weight in network model, until the comparison result terminates when consistent.

6. according to the facial image tracking described in claim 1-5 any one, which is characterized in that the offset parameter packet It includes：Coordinate shift parameter and scale offset parameter.

7. facial image tracking according to claim 4 or 5, which is characterized in that described image changing method includes： To the facial image carry out translation variation, it is rotationally-varying and scaling variation in one or more mixing changes.

8. a kind of facial image tracks of device, which is characterized in that including：

Execution module, for the location information after the correction to be defined as the facial image in next frame target image Change location.

9. a kind of computer equipment, including memory and processor, it is stored with computer-readable instruction in the memory, it is described When computer-readable instruction is executed by the processor so that the processor is executed such as any one of claim 1 to 7 right It is required that the step of facial image tracking.

10. a kind of storage medium being stored with computer-readable instruction, the computer-readable instruction is handled by one or more Device execute when so that one or more processors execute as described in any one of claim 1 to 7 claim facial image with The step of track method.