CN108509940B - Facial image tracking, device, computer equipment and storage medium - Google Patents

Facial image tracking, device, computer equipment and storage medium Download PDF

Info

Publication number
CN108509940B
CN108509940B CN201810359958.8A CN201810359958A CN108509940B CN 108509940 B CN108509940 B CN 108509940B CN 201810359958 A CN201810359958 A CN 201810359958A CN 108509940 B CN108509940 B CN 108509940B
Authority
CN
China
Prior art keywords
image
facial image
location information
offset parameter
tracking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810359958.8A
Other languages
Chinese (zh)
Other versions
CN108509940A (en
Inventor
杨帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN201810359958.8A priority Critical patent/CN108509940B/en
Publication of CN108509940A publication Critical patent/CN108509940A/en
Application granted granted Critical
Publication of CN108509940B publication Critical patent/CN108509940B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/167Detection; Localisation; Normalisation using comparisons between temporally consecutive images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The embodiment of the invention discloses a kind of facial image tracking, device, computer equipment and storage medium, include the following steps: the location information and offset parameter that obtain facial image in target image;The location information is corrected according to the offset parameter;Location information after the correction is defined as change location of the facial image in next frame target image.Orientation training to convergence state convolutional neural networks model, can according in current goal picture facial image location information and feature be inferred to current target image into next frame Target Photo, facial image accurately offset parameter, therefore, the location information of facial image in next frame target image can be inferred to after getting current target image.By the above method, it is able to solve during facial image tracking, when shooting is asynchronous with tracking processing speed, loses the tracking problem of image.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.

Description

Facial image tracking, device, computer equipment and storage medium
Technical field
The present embodiments relate to field of image processing, especially a kind of facial image tracking, device, computer are set Standby and storage medium.
Background technique
Feature tracking, seeks under the premise of detecting face, continue in subsequent frames capture face position and its The information such as size, the tracking technique of identification and face including face.In fields such as customs, airport, bank, teleconferences It closes, requires to track Given Face target.
In the prior art in face key point technology, feature is carried out first with face of the full convolutional network to previous frame and is mentioned It takes, face is then estimated in the position of present frame according to the position of previous frame face, recycle same full convolutional network to working as The face location that previous frame is estimated carries out feature extraction, and the face characteristic of previous frame is then carried out phase on present frame face characteristic Closing property compares, and obtaining the highest position of score is exactly accurate location of the face in present frame.
The inventor of the invention has found that the frame per second of video is every between 25~30 frames in the prior art under study for action Second, the detection speed of the one process of Face datection algorithm is about per second in 10 frames, and needs to calculate front and back when prior art tracking The face of frame and its feature of neighbouring position are extremely difficult to real-time purpose since the calculation amount for extracting characteristic procedure is bigger, Therefore it is slow that there are reaction speeds, and is easily lost target facial image.
Summary of the invention
The embodiment of the present invention provides a kind of facial image tracking, dress that face tracking is fast implemented by offset parameter It sets, computer equipment and storage medium.
In order to solve the above technical problems, the technical solution that the embodiment of the invention uses is: providing a kind of people Face image tracking, includes the following steps:
Obtain the location information and offset parameter of facial image in target image;
The location information is corrected according to the offset parameter;
Location information after the correction is defined as change location of the facial image in next frame target image.
Optionally, it before described the step of obtaining the location information and offset parameter of facial image in target image, also wraps It includes:
Acquire the target image in video information;
The target image is input in preset position coordinates model, wherein the position coordinates model is training To convergent convolutional neural networks model;
Obtain the location information of facial image in the target image of the position coordinates model output.
Optionally, the location information by after the correction is defined as the facial image in next frame target image Change location the step of after, further includes:
Data correction is carried out to preset tracking framework according to the offset parameter;
Tracking framework after correction is moved at the change location in next frame target image, so that the tracking framework Adapt to variation of the facial image in the next frame target image.
Optionally, the acquisition methods of the offset parameter are as follows:
Obtain the actual position information of facial image in preset original sample image;
Facial image in the original sample image is changed according to preset image change method, to obtain at least One is derived from the training sample image of the original sample image;
The training sample image is input in convolutional neural networks model, to calculate people in the training sample image Change location information after face image variation is relative to the offset parameter between the actual position information.
Optionally, described that the training sample image is input in convolutional neural networks model, to calculate the training Change location information in sample image after facial image variation is relative to the offset parameter between the actual position information Step includes:
Obtain the training sample image for being marked with the actual position information;
The training sample image is input to the variation position that the facial image is obtained in convolutional neural networks model Confidence breath and offset parameter;
Trace-back operation is carried out to the change location according to the offset parameter and obtains backtracking location information;
By loss function compare the training sample image actual position information and the backtracking location information whether Unanimously;
When the actual position information and the inconsistent change location information, the update of the iterative cycles iteration volume Weight in product neural network model, until the comparison result terminates when consistent.
Optionally, the offset parameter includes: coordinate shift parameter and scale offset parameter.
Optionally, described image changing method includes: to carry out translation variation, rotationally-varying and scaling to the facial image One or more mixing changes in variation.
In order to solve the above technical problems, the embodiment of the present invention also provides a kind of facial image tracking, comprising:
Module is obtained, for obtaining the location information and offset parameter of facial image in target image;
Processing module, for being corrected according to the offset parameter to the location information;
Execution module, for the location information after the correction to be defined as the facial image in next frame target image In change location.
Optionally, further includes:
First acquisition submodule, for acquiring the target image in video information;
First processing submodule, for the target image to be input in preset position coordinates model, wherein described Position coordinates model is to train to convergent convolutional neural networks model;
First implementation sub-module, for obtaining facial image in the target image that the position coordinates model exports Location information.
Optionally, further includes:
First correction module, for carrying out Data correction to preset tracking framework according to the offset parameter;
Second implementation sub-module, the change location being moved to for the tracking framework after correcting in next frame target image Place, so that the tracking framework adapts to variation of the facial image in the next frame target image.
Optionally, further includes:
First acquisition submodule, for obtaining the actual position information of facial image in preset original sample image;
Second processing submodule is used for according to preset image change method to facial image in the original sample image It is changed, to obtain at least one training sample image for being derived from the original sample image;
Third implementation sub-module, for the training sample image to be input in convolutional neural networks model, to calculate Change location information in the training sample image after facial image variation is relative to inclined between the actual position information Shifting parameter.
Optionally, further includes:
Second acquisition submodule, for obtaining the training sample image for being marked with the actual position information;
Third handles submodule, for the training sample image to be input in convolutional neural networks model to obtain State the change location information and offset parameter of facial image;
First operation submodule is recalled for carrying out trace-back operation to the change location according to the offset parameter Location information;
First compares submodule, for comparing actual position information and the institute of the training sample image by loss function Whether consistent state backtracking location information;
4th implementation sub-module is used for when the actual position information and the backtracking location information are inconsistent, repeatedly Weight in the update convolutional neural networks model of loop iteration, until the comparison result terminates when consistent.
Optionally, the offset parameter includes: coordinate shift parameter and scale offset parameter.
Optionally, described image changing method includes: to carry out translation variation, rotationally-varying and scaling to the facial image One or more mixing changes in variation.
In order to solve the above technical problems, the embodiment of the present invention also provides a kind of computer equipment, including memory and processing Device is stored with computer-readable instruction in the memory, when the computer-readable instruction is executed by the processor, so that The processor executes the step of facial image tracking described in the claims.
In order to solve the above technical problems, the embodiment of the present invention also provides a kind of storage Jie for being stored with computer-readable instruction Matter, when the computer-readable instruction is executed by one or more processors, so that one or more processors execute above-mentioned power Benefit requires the step of facial image tracking.
The convolutional neural networks model for having the beneficial effect that orientation training to convergence state of the embodiment of the present invention, Neng Gougen According in current goal picture facial image location information and feature be inferred to current target image into next frame Target Photo, Accurately therefore offset parameter can be inferred to next frame target image to facial image after getting current target image The location information of middle facial image.It by the above method, is able to solve during facial image tracking, shooting and tracking processing speed When spending asynchronous, the tracking problem of image is lost.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those skilled in the art, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.
Fig. 1 is the basic procedure schematic diagram of facial image of embodiment of the present invention tracking;
Fig. 2 is the idiographic flow schematic diagram that facial image of embodiment of the present invention location information obtains;
Fig. 3 is the flow diagram of amendment tracking framework of the embodiment of the present invention;
Fig. 4 is the flow diagram that the embodiment of the present invention obtains offset parameter;
Fig. 5 is deviation post of embodiment of the present invention model training method flow diagram;
Fig. 6 is facial image of embodiment of the present invention tracking device basic structure schematic diagram;
Fig. 7 is computer equipment of embodiment of the present invention basic structure block diagram.
Specific embodiment
In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described.
In some processes of the description in description and claims of this specification and above-mentioned attached drawing, contain according to Multiple operations that particular order occurs, but it should be clearly understood that these operations can not be what appears in this article suitable according to its Sequence is executed or is executed parallel, and serial number of operation such as 101,102 etc. is only used for distinguishing each different operation, serial number It itself does not represent and any executes sequence.In addition, these processes may include more or fewer operations, and these operations can To execute or execute parallel in order.It should be noted that the description such as " first " herein, " second ", is for distinguishing not Same message, equipment, module etc., does not represent sequencing, does not also limit " first " and " second " and be different type.
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, those skilled in the art's every other implementation obtained without creative efforts Example, shall fall within the protection scope of the present invention.
Embodiment 1
Those skilled in the art of the present technique are appreciated that " terminal " used herein above, " terminal device " both include wireless communication The equipment of number receiver, only has the equipment of the wireless signal receiver of non-emissive ability, and including receiving and emitting hardware Equipment, have on bidirectional communication link, can execute two-way communication reception and emit hardware equipment.This equipment It may include: honeycomb or other communication equipments, shown with single line display or multi-line display or without multi-line The honeycomb of device or other communication equipments;PCS (Personal Communications Service, PCS Personal Communications System), can With combine voice, data processing, fax and/or communication ability;PDA (Personal Digital Assistant, it is personal Digital assistants), it may include radio frequency receiver, pager, the Internet/intranet access, web browser, notepad, day It goes through and/or GPS (Global Positioning System, global positioning system) receiver;Conventional laptop and/or palm Type computer or other equipment, have and/or the conventional laptop including radio frequency receiver and/or palmtop computer or its His equipment." terminal " used herein above, " terminal device " can be it is portable, can transport, be mounted on the vehicles (aviation, Sea-freight and/or land) in, or be suitable for and/or be configured in local runtime, and/or with distribution form, operate in the earth And/or any other position operation in space." terminal " used herein above, " terminal device " can also be communication terminal, on Network termination, music/video playback terminal, such as can be PDA, MID (Mobile Internet Device, mobile Internet Equipment) and/or mobile phone with music/video playing function, it is also possible to the equipment such as smart television, set-top box.
VGG is Oxford University's computer vision group (VisualGeometry Group) and GoogleDeepMind company The depth convolutional neural networks that researcher researches and develops together.VGG is explored between the depth of convolutional neural networks and its performance Relationship, by stacking the small-sized convolution kernel of 3*3 and the maximum pond layer of 2*2 repeatedly, VGG has successfully constructed 16~19 layer depths Convolutional neural networks.The expansion of VGG is very strong, and the generalization moved on other image datas is very good.The structure of VGG is very Succinctly, whole network all employs an equal amount of convolution kernel size (3*3) and maximum pond size (2*2).Up to the present, VGG is still usually utilized to extract characteristics of image.Model parameter after VGG training is increased income in its official website, can be used to Retraining (be equivalent to and provide extraordinary initialization weight) is carried out in specific image classification task.
In present embodiment, deep learning and content understanding are carried out using VGG convolutional neural networks model.But it is not limited to This can be using CNN convolutional neural networks model or CNN convolutional neural networks model in some selective embodiments Branch model.
It is the basic procedure schematic diagram of the present embodiment facial image tracking referring specifically to Fig. 1, Fig. 1.
As shown in Figure 1, a kind of facial image tracking, includes the following steps:
S1100, the location information and offset parameter for obtaining facial image in target image;
When carrying out video human face image trace, the extraction target figure from video information is obtained by way of timing acquiring Picture.Such as the speed opened with 0.5s mono- extracts a Target Photo in the frame picture image of video information, but not limited to this, According to the difference of concrete application scene, the speed for acquiring target image is able to carry out the adjustment of adaptability, and Adjustment principle is, is Processing capacity of uniting is stronger and the tracking more high then acquisition time of accuracy requirement is shorter, reaches the frequency with picture pick-up device acquisition image Until when synchronous;Otherwise, then acquisition time interval is longer, but longest acquisition time interval must not exceed 1s.Target image is specific Refer to acquired from video information comprising facial image frame picture image.
The location information of facial image is extracted by training to convergent convolutional neural networks model in target image, The position of the facial image in the target image of input can be believed by orientation training to convergent convolutional neural networks model Breath extracts.
In present embodiment, location information can be (being not limited to): the seat of facial image facial contour in Target Photo Mark set also can be the coordinate or face key point (eyes, nose and mouth for characterizing facial contour center Bar) coordinate set at position, and in present embodiment, location information further includes the size of facial image, i.e. facial image Width and height.
S1200, the location information is corrected according to the offset parameter;
After obtaining the location information in facial image, carried out according to position coordinates of the preset offset parameter to facial image Correction, wherein the formula of correction are as follows:
Δ x=(c 'x-cx)/ω
Δ y=(c 'y-cy)/h
Wherein, Δ x is offset of the facial image in two-dimensional coordinates in x-axis direction, and Δ y is facial image in two dimension Offset in reference axis on y-axis direction, c 'xFor the coordinate in x-axis direction after correction, c 'yFor the offset after correction in y-axis, cxTo correct the coordinate in preceding x-axis direction, cyTo correct the coordinate on preceding y-axis direction, ω is the width of facial image before correcting, h For the height for correcting preceding facial image.
It can be according to the ruler of the coordinate position and facial image of offset parameter and facial image according to the variation of above-mentioned formula The very little position coordinates for being inferred to facial image after correcting.
In present embodiment, the step of being corrected by offset parameter to location information, is also by above-mentioned convolutional neural networks Model carries out, and orientation training to convergent convolutional neural networks model can be after the location information for obtaining facial image, to this Face location information is corrected.
In some embodiments, offset parameter further include: width and height to facial image are corrected, wherein The formula of correction are as follows:
γω=ω '/ω
γh=h '/h
Wherein, ω is the width of facial image before correcting, and h is the height of facial image before correcting, γωFor facial image Width deviations parameter, γhFor the offsets in height parameter of facial image, ω ' is the width of facial image after correction, and h ' is after correcting The height of facial image.
S1300, the location information after the correction is defined as to change of the facial image in next frame target image Change position.
Using the location information after correction as the location information of facial image in next frame targeted graphical, face is realized with this The rapid location of image tracks.Since the convolutional neural networks model of orientation training to convergence state can be according to current goal Facial image location information and feature in picture are inferred to current target image into next frame Target Photo, facial image essence Therefore quasi- offset parameter can be inferred to facial image in next frame target image after getting current target image Location information.By the above method, it is able to solve during facial image tracking, shooting is asynchronous with tracking processing speed When, lose the tracking problem of image.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.
Change location position refers to physical location of the facial image in next frame target image.
In some embodiments, it before being corrected to the location information in target image, needs to target image Facial image location information extracts.Specific extraction process is referring to Fig. 2, Fig. 2 is the present embodiment facial image position letter Cease the idiographic flow schematic diagram obtained.
As shown in Fig. 2, further including following step before step S1100:
Target image in S1011, acquisition video information;
When carrying out video human face image trace, the extraction target figure from video information is obtained by way of timing acquiring Picture.Such as the speed opened with 0.5s mono- extracts a Target Photo in the frame picture image of video information, but not limited to this, According to the difference of concrete application scene, the speed for acquiring target image is able to carry out the adjustment of adaptability, and Adjustment principle is, is Processing capacity of uniting is stronger and the tracking more high then acquisition time of accuracy requirement is shorter, reaches the frequency with picture pick-up device acquisition image Until when synchronous;Otherwise, then acquisition time interval is longer, but longest acquisition time interval must not exceed 1s.Target image is specific Refer to acquired from video information comprising facial image frame picture image.
S1012, the target image is input in preset position coordinates model, wherein the position coordinates model For training to convergent convolutional neural networks model;
The location information of facial image is extracted by training to convergent convolutional neural networks model in target image, The position of the facial image in the target image of input can be believed by orientation training to convergent convolutional neural networks model Breath extracts.
In present embodiment, position coordinates model, that is, convolutional neural networks model, the convolution of orientation training to convergence state Neural network model can extract the location information of facial image, while can also extract to location information Afterwards, continue to carry out position correction to facial image.
S1013, the location information for obtaining facial image in the target image of the position coordinates model output.
The location information of facial image in classification data, that is, target image of position coordinates model output.
For example, in the present embodiment, convolutional neural networks modal position coordinate model and deviation post model composition, Two models are vgg convolutional neural networks model.Wherein, position coordinates model and deviation post Cascade, position coordinates Model is used to obtain the position coordinates of the facial image in target image, i.e. the classification data of position coordinate model output is target The position coordinates of facial image in image.Deviation post model is then used to obtain the location information after correction, input offset position Set the location information that facial image at the data of model further includes the output of position coordinates model outside.
In some embodiments, it during facial image tracking, needs to be provide with tracking box body on facial image, under The position of one frame target image needs the data of corresponding adjustment tracking framework after determining.It is this reality referring specifically to Fig. 3, Fig. 3 Apply the flow diagram of example amendment tracking framework.
As shown in figure 3, further including following step after step S1300:
S1311, Data correction is carried out to preset tracking framework according to the offset parameter;
The tracking framework in current target image, being located on facial image is corrected according to offset parameter.Specifically Ground, tracking framework are made of preset scalar closure wire frame, and the design parameter for tracking framework is controlled by framework control, framework Control controls the position of framework and the size of framework, and specifically, control location parameter is the location information of face figure, framework ruler Very little parameter is the width and height of facial image.By the way that different parameter informations, the position of framework and size is arranged It can be with changing together.
After being corrected according to parameter of the offset parameter to tracking framework, the position and Size Conversion for tracking framework are next The tracking framework parameter of frame target image.
S1312, the tracking framework after correction is moved at the change location in next frame target image so that it is described with Track framework adapts to variation of the facial image in the next frame target image.
The location parameter of tracking framework after correction is the change location parameter in next frame target image, therefore, correction Tracking box body afterwards can be moved to the position after variation along current position.The continuity of cooperation and video playing, tracking box Body carry out position movement be it is linearly moving, i.e., correction after location information and current location information between movement be linear Smooth, and what indirect jump carried out.
Due to, the location information synchronizing variation of facial image after the parameter and correction of the tracking framework after correction, therefore, with Track framework adapts to variation of the facial image in next frame target image.
In some embodiments, the acquisition of offset parameter is needed by orientation training to convergent convolutional neural networks mould Type extracts, and is the flow diagram that the present embodiment obtains offset parameter referring specifically to Fig. 4, Fig. 4.
As shown in figure 4, facial image tracking further include:
S2100, the actual position information for obtaining facial image in preset original sample image;
In present embodiment, convolutional neural networks modal position coordinate model and deviation post model composition, two models It is vgg convolutional neural networks model.Wherein, position coordinates model and deviation post Cascade, position coordinates model are used for The position coordinates of the facial image in target image are obtained, i.e. the classification data of position coordinate model output is in target image The position coordinates of facial image.Deviation post model is then used to obtain the location information after correction, input offset position model It further include the location information of position coordinates model output outside facial image at data.
The calculating of position is carried out by deviation post model after the extraction and correction of offset parameter, and therefore, the present embodiment is to inclined The training method of pan position model is illustrated.
The training of deviation post model need include facial image original training sample, utilize PostgreSQL database wider The human face data of face, as original training sample image.
Face location information in the original training sample image got is extracted, extracts and uses in the prior art Trained to the convergent neural network for obtaining facial image location information extracts.But it is not limited to time, in some realities When mode in, the location information of original sample image can be by manually demarcating completion.
Actual position information is the physical location of facial image in this original sample image.
S2200, facial image in the original sample image is changed according to preset image change method, to obtain Take at least one training sample image for being derived from the original sample image;
The original sample image that will acquire actual position information is changed, and specific variation refers to original sample image In facial image carry out translation variation, it is rotationally-varying and scaling variation in one or more mixing changes.It is specific to become Change mode be it is random, that is, the quantity for extracting variation is that random algorithm generates, and the corresponding variation pattern of each quantity is also It is randomly selected in three variation patterns.By bivariate, treated that original sample image can derive more training Sample image, and the image change of bivariate can be that the deviation post model after the completion of training is more stable.
Adjusting range to original sample image change is [0.8,1.2].That is translation variation, rotationally-varying and scaling variation Range [0.8,1.2] carry out value.Since the image change amplitude between frame picture is little, therefore, takes in the range Value more meets the demand of practical application.
S2300, the training sample image is input in convolutional neural networks model, to calculate the training sample figure Change location information as in after facial image variation is relative to the offset parameter between the actual position information.
Due to also carrying the actual position information in original sample image after training sample image variation, need according to right Convolutional neural networks model is trained, so that convolutional neural networks model can obtain offset parameter.Specifically, convolutional Neural The training of network model, that is, deviation post model is referring to Fig. 5, Fig. 5 is the deviation post model training method process of the present embodiment Schematic diagram.
As shown in figure 5, step S2300 specifically include the following steps:
S2310, acquisition are marked with the training sample image of the actual position information;
When being trained, the input layer of convolutional neural networks model obtains the training sample for being marked with actual position information Image.
S2320, the training sample image is input in convolutional neural networks model to obtain the facial image Change location information and offset parameter;
Training sample image is input to convolutional neural networks model, the classification information of convolutional neural networks model output For offset parameter.Specifically, which is built for the offset parameter and change location to facial image Information is trained.Since the convolutional neural networks model is not trained to convergence state, therefore, the offset parameter of output With certain randomness, data discrete is larger, and loss function is needed to verify the data of output.
S2330 carries out trace-back operation to the change location according to the offset parameter and obtains backtracking location information;
It is calculated to obtain backtracking location information according to trace-back operation method particularly includes: subtract on the basis of change location information Offset parameter obtains backtracking location information.Since offset parameter is actual position information relative to the change between change location information Amount, i.e. actual position information can obtain change location information plus offset parameter, and reverse operating is according to change location information Backtracking location information can be obtained by subtracting offset parameter.
S2340, the actual position information that the training sample image is compared by loss function and the backtracking position are believed It whether consistent ceases;
The classification value of convolutional neural networks model is verified using L1-loss loss function in present embodiment.Tool Body, loss function is by calculating the absolute value distance between the actual position information of training sample image and backtracking location information Whether it is less than preset threshold value, when being less than the threshold value, indicates poor between location information and actual position information by backtracking Away from smaller, offset parameter meets the requirements;Otherwise, then offset parameter is undesirable.
S2350, when the actual position information and the backtracking location information it is inconsistent when, the update of iterative cycles iteration Weight in the convolutional neural networks model, until the comparison result terminates when consistent.
When the actual position information of convolutional neural networks model and inconsistent backtracking location information, need to calculate using reversed Method is corrected the weight in convolutional neural networks model, so that the output result of convolutional neural networks model and classification judge The expected result of information is identical.
The training of convolutional neural networks model needs a large amount of training sample image to be trained, and therefore, it is necessary to hold repeatedly The step of row step S2310-S2350.Until when convolutional neural networks model training to convergence.
In order to solve the above technical problems, the embodiment of the present invention also provides facial image tracking device.
It is the present embodiment facial image tracking device basic structure schematic diagram referring specifically to Fig. 6, Fig. 6.
As shown in fig. 6, a kind of facial image tracking device, comprising: obtain module 2100, processing module 2200 and execute mould Block 2300.Wherein, location information and offset parameter that module 2100 is used to obtain facial image in target image are obtained;Handle mould Block 2200 is for being corrected location information according to offset parameter;Execution module 2300 is used to determine the location information after correction Justice is change location of the facial image in next frame target image.
Facial image tracking device by orientation training to the convolutional neural networks model of convergence state, can be according to current Facial image location information and feature in Target Photo are inferred to current target image into next frame Target Photo, face figure As accurately therefore offset parameter can be inferred to face in next frame target image after getting current target image The location information of image.By the above method, it is able to solve during facial image tracking, shooting is different from tracking processing speed When step, the tracking problem of image is lost.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.
In some embodiments, facial image tracking device further include: the first acquisition submodule, the first processing submodule With the first implementation sub-module.Wherein, the first acquisition submodule is used to acquire the target image in video information;First processing submodule Block is for target image to be input in preset position coordinates model, wherein position coordinates model is to train to convergent volume Product neural network model;First implementation sub-module is used to obtain the position of facial image in the target image of position coordinates model output Confidence breath.
In some embodiments, facial image tracking device further include: the first correction module and second executes submodule Block.Wherein, the first correction module is used to carry out Data correction to preset tracking framework according to offset parameter;Second executes son Tracking framework after module is used to correct is moved at the change location in next frame target image, so that tracking framework adaptation In variation of the facial image in next frame target image.
In some embodiments, facial image tracking device further include: the first acquisition submodule, second processing submodule With third implementation sub-module.Wherein, the first acquisition submodule is used to obtain the reality of facial image in preset original sample image Border location information;Second processing submodule be used for according to preset image change method to facial image in original sample image into Row variation, to obtain at least one training sample image for being derived from original sample image;Third implementation sub-module will be for that will instruct Practice sample image to be input in convolutional neural networks model, to calculate the variation position in training sample image after facial image variation Confidence manner of breathing is for the offset parameter between actual position information.
In some embodiments, facial image tracking device further include: the second acquisition submodule, third handle submodule Block, the first operation submodule and the 4th implementation sub-module.Wherein, the second acquisition submodule is marked with actual bit confidence for obtaining The training sample image of breath;Third processing submodule is for training sample image to be input in convolutional neural networks model to obtain Take the change location information and offset parameter of facial image;First operation submodule be used for according to offset parameter to change location into Row trace-back operation obtains backtracking location information;First compares the reality that submodule is used to compare training sample image by loss function Whether border location information and backtracking location information are consistent;4th implementation sub-module is used for when actual position information and backtracking position letter When ceasing inconsistent, the weight of iterative cycles iteration updated in convolutional neural networks model, until terminating when comparison result is consistent.
In some embodiments, offset parameter includes: coordinate shift parameter and scale offset parameter.
In some embodiments, image change method includes: to carry out translation variation, rotationally-varying and contracting to facial image Put one or more mixing changes in variation.
In order to solve the above technical problems, the embodiment of the present invention also provides computer equipment.It is this referring specifically to Fig. 7, Fig. 7 Embodiment computer equipment basic structure block diagram.
As shown in fig. 7, the schematic diagram of internal structure of computer equipment.As shown in fig. 7, the computer equipment includes passing through to be Processor, non-volatile memory medium, memory and the network interface of bus of uniting connection.Wherein, the computer equipment is non-easy The property lost storage medium is stored with operating system, database and computer-readable instruction, can be stored with control information sequence in database Column, when which is executed by processor, may make processor to realize a kind of facial image tracking.The calculating The processor of machine equipment supports the operation of entire computer equipment for providing calculating and control ability.The computer equipment It can be stored with computer-readable instruction in memory, when which is executed by processor, processor may make to hold A kind of facial image tracking of row.The network interface of the computer equipment is used for and terminal connection communication.Those skilled in the art Member is appreciated that structure shown in Fig. 7, only the block diagram of part-structure relevant to application scheme, composition pair The restriction for the computer equipment that application scheme is applied thereon, specific computer equipment may include than as shown in the figure more More or less component perhaps combines certain components or with different component layouts.
Processor obtains module 2100, processing module 2200 and execution module for executing in present embodiment in Fig. 6 2300 concrete function, program code and Various types of data needed for memory is stored with the above-mentioned module of execution.Network interface is used for To the data transmission between user terminal or server.Memory in present embodiment is stored in facial image tracking device Program code needed for executing all submodules and data, server is capable of the program code of invoking server and data execute institute There is the function of submodule.
Computer orientation training, can be according to the people in current goal picture to the convolutional neural networks model of convergence state Face image location information and feature are inferred to current target image into next frame Target Photo, and facial image accurately deviates ginseng Number, therefore, can be inferred to the location information of facial image in next frame target image after getting current target image. By the above method, it is able to solve during facial image tracking, when shooting is asynchronous with tracking processing speed, loses image Tracking problem.Meanwhile the requirement to the processing speed of tracking system is greatly reduced.
The present invention also provides a kind of storage mediums for being stored with computer-readable instruction, and the computer-readable instruction is by one When a or multiple processors execute, so that one or more processors execute facial image track side described in any of the above-described embodiment The step of method.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, which can be stored in a computer-readable storage and be situated between In matter, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, storage medium above-mentioned can be The non-volatile memory mediums such as magnetic disk, CD, read-only memory (Read-Only Memory, ROM) or random storage note Recall body (Random Access Memory, RAM) etc..
It should be understood that although each step in the flow chart of attached drawing is successively shown according to the instruction of arrow, These steps are not that the inevitable sequence according to arrow instruction successively executes.Unless expressly stating otherwise herein, these steps Execution there is no stringent sequences to limit, can execute in the other order.Moreover, at least one in the flow chart of attached drawing Part steps may include that perhaps these sub-steps of multiple stages or stage are not necessarily in synchronization to multiple sub-steps Completion is executed, but can be executed at different times, execution sequence, which is also not necessarily, successively to be carried out, but can be with other At least part of the sub-step or stage of step or other steps executes in turn or alternately.
The above is only some embodiments of the invention, it is noted that for the ordinary skill people of the art For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims (8)

1. a kind of facial image tracking, which is characterized in that include the following steps:
Obtain the location information and offset parameter of facial image in target image;
The location information is corrected according to the offset parameter;
Location information after the correction is defined as change location of the facial image in next frame target image;
The acquisition methods of the offset parameter are as follows:
Obtain the actual position information of facial image in preset original sample image;
Facial image in the original sample image is changed according to preset image change method, to obtain at least one It is derived from the training sample image of the original sample image;
The training sample image is input in convolutional neural networks model, to calculate face figure in the training sample image As the change location information after variation is relative to the offset parameter between the actual position information;
Wherein, described that the training sample image is input in convolutional neural networks model, to calculate the training sample figure Change location information as in after facial image variation is wrapped relative to the step of offset parameter between the actual position information It includes:
Obtain the training sample image for being marked with the actual position information;
The training sample image is input to the change location letter that the facial image is obtained in convolutional neural networks model Breath and offset parameter;
Trace-back operation is carried out to the change location according to the offset parameter and obtains backtracking location information;
By loss function compare the training sample image actual position information and the backtracking location information it is whether consistent;
When the actual position information and the backtracking location information are inconsistent, the update of the iterative cycles iteration convolution mind Through the weight in network model, until terminating when comparison result is consistent.
2. facial image tracking according to claim 1, which is characterized in that face figure in the acquisition target image Before the step of location information and offset parameter of picture, further includes:
Acquire the target image in video information;
The target image is input in preset position coordinates model, wherein the position coordinates model is that training is extremely received The convolutional neural networks model held back;
Obtain the location information of facial image in the target image of the position coordinates model output.
3. facial image tracking according to claim 1, which is characterized in that believe the position by after the correction After breath is defined as the facial image the change location in next frame target image the step of, further includes:
Data correction is carried out to preset tracking framework according to the offset parameter;
Tracking framework after correction is moved at the change location in next frame target image, so that the tracking framework is adapted to In variation of the facial image in the next frame target image.
4. facial image tracking according to claim 1 to 3, which is characterized in that the offset parameter packet It includes: coordinate shift parameter and scale offset parameter.
5. facial image tracking according to claim 1, which is characterized in that described image changing method includes: pair One or more mixing changes in the facial image carries out translation variation, rotationally-varying and scaling changes.
6. a kind of facial image tracking device characterized by comprising
Module is obtained, for obtaining the location information and offset parameter of facial image in target image;
Processing module, for being corrected according to the offset parameter to the location information;
Execution module, for the location information after the correction to be defined as the facial image in next frame target image Change location;
Wherein, further includes:
First acquisition submodule, for obtaining the actual position information of facial image in preset original sample image;
Second processing submodule, for being carried out according to preset image change method to facial image in the original sample image Variation, to obtain at least one training sample image for being derived from the original sample image;
Third implementation sub-module, for the training sample image to be input in convolutional neural networks model, described in calculating Change location information in training sample image after facial image variation is relative to the offset ginseng between the actual position information Number;
Wherein, further includes:
Second acquisition submodule, for obtaining the training sample image for being marked with the actual position information;
Third handles submodule, for the training sample image to be input in convolutional neural networks model to obtain the people The change location information and offset parameter of face image;
First operation submodule obtains backtracking position for carrying out trace-back operation to the change location according to the offset parameter Information;
First compare submodule, for compared by loss function the training sample image actual position information and described time Whether the location information that traces back is consistent;
4th implementation sub-module is used for when the actual position information and the backtracking location information are inconsistent, iterative cycles Weight in the update convolutional neural networks model of iteration, until the comparison result terminates when consistent.
7. a kind of computer equipment, including memory and processor, it is stored with computer-readable instruction in the memory, it is described When computer-readable instruction is executed by the processor, so that the processor executes such as any one of claims 1 to 5 right It is required that the step of facial image tracking.
8. a kind of storage medium for being stored with computer-readable instruction, the computer-readable instruction is by one or more processors When execution, so that one or more processors execute the facial image as described in any one of claims 1 to 5 claim and track The step of method.
CN201810359958.8A 2018-04-20 2018-04-20 Facial image tracking, device, computer equipment and storage medium Active CN108509940B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810359958.8A CN108509940B (en) 2018-04-20 2018-04-20 Facial image tracking, device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810359958.8A CN108509940B (en) 2018-04-20 2018-04-20 Facial image tracking, device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN108509940A CN108509940A (en) 2018-09-07
CN108509940B true CN108509940B (en) 2019-11-05

Family

ID=63383199

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810359958.8A Active CN108509940B (en) 2018-04-20 2018-04-20 Facial image tracking, device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN108509940B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163066B (en) * 2018-12-07 2022-11-08 腾讯科技(深圳)有限公司 Multimedia data recommendation method, device and storage medium
CN109993137A (en) * 2019-04-09 2019-07-09 安徽大学 A kind of fast face antidote based on convolutional neural networks
CN110969097B (en) * 2019-11-18 2023-05-12 浙江大华技术股份有限公司 Method, equipment and storage device for controlling linkage tracking of monitoring target
CN110941730B (en) * 2019-11-29 2020-12-08 南京甄视智能科技有限公司 Retrieval method and device based on human face feature data migration
CN113051010B (en) * 2019-12-28 2023-04-28 Oppo(重庆)智能科技有限公司 Application picture adjustment method and related device in wearable equipment
CN111093077A (en) * 2019-12-31 2020-05-01 深圳云天励飞技术有限公司 Video coding method and device, electronic equipment and storage medium
CN111339936A (en) * 2020-02-25 2020-06-26 杭州涂鸦信息技术有限公司 Face tracking method and system
CN111860440A (en) * 2020-07-31 2020-10-30 广州繁星互娱信息科技有限公司 Position adjusting method and device for human face characteristic point, terminal and storage medium
CN117135451A (en) * 2023-02-27 2023-11-28 荣耀终端有限公司 Focusing processing method, electronic device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875422A (en) * 2017-02-06 2017-06-20 腾讯科技(上海)有限公司 Face tracking method and device
CN107423707A (en) * 2017-07-25 2017-12-01 深圳帕罗人工智能科技有限公司 A kind of face Emotion identification method based under complex environment
CN107818314A (en) * 2017-11-22 2018-03-20 北京达佳互联信息技术有限公司 Face image processing method, device and server

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106846364B (en) * 2016-12-30 2019-09-24 明见(厦门)技术有限公司 A kind of method for tracking target and device based on convolutional neural networks
CN106874868B (en) * 2017-02-14 2020-09-18 北京飞搜科技有限公司 Face detection method and system based on three-level convolutional neural network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106875422A (en) * 2017-02-06 2017-06-20 腾讯科技(上海)有限公司 Face tracking method and device
CN107423707A (en) * 2017-07-25 2017-12-01 深圳帕罗人工智能科技有限公司 A kind of face Emotion identification method based under complex environment
CN107818314A (en) * 2017-11-22 2018-03-20 北京达佳互联信息技术有限公司 Face image processing method, device and server

Also Published As

Publication number Publication date
CN108509940A (en) 2018-09-07

Similar Documents

Publication Publication Date Title
CN108509940B (en) Facial image tracking, device, computer equipment and storage medium
CN109462776B (en) Video special effect adding method and device, terminal equipment and storage medium
US20230066716A1 (en) Video generation method and apparatus, storage medium, and computer device
US9690981B2 (en) System and method for motion evaluation
CN109815776B (en) Action prompting method and device, storage medium and electronic device
CN111488773B (en) Action recognition method, device, equipment and storage medium
CN110517278A (en) Image segmentation and the training method of image segmentation network, device and computer equipment
CN109525891B (en) Multi-user video special effect adding method and device, terminal equipment and storage medium
US20140355825A1 (en) Method and apparatus for estimating pose
CN103403736B (en) Dynamic template is followed the trail of
WO2021096669A1 (en) Assessing a pose-based sport
CN108833942A (en) Video cover choosing method, device, computer equipment and storage medium
CN108985257A (en) Method and apparatus for generating information
CN110751039B (en) Multi-view 3D human body posture estimation method and related device
CN104111733B (en) A kind of gesture recognition system and method
CN107273857B (en) Motion action recognition method and device and electronic equipment
CN107749987A (en) A kind of digital video digital image stabilization method based on block motion estimation
CN107179839A (en) Information output method, device and equipment for terminal
CN112561973A (en) Method and device for training image registration model and electronic equipment
Li et al. End-to-end feature integration for correlation filter tracking with channel attention
CN111222459B (en) Visual angle independent video three-dimensional human body gesture recognition method
CN105989623B (en) The implementation method of augmented reality application based on handheld mobile device
CN107292295A (en) Hand Gesture Segmentation method and device
CN113723187A (en) Semi-automatic labeling method and system for gesture key points
CN108228069A (en) Hand-written script input method, mobile terminal and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant