CN110059605A - A kind of neural network training method calculates equipment and storage medium - Google Patents

A kind of neural network training method calculates equipment and storage medium Download PDF

Info

Publication number
CN110059605A
CN110059605A CN201910287068.5A CN201910287068A CN110059605A CN 110059605 A CN110059605 A CN 110059605A CN 201910287068 A CN201910287068 A CN 201910287068A CN 110059605 A CN110059605 A CN 110059605A
Authority
CN
China
Prior art keywords
image
neural network
point
frame image
sequential frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910287068.5A
Other languages
Chinese (zh)
Inventor
齐子铭
李志阳
周子健
张伟
许清泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xiamen Meitu Technology Co Ltd
Original Assignee
Xiamen Meitu Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xiamen Meitu Technology Co Ltd filed Critical Xiamen Meitu Technology Co Ltd
Priority to CN201910287068.5A priority Critical patent/CN110059605A/en
Publication of CN110059605A publication Critical patent/CN110059605A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of neural network training methods, and suitable for executing in calculating equipment, neural network is suitable for handling image to export the key point of characterization characteristics of image, and method includes: that the sequential frame image of predetermined quantity is extracted from video data;Sequential frame image is inputted into neural network, obtains the image prediction point of sequential frame image;For each image prediction point, optical flow computation is carried out to sequential frame image, to obtain corresponding light stream future position;Image prediction point and light stream future position based on sequential frame image calculate first-loss value;And it is based on first-loss value, the parameter of neural network is adjusted so as to the neural network after being trained.This method can be improved the stability of neural network output future position.

Description

A kind of neural network training method calculates equipment and storage medium
Technical field
The present invention relates to depth learning technology field more particularly to a kind of neural network training methods, critical point detection side Method calculates equipment and storage medium.
Background technique
Key point/characteristic point detection is very widely used, and the especially detection of face or cat and dog face key point, is various Using such as beautifying faces, the important foundation of the lovely expression of cat and dog face.The requirement of real time of the short-sighted frequency in internet, the inspection to face key point Survey is a challenge, and the method for traditional frame frame detection can generate serious jitter problem, the body of the shaking of point to user It tests and also generates huge adverse effect.How adjustment neural network is still an empirical work with the application effect being optimal Make.
Therefore, it is necessary to a kind of neural network training method, Lai Tigao neural network exports the stability of future position.
Summary of the invention
For this purpose, the present invention provides a kind of neural network training method and critical point detection method, to try hard to solve or Person at least alleviates at least one existing problem above.
According to an aspect of the invention, there is provided a kind of neural network training method, suitable for being executed in calculating equipment, Neural network can be handled image to export the key point of characterization characteristics of image.It in the method, can be first from view Frequency extracts the sequential frame image of predetermined quantity in;Sequential frame image is inputted into neural network, obtains the of sequential frame image Image prediction point.Then it is directed to each first image prediction point, optical flow computation is carried out to sequential frame image, to obtain corresponding light Flow future position.So as to image prediction point and light stream future position based on sequential frame image, first-loss value is calculated.Last base In first-loss value, the parameter of neural network is adjusted, with the neural network after being trained.
Optionally, in the above-mentioned methods, it can be primarily based on optical flow method, calculate in sequential frame image each the between consecutive frame The displacement information of 1 image prediction point.It is then based on displacement information, determines that each first image prediction point of previous frame image is being worked as Position in prior image frame, to obtain the light stream future position of sequential frame image.
Optionally, in the above-mentioned methods, first-loss value can be calculated by following formula:
Wherein, registration loss is first-loss value, LT, iRefer to that i-th of network of t frame is pre- in sequential frame image The coordinate of measuring point,For the coordinate of i-th of light stream future position of corresponding t frame, K is point number of coordinates, and T is video frame number.
Optionally, in order to advanced optimize neural network, in the above-mentioned methods, the figure for having marked key point can also be obtained Picture, this group of image and sequential frame image have consistent quantity and format.Similarly, this group of image is inputted into neural network, obtained Obtain the second image prediction point of image.Then can key point and the second image prediction point based on mark, calculate second damage Mistake value.The second penalty values are finally based on, adjust the parameter of neural network with optimization neural network.
Optionally, in the above-mentioned methods, the second penalty values can be calculated by following formula:
Wherein, detection loss is the second penalty values, and Li refers to the coordinate of i-th of neural network forecast point of image,For The coordinate of corresponding i-th of key point, K are point number of coordinates.
According to a further aspect of the invention, a kind of neural network training method is provided, it can be first from video data The sequential frame image of predetermined quantity is extracted, and is obtained with the image for being labelled with key point, this group of image and sequential frame image have There are consistent quantity and format.Then, sequential frame image and this group of image are inputted into neural network, obtains the of sequential frame image Second image prediction point of 1 image prediction point and this group of image.For each first image prediction point of sequential frame image, to even Continuous frame image carries out optical flow computation, to obtain the light stream future position of sequential frame image.Can based on the first image prediction point and Light stream future position calculates first-loss value, and the key point based on mark and the second image prediction point, calculates the second penalty values. Finally, being based on first-loss value and the second penalty values, the parameter of neural network is adjusted with the neural network after being trained.
Above scheme optimizes neural network by two penalty values, downscaled images future position and actual target point Difference, to improve the stability of neural network prediction.
According to another aspect of the present invention, a kind of critical point detection method is provided, above-mentioned neural network instruction can be used Neural network after practicing method training carries out critical point detection to video to export the key point of characterization characteristics of image.
According to a further aspect of the present invention, a kind of calculating equipment, including one or more processors are provided;Memory; One or more programs, the one or more program store in memory and are configured as being held by one or more processors Row, one or more programs are used to execute the instruction of neural network training method and/or critical point detection method.
According to a further aspect of the present invention, a kind of computer readable storage medium storing one or more programs is provided, The one or more program includes instruction, when instruction is executed by calculating equipment, so that calculating equipment executes neural metwork training Method and/or critical point detection method.
This programme proposes that a kind of neural network good to initial training based on light stream optimizes trained method.Wherein Complicated optical flow computation exists only in the network training stage, in the neural network prediction stage, does not have compared to former network additional Cost is calculated, thus guarantee that the speed of service of network is constant, meanwhile, neural network using optic flow technique because substantially increased defeated The stability put out.
Detailed description of the invention
To the accomplishment of the foregoing and related purposes, certain illustrative sides are described herein in conjunction with following description and drawings Face, these aspects indicate the various modes that can practice principles disclosed herein, and all aspects and its equivalent aspect It is intended to fall in the range of theme claimed.Read following detailed description in conjunction with the accompanying drawings, the disclosure it is above-mentioned And other purposes, feature and advantage will be apparent.Throughout the disclosure, identical appended drawing reference generally refers to identical Component or element.
Fig. 1 shows the organigram according to an embodiment of the invention for calculating equipment 100;
Fig. 2 shows the schematic flow charts of neural network training method 200 according to an embodiment of the invention;
Fig. 3 shows the schematic flow chart of neural network training method 300 according to an embodiment of the invention;
Fig. 4 shows the schematic diagram of neural network training process according to an embodiment of the invention;
Fig. 5 shows the schematic figure of neural metwork training result according to an embodiment of the invention.
Specific embodiment
Exemplary embodiments of the present disclosure are described in more detail below with reference to accompanying drawings.Although showing the disclosure in attached drawing Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here It is limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure It is fully disclosed to those skilled in the art.
Optimum algorithm of multi-layer neural network is to minimize loss function by improving training method.The parameter of model can be used to The extent of deviation between predicted value and target value is calculated, can be formed by loss function based on these parameters.In training nerve net When network, variance can be controlled by finding minimum value, update model parameter, finally restrain model.This programme mainly passes through fixed Justice simultaneously optimizes loss function, and regularized learning algorithm rate optimizes the neural network of pre-training.
Fig. 1 shows the organigram according to an embodiment of the invention for calculating equipment 100.In basic configuration In 102, calculates equipment 100 and typically comprise system storage 106 and one or more processor 104.Memory bus 108 It can be used for the communication between processor 104 and system storage 106.
Depending on desired configuration, processor 104 can be any kind of processing, including but not limited to: microprocessor (μ P), microcontroller (μ C), digital information processor (DSP) or any combination of them.Processor 104 may include such as The cache of one or more rank of on-chip cache 110 and second level cache 112 etc, processor core 114 and register 116.Exemplary processor core 114 may include arithmetic and logical unit (ALU), floating-point unit (FPU), Digital signal processing core (DSP core) or any combination of them.Exemplary Memory Controller 118 can be with processor 104 are used together, or in some implementations, and Memory Controller 118 can be an interior section of processor 104.
Depending on desired configuration, system storage 106 can be any type of memory, including but not limited to: easily The property lost memory (RAM), nonvolatile memory (ROM, flash memory etc.) or any combination of them.System is deposited Reservoir 106 may include operating system 120, one or more program 122 and program data 124.In some embodiments In, program 122 may be arranged to be operated using program data 124 on an operating system.
Calculating equipment 100 can also include facilitating from various interface equipments (for example, output equipment 142, Peripheral Interface 144 and communication equipment 146) to basic configuration 102 via the communication of bus/interface controller 130 interface bus 140.Example Output equipment 142 include graphics processing unit 148 and audio treatment unit 150.They can be configured as facilitate via One or more port A/V 152 is communicated with the various external equipments of such as display or loudspeaker etc.Outside example If interface 144 may include serial interface controller 154 and parallel interface controller 156, they, which can be configured as, facilitates Via one or more port I/O 158 and such as input equipment (for example, keyboard, mouse, pen, voice-input device, touch Input equipment) or the external equipment of other peripheral hardwares (such as printer, scanner etc.) etc communicated.Exemplary communication is set Standby 146 may include network controller 160, can be arranged to convenient for via one or more communication port 164 and one A or multiple other calculate communication of the equipment 162 by network communication link.
Network communication link can be an example of communication media.Communication media can be usually presented as in such as carrier wave Or computer readable instructions, data structure, program module in the modulated data signal of other transmission mechanisms etc, and can To include any information delivery media." modulated data signal " can such signal, one in its data set or more It is a or it change can the mode of encoded information in the signal carry out.As unrestricted example, communication media can be with Wired medium including such as cable network or private line network etc, and it is such as sound, radio frequency (RF), microwave, infrared (IR) the various wireless mediums or including other wireless mediums.Term computer-readable medium used herein may include depositing Both storage media and communication media.
Calculating equipment 100 can be implemented as server, such as file server, database server, application program service Device and web server etc., are also possible to a part of portable (or mobile) electronic equipment of small size, these electronic equipments can be with It is that such as cellular phone, personal digital assistant (PDA), personal media player device, wireless network browsing apparatus, individual wear Equipment, application specific equipment or may include any of the above function mixing apparatus.Equipment 100 is calculated to be also implemented as Personal computer including desktop computer and notebook computer configuration.In some embodiments, calculating equipment 100 can be matched It is set to and executes neural network training method 200 of the invention.Wherein, the one or more programs 122 for calculating equipment 100 include using In the instruction for executing neural network training method 200 according to the present invention.
Critical point detection is also referred to as face point location or alignment, refers to given face image or video, orients face Critical zone locations, including eyebrow, eyes, nose, mouth, face mask etc..How to obtain high-precision key point, always with Being all the hot research problem in the fields such as computer vision, pattern-recognition, image procossing.Critical point detection method is most widely used General, highest effect precision is the method based on deep learning, i.e., carries out critical point detection using neural network.
In order to improve the stability of neural network output future position, neural network progress that can be good to initial training is excellent Change, it is contemplated that influence of the shake of input picture to neural network output stability, this programme provide a kind of neural metwork training Method can improve the stability of neural network prediction using the prediction result optimization neural network of light stream estimation.
Fig. 2 shows the schematic flow charts of neural network training method 200 according to an embodiment of the invention.Such as Shown in Fig. 2, in step S210, the sequential frame image of predetermined quantity can be extracted from video data.
Since optical flow method is using pixel in image sequence in the variation in time-domain and the correlation between consecutive frame Previous frame is found with corresponding relationship existing between present frame, so that the motion information of object between consecutive frame is calculated, because This, can extract a certain number of sequential frame images by image processing methods such as openCV, matlab at random from video.Example It such as can be by the Video Quality Metric of avi format at the image of a frame frame jpg format.
Then in step S220, the sequential frame image that step S210 is extracted can be inputted into neural network, obtained continuous First image prediction point of frame image.
Wherein, neural network can be handled input picture to export the image prediction point of characterization characteristics of image.Example Convolutional neural networks, the recurrence encoding and decoding network that such as can be any pre-training, for detecting face's key point, such as cat The detection of dog face point, face alignment etc..Such as in face critical point detection, each point is in different faces, Ke Yidai The feature of table one kind has certain versatility.
It, can be for each first image prediction point obtained in step S220, to step S210 then in step S230 The sequential frame image of middle extraction carries out optical flow computation, to obtain corresponding light stream future position.
According to one embodiment of present invention, it can be primarily based on optical flow method, calculated in sequential frame image between consecutive frame The displacement information of each first image prediction point.It is then based on displacement information, determines each first image prediction point of previous frame image Position in current frame image, to obtain the light stream future position of sequential frame image.
Wherein, light stream is the motion information description of brightness of image, during picture moving, the x of each pixel on image, Y displacement amount.For example the position of A point is (x1, y1) when t frame, then find A point again when t+1 frame, if Its position is (x2, y2), so that it may determine the movement of A point: (u, v)=(x2, y2)-(x1, y1).Common algorithm has ladder Degree constraint-relevant alignment, Lucas-Kanade method etc..Light stream meter can be carried out using any one optical flow computation method It calculates, it is not limited here.Since the displacement of pixel between successive video frames is smaller, light can be accurately obtained using optical flow method Flow future position.
Such as optical flow computation is carried out using Lucas-Kanade method, the Lucas- that open source library openCV is provided can be used Kanade algorithm is calculated.Assuming that light stream is a constant in pixel neighborhood of a point, then using least square method to neighborhood In all pixels point solve basic optical flow equation.Wherein, data and reality that least square method can make these acquire The quadratic sum of error is minimum between data.In this programme light stream meter can be carried out for image prediction point each in sequential frame image It calculates, without carrying out optical flow computation for each pixel.
Then in step S240, it can be based on obtaining in the first image prediction point obtained in step S220 and step S230 The light stream future position arrived calculates first-loss value.
In deep learning algorithm, tuning is carried out to neural network commonly using penalty values.Loss function can be used to comment The different degree of predicted value and true value of valence model, loss function is better, and the performance of usual model is better.By to loss Definition, the optimization of function, can make neural network have preferable prediction effect.
According to one embodiment of present invention, first-loss value can be calculated by following formula:
Wherein, registration loss is first-loss value, LT, iRefer to that i-th of network of t frame is pre- in sequential frame image The coordinate of measuring point,For the coordinate of i-th of light stream future position of corresponding t frame, K is point number of coordinates, and T is video frame number.It can be with Using the above-mentioned light stream (displacement) being calculated by LT-1, iIt is displaced to t moment, is denoted asThat is light stream future position passes through One penalty values can strengthen the similitude of image prediction point and light stream future position, reduce the shake between frame, make image prediction point It falls in correct position.
Finally in step s 250, the ginseng of neural network can be adjusted based on first-loss value obtained in step S240 Number is with the neural network after being trained.
Neural metwork training, which is achieved, generally calculates penalty values by propagated forward, carries out reverse push according to penalty values It arrives, carries out the adjustment of relevant parameter.It can be based on first-loss value, adjust the weight parameter of neural network.As it can be seen that loss function It is the directive guidance of parameter adjustment.The convergence of neural network can be improved by changing loss function and regularized learning algorithm rate. Multistage learning rate can be set, start setting up higher and to allow it to iterate to penalty values as far as possible enough small.
In order to advanced optimize neural network, make image prediction point close to true mark point, in the above method 200, also The available image for having marked key point, this group of image and sequential frame image have consistent quantity and format.
Wherein, key point is the real features point manually marked, is the point that can be identified for that target signature in image.It is labelled with The image of key point and the sequential frame image of said extracted quantity generally having the same and format, such as sequential frame image are 16 Frame, jpg compressed format, corresponding input picture are also jpg compressed format, can reduce data calculation amount in this way.
This group of image can be inputted into neural network, obtain the second image prediction point of image.Image can be then based on Key point and the second image prediction point, calculate the second penalty values.
According to one embodiment of present invention, the second penalty values can be calculated by following formula:
Wherein, detection loss is the second penalty values, LiRefer to the coordinate of i-th of neural network forecast point of described image,For the coordinate of corresponding i-th of key point, K is point number of coordinates.Second loss function is quadratic sum loss function, i.e. minimum two Multiplication, it is therefore an objective to minimizeThis target function value.
The second penalty values can be finally based on, adjust the parameter of neural network with optimization neural network.
For example, the second penalty values can be based on, the parameter value of neural network is further adjusted.In the training process, learn Rate is an important hyper parameter, it can control the speed based on loss gradient adjustment neural network weight.Learning rate is smaller, Speed along loss gradient decline is slower.In general, learning rate is bigger, and neural network learning speed is faster.If study Rate is too small, and network is likely to fall into local optimum;It but has been more than extreme value, loss will stop declining, at certain if too big It shakes repeatedly one position.If a lower learning rate is first arranged, it is gradually increased this value then as training iteration, finally A preferable learning rate can be obtained.In practice, optimal learning rate should keep the arrival of loss function curve minimum That value of point.Regular raising learning rate will be helpful to model and preferably restrain.
It can use two above loss function to optimize neural network, image prediction point on the one hand can be made close True mark point, on the other hand since there is also errors for mark.This error shows visible small range in video successive frame Shake, can allow image prediction point to fall in correct position by first-loss function by light stream.The prediction of light stream is illuminated by the light Variation and motion amplitude influence, and this light stream evaluated error can be corrected by the second loss function.Fig. 3 is shown according to this The schematic flow chart of the neural network training method 200 of one embodiment of invention.As shown in figure 3, in step s310, it can There is the image for being labelled with key point, this group of image to extract the sequential frame image of predetermined quantity from video data, and obtain There is consistent quantity and format with sequential frame image.Due to for video clip there is no true target point mark value, can With the training set by image data and video data collectively as neural network.
It by sequential frame image and the image of key point can have been marked inputs neural network in step s 320, obtain continuous First image prediction point of frame image and the second image prediction point of acquired image.
Then in step S330, for each first image prediction point of sequential frame image, light is carried out to sequential frame image Stream calculation, to obtain the light stream future position of sequential frame image.
Optical flow method can be compared by present frame and next frame gray scale, estimate present frame characteristic point in the position of next frame. Optical flow computation can be carried out to sequential frame image in the initial stage of training neural network, it can be to video by optical flow computation Carry out critical point detection and tracking.The calculation amount of optical flow computation, and light can be reduced by carrying out optical flow computation for image prediction point Stream calculation will not influence neural network in the efficiency of practical application in the training stage of neural network.
In step S340, it can be based on the first image prediction point and light stream future position, first-loss value is calculated, in step In S350, can key point and the second image prediction point based on mark, calculate the second penalty values.
Finally in step S360, can be based on first-loss value and the second penalty values, adjust the parameter of neural network with Just optimization neural network.
In the training process, the learning rate of dynamic change can be generally set according to exercise wheel number.When just starting to train Habit rate is advisable with 0.01~0.001.According to certain intervals, such as [0.00001,0.001] chooses 100 values, observation training set damage Mistake value (the first-loss value based on sequential frame image) and verifying collection penalty values (the second penalty values based on mark point image) choosing Take variable optimal learning rate.
Fig. 4 shows the schematic diagram of neural network training process according to an embodiment of the invention.As shown in figure 4, The left side is training data, and wherein upper left is the image for being labelled with key point, and lower-left is the video successive frame figure of non-label target point Picture.The right is neural network training process, and two kinds of data input neural networks are carried out facial characteristics point prediction.To successive frame figure Light stream future position is obtained as carrying out optical flow computation, then the loss function by being made of image prediction point and light stream future position again Optimization is trained to neural network.
According to one embodiment of present invention, the nerve net after the above-mentioned training of neural network training method 200 can be used Network carries out critical point detection to video to export the key point of characterization characteristics of image.Equipment 100 is calculated to be also configured to execute Critical point detection method of the invention.Wherein, the one or more programs 122 for calculating equipment 100 include for executing according to this The instruction of the critical point detection method of invention.
Fig. 5 shows the schematic diagram of neural metwork training result according to an embodiment of the invention.As shown in figure 5, Left figure is without the training result of optimization training, and right figure is the training result that have passed through this programme training optimization.The result shows that Error between future position and true mark point reduces, and passes through the tuning training of this programme, the stabilization of neural network prediction point Degree and precision increase.
According to the solution of the present invention, network has been trained to improve the stability that neural network exports future position by optimizing, And the continuity in video between frame and frame can be learnt.Optical flow computation is only in the network optimization training stage in this method, pre- Survey stage not additional calculating cost, will not influence the speed of service of legacy network.The steady of output point is increased substantially simultaneously Fixed degree solves the problems, such as a little shake from source, can accurate capture figure in the application scenarios beautified in real time in video As key point, so that the user experience is improved.
It should be appreciated that in order to simplify the disclosure and help to understand one or more of the various inventive aspects, it is right above In the description of exemplary embodiment of the present invention, each feature of the invention be grouped together into sometimes single embodiment, figure or In person's descriptions thereof.However, the disclosed method should not be interpreted as reflecting the following intention: i.e. claimed hair Bright requirement is than feature more features expressly recited in each claim.More precisely, as the following claims As book reflects, inventive aspect is all features less than single embodiment disclosed above.Therefore, it then follows specific real Thus the claims for applying mode are expressly incorporated in the specific embodiment, wherein each claim itself is used as this hair Bright separate embodiments.
Those skilled in the art should understand that the module of the equipment in example disclosed herein or unit or groups Part can be arranged in equipment as depicted in this embodiment, or alternatively can be positioned at and the equipment in the example In different one or more equipment.Module in aforementioned exemplary can be combined into a module or furthermore be segmented into multiple Submodule.
Those skilled in the art will understand that can be carried out adaptively to the module in the equipment in embodiment Change and they are arranged in one or more devices different from this embodiment.It can be the module or list in embodiment Member or component are combined into a module or unit or component, and furthermore they can be divided into multiple submodule or subelement or Sub-component.Other than such feature and/or at least some of process or unit exclude each other, it can use any Combination is to all features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed All process or units of what method or apparatus are combined.Unless expressly stated otherwise, this specification is (including adjoint power Benefit require, abstract and attached drawing) disclosed in each feature can carry out generation with an alternative feature that provides the same, equivalent, or similar purpose It replaces.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed Meaning one of can in any combination mode come using.
Various technologies described herein are realized together in combination with hardware or software or their combination.To the present invention Method and apparatus or the process and apparatus of the present invention some aspects or part can take insertion tangible media, such as it is soft The form of program code (instructing) in disk, CD-ROM, hard disk drive or other any machine readable storage mediums, Wherein when program is loaded into the machine of such as computer etc, and is executed by the machine, the machine becomes to practice this hair Bright equipment.
In the case where program code executes on programmable computers, calculates equipment and generally comprise processor, processor Readable storage medium (including volatile and non-volatile memory and or memory element), at least one input unit, and extremely A few output device.Wherein, memory is configured for storage program code;Processor is configured for according to the memory Instruction in the said program code of middle storage executes method of the present invention.
By way of example and not limitation, computer-readable medium includes computer storage media and communication media.It calculates Machine readable medium includes computer storage media and communication media.Computer storage medium storage such as computer-readable instruction, The information such as data structure, program module or other data.Communication media is generally modulated with carrier wave or other transmission mechanisms etc. Data-signal processed passes to embody computer readable instructions, data structure, program module or other data including any information Pass medium.Above any combination is also included within the scope of computer-readable medium.
In addition, be described as herein can be by the processor of computer system or by executing by some in the embodiment The combination of method or method element that other devices of the function are implemented.Therefore, have for implementing the method or method The processor of the necessary instruction of element forms the device for implementing this method or method element.In addition, Installation practice Element described in this is the example of following device: the device be used for implement as in order to implement the purpose of the invention element performed by Function.
As used in this, unless specifically stated, come using ordinal number " first ", " second ", " third " etc. Description plain objects, which are merely representative of, is related to the different instances of similar object, and is not intended to imply that the object being described in this way must Must have the time it is upper, spatially, sequence aspect or given sequence in any other manner.
Although the embodiment according to limited quantity describes the present invention, above description, the art are benefited from It is interior it is clear for the skilled person that in the scope of the present invention thus described, it can be envisaged that other embodiments.Additionally, it should be noted that Language used in this specification primarily to readable and introduction purpose and select, rather than in order to explain or limit Determine subject of the present invention and selects.Therefore, without departing from the scope and spirit of the appended claims, for this Many modifications and changes are obvious for the those of ordinary skill of technical field.For the scope of the present invention, to this Invent done disclosure be it is illustrative and not restrictive, it is intended that the scope of the present invention be defined by the claims appended hereto.

Claims (10)

1. a kind of neural network training method, suitable for executing in calculating equipment, the neural network is suitable for image Reason characterizes the key point of characteristics of image to export, wherein the described method includes:
The sequential frame image of predetermined quantity is extracted from video data;
The sequential frame image is inputted into the neural network, obtains the first image prediction point of sequential frame image;
For each first image prediction point, optical flow computation is carried out to the sequential frame image, to obtain corresponding light stream future position;
The first image prediction point and light stream future position based on the sequential frame image calculate first-loss value;And
Based on first-loss value, the parameter of the neural network is adjusted with the neural network after being trained.
2. the method for claim 1, wherein described be directed to each first image prediction point, to the sequential frame image into Row optical flow computation, to include: the step of obtaining corresponding light stream future position
Based on optical flow method, the displacement information of each first image prediction point between consecutive frame in the sequential frame image is calculated;
Based on institute's displacement information, position of each first image prediction point of previous frame image in current frame image is determined, with Obtain the light stream future position of the sequential frame image.
3. the method for claim 1, wherein calculating the first-loss value by following formula:
Wherein, registration loss is first-loss value, LT, iRefer to that i-th of network of t frame is pre- in the sequential frame image The coordinate of measuring point,For the coordinate of i-th of light stream future position of corresponding t frame, K is point number of coordinates, and T is video frame number.
4. the method for claim 1, wherein the method also includes:
The image for having marked key point is obtained, described image and the sequential frame image have consistent quantity and format;
It is handled in the image input neural network that will acquire, obtains the second image prediction point;
Key point and the second image prediction point based on mark calculate the second penalty values;And
Based on second penalty values, the parameter of the neural network is adjusted to optimize the neural network.
5. method as claimed in claim 4, wherein calculate second penalty values by following formula:
Wherein, detection loss is the second penalty values, LiRefer to the coordinate of i-th of neural network forecast point of described image,It is right The coordinate for i-th of the key point answered, K are point number of coordinates.
6. a kind of neural network training method, suitable for being executed in calculating equipment, the neural network be suitable for input picture into Row processing is to export the key point for characterizing characteristics of image, wherein the described method includes:
The sequential frame image of predetermined quantity is extracted from video data, and obtains the image for having marked key point;
The sequential frame image and acquired image are inputted into the neural network respectively, obtain the of the sequential frame image Second image prediction point of 1 image prediction point and described image;
For each first image prediction point, optical flow computation is carried out to sequential frame image, to obtain the light stream prediction of sequential frame image Point;
Based on light stream future position and the first image prediction point, first-loss value is calculated;
Key point and the second image prediction point based on mark calculate the second penalty values;And
Based on first-loss value and the second penalty values, the parameter of the neural network is adjusted, with the neural network after being trained.
7. method as claimed in claim 6, wherein acquired image and the sequential frame image have consistent quantity and Format.
8. a kind of critical point detection method, suitable for being executed in the calculating equipment, wherein include:
Using the method as described in claim any one of 1-7 training after neural network to video carry out critical point detection with The key point of output characterization characteristics of image.
9. a kind of calculating equipment, comprising:
One or more processors;
Memory;
One or more programs, wherein one or more of programs are stored in the memory and are configured as by described one A or multiple processors execute, and one or more of programs include for executing in -8 the methods according to claim 1 The instruction of either method.
10. a kind of computer readable storage medium for storing one or more programs, one or more of programs include instruction, Described instruction is when calculating equipment execution, so that the equipment that calculates executes appointing in method described in -8 according to claim 1 One method.
CN201910287068.5A 2019-04-10 2019-04-10 A kind of neural network training method calculates equipment and storage medium Pending CN110059605A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910287068.5A CN110059605A (en) 2019-04-10 2019-04-10 A kind of neural network training method calculates equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910287068.5A CN110059605A (en) 2019-04-10 2019-04-10 A kind of neural network training method calculates equipment and storage medium

Publications (1)

Publication Number Publication Date
CN110059605A true CN110059605A (en) 2019-07-26

Family

ID=67318750

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910287068.5A Pending CN110059605A (en) 2019-04-10 2019-04-10 A kind of neural network training method calculates equipment and storage medium

Country Status (1)

Country Link
CN (1) CN110059605A (en)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826475A (en) * 2019-11-01 2020-02-21 北京齐尔布莱特科技有限公司 Method and device for detecting near-duplicate video and computing equipment
CN110956643A (en) * 2019-12-04 2020-04-03 齐鲁工业大学 Improved vehicle tracking method and system based on MDNet
CN110991365A (en) * 2019-12-09 2020-04-10 中国科学院深圳先进技术研究院 Video motion information acquisition method and system and electronic equipment
CN111311732A (en) * 2020-04-26 2020-06-19 中国人民解放军国防科技大学 3D human body grid obtaining method and device
CN111967382A (en) * 2020-08-14 2020-11-20 北京金山云网络技术有限公司 Age estimation method, and training method and device of age estimation model
CN112377332A (en) * 2020-10-19 2021-02-19 北京宇航***工程研究所 Rocket engine polarity testing method and system based on computer vision
CN112634126A (en) * 2020-12-22 2021-04-09 厦门美图之家科技有限公司 Portrait age reduction processing method, portrait age reduction training device, portrait age reduction equipment and storage medium
CN113255761A (en) * 2021-05-21 2021-08-13 深圳共形咨询企业(有限合伙) Feedback neural network system, training method and device thereof, and computer equipment
CN113469985A (en) * 2021-07-13 2021-10-01 中国科学院深圳先进技术研究院 Method for extracting characteristic points of endoscope image
CN113538325A (en) * 2020-04-21 2021-10-22 通用汽车环球科技运作有限责任公司 System and method for evaluating spot weld integrity
CN113610016A (en) * 2021-08-11 2021-11-05 人民中科(济南)智能技术有限公司 Training method, system, equipment and storage medium of video frame feature extraction model
CN113673439A (en) * 2021-08-23 2021-11-19 平安科技(深圳)有限公司 Pet dog identification method, device, equipment and storage medium based on artificial intelligence
CN113762173A (en) * 2021-09-09 2021-12-07 北京地平线信息技术有限公司 Training method and device for human face light stream estimation and light stream value prediction model

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105976400A (en) * 2016-05-10 2016-09-28 北京旷视科技有限公司 Object tracking method and device based on neural network model
CN108122234A (en) * 2016-11-29 2018-06-05 北京市商汤科技开发有限公司 Convolutional neural networks training and method for processing video frequency, device and electronic equipment
CN108229282A (en) * 2017-05-05 2018-06-29 商汤集团有限公司 Critical point detection method, apparatus, storage medium and electronic equipment
CN108230390A (en) * 2017-06-23 2018-06-29 北京市商汤科技开发有限公司 Training method, critical point detection method, apparatus, storage medium and electronic equipment
CN109089015A (en) * 2018-09-19 2018-12-25 厦门美图之家科技有限公司 Video stabilization display methods and device
CN109376659A (en) * 2018-10-26 2019-02-22 北京陌上花科技有限公司 Training method, face critical point detection method, apparatus for face key spot net detection model
CN109389072A (en) * 2018-09-29 2019-02-26 北京字节跳动网络技术有限公司 Data processing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105976400A (en) * 2016-05-10 2016-09-28 北京旷视科技有限公司 Object tracking method and device based on neural network model
CN108122234A (en) * 2016-11-29 2018-06-05 北京市商汤科技开发有限公司 Convolutional neural networks training and method for processing video frequency, device and electronic equipment
CN108229282A (en) * 2017-05-05 2018-06-29 商汤集团有限公司 Critical point detection method, apparatus, storage medium and electronic equipment
CN108230390A (en) * 2017-06-23 2018-06-29 北京市商汤科技开发有限公司 Training method, critical point detection method, apparatus, storage medium and electronic equipment
CN109089015A (en) * 2018-09-19 2018-12-25 厦门美图之家科技有限公司 Video stabilization display methods and device
CN109389072A (en) * 2018-09-29 2019-02-26 北京字节跳动网络技术有限公司 Data processing method and device
CN109376659A (en) * 2018-10-26 2019-02-22 北京陌上花科技有限公司 Training method, face critical point detection method, apparatus for face key spot net detection model

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱志宇 著: "《流形粒子滤波算法及其在视频目标跟踪中的应用》", 31 January 2015, 北京:国防工业出版社 *

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110826475B (en) * 2019-11-01 2022-10-04 北京齐尔布莱特科技有限公司 Method and device for detecting near-duplicate video and computing equipment
CN110826475A (en) * 2019-11-01 2020-02-21 北京齐尔布莱特科技有限公司 Method and device for detecting near-duplicate video and computing equipment
CN110956643A (en) * 2019-12-04 2020-04-03 齐鲁工业大学 Improved vehicle tracking method and system based on MDNet
CN110991365B (en) * 2019-12-09 2024-02-20 中国科学院深圳先进技术研究院 Video motion information acquisition method, system and electronic equipment
CN110991365A (en) * 2019-12-09 2020-04-10 中国科学院深圳先进技术研究院 Video motion information acquisition method and system and electronic equipment
CN113538325B (en) * 2020-04-21 2023-09-15 通用汽车环球科技运作有限责任公司 System and method for evaluating spot weld integrity
CN113538325A (en) * 2020-04-21 2021-10-22 通用汽车环球科技运作有限责任公司 System and method for evaluating spot weld integrity
CN111311732A (en) * 2020-04-26 2020-06-19 中国人民解放军国防科技大学 3D human body grid obtaining method and device
CN111967382A (en) * 2020-08-14 2020-11-20 北京金山云网络技术有限公司 Age estimation method, and training method and device of age estimation model
CN112377332A (en) * 2020-10-19 2021-02-19 北京宇航***工程研究所 Rocket engine polarity testing method and system based on computer vision
CN112377332B (en) * 2020-10-19 2022-01-04 北京宇航***工程研究所 Rocket engine polarity testing method and system based on computer vision
CN112634126A (en) * 2020-12-22 2021-04-09 厦门美图之家科技有限公司 Portrait age reduction processing method, portrait age reduction training device, portrait age reduction equipment and storage medium
CN113255761A (en) * 2021-05-21 2021-08-13 深圳共形咨询企业(有限合伙) Feedback neural network system, training method and device thereof, and computer equipment
WO2023284246A1 (en) * 2021-07-13 2023-01-19 中国科学院深圳先进技术研究院 Endoscopic image feature point extraction method
CN113469985A (en) * 2021-07-13 2021-10-01 中国科学院深圳先进技术研究院 Method for extracting characteristic points of endoscope image
CN113610016A (en) * 2021-08-11 2021-11-05 人民中科(济南)智能技术有限公司 Training method, system, equipment and storage medium of video frame feature extraction model
CN113610016B (en) * 2021-08-11 2024-04-23 人民中科(济南)智能技术有限公司 Training method, system, equipment and storage medium for video frame feature extraction model
CN113673439A (en) * 2021-08-23 2021-11-19 平安科技(深圳)有限公司 Pet dog identification method, device, equipment and storage medium based on artificial intelligence
CN113673439B (en) * 2021-08-23 2024-03-05 平安科技(深圳)有限公司 Pet dog identification method, device, equipment and storage medium based on artificial intelligence
CN113762173A (en) * 2021-09-09 2021-12-07 北京地平线信息技术有限公司 Training method and device for human face light stream estimation and light stream value prediction model
CN113762173B (en) * 2021-09-09 2024-05-07 北京地平线信息技术有限公司 Training method and device for face optical flow estimation and optical flow value prediction model

Similar Documents

Publication Publication Date Title
CN110059605A (en) A kind of neural network training method calculates equipment and storage medium
JP7236545B2 (en) Video target tracking method and apparatus, computer apparatus, program
CN110689109B (en) Neural network method and device
US10497122B2 (en) Image crop suggestion and evaluation using deep-learning
US20200372246A1 (en) Hand pose estimation
WO2018227800A1 (en) Neural network training method and device
US20200090042A1 (en) Data efficient imitation of diverse behaviors
CN110503074A (en) Information labeling method, apparatus, equipment and the storage medium of video frame
US11559887B2 (en) Optimizing policy controllers for robotic agents using image embeddings
CN105981075B (en) Utilize the efficient facial landmark tracking in wire shaped homing method
CN110062934A (en) The structure and movement in image are determined using neural network
CN107808147A (en) A kind of face Confidence method based on the tracking of real-time face point
CN110084313A (en) A method of generating object detection model
CN109948741A (en) A kind of transfer learning method and device
US10943352B2 (en) Object shape regression using wasserstein distance
CN102640168A (en) Method and apparatus for local binary pattern based facial feature localization
CN110751039B (en) Multi-view 3D human body posture estimation method and related device
CN110276289A (en) Generate the method and human face characteristic point method for tracing of Matching Model
CN113095254B (en) Method and system for positioning key points of human body part
WO2019117970A1 (en) Adaptive object tracking policy
Passalis et al. Deep reinforcement learning for controlling frontal person close-up shooting
CN107886516A (en) The method and computing device that hair moves towards in a kind of calculating portrait
CN110020600A (en) Generate the method for training the data set of face alignment model
CN110287857A (en) A kind of training method of characteristic point detection model
CN115346262A (en) Method, device and equipment for determining expression driving parameters and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190726