CN110366029A - Method, system and the electronic equipment of picture frame are inserted between a kind of video - Google Patents
Method, system and the electronic equipment of picture frame are inserted between a kind of video Download PDFInfo
- Publication number
- CN110366029A CN110366029A CN201910600097.2A CN201910600097A CN110366029A CN 110366029 A CN110366029 A CN 110366029A CN 201910600097 A CN201910600097 A CN 201910600097A CN 110366029 A CN110366029 A CN 110366029A
- Authority
- CN
- China
- Prior art keywords
- video
- network
- image
- human body
- lstm
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 62
- 238000003780 insertion Methods 0.000 claims abstract description 24
- 230000037431 insertion Effects 0.000 claims abstract description 24
- 230000006870 function Effects 0.000 claims description 22
- 230000015654 memory Effects 0.000 claims description 19
- 238000012549 training Methods 0.000 claims description 12
- 238000005457 optimization Methods 0.000 claims description 11
- 238000013528 artificial neural network Methods 0.000 claims description 10
- 230000001537 neural effect Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000004891 communication Methods 0.000 claims description 3
- 238000004422 calculation algorithm Methods 0.000 abstract description 3
- 238000013527 convolutional neural network Methods 0.000 description 8
- 238000003860 storage Methods 0.000 description 7
- 230000001052 transient effect Effects 0.000 description 6
- 230000005540 biological transmission Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 230000007812 deficiency Effects 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 230000007787 long-term memory Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 238000011084 recovery Methods 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 208000001491 myopia Diseases 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/60—Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client
- H04N21/63—Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
- H04N21/647—Control signaling between network components and server or clients; Network processes for video distribution between server and clients, e.g. controlling the quality of the video stream, by dropping packets, protecting content from unauthorised alteration within the network, monitoring of network load, bridging between two different networks, e.g. between IP and wireless
- H04N21/64784—Data processing by the network
- H04N21/64792—Controlling the complexity of the content stream, e.g. by dropping packets
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computing Systems (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computer Security & Cryptography (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
This application involves method, system and electronic equipments that picture frame is inserted between a kind of video.Include: respectively before the video-losing time and video restoration time after respectively select m frame include pedestrian characteristic pattern, and from every characteristic pattern respectively acquisition setting quantity human body attitude point;All human body attitude points are inputted into Alex Net network, the Alex Net network is predicted using pedestrian's posture that the method that cubic polynomial fitting is combined with cubic spline interpolation treats restored image;The corresponding human body attitude point of characteristic pattern before the video-losing time is inputted into LSTM network, obtains pedestrian's attitude prediction result of parked image;Parked image is obtained according to pedestrian's attitude prediction result of the Alex Net network and LSTM network, and calculates the insertion position of the parked image in video, the parked image is inserted into video on corresponding position.The application improves the precision of existing algorithm.
Description
Technical field
The application belongs to frame insertion technical field between video, in particular to is inserted into the method for picture frame between a kind of video, is
System and electronic equipment.
Background technique
Along with the application of more and more amusement short-sighted frequencies of grade, there is a large amount of short transmission of video in China at present
With the demand of broadcasting, but the problem of bringing at the same time, is: since a variety of causes such as network transmission will cause transmission of video mistake
Frame losing in journey will result in the discontinuous feeling of people's vision of viewing video under such circumstances, not only influence viewing experience, also
Being likely due to loss key message causes video usable value not high.
The prior art is in the case where facing lost frames, often by improving the quality of transmission process so that video
The upload or downloading of not frame losing as far as possible, but the case where missing video frames caused by facing already is also helpless.In recent years
The work of the top research achievement of latest academic circle come in this respect is quite few, and being primarily due to this is the new need that industry expedites the emergence of
It asks, current related work is fewer and fewer, several frames predicted below are only now restored based on past video frame in the art,
But almost seldom it is related to restoring a few frame videos of loss in continuous videos.Document [Walker J, Marino K, Gupta
A,et al.The pose knows:Video forecasting by generating pose futures[C]//
Proceedings of the IEEE International Conference on Computer Vision.2017:
3332-3341.] in show prediction the possible behavior of next frame on, VAE method [Kingma D P, Welling can be used
M.Auto-encoding variational bayes [J] .arXiv preprint arXiv:1312.6114,2013.] into
The attitude prediction of pedestrian, but these attitude predictions are not related to utilize these only for some predictions of future frame
Prediction is by two associated videos but intermediate frame number has the picture completion of missing.
Summary of the invention
This application provides method, system and electronic equipments that picture frame is inserted between a kind of video, it is intended at least certain
One of above-mentioned technical problem in the prior art is solved in degree.
To solve the above-mentioned problems, this application provides following technical solutions:
The method of picture frame is inserted between a kind of video, comprising the following steps:
Step a: respectively selecting m frame before the video-losing time and after video restoration time respectively includes the feature of pedestrian
Figure, and the human body attitude point of acquisition setting quantity respectively from every characteristic pattern;
Step b: all human body attitude points are inputted into Alex Net network, the Alex Net network utilizes cubic polynomial
Pedestrian's posture that the method that fitting is combined with cubic spline interpolation treats restored image is predicted;
Step c: the corresponding human body attitude point of characteristic pattern before the video-losing time is inputted into LSTM network, is obtained to multiple
Pedestrian's attitude prediction result of original image;
Step d: obtaining parked image according to pedestrian's attitude prediction result of the Alex Net network and LSTM network,
And the insertion position of the parked image in video is calculated, the parked image is inserted into corresponding position in video
On.
The technical solution that the embodiment of the present application is taken further include: assuming that the human body attitude point acquired in every characteristic pattern is 17
A, in the step b, pedestrian's attitude prediction method of the Alex Net network is specifically included:
Step b1: it by 17 human body attitudes o'clock as 17 ID, is utilized respectively cubic polynomial approximating method and determines one
Regression curve;The corresponding human body attitude point of each ID has a coordinate in the picture, indicates are as follows: locationID=
(xi,yi), i=ID, a series of (x that each ID is formedi,yi) cubic polynomial is substituted into, it obtains:
Y=ax3+bx2+cx+d
17 groups of (a are obtained by computer fittingi,bi,ci,di), and form 17 cubic polynomials and be fitted identified y
Y=f (x) is plotted on an image by=f (x), generates 17 curves, and the transverse and longitudinal coordinate in curve is each human body respectively
The position of posture point indicates;
Step b2: by cubic spline interpolation, the image coordinate point between two frames before and after the video-losing time is restored
Out, pedestrian's attitude prediction result of parked image is obtained.
The technical solution that the embodiment of the present application is taken further include: in the step c, the input structure of the LSTM network
Are as follows:
[ht,ct]=LSTM (pt,ht-1,ct-1)
The then attitude prediction of next frame parked image are as follows:
In above-mentioned formula, WTIndicate the weight that neural metwork training goes out, ht,ctFor the intrinsic parameter of LSTM structure, LSTM network
Loss function be expressed as object2=Loss (LSTM).
The technical solution that the embodiment of the present application is taken further include: after the step c further include: objective function, according to
The objective function optimizes Alexnet network and LSTM network:
objectfinal=object1+object2+|object1-object2|
In above-mentioned formula, | object1-object2| indicate the pedestrian's posture for allowing Alex Net network and LSTM network to generate
Prediction result is close as far as possible.
The technical solution that the embodiment of the present application is taken further include: described to be inserted into parked image in the step d
Specifically include on corresponding position in video: optimized Alex Net network and LSTM network respectively obtain two groups of phases
It all include 17 human body attitude points in each frame parked image, respectively by each frame parked with the parked image of frame number
17 human body attitude points in image are corresponded to each other with its ID, and seek the (x being passed to from Alex Net networki,yi) with from LSTM net
What network was passed toAverage value, obtain the insertion position of each frame parked image, and by all parked images
It is inserted at corresponding position;The position calculation formula are as follows:
Another technical solution that the embodiment of the present application is taken are as follows: the system of picture frame is inserted between a kind of video, comprising:
Characteristic pattern selecting module: for selecting before the video-losing time and respectively m frame after video restoration time respectively
Characteristic pattern comprising pedestrian;
Posture point acquisition module: for acquisition to set the human body attitude point of quantity respectively from every characteristic pattern;
Alex Net neural network forecast module: for all human body attitude points to be inputted Alex Net network, the Alex
Net network is carried out using pedestrian's posture that the method that cubic polynomial fitting is combined with cubic spline interpolation treats restored image
Prediction;
LSTM neural network forecast module: for inputting the corresponding human body attitude point of characteristic pattern before the video-losing time
LSTM network obtains pedestrian's attitude prediction result of parked image;
Image is inserted into module: for being obtained according to pedestrian's attitude prediction result of the Alex Net network and LSTM network
Parked image, and the insertion position of the parked image in video is calculated, the parked image is inserted into video
In on corresponding position.
The technical solution that the embodiment of the present application is taken further include: assuming that the human body attitude point acquired in every characteristic pattern is 17
It is a, pedestrian's attitude prediction method of the Alex Net neural network forecast module specifically:
By 17 human body attitudes o'clock as 17 ID, it is utilized respectively cubic polynomial approximating method and determines that a recurrence is bent
Line;The corresponding human body attitude point of each ID has a coordinate in the picture, indicates are as follows: locationID=(xi,yi),i
=ID, a series of (x that each ID is formedi,yi) cubic polynomial is substituted into, it obtains:
Y=ax3+bx2+cx+d
17 groups of (a are obtained by computer fittingi,bi,ci,di), and form 17 cubic polynomials and be fitted identified y
Y=f (x) is plotted on an image by=f (x), generates 17 curves, and the transverse and longitudinal coordinate in curve is each human body respectively
The position of posture point indicates;
By cubic spline interpolation, the image coordinate point between two frames before and after the video-losing time is restored, is obtained
To pedestrian's attitude prediction result of parked image.
The technical solution that the embodiment of the present application is taken further include: the input structure of the LSTM network are as follows:
[ht,ct]=LSTM (pt,ht-1,ct-1)
The then attitude prediction of next frame parked image are as follows:
In above-mentioned formula, WTIndicate the weight that neural metwork training goes out, ht,ctFor the intrinsic parameter of LSTM structure, LSTM network
Loss function be expressed as object2=Loss (LSTM).
The technical solution that the embodiment of the present application is taken further includes network optimization module, and the network optimization module is for defining
Objective function optimizes Alexnet network and LSTM network according to the objective function:
objectfinal=object1+object2+|object1-object2|
In above-mentioned formula, | object1-object2| indicate the pedestrian's posture for allowing Alex Net network and LSTM network to generate
Prediction result is close as far as possible.
The technical solution that the embodiment of the present application is taken further include: described image is inserted into module and parked image is inserted into view
Specifically included on corresponding position in frequency: optimized Alex Net network and LSTM network, respectively obtain two groups it is identical
The parked image of frame number all includes 17 human body attitude points in each frame parked image, respectively by each frame parked figure
17 human body attitude points as in are corresponded to each other with its ID, and seek the (x being passed to from Alex Net networki,yi) with from LSTM network
IncomingAverage value, obtain the insertion position of each frame parked image, and all parked images are inserted
Enter at corresponding position;The position calculation formula are as follows:
The another technical solution that the embodiment of the present application is taken are as follows: a kind of electronic equipment, comprising:
At least one processor;And
The memory being connect at least one described processor communication;Wherein,
The memory is stored with the instruction that can be executed by one processor, and described instruction is by described at least one
It manages device to execute, so that at least one described processor is able to carry out the following behaviour for being inserted into the method for picture frame between above-mentioned video
Make:
Step a: respectively selecting m frame before the video-losing time and after video restoration time respectively includes the feature of pedestrian
Figure, and the human body attitude point of acquisition setting quantity respectively from every characteristic pattern;
Step b: all human body attitude points are inputted into Alex Net network, the Alex Net network utilizes cubic polynomial
Pedestrian's posture that the method that fitting is combined with cubic spline interpolation treats restored image is predicted;
Step c: the corresponding human body attitude point of characteristic pattern before the video-losing time is inputted into LSTM network, is obtained to multiple
Pedestrian's attitude prediction result of original image;
Step d: obtaining parked image according to pedestrian's attitude prediction result of the Alex Net network and LSTM network,
And the insertion position of the parked image in video is calculated, the parked image is inserted into corresponding position in video
On.
Compared with the existing technology, the beneficial effect that the embodiment of the present application generates is: the video of the embodiment of the present application interleaves
Enter the method, system and electronic equipment of picture frame by using Alex Net network integration LSTM network while the final phase again of prediction
The scheme mutually promoted is predicted using cubic spline interpolation based on the video frame of front and back, will not be based on interruption to LSTM
The case where deficiency that video afterwards learns known sample again is supplemented, and single LSTM forecasting inaccuracy is efficiently solved,
Preferable viewing experience is brought to video viewers.Meanwhile the dual network of the application effectively improves the precision of existing algorithm, it can
Expansion is high, can also be by replacement convolutional neural networks to complete increasingly complex prediction task.
Detailed description of the invention
Fig. 1 be the embodiment of the present application video between be inserted into picture frame method flow chart;
Fig. 2 be the embodiment of the present application video between be inserted into picture frame system structural schematic diagram;
Fig. 3 is the hardware device structural schematic diagram that the method for picture frame is inserted between video provided by the embodiments of the present application.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only to explain the application, not
For limiting the application.
For technical problem of the existing technology, present applicant proposes it is a kind of using shot and long term memory network (LSTM) into
The method of frame insertion in row memory prediction and video, this method become by convolutional neural networks, by frame insertion process
Habit process, so that network constantly learns relevant parameter weight, insertion is automatic raw in the continuous but intermediate even video for having interruption
At picture frame to complete it is continuous on video visual, give viewer provide a better viewing experience.
The application pertain only in the case where detecting human body attitude (human body attitude detection method referring to [Fang H S,
Xie S,Tai Y W,et al.Rmpe:Regional multi-person pose estimation[C]//
Proceedings of the IEEE International Conference on Computer Vision.2017:
2334-2343.]) it carries out the prediction of target and completes insertion process.Specifically, referring to Fig. 1, being the view of the embodiment of the present application
The flow chart of the method for picture frame is inserted between frequency.It includes following step that the method for picture frame is inserted between the video of the embodiment of the present application
It is rapid:
Step 100: respectively before the video-losing time and video restoration time after respectively select m frame as generate wait answer
The characteristic pattern of original image;
In step 100, it is assumed that video is lost at time t=p, restores at time t=q, and chronomere is the second (s),
The video-losing time is then miss=| p-q |, since each second generally takes 24 frames in smooth video, parked image it is total
Frame number is Total Frames=miss*24.It is selected before video-losing time t=p and respectively after video restoration time t=q
M frame is selected, then input is Input Frames=2*m.
Step 200: acquisition sets the human body attitude point of quantity respectively from every characteristic pattern;
In step 200, the application selectes 17 people's body nodes as human body attitude point, (can manage shown in table 1 specific as follows
Solution, the quantity of human body attitude point and position can be set according to practical application):
1 human body node specification of table
Step 300: collected human body attitude point is inputted into Alex Net network [Krizhevsky A, Sutskever
I,Hinton G E.Imagenet classification with deep convolutional neural networks
[C] //Advances in neural information processing systems.2012:1097-1105.], Alex
Net network is carried out using pedestrian's posture that the method that cubic polynomial fitting is combined with cubic spline interpolation treats restored image
Prediction;
In step 300, since every characteristic pattern respectively includes 17 human body attitude points, then just there is 2*17*m point to make in total
For the input of Alex Net network.Pedestrian's attitude prediction method of Alex Net network specifically includes:
Step 301: firstly, 17 human body attitudes o'clock are utilized respectively three times as 17 ID (ID=1,2,3 ... 17)
Polynomial fitting method determines a regression curve;Wherein, the corresponding human body attitude point of each ID has one in the picture
A coordinate indicates are as follows: locationID=(xi,yi), i=ID, a series of (x that each ID is formedi,yi) substitute into it is more three times
Item formula, obtains:
Y=ax3+bx2+cx+d (1)
In above-mentioned formula, a ≠ 0, b, c, d are constant.By computer fitting (MATLAB), so that it may obtain 17 groups of (ai,
bi,ci,di), it ultimately forms 17 cubic polynomials and is fitted identified y=f (x).Y=f (x) is plotted on an image,
To generate 17 curves, the transverse and longitudinal coordinate in curve is that respective positions indicate respectively.
Step 302: by cubic spline interpolation, the image coordinate point between two frame position of t=p, t=q will be interrupted
It restores, obtains pedestrian's attitude prediction result of parked image;Wherein, cubic spline interpolation is as follows:
Share 17 nodes: x0<x1<…<xn-1<xn(n=17)
Functional value: yi=f (xi)
S (x) meets s (xi)=yi(i=0,1,2 ..., 16)
In formula (3), skIt (x) is [xk-1,xk], (k=1,2 ..., 17);Have simultaneously: sk(xk-1)=yk-1,sk(xk)=
yk。
It can be obtained according to the boundary condition of Interpolatory Splines method:
So can obtain:
Each skIt (x) is the cubic polynomial determined in (1), there are four undetermined coefficient ai,bi,ci,di.So shared
4n coefficient, but be derived by 2n+2 (n-1)=4n-2 equation at present, is also not enough to solve, be added thus initially and
Boundary condition constructs 4n equation.
The two equations are respectively as follows:
s′(x0)=f '0 (5)
s′(xn)=f 'n (6)
Above formula is solved by the chasing method that matrix calculates, in this way, the curve of parked image can be obtained, amplifies xpAnd xq
The curve at place, stochastical sampling go out the point of Total Frames number, are similarly operated to each ID point, can be obtained every
As soon as 17 sample points of frame parked image, this 17 sample points constitute pedestrian's attitude prediction of each frame parked image
As a result.
In convolutional neural networks training, the application indicates the experimental error of convolutional neural networks using mean square deviation, such as
Shown in following formula:
In formula (7),It indicates true coordinate in training set, passes through
The continuous training of Alexnet adjusts suitable Interpolatory Splines method coefficient to reach good prediction effect.
Step 400: the human body attitude point based on the characteristic pattern acquisition before video-losing time t=p, it is (long using LSTM
Short-term memory network) obtain pedestrian's attitude prediction result of parked image;
In step 400, the input structure of LSTM are as follows:
[ht,ct]=LSTM (pt,ht-1,ct-1) (8)
The then attitude prediction of next frame parked image are as follows:
In formula (8) and (9), WTIndicate the weight that neural metwork training goes out, ht,ctFor the intrinsic parameter of LSTM structure.LSTM
The loss function of network can be represented simply as object2=Loss (LSTM).
Step 500: objective function optimizes Alexnet network and LSTM network according to the objective function;
In step 500, the row of corresponding Total Frames number is generated respectively by Alex Net network and LSTM network
People's attitude prediction as a result, pedestrian's attitude prediction that Alex Net network generates the result is that based on before the video-losing time and video
Data distribution after recovery time, and pedestrian's attitude prediction for being generated in LSTM network the result is that based on the video-losing time it
Preceding feature utilizes mutually, the objective function that the application is defined as follows is to Alex in order to allow the two networks to learn from other's strong points to offset one's weaknesses
Net network and LSTM network optimize:
objectfinal=object1+object2+|object1-object2| (10)
In formula (10), | object1-object2| show the pedestrian's posture for allowing Alex Net network and LSTM network to generate
Prediction result is close as far as possible, could finally generate relatively good restoration result.
Step 600: according to after optimization Alex Net network and LSTM network to export final parked image respectively pre-
It surveys as a result, and parked image is inserted into video on corresponding position;
In step 600, by the optimization of Alex Net network and LSTM network, having respectively obtained two groups of frame numbers is Total
The parked image of Frames all includes 17 human body attitude points in each frame, respectively by 17 human body attitudes in each frame
Point is corresponded to each other with its ID, and seeks the (x being passed to from Alex Net networki,yi) and be passed to from LSTM network's
Average value obtains the insertion position of each frame parked image, and all parked images is inserted at corresponding position, complete
At the insertion for losing picture frame in entire video.Wherein, position calculation formula is as follows:
Referring to Fig. 2, be the embodiment of the present application video between be inserted into picture frame system structural schematic diagram.The application is real
Applying the system that picture frame is inserted between the video of example includes that characteristic pattern selecting module, posture point acquisition module, Alex Net network are pre-
It surveys module, LSTM neural network forecast module, network optimization module and image and is inserted into module.
Characteristic pattern selecting module: to select before the video-losing time and respectively m frame after video restoration time respectively
As the characteristic pattern for generating parked image;In the embodiment of the present application, it is assumed that video is lost at time t=p, in time t=q
Place restores, and chronomere is the second (s), and the video-losing time is then miss=| p-q |, since each second generally takes in smooth video
24 frames, therefore the totalframes of parked image is Total Frames=miss*24.Before video-losing time t=p and regard
Each selection m frame after frequency recovery time t=q, then input is Input Frames=2*m.
Posture point acquisition module: for acquisition to set the human body attitude point of quantity respectively from every characteristic pattern;Wherein, originally
Apply for selected 17 people's body nodes as human body attitude point, (it is appreciated that the quantity of human body attitude point shown in table 1 specific as follows
Can be set according to practical application with position):
1 human body node specification of table
Alex Net neural network forecast module: collected human body attitude point is inputted Alex Net network, Alex
Net network is carried out using pedestrian's posture that the method that cubic polynomial fitting is combined with cubic spline interpolation treats restored image
Prediction;Wherein, since every characteristic pattern respectively includes 17 human body attitude points, then just there is 2*17*m point as Alex in total
The input of Net network.Alex Net neural network forecast module specifically includes:
Cubic polynomial fitting unit: for by 17 human body attitudes o'clock as 17 ID (ID=1,2,3 ... 17) points
A regression curve is not determined using cubic polynomial approximating method;Wherein, the corresponding human body attitude point of each ID is in image
In all have a coordinate, indicate are as follows: locationID=(xi,yi), i=ID, a series of (x that each ID is formedi,yi)
Cubic polynomial is substituted into, is obtained:
Y=ax3+bx2+cx+d (1)
By computer fitting (MATLAB), so that it may obtain 17 groups of (ai,bi,ci,di), ultimately form 17 it is more three times
Item formula is fitted identified y=f (x).Y=f (x) is plotted on an image, so that 17 curves are generated, the cross in curve
Ordinate is that respective positions indicate respectively.
Cubic spline interpolation unit: for will interrupt between two frame position of t=p, t=q by cubic spline interpolation
Image coordinate point restore, obtain pedestrian's attitude prediction result of parked image;Wherein, cubic spline interpolation is as follows
It is shown:
Share 17 nodes: x0<x1<…<xn-1<xn(n=17)
Functional value: yi=f (xi)
S (x) meets s (xi)=yi(i=0,1,2 ..., 16)
In formula (3), skIt (x) is [xk-1,xk], (k=1,2 ..., 17);Have simultaneously: sk(xk-1)=yk-1,sk(xk)=
yk。
It can be obtained according to the boundary condition of Interpolatory Splines method:
So can obtain:
Each skIt (x) is the cubic polynomial determined in (1), there are four undetermined coefficient ai,bi,ci,di.So shared
4n coefficient, but be derived by 2n+2 (n-1)=4n-2 equation at present, is also not enough to solve, be added thus initially and
Boundary condition constructs 4n equation.
The two equations are respectively as follows:
s′(x0)=f '0 (5)
s′(xn)=f 'n (6)
Above formula is solved by the chasing method that matrix calculates, in this way, the curve of parked image can be obtained, amplifies xpAnd xq
The curve at place, stochastical sampling go out the point of Total Frames number, are similarly operated to each ID point, can be obtained every
As soon as 17 sample points of frame parked image, this 17 sample points constitute pedestrian's attitude prediction of each frame parked image
As a result.
In convolutional neural networks training, the application indicates the experimental error of convolutional neural networks using mean square deviation, such as
Shown in following formula:
In formula (7),It indicates true coordinate in training set, passes through Alexnet's
Constantly training adjusts suitable Interpolatory Splines method coefficient to reach good prediction effect.
LSTM neural network forecast module: for the human body attitude based on the characteristic pattern acquisition before video-losing time t=p
Point obtains pedestrian's attitude prediction result of parked image using LSTM (shot and long term memory network);Wherein, LSTM network is defeated
Enter structure are as follows:
[ht,ct]=LSTM (pt,ht-1,ct-1) (8)
The then attitude prediction of next frame parked image are as follows:
In formula (8) and (9), WTIndicate the weight that neural metwork training goes out, ht,ctFor the intrinsic parameter of LSTM structure.LSTM
The loss function of network can be represented simply as object2=Loss (LSTM).
Network optimization module: being used for objective function, according to the objective function to Alexnet network and LSTM network into
Row optimization;Wherein, pedestrian's posture of corresponding Total Frames number is generated respectively by Alex Net network and LSTM network
Prediction result, pedestrian's attitude prediction that Alex Net network generates is the result is that be based on before the video-losing time and when video restoration
Between after data distribution, and the pedestrian's attitude prediction generated in LSTM network is the result is that based on the spy before the video-losing time
Sign utilizes mutually, the objective function that the application is defined as follows is to Alex Net network in order to allow the two networks to learn from other's strong points to offset one's weaknesses
It is optimized with LSTM network:
objectfinal=object1+object2+|object1-object2| (10)
In formula (10), | object1-object2| show the pedestrian's posture for allowing Alex Net network and LSTM network to generate
Prediction result is close as far as possible, could finally generate relatively good restoration result.
Image is inserted into module: for according to after optimization Alex Net network and LSTM network export respectively it is final wait answer
Original image prediction result, and parked image is inserted into video on corresponding position;By Alex Net network and LSTM
The optimization of network has respectively obtained the parked image that two groups of frame numbers are Total Frames, all includes 17 people in each frame
Body posture point respectively corresponds to each other 17 human body attitude points in each frame with its ID, and asks incoming from Alex Net network
(xi,yi) and be passed to from LSTM networkAverage value, obtain the insertion position of each frame parked image,
And all parked images are inserted at corresponding position, complete the insertion that picture frame is lost in entire video.Wherein, position
Calculation formula is as follows:
Fig. 3 is the hardware device structural schematic diagram that the method for picture frame is inserted between video provided by the embodiments of the present application.Such as
Shown in Fig. 3, which includes one or more processors and memory.It takes a processor as an example, which can also wrap
It includes: input system and output system.
Processor, memory, input system and output system can be connected by bus or other modes, in Fig. 3 with
For being connected by bus.
Memory as a kind of non-transient computer readable storage medium, can be used for storing non-transient software program, it is non-temporarily
State computer executable program and module.Processor passes through operation non-transient software program stored in memory, instruction
And module realizes the place of above method embodiment thereby executing the various function application and data processing of electronic equipment
Reason method.
Memory may include storing program area and storage data area, wherein storing program area can storage program area, extremely
Application program required for a few function;It storage data area can storing data etc..In addition, memory may include that high speed is random
Memory is accessed, can also include non-transient memory, a for example, at least disk memory, flush memory device or other are non-
Transient state solid-state memory.In some embodiments, it includes the memory remotely located relative to processor that memory is optional, this
A little remote memories can pass through network connection to processing system.The example of above-mentioned network includes but is not limited to internet, enterprise
Intranet, local area network, mobile radio communication and combinations thereof.
Input system can receive the number or character information of input, and generate signal input.Output system may include showing
Display screen etc. shows equipment.
One or more of module storages in the memory, are executed when by one or more of processors
When, execute the following operation of any of the above-described embodiment of the method:
Step a: respectively selecting m frame before the video-losing time and after video restoration time respectively includes the feature of pedestrian
Figure, and the human body attitude point of acquisition setting quantity respectively from every characteristic pattern;
Step b: all human body attitude points are inputted into Alex Net network, the Alex Net network utilizes cubic polynomial
Pedestrian's posture that the method that fitting is combined with cubic spline interpolation treats restored image is predicted;
Step c: the corresponding human body attitude point of characteristic pattern before the video-losing time is inputted into LSTM network, is obtained to multiple
Pedestrian's attitude prediction result of original image;
Step d: obtaining parked image according to pedestrian's attitude prediction result of the Alex Net network and LSTM network,
And the insertion position of the parked image in video is calculated, the parked image is inserted into corresponding position in video
On.
Method provided by the embodiment of the present application can be performed in the said goods, has the corresponding functional module of execution method and has
Beneficial effect.The not technical detail of detailed description in the present embodiment, reference can be made to method provided by the embodiments of the present application.
The embodiment of the present application provides a kind of non-transient (non-volatile) computer storage medium, and the computer storage is situated between
Matter is stored with computer executable instructions, the executable following operation of the computer executable instructions:
Step a: respectively selecting m frame before the video-losing time and after video restoration time respectively includes the feature of pedestrian
Figure, and the human body attitude point of acquisition setting quantity respectively from every characteristic pattern;
Step b: all human body attitude points are inputted into Alex Net network, the Alex Net network utilizes cubic polynomial
Pedestrian's posture that the method that fitting is combined with cubic spline interpolation treats restored image is predicted;
Step c: the corresponding human body attitude point of characteristic pattern before the video-losing time is inputted into LSTM network, is obtained to multiple
Pedestrian's attitude prediction result of original image;
Step d: obtaining parked image according to pedestrian's attitude prediction result of the Alex Net network and LSTM network,
And the insertion position of the parked image in video is calculated, the parked image is inserted into corresponding position in video
On.
The embodiment of the present application provides a kind of computer program product, and the computer program product is non-temporary including being stored in
Computer program on state computer readable storage medium, the computer program include program instruction, when described program instructs
When being computer-executed, the computer is made to execute following operation:
Step a: respectively selecting m frame before the video-losing time and after video restoration time respectively includes the feature of pedestrian
Figure, and the human body attitude point of acquisition setting quantity respectively from every characteristic pattern;
Step b: all human body attitude points are inputted into Alex Net network, the Alex Net network utilizes cubic polynomial
Pedestrian's posture that the method that fitting is combined with cubic spline interpolation treats restored image is predicted;
Step c: the corresponding human body attitude point of characteristic pattern before the video-losing time is inputted into LSTM network, is obtained to multiple
Pedestrian's attitude prediction result of original image;
Step d: obtaining parked image according to pedestrian's attitude prediction result of the Alex Net network and LSTM network,
And the insertion position of the parked image in video is calculated, the parked image is inserted into corresponding position in video
On.
Method, system and the electronic equipment of picture frame are inserted between the video of the embodiment of the present application by using Alex Net net
Network combination LSTM network predicts the scheme finally mutually promoted again, the video frame using cubic spline interpolation based on front and back simultaneously
It is predicted, the LSTM deficiency that will not learn again known sample based on the video after interruption is supplemented, effectively solved
The case where LSTM forecasting inaccuracy for having determined single, brings preferable viewing experience to video viewers.Meanwhile double nets of the application
Network effectively improves the precision of existing algorithm, and expansibility is high, can also be by replacement convolutional neural networks to complete more
Complicated prediction task.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application.
Various modifications to these embodiments will be readily apparent to those skilled in the art, defined herein
General Principle can realize in other embodiments without departing from the spirit or scope of the application.Therefore, this Shen
These embodiments shown in the application please be not intended to be limited to, and are to fit to special with principle disclosed in the present application and novelty
The consistent widest scope of point.
Claims (11)
1. being inserted into the method for picture frame between a kind of video, which comprises the following steps:
Step a: respectively selecting m frame before the video-losing time and after video restoration time respectively includes the characteristic pattern of pedestrian,
And acquisition sets the human body attitude point of quantity respectively from every characteristic pattern;
Step b: all human body attitude points are inputted into Alex Net network, the Alex Net network is fitted using cubic polynomial
Pedestrian's posture that the method combined with cubic spline interpolation treats restored image is predicted;
Step c: the corresponding human body attitude point of characteristic pattern before the video-losing time is inputted into LSTM network, obtains parked figure
Pedestrian's attitude prediction result of picture;
Step d: parked image is obtained according to pedestrian's attitude prediction result of the Alex Net network and LSTM network, and is counted
The insertion position of the parked image in video is calculated, the parked image is inserted into video on corresponding position.
2. being inserted into the method for picture frame between video according to claim 1, which is characterized in that assuming that being adopted in every characteristic pattern
The human body attitude point of collection is 17, and in the step b, pedestrian's attitude prediction method of the Alex Net network is specifically wrapped
It includes:
Step b1: it by 17 human body attitudes o'clock as 17 ID, is utilized respectively cubic polynomial approximating method and determines a recurrence
Curve;The corresponding human body attitude point of each ID has a coordinate in the picture, indicates are as follows: locationID=(xi,
yi), i=ID, a series of (x that each ID is formedi,yi) cubic polynomial is substituted into, it obtains:
Y=ax3+bx2+cx+d
17 groups of (a are obtained by computer fittingi,bi,ci,di), and form 17 cubic polynomials and be fitted identified y=f
(x), y=f (x) is plotted on an image, generates 17 curves, the transverse and longitudinal coordinate in curve is each human body attitude respectively
The position of point indicates;
Step b2: by cubic spline interpolation, the image coordinate point between two frames before and after the video-losing time being restored,
Obtain pedestrian's attitude prediction result of parked image.
3. being inserted into the method for picture frame between video according to claim 1, which is characterized in that described in the step c
The input structure of LSTM network are as follows:
[ht,ct]=LSTM (pt,ht-1,ct-1)
The then attitude prediction of next frame parked image are as follows:
In above-mentioned formula, WTIndicate the weight that neural metwork training goes out, ht,ctFor the intrinsic parameter of LSTM structure, the damage of LSTM network
Mistake function representation is object2=Loss (LSTM).
4. being inserted into the method for picture frame between video according to any one of claims 1 to 3, which is characterized in that the step c
Afterwards further include: objective function optimizes Alexnet network and LSTM network according to the objective function:
objectfinal=object1+object2+|object1-object2|
In above-mentioned formula, | object1-object2| indicate the pedestrian's attitude prediction for allowing Alex Net network and LSTM network to generate
As a result close as far as possible.
5. being inserted into the method for picture frame between video according to claim 4, which is characterized in that described in the step d
Parked image is inserted into video on corresponding position and is specifically included: optimized Alex Net network and LSTM net
Network respectively obtains the parked image of two groups of same number of frames, all includes 17 human body attitude points in each frame parked image, point
17 human body attitude points in each frame parked image are not corresponded to each other with its ID, and ask from Alex Net network be passed to
(xi,yi) and be passed to from LSTM networkAverage value, obtain the insertion position of each frame parked image, and
All parked images are inserted at corresponding position;The position calculation formula are as follows:
6. being inserted into the system of picture frame between a kind of video characterized by comprising
Characteristic pattern selecting module: for respectively selecting m frame to include before the video-losing time and after video restoration time respectively
The characteristic pattern of pedestrian;
Posture point acquisition module: for acquisition to set the human body attitude point of quantity respectively from every characteristic pattern;
Alex Net neural network forecast module: for all human body attitude points to be inputted Alex Net network, the Alex Net net
Network is predicted using pedestrian's posture that the method that cubic polynomial fitting is combined with cubic spline interpolation treats restored image;
LSTM neural network forecast module: for the corresponding human body attitude point of characteristic pattern before the video-losing time to be inputted LSTM net
Network obtains pedestrian's attitude prediction result of parked image;
Image is inserted into module: for being obtained according to pedestrian's attitude prediction result of the Alex Net network and LSTM network to multiple
Original image, and the insertion position of the parked image in video is calculated, it is right in video that the parked image is inserted into
On the position answered.
7. being inserted into the system of picture frame between video according to claim 6, which is characterized in that assuming that being adopted in every characteristic pattern
The human body attitude point of collection is 17, pedestrian's attitude prediction method of the Alex Net neural network forecast module specifically:
By 17 human body attitudes o'clock as 17 ID, it is utilized respectively cubic polynomial approximating method and determines a regression curve;Often
The corresponding human body attitude point of one ID all has a coordinate in the picture, indicates are as follows: locationID=(xi,yi), i=ID,
A series of (x that each ID is formedi,yi) cubic polynomial is substituted into, it obtains:
Y=ax3+bx2+cx+d
17 groups of (a are obtained by computer fittingi,bi,ci,di), and form 17 cubic polynomials and be fitted identified y=f
(x), y=f (x) is plotted on an image, generates 17 curves, the transverse and longitudinal coordinate in curve is each human body attitude respectively
The position of point indicates;
By cubic spline interpolation, the image coordinate point between two frames before and after the video-losing time is restored, obtain to
Pedestrian's attitude prediction result of restored image.
8. being inserted into the system of picture frame between video according to claim 6, which is characterized in that the input of the LSTM network
Structure are as follows:
[ht,ct]=LSTM (pt,ht-1,ct-1)
The then attitude prediction of next frame parked image are as follows:
In above-mentioned formula, WTIndicate the weight that neural metwork training goes out, ht,ctFor the intrinsic parameter of LSTM structure, the damage of LSTM network
Mistake function representation is object2=Loss (LSTM).
9. according to the system for being inserted into picture frame between the described in any item videos of claim 6 to 8, which is characterized in that further include net
Network optimization module, the network optimization module are used for objective function, according to the objective function to Alexnet network and
LSTM network optimizes:
objectfinal=object1+object2+|object1-object2|
In above-mentioned formula, | object1-object2| indicate the pedestrian's attitude prediction for allowing Alex Net network and LSTM network to generate
As a result close as far as possible.
10. being inserted into the system of picture frame between video according to claim 9, which is characterized in that described image is inserted into module
Parked image is inserted into video on corresponding position and is specifically included: optimized Alex Net network and LSTM net
Network respectively obtains the parked image of two groups of same number of frames, all includes 17 human body attitude points in each frame parked image, point
17 human body attitude points in each frame parked image are not corresponded to each other with its ID, and ask from Alex Net network be passed to
(xi,yi) and be passed to from LSTM networkAverage value, obtain the insertion position of each frame parked image, and
All parked images are inserted at corresponding position;The position calculation formula are as follows:
11. a kind of electronic equipment, comprising:
At least one processor;And
The memory being connect at least one described processor communication;Wherein,
The memory is stored with the instruction that can be executed by one processor, and described instruction is by least one described processor
It executes, so that the method that at least one described processor is able to carry out insertion picture frame between above-mentioned 1 to 5 described in any item videos
Following operation:
Step a: respectively selecting m frame before the video-losing time and after video restoration time respectively includes the characteristic pattern of pedestrian,
And acquisition sets the human body attitude point of quantity respectively from every characteristic pattern;
Step b: all human body attitude points are inputted into Alex Net network, the Alex Net network is fitted using cubic polynomial
Pedestrian's posture that the method combined with cubic spline interpolation treats restored image is predicted;
Step c: the corresponding human body attitude point of characteristic pattern before the video-losing time is inputted into LSTM network, obtains parked figure
Pedestrian's attitude prediction result of picture;
Step d: parked image is obtained according to pedestrian's attitude prediction result of the Alex Net network and LSTM network, and is counted
The insertion position of the parked image in video is calculated, the parked image is inserted into video on corresponding position.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910600097.2A CN110366029B (en) | 2019-07-04 | 2019-07-04 | Method and system for inserting image frame between videos and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910600097.2A CN110366029B (en) | 2019-07-04 | 2019-07-04 | Method and system for inserting image frame between videos and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110366029A true CN110366029A (en) | 2019-10-22 |
CN110366029B CN110366029B (en) | 2021-08-24 |
Family
ID=68217860
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910600097.2A Active CN110366029B (en) | 2019-07-04 | 2019-07-04 | Method and system for inserting image frame between videos and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110366029B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112530342A (en) * | 2020-05-26 | 2021-03-19 | 友达光电股份有限公司 | Display method |
CN112884830A (en) * | 2021-01-21 | 2021-06-01 | 浙江大华技术股份有限公司 | Target frame determining method and device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101471073A (en) * | 2007-12-27 | 2009-07-01 | 华为技术有限公司 | Package loss compensation method, apparatus and system based on frequency domain |
KR101452635B1 (en) * | 2013-06-03 | 2014-10-22 | 충북대학교 산학협력단 | Method for packet loss concealment using LMS predictor, and thereof recording medium |
CN104751851A (en) * | 2013-12-30 | 2015-07-01 | 联芯科技有限公司 | Before and after combined estimation based frame loss error hiding method and system |
CN106919360A (en) * | 2017-04-18 | 2017-07-04 | 珠海全志科技股份有限公司 | A kind of head pose compensation method and device |
US20180095076A1 (en) * | 2013-03-15 | 2018-04-05 | Carnegie Mellon University | Linked Peptide Fluorogenic Biosensors |
CN108111860A (en) * | 2018-01-11 | 2018-06-01 | 安徽优思天成智能科技有限公司 | Video sequence lost frames prediction restoration methods based on depth residual error network |
CN108615027A (en) * | 2018-05-11 | 2018-10-02 | 常州大学 | A method of video crowd is counted based on shot and long term memory-Weighted Neural Network |
US20190124403A1 (en) * | 2017-10-20 | 2019-04-25 | Fmr Llc | Integrated Intelligent Overlay for Media Content Streams |
-
2019
- 2019-07-04 CN CN201910600097.2A patent/CN110366029B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101471073A (en) * | 2007-12-27 | 2009-07-01 | 华为技术有限公司 | Package loss compensation method, apparatus and system based on frequency domain |
US20180095076A1 (en) * | 2013-03-15 | 2018-04-05 | Carnegie Mellon University | Linked Peptide Fluorogenic Biosensors |
KR101452635B1 (en) * | 2013-06-03 | 2014-10-22 | 충북대학교 산학협력단 | Method for packet loss concealment using LMS predictor, and thereof recording medium |
CN104751851A (en) * | 2013-12-30 | 2015-07-01 | 联芯科技有限公司 | Before and after combined estimation based frame loss error hiding method and system |
CN106919360A (en) * | 2017-04-18 | 2017-07-04 | 珠海全志科技股份有限公司 | A kind of head pose compensation method and device |
US20190124403A1 (en) * | 2017-10-20 | 2019-04-25 | Fmr Llc | Integrated Intelligent Overlay for Media Content Streams |
CN108111860A (en) * | 2018-01-11 | 2018-06-01 | 安徽优思天成智能科技有限公司 | Video sequence lost frames prediction restoration methods based on depth residual error network |
CN108615027A (en) * | 2018-05-11 | 2018-10-02 | 常州大学 | A method of video crowd is counted based on shot and long term memory-Weighted Neural Network |
Non-Patent Citations (3)
Title |
---|
JACOB WALKER: ""The Pose Knows: Video Forecasting by Generating Pose Futures"", 《IEEE》 * |
张顺: "深度卷积神经网络的发展及其在计算机视觉领域的应用", 《计算机学报》 * |
郑远攀: ""深度学习在图像识别中的应用研究综述"", 《计算机工程与应用》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112530342A (en) * | 2020-05-26 | 2021-03-19 | 友达光电股份有限公司 | Display method |
TWI729826B (en) * | 2020-05-26 | 2021-06-01 | 友達光電股份有限公司 | Display method |
CN112530342B (en) * | 2020-05-26 | 2023-04-25 | 友达光电股份有限公司 | Display method |
CN112884830A (en) * | 2021-01-21 | 2021-06-01 | 浙江大华技术股份有限公司 | Target frame determining method and device |
CN112884830B (en) * | 2021-01-21 | 2024-03-29 | 浙江大华技术股份有限公司 | Target frame determining method and device |
Also Published As
Publication number | Publication date |
---|---|
CN110366029B (en) | 2021-08-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108416327B (en) | Target detection method and device, computer equipment and readable storage medium | |
Weber et al. | Imagination-augmented agents for deep reinforcement learning | |
Seo et al. | Reinforcement learning with action-free pre-training from videos | |
CN107527091B (en) | Data processing method and device | |
CN107515909B (en) | Video recommendation method and system | |
US10026017B2 (en) | Scene labeling of RGB-D data with interactive option | |
EP3451241A1 (en) | Device and method for performing training of convolutional neural network | |
CN111444878A (en) | Video classification method and device and computer readable storage medium | |
KR20200018283A (en) | Method for training a convolutional recurrent neural network and for semantic segmentation of inputted video using the trained convolutional recurrent neural network | |
CN108111860B (en) | Video sequence lost frame prediction recovery method based on depth residual error network | |
TW202215303A (en) | Processing images using self-attention based neural networks | |
CN110366029A (en) | Method, system and the electronic equipment of picture frame are inserted between a kind of video | |
Voulodimos et al. | Improving multi-camera activity recognition by employing neural network based readjustment | |
US11907335B2 (en) | System and method for facilitating autonomous target selection | |
WO2021090535A1 (en) | Information processing device and information processing method | |
CN113469289A (en) | Video self-supervision characterization learning method and device, computer equipment and medium | |
CN113112536A (en) | Image processing model training method, image processing method and device | |
WO2020225247A1 (en) | Unsupervised learning of object keypoint locations in images through temporal transport or spatio-temporal transport | |
CA3204121A1 (en) | Recommendation with neighbor-aware hyperbolic embedding | |
CN110083761B (en) | Data distribution method, system and storage medium based on content popularity | |
CN110288444A (en) | Realize the method and system of user's associated recommendation | |
CN116189277A (en) | Training method and device, gesture recognition method, electronic equipment and storage medium | |
CN116982080A (en) | Methods, systems, and computer media for scene adaptive future depth prediction in monocular video | |
CN107622498A (en) | Image penetration management method, apparatus and computing device based on scene cut | |
CN113902639A (en) | Image processing method, image processing device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |