CN107704924A - Synchronous self-adapting space-time characteristic expresses the construction method and correlation technique of learning model - Google Patents

Synchronous self-adapting space-time characteristic expresses the construction method and correlation technique of learning model Download PDF

Info

Publication number
CN107704924A
CN107704924A CN201610602678.6A CN201610602678A CN107704924A CN 107704924 A CN107704924 A CN 107704924A CN 201610602678 A CN201610602678 A CN 201610602678A CN 107704924 A CN107704924 A CN 107704924A
Authority
CN
China
Prior art keywords
mrow
msubsup
msub
crn
sequence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201610602678.6A
Other languages
Chinese (zh)
Other versions
CN107704924B (en
Inventor
王亮
杜勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN201610602678.6A priority Critical patent/CN107704924B/en
Publication of CN107704924A publication Critical patent/CN107704924A/en
Application granted granted Critical
Publication of CN107704924B publication Critical patent/CN107704924B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
  • Image Analysis (AREA)

Abstract

The construction method of learning model and its model analysis method and Activity recognition method of correlation are expressed the invention discloses the synchronous self-adapting space-time characteristic for sequence.Wherein, the structure method includes the long input of memory neuron in short-term and the full connection of three control doors are replaced with to the wave filter of four groups of independence first, builds convolution recurrent neural member successively;Then, by X CRN parallel arranged, convolution recurrent neural net network layers are built;Hidden layer is built then according in the following manner:Each CRN output only exists the feedback link to its own elementary cell, and annexation is not present between each CRN;Again convolutional layer is built between convolution recurrent neural net network layers and list entries;Finally, Y convolution recurrent neural net network layers are mutually stacked, forms convolution recurrent neural network, single convolution recurrent neural net network layers include Z sublayer.The sequence space time information that more distinction can be obtained by the embodiment of the present invention is expressed, and without complicated pretreatment.

Description

Synchronous self-adapting space-time characteristic expresses the construction method and correlation technique of learning model
Technical field
The present embodiments relate to computer vision, pattern-recognition and depth learning technology field, and in particular to Yi Zhongzhen The construction method of learning model is expressed, to the model according to constructed by the construction method to the synchronous self-adapting space-time characteristic of sequence The method that model after carrying out the method for model analysis and being analyzed using the model analysis method carries out Activity recognition, but absolutely Not limited to this.
Background technology
In recent years, the recovery of nerual network technique has promoted the rapid development of artificial intelligence technology, many in actual life Data all have space structure and time behavioral characteristics, such as video data, customer historical purchase data, meteorological data etc. simultaneously Deng.Substantial amounts of data are recorded in the form of sequence data in society, in data not only comprising spatial structural form but also With time-varying multidate information.Such as:City intelligent monitoring system multitude of video data acquired daily, simply want to rely on manpower It is unpractical, it is necessary to establish special model realization computer to automatic point of these data these data are carried out with analysis Analysis.The development of depth learning technology has promoted AI industryization process energetically, and its model feature is there is powerful number According to feature representation learning ability.Wherein, the spatial structure characteristic for extracting static data, recurrence are the characteristics of convolutional neural networks The advantage of neutral net is then that sequence signature models, if we can combine both advantages, proposes a kind of same Space time information expression model in abstraction sequence is walked, fully to excavate the relation of hollow of sequence and temporal information, so as to obtain pair The expression of initial data, it will have important practical significance.
The deep learning model of two kinds of abstraction sequence data space-time structure information is presently, there are, is by convolutional Neural net respectively Network and recurrent neural network are combined in the form of in parallel and serial, and the common deficiency of this two class model is that abstraction sequence is hollow Between structure and the process of time-varying multidate information be separate, do not take into full account the hollow phase between temporal information of sequence Interaction relation, and error propagation also be present in cascaded structure model, these all influence model to space time information in sequence Expression study.
In view of this, it is special to propose the present invention.
The content of the invention
In order to solve above-mentioned problems of the prior art, the purpose of the present invention is to propose to a kind of synchronization for sequence The construction method of adaptive space-time feature representation learning model, to obtain the sequence space time information expression for having more distinction.Herein On the basis of, the present invention also proposes that a kind of model using constructed by the construction method carries out the method for model analysis and utilized to be somebody's turn to do The method that the model that model analysis method is analyzed carries out Activity recognition.
To achieve these goals, there is provided following technical scheme:
A kind of synchronous self-adapting space-time characteristic for sequence expresses the construction method of learning model, and methods described can wrap Include:
The long input of memory neuron in short-term and the full connection of three control doors are replaced with to the wave filter of four groups of independence, successively Build convolution recurrent neural member CRN;
By the X CRN parallel arrangeds, convolution recurrent neural net network layers are built, establish the output of each CRN only to it The feedback link of itself elementary cell, and do not established a connection between each CRN;
Convolutional layer is built between the convolution recurrent neural net network layers and list entries;
The Y convolution recurrent neural net network layers are mutually stacked, convolution recurrent neural network are formed, wherein single convolution is passed Neural net layer is returned to include Z sublayer, X, Y and Z are positive integer.
Preferably, the long input of memory neuron in short-term and the full connection of three control doors are replaced with to the filtering of four groups of independence Device, convolution recurrent neural member CRN is built successively, can specifically be included:
By share weight convolution connected mode, by the length in short-term the input gate of memory neuron, input control door, Forget door and output control door is attached with the sequence.
Preferably, above-mentioned construction method also includes:The convolution recurrent neural member of the sequence using to(for) single point in time Row perform convolution operation spatially, and export along the time and change for the same area of the sequence each moment spatially For computing.
The method of model analysis is carried out using the model constructed by any of the above-described construction method to be included:
Hierarchical filtering is carried out to the sequence using the convolutional layer, to determine characteristic pattern sequence corresponding to the sequence Row;
Based on the feature graphic sequence, the output of the convolution recurrent neural network is calculated;
Based on the output of the convolution recurrent neural network, through full articulamentum dimensionality reduction, institute is obtained by soft maximization layer State the generic probability of sequence;
The generic probability based on the sequence, the behavior classification being subordinate to each frame of the sequence are sentenced It is fixed, and the overall behavior classification of the sequence is determined according to ballot.
Preferably, it is described to be based on the feature graphic sequence, the output of the convolution recurrent neural network is calculated, specific bag Include:
The output of wave filter in the CRN is determined according to below equation:
Wherein, it is describedIt is describedIt is describedWith it is describedRepresent respectively described in j-th of t l layers The CRN input block, the input control door, the forgetting door and wave filter output corresponding to the output control door; The Mjc, the Mji, the MjfWith the MjoJ-th of CRN units filters internal is represented respectively;Described in the p is represented Mjc, the Mji, the MjfWith the MjoCorresponding locus element;It is describedRepresent convolution recurrent neural described in t The output of l-1 layers in network;The kljc, the klji, the kljfAnd the kljoThe input block, described is represented respectively Input control door, the forgetting door and the output control door;It is describedRepresent the output of CRN units described in the t-1 moment;Institute State wljhc, the wljhi, the wljhfAnd the wljhoRepresent respectively by export to the input block, it is described input control door, The feedback link weight for forgeing door and the output control door;The bljc, the blji, the bljfAnd the bljoRespectively Represent the biasing of elementary cell inside the CRN;
The wave filter is exported according to below equation and carries out Nonlinear Mapping, to determine elementary cell inside the CRN Output:
Wherein, the g and the f represent nonlinear mapping function respectively;
According to below equation, the output of the CRN internal states unit is determined:
Wherein, it is describedRepresent that the input control door adjusts to the amplitude of input signal;It is describedRepresent institute State CRN outputs influences the state at current time Cell after amplitude adjustment is made in the forgetting door output;It is describedRepresent t institute State the output of Cell inside CRN;
Nonlinear transformation is carried out according to output of the below equation to the Cell and weighted by the output control door, with true Determine CRN output:
Wherein, it is describedRepresent outputs of j-th of the CRN of l layers in t;It is describedRepresent the corresponding CRN's Output control door state;It is describedRepresent the nonlinear mapping function of the Cell states.
Preferably, the generic probability based on the sequence, the behavior being subordinate to each frame of the sequence Classification is judged, and determines the behavior classification of the sequence according to ballot, in addition to:
Determined to minimize cross entropy loss function according to below equation:
Wherein, the δ () represents Kronecker functions;The VmRepresent the sequence;The r represents the sequence Vm True value label (groundtruth);The V represents training set;The S represents the sample total in the training set V;It is described K represents behavior classification;The N represents the frame number that m-th of sample is included;P (the Cnk|Vm) represent the sequence VmIn n-th Frame is under the jurisdiction of k-th of behavior classification C probability;
Based on delta learning rule, local error is determined according to below equation:
Wherein, the L represents the cross entropy loss function;It is describedRepresent that j-th of CRN of t l layers is defeated The local error gone out;It is describedRepresent the local error of the Cell states inside the CRN;
Wherein, it is describedDetermined according to the delta learning rule of error backpropagation algorithm according to below equation:
Wherein, the Pl+1Represent that l+1 layers convolution algorithm exports element set;The Ml+1Represent that the CRN of l+1 layers is mono- First filters internal single multiplies accumulating input common during computing;The Ul+1Represent the filtering in all CRN of l+1 layers Device unit;The k represents the individual element in the CRN inputs;The u represents the single filter in l+1 layers;It is described wljkuRepresent that k-th of element is the same as between corresponding element in u-th of filter cell in l+1 layers in j-th of CRN of l layers input Connection weight;It is describedThe local error exported for l+1 layer median filter u in t;The UljRepresent l layers j-th Wave filter in CRN;The u ' then represents the UljIn single filter;The wljhu′Represent that j-th of CRN's of l layers is defeated Go out and the connection weight of inside unit u ';It is describedRepresent UljParts of the middle single filter u ' at the t+1 moment is missed Difference;
Wherein, it is describedDetermined according to chain rule according to below equation:
Wherein, it is describedRepresent that the Cell states correspond to nonlinear mapping function in t inside the CRN First derivative;It is describedRepresent that the Cell states are in the local error at t+1 moment inside j-th of CRN of l layers;It is describedRepresent to forget state of the door at the t+1 moment described in the CRN inside neurons;
According to chain rule according to below equation, to determine part corresponding to output control door inside j-th of CRN of l layers Error:
Wherein, it is describedRepresent that Cell states are through activation primitive inside the CRNMake defeated after Nonlinear Mapping Go out;It is describedRepresent the first derivative that the output control door mapping function inputs to it;
According to delta learning rule and chain rule, the Cell state cells, the institute of the CRN are determined according to below equation State and forget local error corresponding to door and the input control door:
Wherein,AndCRN input blocks are represented respectively, forget door and input control First derivative of the nonlinear mapping function in t corresponding to door three.
Preferably, the model analysis method can also include:The wave filter is entered by error backpropagation algorithm Row fine setting.
The method that model after being analyzed using above-mentioned model analysis method carries out Activity recognition includes:
Recognition sequence is treated using the convolutional layer in the model analysis model and carries out hierarchical filtering, to be treated described in determination Feature graphic sequence corresponding to recognition sequence;
Based on the feature graphic sequence, space structure and time multidate information are synchronously extracted using the model analysis model Expression;
Expressed based on the space structure and time multidate information, to determine that the sequence to be identified is under the jurisdiction of each behavior class Other probability distribution;
The behavior classification for being subordinate to maximum probability is defined as to the recognition result of the sequence to be identified.
Compared with prior art, above-mentioned technical proposal at least has the advantages that:
The embodiment of the present invention overcomes prior art and does not take into full account hollow of sequence by using above-mentioned technical proposal Interaction relationship between temporal information, and there is the defects of error propagation in model, can obtain the sequence of more distinction Pretreatment of the space time information expression without complexity.
The embodiment of the present invention can apply to solve the multinomial task based on sequence analysis, such as the behavior based on video is known Not, pedestrian recognize again and extensive monitoring scene under Activity recognition etc..
Brief description of the drawings
A part of the accompanying drawing as the present invention, for providing further understanding of the invention, of the invention is schematic Embodiment and its illustrate to be used to explain the present invention, but do not form inappropriate limitation of the present invention.Obviously, drawings in the following description Only some embodiments, to those skilled in the art, on the premise of not paying creative work, can be with Other accompanying drawings are obtained according to these accompanying drawings.In the accompanying drawings:
Fig. 1 is that the synchronous self-adapting space-time characteristic for sequence according to an exemplary embodiment expresses learning model Construction method schematic flow sheet;
Fig. 2 is length according to another exemplary embodiment memory neuron structural representation in short-term;
Fig. 3 is the structural representation of the convolution recurrent neural member according to an exemplary embodiment;
Fig. 4 is the structural representation of the convolutional neural networks according to an exemplary embodiment;
Fig. 5 is the side of the model progress model analysis to the structure of method shown in Fig. 1 according to an exemplary embodiment The schematic flow sheet of method;
Fig. 6 is that the multiple dimensioned space-time feature representation according to an exemplary embodiment learns schematic diagram;
Fig. 7 is the convolution recurrent neural networks model schematic diagram after adjusted according to an exemplary embodiment;
Fig. 8 is to utilize the model after the analysis of model analysis method shown in above-mentioned Fig. 5 according to an exemplary embodiment Carry out the method flow schematic diagram of Activity recognition;
Fig. 9 is to utilize the mould after the analysis of model analysis method shown in above-mentioned Fig. 5 according to another exemplary embodiment Type carries out the method flow schematic diagram of Activity recognition.
These accompanying drawings and word description are not intended as the concept limiting the invention in any way, but by reference to Specific embodiment is that those skilled in the art illustrate idea of the invention.
Embodiment
Below in conjunction with the accompanying drawings and specific embodiment solved to the embodiment of the present invention technical problem, used technical side Case and the technique effect of realization carry out clear, complete description.Obviously, described embodiment is only one of the application Divide embodiment, be not whole embodiments.Based on the embodiment in the application, those of ordinary skill in the art are not paying creation Property work on the premise of, the embodiment of all other equivalent or obvious modification obtained is all fallen within protection scope of the present invention. The embodiment of the present invention can embody according to the multitude of different ways being defined and covered by claim.
It should be noted that in the following description, understand for convenience, give many details.It is but very bright Aobvious, realization of the invention can be without these details.
It should also be noted that, in the case where not limiting clearly or not conflicting, each embodiment in the present invention and Technical characteristic therein can be mutually combined and form technical scheme.
The main thought of the embodiment of the present invention is static in extraction respectively with reference to convolutional neural networks and recurrent neural network Advantage in terms of space structure and dynamic varying information, while consider that gradient existing for conventional recursive neural network training process disappears The problem of error of becoming estranged expands, based on the gating structure and feedforward and feedback mechanism in the long design of memory neuron in short-term, with convolution Recurrent neural member may be constructed convolution recurrent neural network for elementary cell, learn the space in sequence data with synchronous self-adapting Structure and time Expression, inter-related task is completed based on the expression learnt.
Exist between spatial structure characteristic and time-varying dynamic characteristic in data and be closely connected, by taking into full account between the two Interaction relationship is significant to analyzing these data to extract the spatial and temporal variation in data.It is therefore, of the invention It is proposed that a kind of synchronous self-adapting space-time characteristic for sequence expresses the construction method of learning model.As shown in figure 1, this method can With including step S100 to step S140.
S100:The long input of memory neuron in short-term and the full connection of three control doors are replaced with to the filtering of four groups of independence Device, convolution recurrent neural member CRN is built successively.
Fig. 2 schematically illustrate in long memory neuron structure in short-term the long gating structure of memory neuron in short-term and before Present feedback mechanism.
Fig. 3 schematically illustrates the structure of convolution recurrent neural member.
S110:By X CRN parallel arranged, convolution recurrent neural net network layers are built.Wherein, X takes positive integer.
Wherein, multiple convolution recurrent neural member parallel arrangeds form wave filter group, are connected between each wave filter independent.
S120:The output of each convolution recurrent neural member is established only to the feedback link of its own elementary cell, and in each volume Do not established a connection between product recurrent neural member.
S130:Convolutional layer is built between convolution recurrent neural net network layers and the sequence of input.
By taking sequence of frames of video as an example, in order to reduce interference of the background varying information to model, suppress it on training set Over-fitting.Instructed so entering indirectly in convolution recurrent neural net network layers and input video frame sequence on large-scale image data collection The convolutional layer perfected.
S140:Y convolution recurrent neural net network layers are mutually stacked, form convolution recurrent neural network.Wherein, single convolution Recurrent neural net network layers include Z sublayer, and Y and Z take positive integer.
In this step, multiple convolution recurrent neural net network layers are mutually stacked to form convolution recurrent neural network, each convolution Recurrent neural net network layers support multiple dimensioned sequence space time information synchronously to extract.
Convolution recurrent neural net network layers can be with the pooling layers in convolution recurrent neural network, regularization layer etc. All types layer carries out slitless connection.Front end is convolutional layer to model on the whole, and rear end is convolution recurrent neural net network layers.The model Being one has monitor model.
Fig. 4 schematically illustrates the structure of convolutional neural networks.Wherein, Softmax layers represent soft maximization layer.It is same Multiple sublayers are included in convolution recurrent neural net network layers, the sequences such as video data are divided with different convolution yardsticks Analysis is handled.
It is above-mentioned to replace the long input of memory neuron in short-term and the full connection of three control doors in an optional embodiment The wave filter of four groups of independence is changed to, convolution recurrent neural member CRN is built successively, can specifically include:By the volume for sharing weight Product connected mode, the long input gate of memory neuron in short-term, input control door, forgetting door and output control door and sequence are carried out Connection.
In actual implementation process, long memory neuron in short-term can be configured to multigroup separate wave filter, and By the output feedback link of long memory neuron in short-term to input gate, input control door, forget door and output control door.
In an optional embodiment, the method for the embodiments of the present invention can also include:
Convolution operation spatially is performed using sequence of the convolution recurrent neural member for single point in time, and it is each for sequence The same area of moment spatially, which was exported along the time, is iterated computing.
Wherein, separate filter synchronization carries out convolutional calculation, the identical input of previous moment to same input area Output result corresponding to region influences current same input area by being exported by neuron to the feedback link of elementary cell Corresponding output, realize space structure of the synchronization output by present input data and its structure change letter at the moment before Cease joint effect.
Due to while above-mentioned convolution operation is carried out, for each time data of sequence spatially same area output along when Between be iterated computing.So, the mutual shadow between considering sequence at different moments while space characteristics extraction is being done on the whole Ring.
Fig. 5 schematically illustrates carries out model analysis to the model constructed by the construction method embodiment according to Fig. 1 Method flow.As shown in figure 5, the model analysis method can include step S500 to step S530.
S500:Hierarchical filtering is carried out to sequence using convolutional layer, to determine feature graphic sequence corresponding to sequence.
In this step, sequence includes but is not limited to sequence of frames of video.For example, the sequence can be imperial family of Sweden science and engineering Disclosed in the YouTube Activity recognition databases that the KTH Activity recognitions database and University of Central Florida that institute announces are announced Sequence.Wherein, the KTH Activity recognitions database that KTH of Sweden announces includes 6 typical behavior types: Walking, Jogging, Running, Boxing, Hand waving, Hand clapping, totally 600 video sequences, video Sample rate is 25FPS.The YouTube Activity recognitions database that University of Central Florida announces includes 11 behavior classifications: Basketball shooting, biking/cycling, diving, golf swinging, horse back riding, Soccer juggling, swinging, tennis swinging, trampoline jumping, volleyball Spiking, and walking with a dog, totally 1168 behavior sequences.
In actual applications, image data set can be based on, hierarchical filtering is carried out to sequence using convolutional layer.By right Sequence carries out hierarchical filtering, can reduce interference of the background change to Activity recognition task.
S510:Feature based graphic sequence, calculate the output of convolution recurrent neural network.
Multiple sublayers can be included in same convolution recurrent neural net network layers, are come with different convolution yardsticks to sequence data (for example, it may be video data) is analyzed, and output result heap poststack is as next layer of input (referring to Fig. 6), so as to real The synchronous extraction of existing multiple dimensioned sequence space time information.
The embodiment of the present invention is with convolution recurrent neural first (Convolutional Recurrent Neuron, CRN) for base This unit, convolution and recurrence thought are dissolved into same convolution recurrent neural member, while consider solve because recurrence connects institute The gradient brought disappears and error expansion issues.
Specifically, gating structure of the embodiment of the present invention based on long memory neuron in short-term, input block, input control door (Input gate), forgetting door (Forget gate) and the sequence data of the same input of output control door (Output gate) are common The convolution connected mode of weight is enjoyed, four groups of separate wave filters are provided with CRN, the output feedback of convolution recurrent neural member connects It is connected to input block and input control door, forgets door and output control door.
As an example, as shown in fig. 6, in first convolution recurrent neural net network layers, two passages be respectively adopted 4x4 and 3x3 convolution kernel does the feature extraction under different scale to the feature graphic sequence of input;Then it is difference in second convolutional layer Come to carry out multiscale analysis to the output result of last layer with 2x2 and 3x3 convolution kernel.For the ease of subsequent treatment, herein Ask the output under different convolution yardsticks that there is identical Spatial Dimension.For single convolution recurrent neural first (CRN), it is in sky Between on 2D filtering is carried out to single point in time input frame, correspond to output to the same spatial location of different frame in time and does recurrence and divide Analysis, its four groups of internal wave filter adaptively determine the signal transmission and output of inside according to the input of synchronization jointly.
For the forward process of convolution recurrent neural networks model analysis, it is assumed that:Convolution recurrent neural network The output of l layers is x in (Convolutional Recurrent Neural Network, CRNN)l, j-th of l layers CRN units are Mj, four groups of wave filters inside the CRN units are respectively input block kljc, input control door klji, forget door kljfWith output control door kljo, the output of CRN units is hlj, by exporting to input block, input control door, forgeing door and output control The feedback link weight of door processed is respectively wljhc、wljhi、wljhfAnd wljho, the state of CRN internal states unit (Cell) is slj, The biasing of four elementary cells is respectively b inside CRNljc、blji、bljfAnd bljo
Specifically, step S510 can be realized by step S511 to step S514.
S511:Determine that CRN units filters internal exports according to below equation:
Wherein,WithJ-th of CRN of t l layers input block, input is represented respectively Control door, forget wave filter output corresponding to door and output control door;P represents four groups of wave filter M inside j-th of CRN unitjc、 Mji、MjfAnd MjoCorresponding locus element;Represent the output of l-1 layers in t convolution recurrent neural network; Represent the output of t-1 moment CRN units.
It will be apparent to one skilled in the art that above-mentioned be only assumed as illustrating, it is not construed as to the improper of the scope of the present invention Limit.
S512:Wave filter is exported through Nonlinear Mapping according to below equation, to determine the defeated of CRN inside elementary cell Go out:
Wherein, g and f represent nonlinear mapping function respectively.
Preferably, g is Tanh tanh mapping functions, and f is Sigmoid mapping functions.
S513:According to below equation, the output of Cell inside CRN is determined
Wherein,Represent that input control door adjusts to the amplitude of input signal;Represent CRN outputs through forgeing door The output at influence current time Cell after amplitude adjustment is done in output;Represent Cell states inside t CRN.
S514:Nonlinear transformation is carried out according to output of the below equation to Cell and weighted by output control door, to determine Convolution recurrent neural member CRN output:
Wherein,Represent outputs of j-th of the CRN of l layers in t;Represent corresponding CRN output control gate-shaped State;Represent the nonlinear mapping function of Cell states, it is therefore preferable to Tanh hyperbolic tangent functions.
Model last convolution recurrent neural net network layers output is the space time information table contained by extracted sequence Reach.Above-mentioned CRN output is same based on feature graphic sequence corresponding with sequence using multiple dimensioned convolution recurrent neural network Step extracts space structure and time multidate information expression therein.The space structure and time multidate information are expressed for follow-up Identification mission.
S520:Based on the output of convolution recurrent neural network, and through full articulamentum dimensionality reduction, sequence is obtained by soft maximization layer Generic probability.
In this step, the output of last convolution recurrent neural net network layers is the output of convolution recurrent neural network. The output of convolution recurrent neural network is expressed as space time information corresponding to the sequence extracted, dropped through a full articulamentum The generic probability of sequence is obtained by soft maximization layer after dimension.
Specifically, if last CRNN layer output output after the mapping of full articulamentum is ot, behavior database classification is K, then otDimension be K, it obtains t frame of video and is under the jurisdiction of probability of all categories after the mapping of Softmax layers.
S530:Generic probability based on sequence, the behavior classification being subordinate to each frame of sequence judge, and according to Ballot carrys out the overall behavior classification of determining sequence.
For example, when sequence length is 20 frame, the behavior classification being subordinate to each frame judges, then to this 20 frame Result of determination counted, find out that most classification of corresponding frame, be regarded as the affiliated behavior classification of sequence.
Specifically, in actual applications, this step can include step S531 to step
S531:Determined to minimize cross entropy (Cross Entropy) loss function according to below equation.Wherein, cross entropy Loss function is:
Wherein, δ () represents Kronecker functions;R represents sequence VmTrue value label (groundtruth);S is represented Sample total in training set V;K represents behavior classification;N represents the frame number that m-th of sample is included;p(Cnk|Vm) represent sequence VmMiddle n-th frame is under the jurisdiction of k-th of behavior classification C probability.
For reverse procedure, propagated using error along time reversal (Back Propagation Through Time, BPTT) algorithm.
S532:Based on delta learning rule, local error is determined according to below equation:
Wherein, L represents cross entropy loss function;Represent the local error of j-th of CRN output of t l layers; Represent the local error of Cell states inside CRN.
Wherein,Determined according to the delta learning rule of error backpropagation algorithm according to below equation:
Wherein, Pl+1Represent that l+1 layers convolution algorithm exports element set;Ml+1Represent to filter inside the CRN units of l+1 layers Ripple device single multiplies accumulating input common during computing;Ul+1Represent the filter cell in all CRN of l+1 layers;K represents that CRN is defeated Individual element in entering;U represents the single filter in l+1 layers;wljkuRepresent in j-th of CRN of l layers input k-th Element is the same as the connection weight between corresponding element in u-th of filter cell in l+1 layers;It is l+1 layer median filters u in t The local error of moment output;UljRepresent the wave filter in j-th of CRN of l layers;U ' then represents UljIn single filter, wljhu′Represent j-th of CRN of l layers output and the connection weight of inside unit u ';Represent UljMiddle single filter Local errors of the u ' at the t+1 moment.
Wherein, the local error of j-th of CRN inside Cell state of t l layersAccording to chain rule according to following Formula determines:
Wherein,Represent that Cell states correspond to first derivative of the nonlinear mapping function in t inside CRN; Represent that Cell states are in the local error at t+1 moment inside j-th of CRN of l layers;Represent that CRN inside neurons are forgotten door and existed The state at t+1 moment.
S533:Office corresponding to according to below equation determining output control door inside j-th CRN of l layers according to chain rule Portion's error is:
Wherein,Represent that Cell states are through activation primitive inside CRNMake the output after Nonlinear Mapping; Represent the first derivative that output control door mapping function inputs to it.
S534:According to delta learning rule and chain rule, determine CRN Cell state cells according to below equation, forget door Local error corresponding to door is controlled with input:
Wherein,AndCRN input blocks are represented respectively, forget door and input control First derivative of the nonlinear mapping function in t corresponding to door three.
Model analysis process using error along time reversal propagation algorithm (Back Propagation Through Time, BPTT) algorithm, for single convolution recurrent neural net network layers, original series need to be iterated with calculating, the process is passed by convolution Return neural net layer input state and export the influence of local error, obtaining input state and exporting the situation of local error Under, calculating process is independently of other each layers, and therefore, convolution recurrent neural net network layers can be the same as all classes in convolutional neural networks Type layer (such as:The functional layers such as pooling, regularization layer, dropout layers) carry out slitless connection.
In an optional embodiment, above-mentioned model analysis embodiment of the method can also include:Reversely passed by error Algorithm is broadcast to be finely adjusted front end filter.
The present embodiment can be finely adjusted according to specific tasks to the parameter of front end filter.Fig. 7 is exemplarily illustrated Convolution recurrent neural networks model schematic diagram after adjusted.
In addition, the embodiment of the present invention also proposes that the model after a kind of analysis using above-mentioned model analysis embodiment of the method is carried out The method of Activity recognition, as shown in figure 8, this method can include:
S800:The convolutional layer in model after being analyzed using model analysis method is treated recognition sequence and carries out hierarchical filter Ripple, to determine feature graphic sequence corresponding to sequence to be identified.
S810:Feature based graphic sequence, using model analysis method analyze after mold sync extract space structure and when Between multidate information express.
S820:Expressed based on space structure and time multidate information, determine that sequence to be identified is under the jurisdiction of each behavior classification Probability distribution.
S830:The behavior classification for being subordinate to maximum probability is defined as to the recognition result of sequence to be identified.
Illustrate the process that Activity recognition is carried out using model analysis model by taking video sequence data as an example below.Such as Fig. 9 Shown, the method for Activity recognition is carried out using model analysis model to be included:
S900:The convolutional layer in model after being analyzed using model analysis method carries out hierarchical to each frame in video sequence Filtering, to obtain feature graphic sequence corresponding to original video frame sequence.
S910:Space structure therein is synchronously extracted using multiple dimensioned convolution recurrent neural network feature based graphic sequence Expressed with time multidate information.
S920:Based on acquired space structure and the expression of time multidate information come consummatory behavior identification mission.
The embodiment of the present invention passes through the structure to the above-mentioned synchronous self-adapting space-time characteristic expression learning model for sequence Model constructed by method carries out model analysis, obtains model analysis model.The model analysis model is recycled to sequence to be identified Row carry out hierarchical filtering, to determine feature graphic sequence corresponding to sequence to be identified;Then, feature based graphic sequence, mould is utilized Type analysis mold sync extracts space structure and the expression of time multidate information;Then, based on space structure and time multidate information Expression, to determine that sequence to be identified is under the jurisdiction of the probability distribution of each behavior classification;Finally, the behavior classification of maximum probability will be subordinate to It is defined as the recognition result of sequence to be identified.Thus, the embodiment of the present invention can synchronously extract sequence according to specific tasks demand The relation of its space structure and varying information between the two is taken into full account while space time information is expressed in row, so as to preferably express Space time information structure in sequence, and without complicated pretreatment, directly can be carried using original sequence data as input Space time information therein is taken to express.
Next, the validity of the embodiment of the present invention is verified by Activity recognition experimental result.
Confirmatory experiment is carried out in the public database of two standards, and it is that KTH of Sweden announces respectively KTH Activity recognition databases, and the YouTube Activity recognition databases that University of Central Florida announces.The former is Activity recognition One of most classical database in research, include 6 typical behavior types:Walking、Jogging、Running、 Boxing, Hand waving, Hand clapping, completed by 25 people under 4 kinds of different scenes, totally 600 video sequences Row, video sampling rate is 25FPS.The latter is a challenging real scene database, containing 11 behavior classifications: Basketball shooting, biking/cycling, diving, golf swinging, horse back riding, Soccer juggling, swinging, tennis swinging, trampoline jumping, volleyball Spiking, and walking with a dog, totally 1168 behavior sequences.Each behavior classification is divided into comprising sample 25 groups, every group of at least four samples, the sample in same group has identical background, visual angle and action executing person.By video camera The factors such as motion, targeted attitude and dimensional variation, background and visual angle change, illumination is changeable influence.All experiments are set in phase The control methods answered is consistent.
Table one schematically illustrates experimental result of the convolution recurrent neural network on KTH databases.
Table one:
Table two schematically illustrates experimental result of the convolution recurrent neural network on YouTube databases.
Table two:
In experimental result, by the best careful track of model of precision in traditional Activity recognition method (Wang's et al. Method (2011), Wang et al. method (2013)) method contrasted, it was demonstrated that and convolution recurrent neural networks model can be more preferable Space time information expression in ground abstraction sequence.It is versatile and the model manipulation is simple, there is higher computational efficiency, just In practical application.
Although each step is described in the way of above-mentioned precedence in above-described embodiment, this area Technical staff is appreciated that to realize the effect of the present embodiment, is performed between different steps not necessarily in such order, It (parallel) execution simultaneously or can be performed with reverse order, these simple changes all protection scope of the present invention it It is interior.
The technical scheme provided above the embodiment of the present invention is described in detail.Although apply herein specific Individual example to the present invention principle and embodiment be set forth, still, the explanation of above-described embodiment is only applicable to help and managed Solve the principle of the embodiment of the present invention;Meanwhile to those skilled in the art, according to the embodiment of the present invention, it is being embodied It can be made a change within mode and application.
It should be noted that the flow chart being referred to herein is not limited solely to form shown in this article, it can be with Carry out other divisions and/or combination.
It should be noted that:Mark and word in accompanying drawing are intended merely to be illustrated more clearly that the present invention, are not intended as pair The improper restriction of the scope of the present invention.
Term " comprising ", "comprising" or any other like term are intended to including for nonexcludability, so that Process, method, article or equipment/device including a series of elements not only include those key elements, but also including not bright The other key elements really listed, either also include these processes, method, article or the intrinsic key element of equipment/device.
The present invention each step can be realized with general computing device, for example, they can concentrate on it is single On computing device, such as:Personal computer, server computer, handheld device or portable set, laptop device or more Processor device, it can also be distributed on the network that multiple computing devices are formed, they can be with different from order herein Shown or described step is performed, they are either fabricated to each integrated circuit modules respectively or will be more in them Individual module or step are fabricated to single integrated circuit module to realize.Therefore, the invention is not restricted to any specific hardware and soft Part or its combination.
Method provided by the invention can be realized using PLD, and it is soft can also to be embodied as computer program Part or program module (it include performing particular task or the routine for realizing particular abstract data type, program, object, component or Data structure etc.), such as can be according to an embodiment of the invention a kind of computer program product, run the computer program Product makes computer perform for demonstrated method.The computer program product includes computer-readable recording medium, should Computer program logic or code section are included on medium, for realizing methods described.The computer-readable recording medium can To be the removable medium (example that is mounted built-in medium in a computer or can be disassembled from basic computer Such as:Using the storage device of hot plug technology).The built-in medium includes but is not limited to rewritable nonvolatile memory, Such as:RAM, ROM, flash memory and hard disk.The removable medium includes but is not limited to:Optical storage media (such as:CD- ROM and DVD), magnetic-optical storage medium (such as:MO), magnetic storage medium (such as:Tape or mobile hard disk), have it is built-in can Rewrite nonvolatile memory media (such as:Storage card) and with built-in ROM media (such as:ROM boxes).
Particular embodiments described above, the purpose of the present invention, technical scheme and beneficial effect are carried out further in detail Describe in detail it is bright, should be understood that the foregoing is only the present invention specific embodiment, be not intended to limit the invention, it is all Within the spirit and principles in the present invention, any modification, equivalent substitution and improvements done etc., it should be included in the guarantor of the present invention Within the scope of shield.

Claims (8)

1. a kind of synchronous self-adapting space-time characteristic for sequence expresses the construction method of learning model, methods described is at least wrapped Include:
The long input of memory neuron in short-term and the full connection of three control doors are replaced with to the wave filter of four groups of independence, built successively Convolution recurrent neural member CRN;
By the X CRN parallel arrangeds, convolution recurrent neural net network layers are built;
Its feature is:
The output of each CRN is established only to the feedback link of its own elementary cell, and the not company of foundation between each CRN Connect relation;
Convolutional layer is built between the convolution recurrent neural net network layers and list entries;
The Y convolution recurrent neural net network layers are mutually stacked, convolution recurrent neural network are formed, wherein the single convolution is passed Neural net layer is returned to include Z sublayer, the X, the Y and the Z are positive integer.
2. according to the method for claim 1, it is characterised in that the long input of memory neuron in short-term and three are controlled into door Full connection replaces with the wave filter of four groups of independence, builds convolution recurrent neural member CRN successively, specifically includes:
By sharing the convolution connected mode of weight, by the length input gate of memory neuron, input control door, forgetting in short-term Door and output control door are attached with the sequence.
3. according to the method for claim 1, it is characterised in that methods described also includes:
Using the convolution operation of sequence execution of the convolution recurrent neural member for single point in time spatially, and for institute State the same area output of sequence each moment spatially and be iterated computing along the time.
4. a kind of method that model using constructed by any construction method in claims 1 to 33 carries out model analysis, its It is characterised by, the model analysis method includes:
Hierarchical filtering is carried out to the sequence using the convolutional layer, to determine feature graphic sequence corresponding to the sequence;
Based on the feature graphic sequence, the output of the convolution recurrent neural network is calculated;
Based on the output of the convolution recurrent neural network, through full articulamentum dimensionality reduction, the sequence is obtained by soft maximization layer The generic probability of row;
The generic probability based on the sequence, the behavior classification being subordinate to each frame of the sequence judge, and The overall behavior classification of the sequence is determined according to ballot.
5. model analysis method according to claim 4, it is characterised in that it is described to be based on the feature graphic sequence, calculate The output of the convolution recurrent neural network, is specifically included:
The output of wave filter in the CRN is determined according to below equation:
<mrow> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <munder> <mi>&amp;Sigma;</mi> <mrow> <mi>p</mi> <mo>&amp;Element;</mo> <msub> <mi>M</mi> <mrow> <mi>j</mi> <mi>c</mi> </mrow> </msub> </mrow> </munder> <msubsup> <mi>X</mi> <mrow> <mrow> <mo>(</mo> <mrow> <mi>l</mi> <mo>-</mo> <mn>1</mn> </mrow> <mo>)</mo> </mrow> <mi>p</mi> </mrow> <mi>t</mi> </msubsup> <msub> <mi>k</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> <mi>p</mi> </mrow> </msub> <mo>+</mo> <msub> <mi>w</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>h</mi> <mi>c</mi> </mrow> </msub> <msubsup> <mi>h</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mrow> <mi>t</mi> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <mo>+</mo> <msub> <mi>b</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> </msub> </mrow>
<mrow> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <munder> <mi>&amp;Sigma;</mi> <mrow> <mi>p</mi> <mo>&amp;Element;</mo> <msub> <mi>M</mi> <mrow> <mi>j</mi> <mi>i</mi> </mrow> </msub> </mrow> </munder> <msubsup> <mi>x</mi> <mrow> <mrow> <mo>(</mo> <mrow> <mi>l</mi> <mo>-</mo> <mn>1</mn> </mrow> <mo>)</mo> </mrow> <mi>p</mi> </mrow> <mi>t</mi> </msubsup> <msub> <mi>k</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> <mi>p</mi> </mrow> </msub> <mo>+</mo> <msub> <mi>w</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>h</mi> <mi>i</mi> </mrow> </msub> <msubsup> <mi>h</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mrow> <mi>t</mi> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <mo>+</mo> <msub> <mi>b</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> </msub> </mrow>
<mrow> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <munder> <mi>&amp;Sigma;</mi> <mrow> <mi>p</mi> <mo>&amp;Element;</mo> <msub> <mi>M</mi> <mrow> <mi>j</mi> <mi>f</mi> </mrow> </msub> </mrow> </munder> <msubsup> <mi>x</mi> <mrow> <mrow> <mo>(</mo> <mrow> <mi>l</mi> <mo>-</mo> <mn>1</mn> </mrow> <mo>)</mo> </mrow> <mi>p</mi> </mrow> <mi>t</mi> </msubsup> <msub> <mi>k</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> <mi>p</mi> </mrow> </msub> <mo>+</mo> <msub> <mi>w</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>h</mi> <mi>f</mi> </mrow> </msub> <msubsup> <mi>h</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mrow> <mi>t</mi> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <mo>+</mo> <msub> <mi>b</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> </msub> </mrow>
<mrow> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>o</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <munder> <mi>&amp;Sigma;</mi> <mrow> <mi>p</mi> <mo>&amp;Element;</mo> <msub> <mi>M</mi> <mrow> <mi>j</mi> <mi>o</mi> </mrow> </msub> </mrow> </munder> <msubsup> <mi>x</mi> <mrow> <mrow> <mo>(</mo> <mrow> <mi>l</mi> <mo>-</mo> <mn>1</mn> </mrow> <mo>)</mo> </mrow> <mi>p</mi> </mrow> <mi>t</mi> </msubsup> <msub> <mi>k</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>o</mi> <mi>p</mi> </mrow> </msub> <mo>+</mo> <msub> <mi>w</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>h</mi> <mi>o</mi> </mrow> </msub> <msubsup> <mi>h</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mrow> <mi>t</mi> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <mo>+</mo> <msub> <mi>b</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>o</mi> </mrow> </msub> </mrow>
Wherein, it is describedIt is describedIt is describedWith it is describedJ-th of CRN of t l layers institute is represented respectively State input block, the input control door, the forgetting door and wave filter output corresponding to the output control door;The Mjc、 The Mji, the MjfWith the MjoJ-th of CRN units filters internal is represented respectively;The p represents the Mjc, it is described Mji, the MjfWith the MjoCorresponding locus element;It is describedRepresent in convolution recurrent neural network described in t The output of l-1 layers;The kljc, the klji, the kljfAnd the kljoThe input block, the input control are represented respectively Door, the forgetting door and the output control door processed;It is describedRepresent the output of CRN units described in the t-1 moment;The wljhc、 The wljhi, the wljhfAnd the wljhoRepresent respectively by exporting to the input block, the input control door, the something lost Forget the feedback link weight of door and the output control door;The bljc, the blji, the bljfAnd the bljoInstitute is represented respectively State the biasing of elementary cell inside CRN;
The wave filter is exported according to below equation and carries out Nonlinear Mapping, to determine the defeated of CRN inside elementary cell Go out:
<mrow> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <mi>g</mi> <mo>(</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> <mi>t</mi> </msubsup> <mo>)</mo> <mo>;</mo> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <mi>f</mi> <mo>(</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> <mo>)</mo> <mo>;</mo> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <mi>f</mi> <mrow> <mo>(</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> <mo>)</mo> </mrow> <mo>;</mo> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>o</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <mi>f</mi> <mrow> <mo>(</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>o</mi> </mrow> <mi>t</mi> </msubsup> <mo>)</mo> </mrow> </mrow>
Wherein, the g and the f represent nonlinear mapping function respectively;
According to below equation, the output of the CRN internal states unit is determined:
<mrow> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> <mi>t</mi> </msubsup> <mo>+</mo> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mrow> <mi>t</mi> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> </mrow>
Wherein, it is describedRepresent that the input control door adjusts to the amplitude of input signal;It is describedDescribed in expression CRN outputs influence the state at current time Cell after amplitude adjustment is made in the forgetting door output;It is describedRepresent described in t Cell output inside CRN;
Nonlinear transformation is carried out according to output of the below equation to the Cell and weighted by the output control door, to determine CRN output:
Wherein, it is describedRepresent outputs of j-th of the CRN of l layers in t;It is describedRepresent corresponding CRN output control Door state processed;It is describedRepresent the nonlinear mapping function of the Cell states.
6. model analysis method according to claim 5, it is characterised in that the generic based on the sequence is general Rate, the behavior classification being subordinate to each frame of the sequence judge, and determine the behavior of the sequence according to ballot Classification, in addition to:
Determined to minimize cross entropy loss function according to below equation:
<mrow> <mi>L</mi> <mrow> <mo>(</mo> <mi>V</mi> <mo>)</mo> </mrow> <mo>=</mo> <mo>-</mo> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>m</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>S</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>n</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>N</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <mi>l</mi> <mi>n</mi> <munderover> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>=</mo> <mn>0</mn> </mrow> <mrow> <mi>K</mi> <mo>-</mo> <mn>1</mn> </mrow> </munderover> <mi>&amp;delta;</mi> <mrow> <mo>(</mo> <mi>k</mi> <mo>-</mo> <mi>r</mi> <mo>)</mo> </mrow> <mi>p</mi> <mrow> <mo>(</mo> <msub> <mi>C</mi> <mrow> <mi>n</mi> <mi>k</mi> </mrow> </msub> <mo>|</mo> <msub> <mi>V</mi> <mi>m</mi> </msub> <mo>)</mo> </mrow> </mrow>
Wherein, the δ () represents Kronecker functions;The VmRepresent the sequence;The r represents the sequence VmIt is true It is worth label (groundtruth);The V represents training set;The S represents the sample total in the training set V;The K tables Show behavior classification;The N represents the frame number that m-th of sample is included;P (the Cnk|Vm) represent the sequence VmMiddle n-th frame is subordinate to Belong to k-th of behavior classification C probability;
Based on delta learning rule, local error is determined according to below equation:
<mrow> <msubsup> <mi>&amp;xi;</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>h</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <mfrac> <mrow> <mo>&amp;part;</mo> <mi>L</mi> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>h</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mo>;</mo> <msubsup> <mi>&amp;xi;</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>s</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <mfrac> <mrow> <mo>&amp;part;</mo> <mi>L</mi> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> </mrow>
Wherein, the L represents the cross entropy loss function;It is describedRepresent j-th of the CRN output of t l layers Local error;It is describedRepresent the local error of the Cell states inside the CRN;
Wherein, it is describedDetermined according to the delta learning rule of error backpropagation algorithm according to below equation:
<mrow> <msubsup> <mi>&amp;xi;</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>h</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <munder> <mo>&amp;Sigma;</mo> <msub> <mi>P</mi> <mrow> <mi>l</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> </munder> <mrow> <mo>(</mo> <mrow> <munder> <mo>&amp;Sigma;</mo> <mrow> <mi>k</mi> <mo>&amp;Element;</mo> <msub> <mi>M</mi> <mrow> <mi>l</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> </mrow> </munder> <munder> <mo>&amp;Sigma;</mo> <mrow> <mi>u</mi> <mo>&amp;Element;</mo> <msub> <mi>U</mi> <mrow> <mi>l</mi> <mo>+</mo> <mn>1</mn> </mrow> </msub> </mrow> </munder> <msub> <mi>w</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>k</mi> <mi>u</mi> </mrow> </msub> <msubsup> <mi>&amp;delta;</mi> <mi>u</mi> <mi>t</mi> </msubsup> <mo>+</mo> <munder> <mo>&amp;Sigma;</mo> <mrow> <msup> <mi>u</mi> <mo>&amp;prime;</mo> </msup> <mo>&amp;Element;</mo> <msub> <mi>U</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> </msub> </mrow> </munder> <msub> <mi>w</mi> <mrow> <msup> <mi>ljhu</mi> <mo>&amp;prime;</mo> </msup> </mrow> </msub> <msubsup> <mi>&amp;delta;</mi> <msup> <mi>u</mi> <mo>&amp;prime;</mo> </msup> <mrow> <mi>t</mi> <mo>+</mo> <mn>1</mn> </mrow> </msubsup> </mrow> <mo>)</mo> </mrow> </mrow>
Wherein, the Pl+1Represent that l+1 layers convolution algorithm exports element set;The Ml+1Represent in the CRN units of l+1 layers Portion's wave filter single multiplies accumulating input common during computing;The Ul+1Represent the wave filter list in all CRN of l+1 layers Member;The k represents the individual element in the CRN inputs;The u represents the single filter in l+1 layers;The wljkuTable Show that k-th of element is the same as the connection between corresponding element in u-th of filter cell in l+1 layers in j-th of CRN of l layers input Weight;It is describedThe local error exported for l+1 layer median filter u in t;The UljRepresent in j-th of CRN of l layers Wave filter;The u ' then represents the UljIn single filter;The wljhu′Represent j-th of CRN of l layers output and The connection weight of inside unit u ';It is describedRepresent UljLocal errors of the middle single filter u ' at the t+1 moment;
Wherein, it is describedDetermined according to chain rule according to below equation:
Wherein, it is describedRepresent that the Cell states correspond to single order of the nonlinear mapping function in t inside the CRN Derivative;It is describedRepresent that the Cell states are in the local error at t+1 moment inside j-th of CRN of l layers;It is describedTable Show and state of the door at the t+1 moment is forgotten described in the CRN inside neurons;
According to chain rule according to below equation, to determine local error corresponding to output control door inside j-th of CRN of l layers:
Wherein, it is describedRepresent that Cell states are through activation primitive inside the CRNMake the output after Nonlinear Mapping;Institute StateRepresent the first derivative that the output control door mapping function inputs to it;
According to delta learning rule and chain rule, the Cell state cells, the something lost of the CRN are determined according to below equation Forget local error corresponding to door and the input control door:
<mrow> <msubsup> <mi>&amp;delta;</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <mfrac> <mrow> <mo>&amp;part;</mo> <mi>L</mi> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mo>=</mo> <mfrac> <mrow> <mo>&amp;part;</mo> <mi>L</mi> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mfrac> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mi>t</mi> </msubsup> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mo>=</mo> <msubsup> <mi>&amp;xi;</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>s</mi> </mrow> <mi>t</mi> </msubsup> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> <msup> <mi>g</mi> <mo>&amp;prime;</mo> </msup> <mrow> <mo>(</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> <mi>t</mi> </msubsup> <mo>)</mo> </mrow> </mrow>
<mrow> <msubsup> <mi>&amp;delta;</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <mfrac> <mrow> <mo>&amp;part;</mo> <mi>L</mi> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mo>=</mo> <mfrac> <mrow> <mo>&amp;part;</mo> <mi>L</mi> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mfrac> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mi>t</mi> </msubsup> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mfrac> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mo>=</mo> <msubsup> <mi>&amp;xi;</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>s</mi> </mrow> <mi>t</mi> </msubsup> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mrow> <mi>t</mi> <mo>-</mo> <mn>1</mn> </mrow> </msubsup> <msup> <mi>f</mi> <mo>&amp;prime;</mo> </msup> <mrow> <mo>(</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>f</mi> </mrow> <mi>t</mi> </msubsup> <mo>)</mo> </mrow> </mrow>
<mrow> <msubsup> <mi>&amp;delta;</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> <mo>=</mo> <mfrac> <mrow> <mo>&amp;part;</mo> <mi>L</mi> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mo>=</mo> <mfrac> <mrow> <mo>&amp;part;</mo> <mi>L</mi> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mfrac> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>s</mi> <mrow> <mi>l</mi> <mi>j</mi> </mrow> <mi>t</mi> </msubsup> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mfrac> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>u</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> </mrow> <mrow> <mo>&amp;part;</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> </mrow> </mfrac> <mo>=</mo> <msubsup> <mi>&amp;xi;</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>s</mi> </mrow> <mi>t</mi> </msubsup> <mi>g</mi> <mrow> <mo>(</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>c</mi> </mrow> <mi>t</mi> </msubsup> <mo>)</mo> </mrow> <msup> <mi>f</mi> <mo>&amp;prime;</mo> </msup> <mrow> <mo>(</mo> <msubsup> <mi>a</mi> <mrow> <mi>l</mi> <mi>j</mi> <mi>i</mi> </mrow> <mi>t</mi> </msubsup> <mo>)</mo> </mrow> </mrow>
Wherein,AndCRN input blocks are represented respectively, forget door and input control door three First derivative of the corresponding nonlinear mapping function in t.
7. according to any described model analysis method in claim 4 to 6, it is characterised in that the analysis method of the model Also include:
The wave filter is finely adjusted by error backpropagation algorithm.
8. it is a kind of analyzed using the model analysis method as described in any in claim 4 to 6 after model carry out Activity recognition Method, it is characterised in that the method for the Activity recognition includes:
The convolutional layer in model after being analyzed using the model analysis method is treated recognition sequence and carries out hierarchical filtering, with true Feature graphic sequence corresponding to the fixed sequence to be identified;
Based on the feature graphic sequence, the mold sync after being analyzed using the model analysis method extracts space structure and time Multidate information is expressed;
Expressed based on the space structure and time multidate information, to determine that the sequence to be identified is under the jurisdiction of each behavior classification Probability distribution;
The behavior classification for being subordinate to maximum probability is defined as to the recognition result of the sequence to be identified.
CN201610602678.6A 2016-07-27 2016-07-27 Construction method of synchronous self-adaptive space-time feature expression learning model and related method Active CN107704924B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610602678.6A CN107704924B (en) 2016-07-27 2016-07-27 Construction method of synchronous self-adaptive space-time feature expression learning model and related method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610602678.6A CN107704924B (en) 2016-07-27 2016-07-27 Construction method of synchronous self-adaptive space-time feature expression learning model and related method

Publications (2)

Publication Number Publication Date
CN107704924A true CN107704924A (en) 2018-02-16
CN107704924B CN107704924B (en) 2020-05-19

Family

ID=61169004

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610602678.6A Active CN107704924B (en) 2016-07-27 2016-07-27 Construction method of synchronous self-adaptive space-time feature expression learning model and related method

Country Status (1)

Country Link
CN (1) CN107704924B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108510280A (en) * 2018-03-23 2018-09-07 上海氪信信息技术有限公司 A kind of financial fraud behavior prediction method based on mobile device behavioral data
CN109063829A (en) * 2018-06-22 2018-12-21 泰康保险集团股份有限公司 Neural network construction method, device, computer equipment and storage medium
CN109656134A (en) * 2018-12-07 2019-04-19 电子科技大学 A kind of end-to-end decision-making technique of intelligent vehicle based on space-time joint recurrent neural network
CN110210581A (en) * 2019-04-28 2019-09-06 平安科技(深圳)有限公司 A kind of handwritten text recognition methods and device, electronic equipment
CN111656412A (en) * 2018-06-28 2020-09-11 株式会社小松制作所 System and method for determining work performed by work vehicle, and method for manufacturing learned model
WO2021134519A1 (en) * 2019-12-31 2021-07-08 华为技术有限公司 Device and method for realizing data synchronization in neural network inference
CN113671031A (en) * 2021-08-20 2021-11-19 北京房江湖科技有限公司 Wall hollowing detection method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140358546A1 (en) * 2013-05-28 2014-12-04 International Business Machines Corporation Hybrid predictive model for enhancing prosodic expressiveness
CN104615983A (en) * 2015-01-28 2015-05-13 中国科学院自动化研究所 Behavior identification method based on recurrent neural network and human skeleton movement sequences
CN105243398A (en) * 2015-09-08 2016-01-13 西安交通大学 Method of improving performance of convolutional neural network based on linear discriminant analysis criterion
CN105678292A (en) * 2015-12-30 2016-06-15 成都数联铭品科技有限公司 Complex optical text sequence identification system based on convolution and recurrent neural network

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140358546A1 (en) * 2013-05-28 2014-12-04 International Business Machines Corporation Hybrid predictive model for enhancing prosodic expressiveness
CN104615983A (en) * 2015-01-28 2015-05-13 中国科学院自动化研究所 Behavior identification method based on recurrent neural network and human skeleton movement sequences
CN105243398A (en) * 2015-09-08 2016-01-13 西安交通大学 Method of improving performance of convolutional neural network based on linear discriminant analysis criterion
CN105678292A (en) * 2015-12-30 2016-06-15 成都数联铭品科技有限公司 Complex optical text sequence identification system based on convolution and recurrent neural network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
宣森炎 等: "基于联合卷积和递归神经网络的交通标志识别", 《传感器与微***》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108510280A (en) * 2018-03-23 2018-09-07 上海氪信信息技术有限公司 A kind of financial fraud behavior prediction method based on mobile device behavioral data
CN108510280B (en) * 2018-03-23 2020-07-31 上海氪信信息技术有限公司 Financial fraud behavior prediction method based on mobile equipment behavior data
CN109063829A (en) * 2018-06-22 2018-12-21 泰康保险集团股份有限公司 Neural network construction method, device, computer equipment and storage medium
CN109063829B (en) * 2018-06-22 2021-03-16 泰康保险集团股份有限公司 Neural network construction method and device, computer equipment and storage medium
CN111656412A (en) * 2018-06-28 2020-09-11 株式会社小松制作所 System and method for determining work performed by work vehicle, and method for manufacturing learned model
CN109656134A (en) * 2018-12-07 2019-04-19 电子科技大学 A kind of end-to-end decision-making technique of intelligent vehicle based on space-time joint recurrent neural network
CN110210581A (en) * 2019-04-28 2019-09-06 平安科技(深圳)有限公司 A kind of handwritten text recognition methods and device, electronic equipment
CN110210581B (en) * 2019-04-28 2023-11-24 平安科技(深圳)有限公司 Handwriting text recognition method and device and electronic equipment
WO2021134519A1 (en) * 2019-12-31 2021-07-08 华为技术有限公司 Device and method for realizing data synchronization in neural network inference
CN113169989A (en) * 2019-12-31 2021-07-23 华为技术有限公司 Device and method for realizing data synchronization in neural network inference
CN113671031A (en) * 2021-08-20 2021-11-19 北京房江湖科技有限公司 Wall hollowing detection method and device

Also Published As

Publication number Publication date
CN107704924B (en) 2020-05-19

Similar Documents

Publication Publication Date Title
CN107704924A (en) Synchronous self-adapting space-time characteristic expresses the construction method and correlation technique of learning model
CN107506740B (en) Human body behavior identification method based on three-dimensional convolutional neural network and transfer learning model
CN108681712B (en) Basketball game semantic event recognition method fusing domain knowledge and multi-order depth features
Ghazi et al. Plant identification using deep neural networks via optimization of transfer learning parameters
Verma et al. Application of convolutional neural networks for evaluation of disease severity in tomato plant
CN106709461B (en) Activity recognition method and device based on video
CN109443382A (en) Vision SLAM closed loop detection method based on feature extraction Yu dimensionality reduction neural network
Ullah et al. One-shot learning for surveillance anomaly recognition using siamese 3d cnn
CN107463919A (en) A kind of method that human facial expression recognition is carried out based on depth 3D convolutional neural networks
CN108399435B (en) Video classification method based on dynamic and static characteristics
Verma et al. Deep learning-based multi-modal approach using RGB and skeleton sequences for human activity recognition
CN106203363A (en) Human skeleton motion sequence Activity recognition method
Li et al. Pedestrian detection based on deep learning model
CN106462797A (en) Customized classifier over common features
CN111652903A (en) Pedestrian target tracking method based on convolution correlation network in automatic driving scene
CN104537684A (en) Real-time moving object extraction method in static scene
CN111696137A (en) Target tracking method based on multilayer feature mixing and attention mechanism
CN109979161A (en) A kind of tumble detection method for human body based on convolution loop neural network
Xiao et al. Overview: Video recognition from handcrafted method to deep learning method
Tan et al. Bidirectional long short-term memory with temporal dense sampling for human action recognition
CN109858496A (en) A kind of image characteristic extracting method based on weighting depth characteristic
CN103093247A (en) Automatic classification method for plant images
Abdelrazik et al. Efficient hybrid algorithm for human action recognition
Zhao et al. Human action recognition based on improved fusion attention CNN and RNN
Xu et al. Enhancing adaptive history reserving by spiking convolutional block attention module in recurrent neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant