CN110163095A - Winding detection method, winding detection device and terminal device - Google Patents
Winding detection method, winding detection device and terminal device Download PDFInfo
- Publication number
- CN110163095A CN110163095A CN201910303060.3A CN201910303060A CN110163095A CN 110163095 A CN110163095 A CN 110163095A CN 201910303060 A CN201910303060 A CN 201910303060A CN 110163095 A CN110163095 A CN 110163095A
- Authority
- CN
- China
- Prior art keywords
- image
- historical frames
- present frame
- feature descriptor
- winding detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/52—Surveillance or monitoring of activities, e.g. for recognising suspicious objects
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Biomedical Technology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Image Analysis (AREA)
Abstract
The application is suitable for winding detection technique field, provides a kind of winding detection method, winding detection device, terminal device and computer readable storage medium, comprising: obtains present frame and the corresponding multiple historical frames of the present frame;The present frame and the multiple historical frames are input to trained convolution from coding structure, export the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;According to the Feature Descriptor of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames, the Euclidean distance of each historical frames in the present frame and the multiple historical frames is calculated;Determine that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.The application carries out winding detection from coding structure using the convolution of unsupervised learning, the success rate and the robustness in complex environment that winding detection can be improved.
Description
Technical field
The application belongs to winding detection technique field more particularly to a kind of winding detection method, winding detection device, terminal
Equipment and computer readable storage medium.
Background technique
Currently, winding detection mainly faces two problems: first is that perception deviation, also referred to as false positive, i.e., different scenes are seen
Get up similar to be judged as winding;Second is that perception variation, also referred to as false negative, i.e., identical scene is due to illumination, visual angle, goer
Body etc. seems different, is not judged as winding.One good winding detection algorithm should be able to overcome both of these problems.Perhaps
Mostly the winding detection algorithm based on appearance uses bag of words, achieves good results, but these characteristics of image are all bases
In the feature of hand-designed, when illumination variation is obvious in environment, this method is easy failure.Deep neural network can be with
Learn the character representation of image automatically from mass data, a large number of studies show that the feature that convolutional neural networks learn is to environment
In illumination variation robustness it is fine.But convolutional neural networks extraction is global characteristics, when image aspects change greatly
When, detect that the success rate of winding is not high, and convolutional neural networks belong to supervised learning, needing to obtain a large amount of labels could instruct
Practice.
Summary of the invention
In view of this, the embodiment of the present application provides a kind of winding detection method, winding detection device, terminal device and meter
Calculation machine readable storage medium storing program for executing improves winding detection to use the convolution of unsupervised learning to carry out winding detection from coding structure
Success rate and the robustness in complex environment.
The first aspect of the embodiment of the present application provides a kind of winding detection method, and the winding detection method includes:
Obtain present frame and the corresponding multiple historical frames of the present frame;
The present frame and the multiple historical frames are input to trained convolution from coding structure, exported described current
The Feature Descriptor of each historical frames in the Feature Descriptor of frame and the multiple historical frames;
According to the Feature Descriptor of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames, meter
Calculate the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Determine that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.
The second aspect of the embodiment of the present application provides a kind of winding detection device, and the winding detection device includes:
Frame obtains module, for obtaining present frame and the corresponding multiple historical frames of the present frame;
Feature output module, for the present frame and the multiple historical frames to be input to trained convolution from coding
Structure exports the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;
Distance calculation module, for according to each history in the Feature Descriptor of the present frame and the multiple historical frames
The Feature Descriptor of frame calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Winding determining module, for determining the shortest frame of Euclidean distance in the present frame and the multiple historical frames
For winding.
The third aspect of the embodiment of the present application provides a kind of terminal device, including memory, processor and is stored in
In the memory and the computer program that can run on the processor, when the processor executes the computer program
It realizes as described in above-mentioned first aspect the step of winding detection method
The fourth aspect of the embodiment of the present application provides a kind of computer readable storage medium, the computer-readable storage
Media storage has computer program, realizes that winding is examined as described in above-mentioned first aspect when the computer program is executed by processor
The step of survey method.
The 5th aspect of the application provides a kind of computer program product, and the computer program product includes computer
Program realizes the winding detection method as described in above-mentioned first aspect when the computer program is executed by one or more processors
The step of.
Therefore the application is after getting present frame and the corresponding multiple historical frames of present frame, by present frame and more
A historical frames are input to trained convolution from coding structure, export every in the Feature Descriptor and multiple historical frames of present frame
The Feature Descriptor of a historical frames, and according to the Feature Descriptor of the Feature Descriptor of present frame and each historical frames, calculating is worked as
The Euclidean distance of previous frame and each historical frames, thus according to Euclidean distance select with present frame indicate Same Scene or
The frame in the same place of person, to complete winding detection.The application, from coding structure, can be extracted by the convolution of unsupervised learning
More complex environmental change and the Feature Descriptor compared with robust are adapted to, son is described using this feature and carries out winding detection, it can
The success rate and the robustness in complex environment for improving winding detection.
Detailed description of the invention
It in order to more clearly explain the technical solutions in the embodiments of the present application, below will be to embodiment or description of the prior art
Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only some of the application
Embodiment for those of ordinary skill in the art without any creative labor, can also be according to these
Attached drawing obtains other attached drawings.
Fig. 1 is the implementation process schematic diagram for the winding detection method that the embodiment of the present application one provides;
Fig. 2 is convolution coding topology example figure certainly;
Fig. 3 is four scale pyramid pond topology example figures;
Fig. 4 is the implementation process schematic diagram for the winding detection method that the embodiment of the present application two provides;
Fig. 5 is random projection transforms exemplary diagram;
Fig. 6 is the schematic diagram for the winding detection device that the embodiment of the present application three provides;
Fig. 7 is the schematic diagram for the terminal device that the embodiment of the present application four provides.
Specific embodiment
In being described below, for illustration and not for limitation, the tool of such as particular system structure, technology etc is proposed
Body details, so as to provide a thorough understanding of the present application embodiment.However, it will be clear to one skilled in the art that there is no these specific
The application also may be implemented in the other embodiments of details.In other situations, it omits to well-known system, device, electricity
The detailed description of road and method, so as not to obscure the description of the present application with unnecessary details.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " instruction is described special
Sign, entirety, step, operation, the presence of element and/or component, but be not precluded one or more of the other feature, entirety, step,
Operation, the presence or addition of element, component and/or its set.
In the specific implementation, terminal device described in the embodiment of the present application is including but not limited to such as with the sensitive table of touch
Mobile phone, laptop computer or the tablet computer in face (for example, touch-screen display and/or touch tablet) etc it is other
Portable device.It is to be further understood that in certain embodiments, the equipment is not portable communication device, but is had
The desktop computer of touch sensitive surface (for example, touch-screen display and/or touch tablet).
In following discussion, the terminal device including display and touch sensitive surface is described.However, should manage
Solution, terminal device may include that one or more of the other physical User of such as physical keyboard, mouse and/or control-rod connects
Jaws equipment.
Terminal device supports various application programs, such as one of the following or multiple: drawing application program, demonstration application
Program, word-processing application, website creation application program, disk imprinting application program, spreadsheet applications, game are answered
With program, telephony application, videoconference application, email application, instant messaging applications, forging
Refining supports application program, photo management application program, digital camera application program, digital camera application program, web-browsing to answer
With program, digital music player application and/or video frequency player application program.
At least one of such as touch sensitive surface can be used in the various application programs that can be executed on the terminal device
Public physical user-interface device.It can be adjusted among applications and/or in corresponding application programs and/or change touch is quick
Feel the corresponding information shown in the one or more functions and terminal on surface.In this way, terminal public physical structure (for example,
Touch sensitive surface) it can support the various application programs with user interface intuitive and transparent for a user.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in the present embodiment, each process
Execution sequence should be determined by its function and internal logic, and the implementation process without coping with the embodiment of the present application constitutes any restriction.
In order to illustrate technical solution described herein, the following is a description of specific embodiments.
It is the implementation process schematic diagram for the winding detection method that the embodiment of the present application one provides, winding detection referring to Fig. 1
Method is applied to terminal device, and the winding detection method as shown in the figure may comprise steps of:
Step S101 obtains present frame and the corresponding multiple historical frames of the present frame.
In the embodiment of the present application, winding detection is also known as closed loop detection, refers to that terminal device identification once reached the energy of scene
Power can reduce accumulated error if detected successfully significantly.Present frame can refer to the frame of pending winding detection;Currently
The corresponding multiple historical frames of frame can refer to the frame for having carried out winding detection, and historical frames occur before present frame, for example, having five
A video frame, the 5th video frame is present frame, then first four video frame is the historical frames of present frame.
The present frame and the multiple historical frames are input to trained convolution from coding structure by step S102, defeated
Out in the Feature Descriptor of the present frame and the multiple historical frames each historical frames Feature Descriptor.
In the embodiment of the present application, Feature Descriptor is the expression of image, extracts useful information and loses incoherent letter
Breath, such as to predict the button in an image above clothes, button is usually circle, and has several holes above, then long
The image that image can be become with edge detection only edge is exactly useful, color letter for this image edge information
Breath is exactly otiose, and good feature usually can distinguish the difference of button He other circular objects.It is described by feature
Son can be converted into the image of a w*h*3 (wide * high * 3,3 channels) vector or matrix that one length is n.Such as one
The image of 64*128*3, the image vector length exported after conversion can be 3780.
In the embodiment of the present application, present frame and multiple historical frames can be sequentially input to convolution from coding structure,
Present frame and multiple historical frames can be input to convolution from coding structure together, are not limited thereto.In output present frame
In Feature Descriptor and multiple historical frames when the Feature Descriptor of each historical frames, convolution can export respectively from coding structure to be worked as
The Feature Descriptor of each historical frames in the Feature Descriptor of previous frame and multiple historical frames.
Convolution is a kind of unsupervised learning algorithm from coding structure, and output can be realized the reproduction to input data, is
A kind of data compression algorithm.Using based on the unsupervised deep learning structure from coding, it is contemplated that the local space characteristics of image,
Generalization, in terms of be excellent in.Compared with convolutional neural networks, convolution is not needed from coding structure containing label
Data can be carried out training, the workload of label can be effectively reduced, the responsible degree of training pattern is simplified, improve volume
Training effectiveness of the product from coding structure.
Optionally, the convolution includes multiple convolutional layers, a pyramid pond structure from coding structure and multiple connects entirely
Layer is connect, pyramid pond structure is connect with the last one convolutional layer and first full articulamentum respectively.
In the embodiment of the present application, convolution is from the multiple convolutional layers and a pyramid pond structural simulation in coding structure
Convolution realizes data compression from the cataloged procedure of coding structure;Convolution is from multiple full articulamentum analog codecs in coding structure
Process realizes decompression.High dimensional data is mapped to low-dimensional data and reduces data volume by coding stage, and decoding stage just phase
Instead, it can be realized the reproduction to input data.It is convolution as shown in Figure 2 from topology example figure is encoded, convolution encodes knot certainly in Fig. 2
Structure includes four convolutional layers, a pyramid pond structure (i.e. spatial pyramid structure) and three full articulamentums, the last one
Convolutional layer connects the input of pyramid pond structure, and the output of pyramid pond structure connects first full articulamentum.
Important step when pondization operates in convolutional network, it can reduce net by reducing the dimension of feature
The complexity of network, and some invariance can be kept, such as: rotational invariance, translation invariance scale invariance, thus
Minor change and distortion of the creation one to image have the feature of invariance.But pondization operation is easily lost local feature
Information, therefore convolution uses the golden word of multiple and different scales from the pyramid pond structure in coding structure in the embodiment of the present application
Tower basinization operation, can extract more fully information.If Fig. 3 is four scale pyramid pond topology example figures, the golden word in Fig. 3
Tower basin topology example figure is operated using the pyramid pondization of four different scales.
Pyramid pond structure can go out the feature vector of fixed size by Multi resolution feature extraction.Pyramid pondization and
Pyramid pond unlike conventional pondization can produce the output characteristic pattern of fixed quantity.When in use, with pyramid pond
Change the pond layer replaced in convolutional neural networks after the last layer convolutional layer.Network can receive the defeated of arbitrary dimension in this way
Enter, and also has certain robustness to target distortion.It is multi-party due to all having been carried out behind convolutional layer to each picture
The precision of task can be improved in the feature extraction in face.One is trained in different networks similar to different size of picture
Sample substantially increases the precision of model.
Due to the parameter sharing mechanism of convolution algorithm, convolution characteristic pattern can be interpreted by applying over an input image
Convolution filter and the detection score obtained, and there are filters to search around them for the position instruction with high activation value
The visual pattern of rope.Observe that usual convolution characteristic pattern is sparse, because only that a few locations have high activation and exist
Certain visual patterns.This shows that convolution filter has high selectivity to certain visual patterns.When observing from different angles
When the same place, their some visual patterns still retain, and can be detected by identical convolution filter.Pass through this
Kind of observation, can using the visual pattern most outstanding at multiple positions of the multiple dimensioned merging method to search for image, so as to across
Different points of view matching image.For each convolution characteristic pattern, can be operated using the pyramid pondization of multiple and different scales, first
Each convolution characteristic pattern is divided into H × H (i.e. m*m, n*n, p*p and q*q in Fig. 3) a unit, and in each space
In unit, we collect Feature Descriptor using maximum Chi Hualai.It after the merge operation, can be by the spy with any size
Sign mapping is reduced to the vector of low-dimensional, further reduced computation complexity.Finally, connecting the pyramid pond of multiple and different scales
Change the Feature Descriptor that operation obtains, to form the Feature Descriptor of image.
Optionally, in the Feature Descriptor of the output present frame and the multiple historical frames each historical frames spy
Sign describes son
By every in the Feature Descriptor of present frame described in the structure output of the pyramid pond and the multiple historical frames
The Feature Descriptor of a historical frames.
Step S103, according to the feature of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames
Description, calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames.
Step S104 determines that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.
In the embodiment of the present application, multiple historical frames correspond to multiple Euclidean distances, and present frame and every is being calculated
After the Euclidean distance of a historical frames, can more multiple Euclidean distances, and chosen from multiple Euclidean distances
Shortest Euclidean distance out, the historical frames that shortest Euclidean distance corresponds to are the frame with present frame winding, i.e.,
Present frame historical frames corresponding with shortest Euclidean distance are winding.Optionally, in order to further increase the accurate of winding
Property, can preset distance threshold, using the distance threshold judge historical frames whether with present frame for winding, such as select
After shortest Euclidean distance, judge whether shortest Euclidean distance is less than distance threshold, if in shortest Europe is several
It obtains distance and is less than distance threshold, it is determined that present frame historical frames corresponding with shortest Euclidean distance are winding;If most short
Euclidean distance be not less than distance threshold, it is determined that present frame historical frames corresponding with shortest Euclidean distance are not
Winding, i.e., there is no the frames with present frame winding in multiple historical frames.
The embodiment of the present application, from coding structure, can be extracted by the convolution of unsupervised learning and adapt to more complex ring
Border variation and compared with robust Feature Descriptor, using this feature describe son carry out winding detection, can be improved winding detection at
Power and the robustness in complex environment.
It referring to fig. 4, is the implementation process schematic diagram for the winding detection method that the embodiment of the present application two provides, winding detection
Method is applied to terminal device, and the winding detection method as shown in the figure may comprise steps of:
Step S401, training convolutional is from coding structure.
In the embodiment of the present application, in the Feature Descriptor for obtaining present frame and historical frames from coding structure using convolution
When, preparatory training convolutional is needed from coding structure, and trained convolution is extracted from coding structure and is both able to satisfy pair
The invariance of illumination variation is also able to satisfy the Feature Descriptor to the invariance of visual angle change, that is, the feature extracted compared with robust is retouched
State sub (i.e. image expression or feature representation).
Optionally, the training convolutional includes: from coding structure
Obtain training set of images;
Every image in described image training set is generated into an image pair;
Calculate histograms of oriented gradients HOG description that each image pair one opens image;
Described another image of each image pair is input to the convolution from coding structure, exports each image
The Feature Descriptor of another image of centering;
Calculate the spy of HOG the description son and each another image of image pair of an image in each image
The loss function of sign description;
According to the loss function, train the convolution from coding structure.
In the embodiment of the present application, training set of images can refer to multiple images for training from coding structure, that is, use
In training from the image set of coding structure, user can voluntarily selected digital image collection according to actual needs, be not limited thereto.Every
The image that image generates is identical to represented scene, may be visual angle difference, with the figure generated according to every image
As training convolutional from coding structure, is enabled convolution to export or extract from coding structure and had both been able to satisfy to illumination variation
Invariance be also able to satisfy the Feature Descriptor to the invariance of visual angle change.
In the embodiment of the present application, every image generates an image to rear, can randomly choose one from the image pair
Be input to convolution learns image expression from coding structure automatically, and another image calculates its histograms of oriented gradients
(Histogram of Oriented Gradient, HOG) description.Wherein, HOG description is by calculating and statistical picture office
Pair the gradient orientation histogram in portion region carrys out constitutive characteristic, can keep good invariance to geometry and optical deformation, i.e.,
The variation of environment has very strong robustness, and the main thought of this feature is: the presentation of localized target and character can in image
It is described well by the direction density at gradient or edge, essence is the statistical information of gradient, and gradient is primarily present in edge
Place.In actual operation, small cell factory is divided the image into, each cell factory calculates a gradient direction (or side
Edge direction) histogram.Formula can be usedWithFind the size and Orientation of gradient, wherein gx
Indicate gradient of the cell factory in the direction x, gyCell factory is indicated in the gradient in the direction y, (x, y) indicates the seat of cell factory
Mark.
In order to have better invariance to illumination and shade, need to normalize gradient orientation histogram degree of comparing,
It can be realized by forming bigger block and normalizing all cell factories in block cell factory, all pieces of HOG is retouched
It states subgroup and collectively forms final HOG description.
In the embodiment of the present application, loss function measurement is that each image pair one is opened between image and another image
Difference, convolution can be updated from the parameter in coding structure, thus to convolution according to loss function in back-propagation process
It is trained from coding structure, when convolution reaches stable from coding structure, completes to train convolution from the training of coding structure
Use the output of pyramid pond structure as image expression (i.e. Feature Descriptor) after the completion.
Optionally, described that every image in described image training set is generated into an image to including:
Every image in described image training set is subjected to random projection transforms and generates an image pair.
In the embodiment of the present application, projective transformation is the non singular linear transformation under a kind of homogeneous coordinates, it is therefore an objective to one
The geometric distortion that a plane is generated when being imaged on having an X-rayed video camera is modeled, it has nine freedom degrees, but only proportional meaning
Justice, therefore it can be by eight parameter definitions, therefore the projection variation between two planes can be determined by four pairs of match points, wherein
Any 3 points in one plane are not conllinear.Random projection transforms can refer to that four points selected from original image are random
, it is random projection transforms exemplary diagram as shown in Figure 5, image, which is an image pair, in Fig. 5 is thrown at random left image
Shadow converts to obtain right image.Change warp image by accidental projection, can more preferable simulation due to terminal device (such as machine
Device people) movement caused by nature visual angle change.Wherein, projective transformation can be decomposed into similarity transformation, affine transformation and projection
The cascade of transformation.
Illustratively, convolution can be with from the training process of coding structure are as follows: to any one input picture, first with immediately
Projective transformation generates an image pair, obtains the appearance under same image different perspectives;Then a calculating is randomly choosed
Its HOG description, HOG, which describes son, can learn to preferable illumination invariant, and another image inputs convolution from coding structure
In learn characteristics of image automatically, the last one convolutional layer with pyramid pondization operate multi-angle is carried out to image feature mention
It takes, then with the feature vector of multiple full connection one fixed dimension of layer building;It is finally to compare that two methods are calculated to retouch
Son is stated, since HOG descriptor is the vector of regular length, can be compared by Euclidean distance, euclidean
Distance is easily integrated in the neural network that there is L2 to lose, therefore L2 loss function can be used and compare HOG description
From the Feature Descriptor for encoding structural remodeling, convolution may finally be learned from coding structure by study reconstruct repeatedly for son and convolution
Practise the Feature Descriptor compared with robust.
Optionally, every image in described image training set is being generated into an image to before, further includes:
Every image in described image training set is converted into grayscale image.
In the embodiment of the present application, in order to reduce the original data volume of every image in training set of images, convenient for subsequent right
Calculation amount is less when every image procossing, every image can be converted into grayscale image, is then grasped with accidental projection variation
Make, so that it may obtain original image (i.e. every image in training set of images) image pair that deformation occurs.
Step S402 obtains present frame and the corresponding multiple historical frames of the present frame.
The step is identical as step S101, and for details, reference can be made to the associated descriptions of step S101, and details are not described herein.
The present frame and the multiple historical frames are input to trained convolution from coding structure by step S403, defeated
Out in the Feature Descriptor of the present frame and the multiple historical frames each historical frames Feature Descriptor.
The step is identical as step S102, and for details, reference can be made to the associated descriptions of step S102, and details are not described herein.
Step S404, according to the feature of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames
Description, calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames.
The step is identical as step S103, and for details, reference can be made to the associated descriptions of step S103, and details are not described herein.
Step S405 determines that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.
The step is identical as step S104, and for details, reference can be made to the associated descriptions of step S104, and details are not described herein.
The embodiment of the present application increases training convolutional from coding structure, by orienting gradient histogram on the basis of implementing one
The unsupervised convolution of one kind is designed from coding structure from coding structure in figure and neural network, and it is straight on the one hand to pass through orientation gradient
Square graphics practises the image expression of image, on the other hand learns and reconstruct original image automatically by convolution autoencoder network, in conjunction with
The advantages of two methods, makes the feature finally extracted both be able to satisfy the invariance to illumination variation or be able to satisfy to visual angle change
Invariance, to extract the Feature Descriptor compared with robust.
It is that the schematic diagram for the winding detection device that the embodiment of the present application three provides only shows for ease of description referring to Fig. 6
Part relevant to the embodiment of the present application is gone out.
The winding detection device includes:
Frame obtains module 61, for obtaining present frame and the corresponding multiple historical frames of the present frame;
Feature output module 62, it is self-editing for the present frame and the multiple historical frames to be input to trained convolution
Code structure, exports the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;
Distance calculation module 63, for each being gone through according in the Feature Descriptor of the present frame and the multiple historical frames
The Feature Descriptor of history frame calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Winding determining module 64, for determining that the present frame and Euclidean distance in the multiple historical frames are shortest
Frame is winding.
Optionally, the convolution includes multiple convolutional layers, a pyramid pond structure from coding structure and multiple connects entirely
Layer is connect, pyramid pond structure is connect with the last one convolutional layer and first full articulamentum respectively.
Optionally, the feature output module 62 is specifically used for:
By every in the Feature Descriptor of present frame described in the structure output of the pyramid pond and the multiple historical frames
The Feature Descriptor of a historical frames.
Optionally, the winding detection device further include:
Structured training module 65, for training the convolution from coding structure.
Optionally, the structured training module 65 includes:
Acquiring unit, for obtaining training set of images;
Generation unit, for every image in described image training set to be generated an image pair;
Sub- computing unit is described, histograms of oriented gradients HOG description of image is opened for calculating each image pair one;
Output unit, it is defeated for described another image of each image pair to be input to the convolution from coding structure
The Feature Descriptor of another image of each image pair out;
Costing bio disturbance unit, for calculating HOG description and each image of an image in each image
The loss function of the Feature Descriptor of another image of centering;
Training unit, for training the convolution from coding structure according to the loss function.
Optionally, the generation unit is specifically used for:
Every image in described image training set is subjected to random projection transforms and generates an image pair.
Optionally, the structured training module 65 further include:
Converting unit, for every image in described image training set to be converted to grayscale image.
Winding detection device provided by the embodiments of the present application can be applied in preceding method embodiment one and embodiment two,
Details are referring to the description of above method embodiment one and embodiment two, and details are not described herein.
Fig. 7 is the schematic diagram for the terminal device that the embodiment of the present application four provides.As shown in fig. 7, the terminal of the embodiment is set
Standby 7 include: processor 70, memory 71 and are stored in the meter that can be run in the memory 71 and on the processor 70
Calculation machine program 72.The processor 70 is realized when executing the computer program 72 in above-mentioned each winding detection method embodiment
The step of, such as step S101 to S104 shown in FIG. 1.Alternatively, reality when the processor 70 executes the computer program 72
The function of each module/unit in existing above-mentioned each Installation practice, such as the function of module 61 to 65 shown in Fig. 6.
Illustratively, the computer program 72 can be divided into one or more module/units, it is one or
Multiple module/units are stored in the memory 71, and are executed by the processor 70, to complete the application.Described one
A or multiple module/units can be the series of computation machine program instruction section that can complete specific function, which is used for
Implementation procedure of the computer program 72 in the terminal device 7 is described.For example, the computer program 72 can be divided
It is cut into frame and obtains module, feature output module, distance calculation module, winding determining module and structured training module, each module
Concrete function is as follows:
Frame obtains module, for obtaining present frame and the corresponding multiple historical frames of the present frame;
Feature output module, for the present frame and the multiple historical frames to be input to trained convolution from coding
Structure exports the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;
Distance calculation module, for according to each history in the Feature Descriptor of the present frame and the multiple historical frames
The Feature Descriptor of frame calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Winding determining module, for determining the shortest frame of Euclidean distance in the present frame and the multiple historical frames
For winding.
Optionally, the convolution includes multiple convolutional layers, a pyramid pond structure from coding structure and multiple connects entirely
Layer is connect, pyramid pond structure is connect with the last one convolutional layer and first full articulamentum respectively.
Optionally, the feature output module is specifically used for:
By every in the Feature Descriptor of present frame described in the structure output of the pyramid pond and the multiple historical frames
The Feature Descriptor of a historical frames.
Optionally, structured training module, for training the convolution from coding structure.
Optionally, the structured training module includes:
Acquiring unit, for obtaining training set of images;
Generation unit, for every image in described image training set to be generated an image pair;
Sub- computing unit is described, histograms of oriented gradients HOG description of image is opened for calculating each image pair one;
Output unit, it is defeated for described another image of each image pair to be input to the convolution from coding structure
The Feature Descriptor of another image of each image pair out;
Costing bio disturbance unit, for calculating HOG description and each image of an image in each image
The loss function of the Feature Descriptor of another image of centering;
Training unit, for training the convolution from coding structure according to the loss function.
Optionally, the generation unit is specifically used for:
Every image in described image training set is subjected to random projection transforms and generates an image pair.
Optionally, the structured training module further include:
Converting unit, for every image in described image training set to be converted to grayscale image.
The terminal device 7 can be the equipment that robot, unmanned plane etc. need to carry out winding detection.The terminal device
It may include, but be not limited only to, processor 70, memory 71.It will be understood by those skilled in the art that Fig. 7 is only terminal device 7
Example, do not constitute the restriction to terminal device 7, may include than illustrating more or fewer components, or combination is certain
Component or different components, such as the terminal device can also include input-output equipment, network access equipment, bus
Deng.
Alleged processor 70 can be central processing unit (Central Processing Unit, CPU), can also be
Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit
(Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field-
Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic,
Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor
Deng.
The memory 71 can be the internal storage unit of the terminal device 7, such as the hard disk or interior of terminal device 7
It deposits.The memory 71 is also possible to the External memory equipment of the terminal device 7, such as be equipped on the terminal device 7
Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card dodge
Deposit card (Flash Card) etc..Further, the memory 71 can also both include the storage inside list of the terminal device 7
Member also includes External memory equipment.The memory 71 is for storing needed for the computer program and the terminal device
Other programs and data.The memory 71 can be also used for temporarily storing the data that has exported or will export.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function
Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different
Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing
The all or part of function of description.Each functional unit in embodiment, module can integrate in one processing unit, can also
To be that each unit physically exists alone, can also be integrated in one unit with two or more units, it is above-mentioned integrated
Unit both can take the form of hardware realization, can also realize in the form of software functional units.In addition, each function list
Member, the specific name of module are also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.Above system
The specific work process of middle unit, module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment
The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure
Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually
It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician
Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed
Scope of the present application.
In embodiment provided herein, it should be understood that disclosed device/terminal device and method, it can be with
It realizes by another way.For example, device described above/terminal device embodiment is only schematical, for example, institute
The division of module or unit is stated, only a kind of logical function partition, there may be another division manner in actual implementation, such as
Multiple units or components can be combined or can be integrated into another system, or some features can be ignored or not executed.Separately
A bit, shown or discussed mutual coupling or direct-coupling or communication connection can be through some interfaces, device
Or the INDIRECT COUPLING or communication connection of unit, it can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or
In use, can store in a computer readable storage medium.Based on this understanding, the application realizes above-mentioned implementation
All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program
Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on
The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program generation
Code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer-readable medium
It may include: any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic that can carry the computer program code
Dish, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM,
Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that described
The content that computer-readable medium includes can carry out increasing appropriate according to the requirement made laws in jurisdiction with patent practice
Subtract, such as does not include electric carrier signal and electricity according to legislation and patent practice, computer-readable medium in certain jurisdictions
Believe signal.
Embodiment described above is only to illustrate the technical solution of the application, rather than its limitations;Although referring to aforementioned reality
Example is applied the application is described in detail, those skilled in the art should understand that: it still can be to aforementioned each
Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified
Or replacement, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution should all
Comprising within the scope of protection of this application.
Claims (10)
1. a kind of winding detection method, which is characterized in that the winding detection method includes:
Obtain present frame and the corresponding multiple historical frames of the present frame;
The present frame and the multiple historical frames are input to trained convolution from coding structure, export the present frame
The Feature Descriptor of each historical frames in Feature Descriptor and the multiple historical frames;
According to the Feature Descriptor of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames, institute is calculated
State the Euclidean distance of each historical frames in present frame and the multiple historical frames;
Determine that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.
2. winding detection method as described in claim 1, which is characterized in that the convolution from coding structure include multiple convolution
Layer, a pyramid pond structure and multiple full articulamentums, pyramid pond structure respectively with the last one convolutional layer and
First full articulamentum connection.
3. winding detection method as claimed in claim 2, which is characterized in that the Feature Descriptor of the output present frame
Feature Descriptor with historical frames each in the multiple historical frames includes:
By each being gone through in the Feature Descriptor of present frame described in the structure output of the pyramid pond and the multiple historical frames
The Feature Descriptor of history frame.
4. winding detection method as described in claim 1, which is characterized in that the winding detection method further include:
Train the convolution from coding structure.
5. winding detection method as claimed in claim 4, which is characterized in that the training convolution is from coding structure packet
It includes:
Obtain training set of images;
Every image in described image training set is generated into an image pair;
Calculate histograms of oriented gradients HOG description that each image pair one opens image;
Described another image of each image pair is input to the convolution from coding structure, exports each image pair
The Feature Descriptor of another image;
HOG description and the feature of each another image of image pair for calculating an image in each image are retouched
State the loss function of son;
According to the loss function, train the convolution from coding structure.
6. winding detection method as claimed in claim 5, which is characterized in that described by every image in described image training set
An image is generated to including:
Every image in described image training set is subjected to random projection transforms and generates an image pair.
7. winding detection method as claimed in claim 5, which is characterized in that every image in described image training set is raw
At an image to before, further includes:
Every image in described image training set is converted into grayscale image.
8. a kind of winding detection device, which is characterized in that the winding detection device includes:
Frame obtains module, for obtaining present frame and the corresponding multiple historical frames of the present frame;
Feature output module encodes knot for the present frame and the multiple historical frames to be input to trained convolution certainly
Structure exports the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;
Distance calculation module, for according to each historical frames in the Feature Descriptor of the present frame and the multiple historical frames
Feature Descriptor calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Winding determining module, for determining, the shortest frame of Euclidean distance is back in the present frame and the multiple historical frames
Ring.
9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor
The computer program of operation, which is characterized in that the processor realizes such as claim 1 to 7 when executing the computer program
The step of any one winding detection method.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists
In the step of realization winding detection method as described in any one of claim 1 to 7 when the computer program is executed by processor
Suddenly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910303060.3A CN110163095B (en) | 2019-04-16 | 2019-04-16 | Loop detection method, loop detection device and terminal equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910303060.3A CN110163095B (en) | 2019-04-16 | 2019-04-16 | Loop detection method, loop detection device and terminal equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110163095A true CN110163095A (en) | 2019-08-23 |
CN110163095B CN110163095B (en) | 2022-11-29 |
Family
ID=67639413
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910303060.3A Active CN110163095B (en) | 2019-04-16 | 2019-04-16 | Loop detection method, loop detection device and terminal equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110163095B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111598149A (en) * | 2020-05-09 | 2020-08-28 | 鹏城实验室 | Loop detection method based on attention mechanism |
CN111862162A (en) * | 2020-07-31 | 2020-10-30 | 湖北亿咖通科技有限公司 | Loop detection method and system, readable storage medium and electronic device |
WO2021036699A1 (en) * | 2019-08-29 | 2021-03-04 | 腾讯科技(深圳)有限公司 | Video frame information labeling method, device and apparatus, and storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104198752A (en) * | 2014-08-18 | 2014-12-10 | 浙江大学 | High temperature steel billet motion state multi-rate detection method based on machine vision |
US20160104058A1 (en) * | 2014-10-09 | 2016-04-14 | Microsoft Technology Licensing, Llc | Generic object detection in images |
US20170178331A1 (en) * | 2015-12-17 | 2017-06-22 | Casio Computer Co., Ltd. | Autonomous movement device, autonomous movement method and non-transitory recording medium |
CN107292949A (en) * | 2017-05-25 | 2017-10-24 | 深圳先进技术研究院 | Three-dimensional rebuilding method, device and the terminal device of scene |
CN107330357A (en) * | 2017-05-18 | 2017-11-07 | 东北大学 | Vision SLAM closed loop detection methods based on deep neural network |
CN109040747A (en) * | 2018-08-06 | 2018-12-18 | 上海交通大学 | Stereo-picture comfort level quality evaluating method and system based on convolution self-encoding encoder |
CN109443382A (en) * | 2018-10-22 | 2019-03-08 | 北京工业大学 | Vision SLAM closed loop detection method based on feature extraction Yu dimensionality reduction neural network |
-
2019
- 2019-04-16 CN CN201910303060.3A patent/CN110163095B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104198752A (en) * | 2014-08-18 | 2014-12-10 | 浙江大学 | High temperature steel billet motion state multi-rate detection method based on machine vision |
US20160104058A1 (en) * | 2014-10-09 | 2016-04-14 | Microsoft Technology Licensing, Llc | Generic object detection in images |
US20170178331A1 (en) * | 2015-12-17 | 2017-06-22 | Casio Computer Co., Ltd. | Autonomous movement device, autonomous movement method and non-transitory recording medium |
CN107330357A (en) * | 2017-05-18 | 2017-11-07 | 东北大学 | Vision SLAM closed loop detection methods based on deep neural network |
CN107292949A (en) * | 2017-05-25 | 2017-10-24 | 深圳先进技术研究院 | Three-dimensional rebuilding method, device and the terminal device of scene |
CN109040747A (en) * | 2018-08-06 | 2018-12-18 | 上海交通大学 | Stereo-picture comfort level quality evaluating method and system based on convolution self-encoding encoder |
CN109443382A (en) * | 2018-10-22 | 2019-03-08 | 北京工业大学 | Vision SLAM closed loop detection method based on feature extraction Yu dimensionality reduction neural network |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2021036699A1 (en) * | 2019-08-29 | 2021-03-04 | 腾讯科技(深圳)有限公司 | Video frame information labeling method, device and apparatus, and storage medium |
US11727688B2 (en) | 2019-08-29 | 2023-08-15 | Tencent Technology (Shenzhen) Company Limited | Method and apparatus for labelling information of video frame, device, and storage medium |
CN111598149A (en) * | 2020-05-09 | 2020-08-28 | 鹏城实验室 | Loop detection method based on attention mechanism |
CN111598149B (en) * | 2020-05-09 | 2023-10-24 | 鹏城实验室 | Loop detection method based on attention mechanism |
CN111862162A (en) * | 2020-07-31 | 2020-10-30 | 湖北亿咖通科技有限公司 | Loop detection method and system, readable storage medium and electronic device |
Also Published As
Publication number | Publication date |
---|---|
CN110163095B (en) | 2022-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Beddiar et al. | Vision-based human activity recognition: a survey | |
Guo et al. | Learning to measure change: Fully convolutional siamese metric networks for scene change detection | |
CN108052896B (en) | Human body behavior identification method based on convolutional neural network and support vector machine | |
Shao et al. | Performance evaluation of deep feature learning for RGB-D image/video classification | |
Hu et al. | SAC-Net: Spatial attenuation context for salient object detection | |
CN111819568B (en) | Face rotation image generation method and device | |
CN110147743A (en) | Real-time online pedestrian analysis and number system and method under a kind of complex scene | |
CN109255352A (en) | Object detection method, apparatus and system | |
CN109359538A (en) | Training method, gesture identification method, device and the equipment of convolutional neural networks | |
Chen et al. | Research on recognition of fly species based on improved RetinaNet and CBAM | |
CN109559300A (en) | Image processing method, electronic equipment and computer readable storage medium | |
Yu et al. | A discriminative deep model with feature fusion and temporal attention for human action recognition | |
Li et al. | LPSNet: a novel log path signature feature based hand gesture recognition framework | |
CN110163111A (en) | Method, apparatus of calling out the numbers, electronic equipment and storage medium based on recognition of face | |
CN113128424B (en) | Method for identifying action of graph convolution neural network based on attention mechanism | |
CN110163095A (en) | Winding detection method, winding detection device and terminal device | |
CN112132739A (en) | 3D reconstruction and human face posture normalization method, device, storage medium and equipment | |
Yang et al. | RGB-depth feature for 3D human activity recognition | |
Banzi et al. | Learning a deep predictive coding network for a semi-supervised 3D-hand pose estimation | |
Cai et al. | A robust interclass and intraclass loss function for deep learning based tongue segmentation | |
CN109934183A (en) | Image processing method and device, detection device and storage medium | |
Cao et al. | Real-time gesture recognition based on feature recalibration network with multi-scale information | |
CN111291713B (en) | Gesture recognition method and system based on skeleton | |
CN112906520A (en) | Gesture coding-based action recognition method and device | |
Liu et al. | Double Mask R‐CNN for Pedestrian Detection in a Crowd |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |