CN110163095A - Winding detection method, winding detection device and terminal device - Google Patents

Winding detection method, winding detection device and terminal device Download PDF

Info

Publication number
CN110163095A
CN110163095A CN201910303060.3A CN201910303060A CN110163095A CN 110163095 A CN110163095 A CN 110163095A CN 201910303060 A CN201910303060 A CN 201910303060A CN 110163095 A CN110163095 A CN 110163095A
Authority
CN
China
Prior art keywords
image
historical frames
present frame
feature descriptor
winding detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910303060.3A
Other languages
Chinese (zh)
Other versions
CN110163095B (en
Inventor
张锲石
刘袁
程俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Institute of Advanced Technology of CAS
Original Assignee
Shenzhen Institute of Advanced Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Institute of Advanced Technology of CAS filed Critical Shenzhen Institute of Advanced Technology of CAS
Priority to CN201910303060.3A priority Critical patent/CN110163095B/en
Publication of CN110163095A publication Critical patent/CN110163095A/en
Application granted granted Critical
Publication of CN110163095B publication Critical patent/CN110163095B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The application is suitable for winding detection technique field, provides a kind of winding detection method, winding detection device, terminal device and computer readable storage medium, comprising: obtains present frame and the corresponding multiple historical frames of the present frame;The present frame and the multiple historical frames are input to trained convolution from coding structure, export the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;According to the Feature Descriptor of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames, the Euclidean distance of each historical frames in the present frame and the multiple historical frames is calculated;Determine that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.The application carries out winding detection from coding structure using the convolution of unsupervised learning, the success rate and the robustness in complex environment that winding detection can be improved.

Description

Winding detection method, winding detection device and terminal device
Technical field
The application belongs to winding detection technique field more particularly to a kind of winding detection method, winding detection device, terminal Equipment and computer readable storage medium.
Background technique
Currently, winding detection mainly faces two problems: first is that perception deviation, also referred to as false positive, i.e., different scenes are seen Get up similar to be judged as winding;Second is that perception variation, also referred to as false negative, i.e., identical scene is due to illumination, visual angle, goer Body etc. seems different, is not judged as winding.One good winding detection algorithm should be able to overcome both of these problems.Perhaps Mostly the winding detection algorithm based on appearance uses bag of words, achieves good results, but these characteristics of image are all bases In the feature of hand-designed, when illumination variation is obvious in environment, this method is easy failure.Deep neural network can be with Learn the character representation of image automatically from mass data, a large number of studies show that the feature that convolutional neural networks learn is to environment In illumination variation robustness it is fine.But convolutional neural networks extraction is global characteristics, when image aspects change greatly When, detect that the success rate of winding is not high, and convolutional neural networks belong to supervised learning, needing to obtain a large amount of labels could instruct Practice.
Summary of the invention
In view of this, the embodiment of the present application provides a kind of winding detection method, winding detection device, terminal device and meter Calculation machine readable storage medium storing program for executing improves winding detection to use the convolution of unsupervised learning to carry out winding detection from coding structure Success rate and the robustness in complex environment.
The first aspect of the embodiment of the present application provides a kind of winding detection method, and the winding detection method includes:
Obtain present frame and the corresponding multiple historical frames of the present frame;
The present frame and the multiple historical frames are input to trained convolution from coding structure, exported described current The Feature Descriptor of each historical frames in the Feature Descriptor of frame and the multiple historical frames;
According to the Feature Descriptor of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames, meter Calculate the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Determine that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.
The second aspect of the embodiment of the present application provides a kind of winding detection device, and the winding detection device includes:
Frame obtains module, for obtaining present frame and the corresponding multiple historical frames of the present frame;
Feature output module, for the present frame and the multiple historical frames to be input to trained convolution from coding Structure exports the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;
Distance calculation module, for according to each history in the Feature Descriptor of the present frame and the multiple historical frames The Feature Descriptor of frame calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Winding determining module, for determining the shortest frame of Euclidean distance in the present frame and the multiple historical frames For winding.
The third aspect of the embodiment of the present application provides a kind of terminal device, including memory, processor and is stored in In the memory and the computer program that can run on the processor, when the processor executes the computer program It realizes as described in above-mentioned first aspect the step of winding detection method
The fourth aspect of the embodiment of the present application provides a kind of computer readable storage medium, the computer-readable storage Media storage has computer program, realizes that winding is examined as described in above-mentioned first aspect when the computer program is executed by processor The step of survey method.
The 5th aspect of the application provides a kind of computer program product, and the computer program product includes computer Program realizes the winding detection method as described in above-mentioned first aspect when the computer program is executed by one or more processors The step of.
Therefore the application is after getting present frame and the corresponding multiple historical frames of present frame, by present frame and more A historical frames are input to trained convolution from coding structure, export every in the Feature Descriptor and multiple historical frames of present frame The Feature Descriptor of a historical frames, and according to the Feature Descriptor of the Feature Descriptor of present frame and each historical frames, calculating is worked as The Euclidean distance of previous frame and each historical frames, thus according to Euclidean distance select with present frame indicate Same Scene or The frame in the same place of person, to complete winding detection.The application, from coding structure, can be extracted by the convolution of unsupervised learning More complex environmental change and the Feature Descriptor compared with robust are adapted to, son is described using this feature and carries out winding detection, it can The success rate and the robustness in complex environment for improving winding detection.
Detailed description of the invention
It in order to more clearly explain the technical solutions in the embodiments of the present application, below will be to embodiment or description of the prior art Needed in attached drawing be briefly described, it should be apparent that, the accompanying drawings in the following description is only some of the application Embodiment for those of ordinary skill in the art without any creative labor, can also be according to these Attached drawing obtains other attached drawings.
Fig. 1 is the implementation process schematic diagram for the winding detection method that the embodiment of the present application one provides;
Fig. 2 is convolution coding topology example figure certainly;
Fig. 3 is four scale pyramid pond topology example figures;
Fig. 4 is the implementation process schematic diagram for the winding detection method that the embodiment of the present application two provides;
Fig. 5 is random projection transforms exemplary diagram;
Fig. 6 is the schematic diagram for the winding detection device that the embodiment of the present application three provides;
Fig. 7 is the schematic diagram for the terminal device that the embodiment of the present application four provides.
Specific embodiment
In being described below, for illustration and not for limitation, the tool of such as particular system structure, technology etc is proposed Body details, so as to provide a thorough understanding of the present application embodiment.However, it will be clear to one skilled in the art that there is no these specific The application also may be implemented in the other embodiments of details.In other situations, it omits to well-known system, device, electricity The detailed description of road and method, so as not to obscure the description of the present application with unnecessary details.
It should be appreciated that ought use in this specification and in the appended claims, term " includes " instruction is described special Sign, entirety, step, operation, the presence of element and/or component, but be not precluded one or more of the other feature, entirety, step, Operation, the presence or addition of element, component and/or its set.
In the specific implementation, terminal device described in the embodiment of the present application is including but not limited to such as with the sensitive table of touch Mobile phone, laptop computer or the tablet computer in face (for example, touch-screen display and/or touch tablet) etc it is other Portable device.It is to be further understood that in certain embodiments, the equipment is not portable communication device, but is had The desktop computer of touch sensitive surface (for example, touch-screen display and/or touch tablet).
In following discussion, the terminal device including display and touch sensitive surface is described.However, should manage Solution, terminal device may include that one or more of the other physical User of such as physical keyboard, mouse and/or control-rod connects Jaws equipment.
Terminal device supports various application programs, such as one of the following or multiple: drawing application program, demonstration application Program, word-processing application, website creation application program, disk imprinting application program, spreadsheet applications, game are answered With program, telephony application, videoconference application, email application, instant messaging applications, forging Refining supports application program, photo management application program, digital camera application program, digital camera application program, web-browsing to answer With program, digital music player application and/or video frequency player application program.
At least one of such as touch sensitive surface can be used in the various application programs that can be executed on the terminal device Public physical user-interface device.It can be adjusted among applications and/or in corresponding application programs and/or change touch is quick Feel the corresponding information shown in the one or more functions and terminal on surface.In this way, terminal public physical structure (for example, Touch sensitive surface) it can support the various application programs with user interface intuitive and transparent for a user.
It should be understood that the size of the serial number of each step is not meant that the order of the execution order in the present embodiment, each process Execution sequence should be determined by its function and internal logic, and the implementation process without coping with the embodiment of the present application constitutes any restriction.
In order to illustrate technical solution described herein, the following is a description of specific embodiments.
It is the implementation process schematic diagram for the winding detection method that the embodiment of the present application one provides, winding detection referring to Fig. 1 Method is applied to terminal device, and the winding detection method as shown in the figure may comprise steps of:
Step S101 obtains present frame and the corresponding multiple historical frames of the present frame.
In the embodiment of the present application, winding detection is also known as closed loop detection, refers to that terminal device identification once reached the energy of scene Power can reduce accumulated error if detected successfully significantly.Present frame can refer to the frame of pending winding detection;Currently The corresponding multiple historical frames of frame can refer to the frame for having carried out winding detection, and historical frames occur before present frame, for example, having five A video frame, the 5th video frame is present frame, then first four video frame is the historical frames of present frame.
The present frame and the multiple historical frames are input to trained convolution from coding structure by step S102, defeated Out in the Feature Descriptor of the present frame and the multiple historical frames each historical frames Feature Descriptor.
In the embodiment of the present application, Feature Descriptor is the expression of image, extracts useful information and loses incoherent letter Breath, such as to predict the button in an image above clothes, button is usually circle, and has several holes above, then long The image that image can be become with edge detection only edge is exactly useful, color letter for this image edge information Breath is exactly otiose, and good feature usually can distinguish the difference of button He other circular objects.It is described by feature Son can be converted into the image of a w*h*3 (wide * high * 3,3 channels) vector or matrix that one length is n.Such as one The image of 64*128*3, the image vector length exported after conversion can be 3780.
In the embodiment of the present application, present frame and multiple historical frames can be sequentially input to convolution from coding structure, Present frame and multiple historical frames can be input to convolution from coding structure together, are not limited thereto.In output present frame In Feature Descriptor and multiple historical frames when the Feature Descriptor of each historical frames, convolution can export respectively from coding structure to be worked as The Feature Descriptor of each historical frames in the Feature Descriptor of previous frame and multiple historical frames.
Convolution is a kind of unsupervised learning algorithm from coding structure, and output can be realized the reproduction to input data, is A kind of data compression algorithm.Using based on the unsupervised deep learning structure from coding, it is contemplated that the local space characteristics of image, Generalization, in terms of be excellent in.Compared with convolutional neural networks, convolution is not needed from coding structure containing label Data can be carried out training, the workload of label can be effectively reduced, the responsible degree of training pattern is simplified, improve volume Training effectiveness of the product from coding structure.
Optionally, the convolution includes multiple convolutional layers, a pyramid pond structure from coding structure and multiple connects entirely Layer is connect, pyramid pond structure is connect with the last one convolutional layer and first full articulamentum respectively.
In the embodiment of the present application, convolution is from the multiple convolutional layers and a pyramid pond structural simulation in coding structure Convolution realizes data compression from the cataloged procedure of coding structure;Convolution is from multiple full articulamentum analog codecs in coding structure Process realizes decompression.High dimensional data is mapped to low-dimensional data and reduces data volume by coding stage, and decoding stage just phase Instead, it can be realized the reproduction to input data.It is convolution as shown in Figure 2 from topology example figure is encoded, convolution encodes knot certainly in Fig. 2 Structure includes four convolutional layers, a pyramid pond structure (i.e. spatial pyramid structure) and three full articulamentums, the last one Convolutional layer connects the input of pyramid pond structure, and the output of pyramid pond structure connects first full articulamentum.
Important step when pondization operates in convolutional network, it can reduce net by reducing the dimension of feature The complexity of network, and some invariance can be kept, such as: rotational invariance, translation invariance scale invariance, thus Minor change and distortion of the creation one to image have the feature of invariance.But pondization operation is easily lost local feature Information, therefore convolution uses the golden word of multiple and different scales from the pyramid pond structure in coding structure in the embodiment of the present application Tower basinization operation, can extract more fully information.If Fig. 3 is four scale pyramid pond topology example figures, the golden word in Fig. 3 Tower basin topology example figure is operated using the pyramid pondization of four different scales.
Pyramid pond structure can go out the feature vector of fixed size by Multi resolution feature extraction.Pyramid pondization and Pyramid pond unlike conventional pondization can produce the output characteristic pattern of fixed quantity.When in use, with pyramid pond Change the pond layer replaced in convolutional neural networks after the last layer convolutional layer.Network can receive the defeated of arbitrary dimension in this way Enter, and also has certain robustness to target distortion.It is multi-party due to all having been carried out behind convolutional layer to each picture The precision of task can be improved in the feature extraction in face.One is trained in different networks similar to different size of picture Sample substantially increases the precision of model.
Due to the parameter sharing mechanism of convolution algorithm, convolution characteristic pattern can be interpreted by applying over an input image Convolution filter and the detection score obtained, and there are filters to search around them for the position instruction with high activation value The visual pattern of rope.Observe that usual convolution characteristic pattern is sparse, because only that a few locations have high activation and exist Certain visual patterns.This shows that convolution filter has high selectivity to certain visual patterns.When observing from different angles When the same place, their some visual patterns still retain, and can be detected by identical convolution filter.Pass through this Kind of observation, can using the visual pattern most outstanding at multiple positions of the multiple dimensioned merging method to search for image, so as to across Different points of view matching image.For each convolution characteristic pattern, can be operated using the pyramid pondization of multiple and different scales, first Each convolution characteristic pattern is divided into H × H (i.e. m*m, n*n, p*p and q*q in Fig. 3) a unit, and in each space In unit, we collect Feature Descriptor using maximum Chi Hualai.It after the merge operation, can be by the spy with any size Sign mapping is reduced to the vector of low-dimensional, further reduced computation complexity.Finally, connecting the pyramid pond of multiple and different scales Change the Feature Descriptor that operation obtains, to form the Feature Descriptor of image.
Optionally, in the Feature Descriptor of the output present frame and the multiple historical frames each historical frames spy Sign describes son
By every in the Feature Descriptor of present frame described in the structure output of the pyramid pond and the multiple historical frames The Feature Descriptor of a historical frames.
Step S103, according to the feature of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames Description, calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames.
Step S104 determines that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.
In the embodiment of the present application, multiple historical frames correspond to multiple Euclidean distances, and present frame and every is being calculated After the Euclidean distance of a historical frames, can more multiple Euclidean distances, and chosen from multiple Euclidean distances Shortest Euclidean distance out, the historical frames that shortest Euclidean distance corresponds to are the frame with present frame winding, i.e., Present frame historical frames corresponding with shortest Euclidean distance are winding.Optionally, in order to further increase the accurate of winding Property, can preset distance threshold, using the distance threshold judge historical frames whether with present frame for winding, such as select After shortest Euclidean distance, judge whether shortest Euclidean distance is less than distance threshold, if in shortest Europe is several It obtains distance and is less than distance threshold, it is determined that present frame historical frames corresponding with shortest Euclidean distance are winding;If most short Euclidean distance be not less than distance threshold, it is determined that present frame historical frames corresponding with shortest Euclidean distance are not Winding, i.e., there is no the frames with present frame winding in multiple historical frames.
The embodiment of the present application, from coding structure, can be extracted by the convolution of unsupervised learning and adapt to more complex ring Border variation and compared with robust Feature Descriptor, using this feature describe son carry out winding detection, can be improved winding detection at Power and the robustness in complex environment.
It referring to fig. 4, is the implementation process schematic diagram for the winding detection method that the embodiment of the present application two provides, winding detection Method is applied to terminal device, and the winding detection method as shown in the figure may comprise steps of:
Step S401, training convolutional is from coding structure.
In the embodiment of the present application, in the Feature Descriptor for obtaining present frame and historical frames from coding structure using convolution When, preparatory training convolutional is needed from coding structure, and trained convolution is extracted from coding structure and is both able to satisfy pair The invariance of illumination variation is also able to satisfy the Feature Descriptor to the invariance of visual angle change, that is, the feature extracted compared with robust is retouched State sub (i.e. image expression or feature representation).
Optionally, the training convolutional includes: from coding structure
Obtain training set of images;
Every image in described image training set is generated into an image pair;
Calculate histograms of oriented gradients HOG description that each image pair one opens image;
Described another image of each image pair is input to the convolution from coding structure, exports each image The Feature Descriptor of another image of centering;
Calculate the spy of HOG the description son and each another image of image pair of an image in each image The loss function of sign description;
According to the loss function, train the convolution from coding structure.
In the embodiment of the present application, training set of images can refer to multiple images for training from coding structure, that is, use In training from the image set of coding structure, user can voluntarily selected digital image collection according to actual needs, be not limited thereto.Every The image that image generates is identical to represented scene, may be visual angle difference, with the figure generated according to every image As training convolutional from coding structure, is enabled convolution to export or extract from coding structure and had both been able to satisfy to illumination variation Invariance be also able to satisfy the Feature Descriptor to the invariance of visual angle change.
In the embodiment of the present application, every image generates an image to rear, can randomly choose one from the image pair Be input to convolution learns image expression from coding structure automatically, and another image calculates its histograms of oriented gradients (Histogram of Oriented Gradient, HOG) description.Wherein, HOG description is by calculating and statistical picture office Pair the gradient orientation histogram in portion region carrys out constitutive characteristic, can keep good invariance to geometry and optical deformation, i.e., The variation of environment has very strong robustness, and the main thought of this feature is: the presentation of localized target and character can in image It is described well by the direction density at gradient or edge, essence is the statistical information of gradient, and gradient is primarily present in edge Place.In actual operation, small cell factory is divided the image into, each cell factory calculates a gradient direction (or side Edge direction) histogram.Formula can be usedWithFind the size and Orientation of gradient, wherein gx Indicate gradient of the cell factory in the direction x, gyCell factory is indicated in the gradient in the direction y, (x, y) indicates the seat of cell factory Mark.
In order to have better invariance to illumination and shade, need to normalize gradient orientation histogram degree of comparing, It can be realized by forming bigger block and normalizing all cell factories in block cell factory, all pieces of HOG is retouched It states subgroup and collectively forms final HOG description.
In the embodiment of the present application, loss function measurement is that each image pair one is opened between image and another image Difference, convolution can be updated from the parameter in coding structure, thus to convolution according to loss function in back-propagation process It is trained from coding structure, when convolution reaches stable from coding structure, completes to train convolution from the training of coding structure Use the output of pyramid pond structure as image expression (i.e. Feature Descriptor) after the completion.
Optionally, described that every image in described image training set is generated into an image to including:
Every image in described image training set is subjected to random projection transforms and generates an image pair.
In the embodiment of the present application, projective transformation is the non singular linear transformation under a kind of homogeneous coordinates, it is therefore an objective to one The geometric distortion that a plane is generated when being imaged on having an X-rayed video camera is modeled, it has nine freedom degrees, but only proportional meaning Justice, therefore it can be by eight parameter definitions, therefore the projection variation between two planes can be determined by four pairs of match points, wherein Any 3 points in one plane are not conllinear.Random projection transforms can refer to that four points selected from original image are random , it is random projection transforms exemplary diagram as shown in Figure 5, image, which is an image pair, in Fig. 5 is thrown at random left image Shadow converts to obtain right image.Change warp image by accidental projection, can more preferable simulation due to terminal device (such as machine Device people) movement caused by nature visual angle change.Wherein, projective transformation can be decomposed into similarity transformation, affine transformation and projection The cascade of transformation.
Illustratively, convolution can be with from the training process of coding structure are as follows: to any one input picture, first with immediately Projective transformation generates an image pair, obtains the appearance under same image different perspectives;Then a calculating is randomly choosed Its HOG description, HOG, which describes son, can learn to preferable illumination invariant, and another image inputs convolution from coding structure In learn characteristics of image automatically, the last one convolutional layer with pyramid pondization operate multi-angle is carried out to image feature mention It takes, then with the feature vector of multiple full connection one fixed dimension of layer building;It is finally to compare that two methods are calculated to retouch Son is stated, since HOG descriptor is the vector of regular length, can be compared by Euclidean distance, euclidean Distance is easily integrated in the neural network that there is L2 to lose, therefore L2 loss function can be used and compare HOG description From the Feature Descriptor for encoding structural remodeling, convolution may finally be learned from coding structure by study reconstruct repeatedly for son and convolution Practise the Feature Descriptor compared with robust.
Optionally, every image in described image training set is being generated into an image to before, further includes:
Every image in described image training set is converted into grayscale image.
In the embodiment of the present application, in order to reduce the original data volume of every image in training set of images, convenient for subsequent right Calculation amount is less when every image procossing, every image can be converted into grayscale image, is then grasped with accidental projection variation Make, so that it may obtain original image (i.e. every image in training set of images) image pair that deformation occurs.
Step S402 obtains present frame and the corresponding multiple historical frames of the present frame.
The step is identical as step S101, and for details, reference can be made to the associated descriptions of step S101, and details are not described herein.
The present frame and the multiple historical frames are input to trained convolution from coding structure by step S403, defeated Out in the Feature Descriptor of the present frame and the multiple historical frames each historical frames Feature Descriptor.
The step is identical as step S102, and for details, reference can be made to the associated descriptions of step S102, and details are not described herein.
Step S404, according to the feature of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames Description, calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames.
The step is identical as step S103, and for details, reference can be made to the associated descriptions of step S103, and details are not described herein.
Step S405 determines that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.
The step is identical as step S104, and for details, reference can be made to the associated descriptions of step S104, and details are not described herein.
The embodiment of the present application increases training convolutional from coding structure, by orienting gradient histogram on the basis of implementing one The unsupervised convolution of one kind is designed from coding structure from coding structure in figure and neural network, and it is straight on the one hand to pass through orientation gradient Square graphics practises the image expression of image, on the other hand learns and reconstruct original image automatically by convolution autoencoder network, in conjunction with The advantages of two methods, makes the feature finally extracted both be able to satisfy the invariance to illumination variation or be able to satisfy to visual angle change Invariance, to extract the Feature Descriptor compared with robust.
It is that the schematic diagram for the winding detection device that the embodiment of the present application three provides only shows for ease of description referring to Fig. 6 Part relevant to the embodiment of the present application is gone out.
The winding detection device includes:
Frame obtains module 61, for obtaining present frame and the corresponding multiple historical frames of the present frame;
Feature output module 62, it is self-editing for the present frame and the multiple historical frames to be input to trained convolution Code structure, exports the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;
Distance calculation module 63, for each being gone through according in the Feature Descriptor of the present frame and the multiple historical frames The Feature Descriptor of history frame calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Winding determining module 64, for determining that the present frame and Euclidean distance in the multiple historical frames are shortest Frame is winding.
Optionally, the convolution includes multiple convolutional layers, a pyramid pond structure from coding structure and multiple connects entirely Layer is connect, pyramid pond structure is connect with the last one convolutional layer and first full articulamentum respectively.
Optionally, the feature output module 62 is specifically used for:
By every in the Feature Descriptor of present frame described in the structure output of the pyramid pond and the multiple historical frames The Feature Descriptor of a historical frames.
Optionally, the winding detection device further include:
Structured training module 65, for training the convolution from coding structure.
Optionally, the structured training module 65 includes:
Acquiring unit, for obtaining training set of images;
Generation unit, for every image in described image training set to be generated an image pair;
Sub- computing unit is described, histograms of oriented gradients HOG description of image is opened for calculating each image pair one;
Output unit, it is defeated for described another image of each image pair to be input to the convolution from coding structure The Feature Descriptor of another image of each image pair out;
Costing bio disturbance unit, for calculating HOG description and each image of an image in each image The loss function of the Feature Descriptor of another image of centering;
Training unit, for training the convolution from coding structure according to the loss function.
Optionally, the generation unit is specifically used for:
Every image in described image training set is subjected to random projection transforms and generates an image pair.
Optionally, the structured training module 65 further include:
Converting unit, for every image in described image training set to be converted to grayscale image.
Winding detection device provided by the embodiments of the present application can be applied in preceding method embodiment one and embodiment two, Details are referring to the description of above method embodiment one and embodiment two, and details are not described herein.
Fig. 7 is the schematic diagram for the terminal device that the embodiment of the present application four provides.As shown in fig. 7, the terminal of the embodiment is set Standby 7 include: processor 70, memory 71 and are stored in the meter that can be run in the memory 71 and on the processor 70 Calculation machine program 72.The processor 70 is realized when executing the computer program 72 in above-mentioned each winding detection method embodiment The step of, such as step S101 to S104 shown in FIG. 1.Alternatively, reality when the processor 70 executes the computer program 72 The function of each module/unit in existing above-mentioned each Installation practice, such as the function of module 61 to 65 shown in Fig. 6.
Illustratively, the computer program 72 can be divided into one or more module/units, it is one or Multiple module/units are stored in the memory 71, and are executed by the processor 70, to complete the application.Described one A or multiple module/units can be the series of computation machine program instruction section that can complete specific function, which is used for Implementation procedure of the computer program 72 in the terminal device 7 is described.For example, the computer program 72 can be divided It is cut into frame and obtains module, feature output module, distance calculation module, winding determining module and structured training module, each module Concrete function is as follows:
Frame obtains module, for obtaining present frame and the corresponding multiple historical frames of the present frame;
Feature output module, for the present frame and the multiple historical frames to be input to trained convolution from coding Structure exports the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;
Distance calculation module, for according to each history in the Feature Descriptor of the present frame and the multiple historical frames The Feature Descriptor of frame calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Winding determining module, for determining the shortest frame of Euclidean distance in the present frame and the multiple historical frames For winding.
Optionally, the convolution includes multiple convolutional layers, a pyramid pond structure from coding structure and multiple connects entirely Layer is connect, pyramid pond structure is connect with the last one convolutional layer and first full articulamentum respectively.
Optionally, the feature output module is specifically used for:
By every in the Feature Descriptor of present frame described in the structure output of the pyramid pond and the multiple historical frames The Feature Descriptor of a historical frames.
Optionally, structured training module, for training the convolution from coding structure.
Optionally, the structured training module includes:
Acquiring unit, for obtaining training set of images;
Generation unit, for every image in described image training set to be generated an image pair;
Sub- computing unit is described, histograms of oriented gradients HOG description of image is opened for calculating each image pair one;
Output unit, it is defeated for described another image of each image pair to be input to the convolution from coding structure The Feature Descriptor of another image of each image pair out;
Costing bio disturbance unit, for calculating HOG description and each image of an image in each image The loss function of the Feature Descriptor of another image of centering;
Training unit, for training the convolution from coding structure according to the loss function.
Optionally, the generation unit is specifically used for:
Every image in described image training set is subjected to random projection transforms and generates an image pair.
Optionally, the structured training module further include:
Converting unit, for every image in described image training set to be converted to grayscale image.
The terminal device 7 can be the equipment that robot, unmanned plane etc. need to carry out winding detection.The terminal device It may include, but be not limited only to, processor 70, memory 71.It will be understood by those skilled in the art that Fig. 7 is only terminal device 7 Example, do not constitute the restriction to terminal device 7, may include than illustrating more or fewer components, or combination is certain Component or different components, such as the terminal device can also include input-output equipment, network access equipment, bus Deng.
Alleged processor 70 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor is also possible to any conventional processor Deng.
The memory 71 can be the internal storage unit of the terminal device 7, such as the hard disk or interior of terminal device 7 It deposits.The memory 71 is also possible to the External memory equipment of the terminal device 7, such as be equipped on the terminal device 7 Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card dodge Deposit card (Flash Card) etc..Further, the memory 71 can also both include the storage inside list of the terminal device 7 Member also includes External memory equipment.The memory 71 is for storing needed for the computer program and the terminal device Other programs and data.The memory 71 can be also used for temporarily storing the data that has exported or will export.
It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each function Can unit, module division progress for example, in practical application, can according to need and by above-mentioned function distribution by different Functional unit, module are completed, i.e., the internal structure of described device is divided into different functional unit or module, more than completing The all or part of function of description.Each functional unit in embodiment, module can integrate in one processing unit, can also To be that each unit physically exists alone, can also be integrated in one unit with two or more units, it is above-mentioned integrated Unit both can take the form of hardware realization, can also realize in the form of software functional units.In addition, each function list Member, the specific name of module are also only for convenience of distinguishing each other, the protection scope being not intended to limit this application.Above system The specific work process of middle unit, module, can refer to corresponding processes in the foregoing method embodiment, and details are not described herein.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, is not described in detail or remembers in some embodiment The part of load may refer to the associated description of other embodiments.
Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed Scope of the present application.
In embodiment provided herein, it should be understood that disclosed device/terminal device and method, it can be with It realizes by another way.For example, device described above/terminal device embodiment is only schematical, for example, institute The division of module or unit is stated, only a kind of logical function partition, there may be another division manner in actual implementation, such as Multiple units or components can be combined or can be integrated into another system, or some features can be ignored or not executed.Separately A bit, shown or discussed mutual coupling or direct-coupling or communication connection can be through some interfaces, device Or the INDIRECT COUPLING or communication connection of unit, it can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated module/unit be realized in the form of SFU software functional unit and as independent product sale or In use, can store in a computer readable storage medium.Based on this understanding, the application realizes above-mentioned implementation All or part of the process in example method, can also instruct relevant hardware to complete, the meter by computer program Calculation machine program can be stored in a computer readable storage medium, the computer program when being executed by processor, it can be achieved that on The step of stating each embodiment of the method.Wherein, the computer program includes computer program code, the computer program generation Code can be source code form, object identification code form, executable file or certain intermediate forms etc..The computer-readable medium It may include: any entity or device, recording medium, USB flash disk, mobile hard disk, magnetic that can carry the computer program code Dish, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It should be noted that described The content that computer-readable medium includes can carry out increasing appropriate according to the requirement made laws in jurisdiction with patent practice Subtract, such as does not include electric carrier signal and electricity according to legislation and patent practice, computer-readable medium in certain jurisdictions Believe signal.
Embodiment described above is only to illustrate the technical solution of the application, rather than its limitations;Although referring to aforementioned reality Example is applied the application is described in detail, those skilled in the art should understand that: it still can be to aforementioned each Technical solution documented by embodiment is modified or equivalent replacement of some of the technical features;And these are modified Or replacement, the spirit and scope of each embodiment technical solution of the application that it does not separate the essence of the corresponding technical solution should all Comprising within the scope of protection of this application.

Claims (10)

1. a kind of winding detection method, which is characterized in that the winding detection method includes:
Obtain present frame and the corresponding multiple historical frames of the present frame;
The present frame and the multiple historical frames are input to trained convolution from coding structure, export the present frame The Feature Descriptor of each historical frames in Feature Descriptor and the multiple historical frames;
According to the Feature Descriptor of each historical frames in the Feature Descriptor of the present frame and the multiple historical frames, institute is calculated State the Euclidean distance of each historical frames in present frame and the multiple historical frames;
Determine that the shortest frame of Euclidean distance is winding in the present frame and the multiple historical frames.
2. winding detection method as described in claim 1, which is characterized in that the convolution from coding structure include multiple convolution Layer, a pyramid pond structure and multiple full articulamentums, pyramid pond structure respectively with the last one convolutional layer and First full articulamentum connection.
3. winding detection method as claimed in claim 2, which is characterized in that the Feature Descriptor of the output present frame Feature Descriptor with historical frames each in the multiple historical frames includes:
By each being gone through in the Feature Descriptor of present frame described in the structure output of the pyramid pond and the multiple historical frames The Feature Descriptor of history frame.
4. winding detection method as described in claim 1, which is characterized in that the winding detection method further include:
Train the convolution from coding structure.
5. winding detection method as claimed in claim 4, which is characterized in that the training convolution is from coding structure packet It includes:
Obtain training set of images;
Every image in described image training set is generated into an image pair;
Calculate histograms of oriented gradients HOG description that each image pair one opens image;
Described another image of each image pair is input to the convolution from coding structure, exports each image pair The Feature Descriptor of another image;
HOG description and the feature of each another image of image pair for calculating an image in each image are retouched State the loss function of son;
According to the loss function, train the convolution from coding structure.
6. winding detection method as claimed in claim 5, which is characterized in that described by every image in described image training set An image is generated to including:
Every image in described image training set is subjected to random projection transforms and generates an image pair.
7. winding detection method as claimed in claim 5, which is characterized in that every image in described image training set is raw At an image to before, further includes:
Every image in described image training set is converted into grayscale image.
8. a kind of winding detection device, which is characterized in that the winding detection device includes:
Frame obtains module, for obtaining present frame and the corresponding multiple historical frames of the present frame;
Feature output module encodes knot for the present frame and the multiple historical frames to be input to trained convolution certainly Structure exports the Feature Descriptor of each historical frames in the Feature Descriptor and the multiple historical frames of the present frame;
Distance calculation module, for according to each historical frames in the Feature Descriptor of the present frame and the multiple historical frames Feature Descriptor calculates the Euclidean distance of each historical frames in the present frame and the multiple historical frames;
Winding determining module, for determining, the shortest frame of Euclidean distance is back in the present frame and the multiple historical frames Ring.
9. a kind of terminal device, including memory, processor and storage are in the memory and can be on the processor The computer program of operation, which is characterized in that the processor realizes such as claim 1 to 7 when executing the computer program The step of any one winding detection method.
10. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In the step of realization winding detection method as described in any one of claim 1 to 7 when the computer program is executed by processor Suddenly.
CN201910303060.3A 2019-04-16 2019-04-16 Loop detection method, loop detection device and terminal equipment Active CN110163095B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910303060.3A CN110163095B (en) 2019-04-16 2019-04-16 Loop detection method, loop detection device and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910303060.3A CN110163095B (en) 2019-04-16 2019-04-16 Loop detection method, loop detection device and terminal equipment

Publications (2)

Publication Number Publication Date
CN110163095A true CN110163095A (en) 2019-08-23
CN110163095B CN110163095B (en) 2022-11-29

Family

ID=67639413

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910303060.3A Active CN110163095B (en) 2019-04-16 2019-04-16 Loop detection method, loop detection device and terminal equipment

Country Status (1)

Country Link
CN (1) CN110163095B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111598149A (en) * 2020-05-09 2020-08-28 鹏城实验室 Loop detection method based on attention mechanism
CN111862162A (en) * 2020-07-31 2020-10-30 湖北亿咖通科技有限公司 Loop detection method and system, readable storage medium and electronic device
WO2021036699A1 (en) * 2019-08-29 2021-03-04 腾讯科技(深圳)有限公司 Video frame information labeling method, device and apparatus, and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104198752A (en) * 2014-08-18 2014-12-10 浙江大学 High temperature steel billet motion state multi-rate detection method based on machine vision
US20160104058A1 (en) * 2014-10-09 2016-04-14 Microsoft Technology Licensing, Llc Generic object detection in images
US20170178331A1 (en) * 2015-12-17 2017-06-22 Casio Computer Co., Ltd. Autonomous movement device, autonomous movement method and non-transitory recording medium
CN107292949A (en) * 2017-05-25 2017-10-24 深圳先进技术研究院 Three-dimensional rebuilding method, device and the terminal device of scene
CN107330357A (en) * 2017-05-18 2017-11-07 东北大学 Vision SLAM closed loop detection methods based on deep neural network
CN109040747A (en) * 2018-08-06 2018-12-18 上海交通大学 Stereo-picture comfort level quality evaluating method and system based on convolution self-encoding encoder
CN109443382A (en) * 2018-10-22 2019-03-08 北京工业大学 Vision SLAM closed loop detection method based on feature extraction Yu dimensionality reduction neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104198752A (en) * 2014-08-18 2014-12-10 浙江大学 High temperature steel billet motion state multi-rate detection method based on machine vision
US20160104058A1 (en) * 2014-10-09 2016-04-14 Microsoft Technology Licensing, Llc Generic object detection in images
US20170178331A1 (en) * 2015-12-17 2017-06-22 Casio Computer Co., Ltd. Autonomous movement device, autonomous movement method and non-transitory recording medium
CN107330357A (en) * 2017-05-18 2017-11-07 东北大学 Vision SLAM closed loop detection methods based on deep neural network
CN107292949A (en) * 2017-05-25 2017-10-24 深圳先进技术研究院 Three-dimensional rebuilding method, device and the terminal device of scene
CN109040747A (en) * 2018-08-06 2018-12-18 上海交通大学 Stereo-picture comfort level quality evaluating method and system based on convolution self-encoding encoder
CN109443382A (en) * 2018-10-22 2019-03-08 北京工业大学 Vision SLAM closed loop detection method based on feature extraction Yu dimensionality reduction neural network

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021036699A1 (en) * 2019-08-29 2021-03-04 腾讯科技(深圳)有限公司 Video frame information labeling method, device and apparatus, and storage medium
US11727688B2 (en) 2019-08-29 2023-08-15 Tencent Technology (Shenzhen) Company Limited Method and apparatus for labelling information of video frame, device, and storage medium
CN111598149A (en) * 2020-05-09 2020-08-28 鹏城实验室 Loop detection method based on attention mechanism
CN111598149B (en) * 2020-05-09 2023-10-24 鹏城实验室 Loop detection method based on attention mechanism
CN111862162A (en) * 2020-07-31 2020-10-30 湖北亿咖通科技有限公司 Loop detection method and system, readable storage medium and electronic device

Also Published As

Publication number Publication date
CN110163095B (en) 2022-11-29

Similar Documents

Publication Publication Date Title
Beddiar et al. Vision-based human activity recognition: a survey
Guo et al. Learning to measure change: Fully convolutional siamese metric networks for scene change detection
CN108052896B (en) Human body behavior identification method based on convolutional neural network and support vector machine
Shao et al. Performance evaluation of deep feature learning for RGB-D image/video classification
Hu et al. SAC-Net: Spatial attenuation context for salient object detection
CN111819568B (en) Face rotation image generation method and device
CN110147743A (en) Real-time online pedestrian analysis and number system and method under a kind of complex scene
CN109255352A (en) Object detection method, apparatus and system
CN109359538A (en) Training method, gesture identification method, device and the equipment of convolutional neural networks
Chen et al. Research on recognition of fly species based on improved RetinaNet and CBAM
CN109559300A (en) Image processing method, electronic equipment and computer readable storage medium
Yu et al. A discriminative deep model with feature fusion and temporal attention for human action recognition
Li et al. LPSNet: a novel log path signature feature based hand gesture recognition framework
CN110163111A (en) Method, apparatus of calling out the numbers, electronic equipment and storage medium based on recognition of face
CN113128424B (en) Method for identifying action of graph convolution neural network based on attention mechanism
CN110163095A (en) Winding detection method, winding detection device and terminal device
CN112132739A (en) 3D reconstruction and human face posture normalization method, device, storage medium and equipment
Yang et al. RGB-depth feature for 3D human activity recognition
Banzi et al. Learning a deep predictive coding network for a semi-supervised 3D-hand pose estimation
Cai et al. A robust interclass and intraclass loss function for deep learning based tongue segmentation
CN109934183A (en) Image processing method and device, detection device and storage medium
Cao et al. Real-time gesture recognition based on feature recalibration network with multi-scale information
CN111291713B (en) Gesture recognition method and system based on skeleton
CN112906520A (en) Gesture coding-based action recognition method and device
Liu et al. Double Mask R‐CNN for Pedestrian Detection in a Crowd

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant