CN113052110B - Three-dimensional interest point extraction method based on multi-view projection and deep learning - Google Patents

Three-dimensional interest point extraction method based on multi-view projection and deep learning Download PDF

Info

Publication number
CN113052110B
CN113052110B CN202110359551.7A CN202110359551A CN113052110B CN 113052110 B CN113052110 B CN 113052110B CN 202110359551 A CN202110359551 A CN 202110359551A CN 113052110 B CN113052110 B CN 113052110B
Authority
CN
China
Prior art keywords
dimensional
interest
probability distribution
training
virtual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110359551.7A
Other languages
Chinese (zh)
Other versions
CN113052110A (en
Inventor
舒振宇
杨思鹏
辛士庆
庞超逸
金小刚
刘利刚
吴皓钰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Science and Technology ZUST
Original Assignee
Zhejiang University of Science and Technology ZUST
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Science and Technology ZUST filed Critical Zhejiang University of Science and Technology ZUST
Priority to CN202110359551.7A priority Critical patent/CN113052110B/en
Publication of CN113052110A publication Critical patent/CN113052110A/en
Application granted granted Critical
Publication of CN113052110B publication Critical patent/CN113052110B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/60Type of objects
    • G06V20/64Three-dimensional objects
    • G06V20/647Three-dimensional objects by matching two-dimensional images to three-dimensional objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/60Analysis of geometric attributes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Multimedia (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a three-dimensional interest point extraction method based on multi-view projection and deep learning, which comprises the steps of projecting a marked 3D object into a plurality of 2D views to collect training data, constructing interest point training probability distribution, training a neural network through 2D image data and the interest point training probability distribution, obtaining probability distribution Q according to the trained neural network and an improved density peak clustering algorithm, and extracting three-dimensional interest points of the 3D object. The automatic detection of the 3D object interest points is realized through a small amount of data, and satisfactory results can be obtained without depending on manually set feature descriptors or a large amount of expensive 3D training data.

Description

Three-dimensional interest point extraction method based on multi-view projection and deep learning
Technical Field
The invention relates to the technical field of image processing, in particular to a three-dimensional interest point extraction method based on multi-view projection and deep learning.
Background
Points of interest (POIs), also referred to as feature points, are generally defined as unique features on the surface of a 3D object, and POIs play a crucial role in many geometric processing tasks, such as viewpoint selection, shape enhancement, shape retrieval, mesh registration, and mesh segmentation.
POIs can be easily distinguished from other points on a 3D shape by human visual perception. However, it is not easy to define POIs accurately from a geometric point of view, although they do relate to geometric features, so automatic detection of POIs that conform to human visual perception remains a challenging problem.
It is generally accepted that determining whether a point is a POI is subjective because different people may have different opinions about the POI. Based on the above observations, a data-driven approach is applied to efficiently detect POIs on 3D shapes. With recent advances in relevant research areas, deep learning has been introduced to detect POIs on 3D shapes to achieve satisfactory results by learning complex mappings between geometric features and POI probability values for each point on the surface.
However, the scarcity of 3D training data forces learning-based methods to rely heavily on artificially set geometric features rather than learning features directly from raw data, since acquiring high-quality training data of 3D shapes is much more expensive than acquiring 2D images. This greatly limits the ability of this approach to achieve better detection performance. Furthermore, the difference between the subjective visual perception of a person and the geometric features on the 3D shape also limits its performance.
Therefore, solving the scarcity of 3D training data and employing data-driven is a problem that those skilled in the art need to solve.
Disclosure of Invention
In view of the above, the present invention provides a three-dimensional interest point extraction method based on multi-view projection and deep learning, which projects a labeled 3D object into multiple 2D views, learns required features from the 2D views in an end-to-end manner, and automatically detects interest points in a 3D object test by applying a trained neural network and an improved density peak clustering algorithm, so that satisfactory results can be obtained without depending on artificially set feature descriptors or a large amount of expensive 3D training data.
In order to achieve the purpose, the invention adopts the following technical scheme:
a three-dimensional interest point extraction method based on multi-view projection and deep learning is characterized by comprising the following steps:
s1, acquiring training image data: projecting the 3D object into a plurality of 2D views in different directions, and recording the corresponding relation between each pixel in the 2D views and each vertex on the surface of the 3D object;
s2, constructing the training probability distribution of the artificially marked 3D object interest points: constructing a training probability distribution P of the artificially marked interest points on the surface of each 3D object based on the normal probability density;
S3, training a neural network: training a neural network according to training image data and training probability distribution P of the artificial marker to obtain a neural network model capable of automatically generating 3D object interest point probability distribution;
s4, obtaining probability distribution Q of interest points on the surface of the test 3D object: projecting a 3D model of interest points to be extracted to obtain a 2D view, inputting images into a trained neural network model to obtain probability distribution of the interest points to be extracted on the 2D view, and then back-projecting the probability distribution to the surface of a 3D object to obtain the probability distribution Q of three-dimensional interest points on the surface of the tested 3D object;
s5, extracting three-dimensional interest points: according to the probability distribution Q of step S4, a three-dimensional interest point on the 3D object is extracted using a modified density peak clustering algorithm.
Preferably, the step S1 of projecting the 3D object into the 2D views in a plurality of different directions includes:
s11, constructing a virtual three-dimensional boundary sphere by taking the 3D object as a sphere center;
s12, taking any point on the surface of the virtual three-dimensional boundary sphere as an initial position and selecting any direction to construct longitude and latitude, wherein the initial position point is contained on the equator of the constructed longitude and latitude;
s13, placing a first virtual camera at the initial position of the virtual three-dimensional boundary sphere, and uniformly arranging the virtual cameras along the equator by using the initial position as a starting point and using the same longitude angle;
S14, taking the positions of virtual cameras arranged at different longitudes on the equator of the virtual three-dimensional boundary sphere as a reference, correspondingly and uniformly arranging virtual cameras on latitude lines with the same latitude angle on two sides of the equator, and respectively arranging one virtual camera at each of the two polar positions of the virtual three-dimensional boundary sphere;
and S15, shooting a 3D object located at the sphere center position of the virtual three-dimensional boundary sphere through the virtual camera to acquire a 2D image, wherein the 2D image comprises a shadow image and positive and negative depth images of the 3D object.
Preferably, the same longitude angle intersection comprises a 45 ° longitude angle and the same latitude angle comprises a 45 ° latitude angle.
Preferably, the virtual camera is rotated 4 times at angular intervals of 90 degrees to increase the amount of training data when capturing the image of the object.
Preferably, the interest point training probability distribution P constructed on the surface of the 3D object in the step S2 is generated according to the attenuation of the geodesic distance of the nearest interest point, and any vertex v i The upper P is defined as:
Figure BDA0003004951780000031
wherein d (v) i ,p i ) Representing a vertex v i To the nearest point of interest p i σ is a parameter that can be used to control the decay rate.
Preferably, the training of the neural network according to the training image data and the training probability distribution P of the artificial label in step S3 includes the following steps:
S31, taking an encoder network formed by Conv + BN + ReLU and a pooling layer as a convolutional layer of a neural network to design a feature extractor of a 2D view, extracting a feature value of each pixel of a 2D image, and forming a decoder network by an upsampling layer and the Conv + BN + ReLU layer;
s32, the up-sampling layer receives corresponding pooling indexes from the pooling layer, each pixel characteristic value is set in a position according to the merging indexes, and a density characteristic diagram is generated by the conversion layer;
s33, feeding the feature map to the softmax layer to classify each pixel independently.
Preferably, the following strategy is adopted to obtain the probability distribution Q of the interest points on the surface of the test 3D object in step S4:
Figure BDA0003004951780000032
wherein Q i For the surface vertex v of the 3D object i The true probability of whether the interest point is; q. q.s ij Is that the 3D object is at the corresponding vertex v i 0 indicates that there is no pixel point at the corresponding vertex, and n indicates that there are n pixel points at the corresponding vertex.
Preferably, the step S5 uses an improved density peak clustering algorithm, and specifically includes the following steps:
s51, connecting all the vertexes v on the 3D object i Mapping to a two-dimensional decision graph with the horizontal and vertical axes representing p and δ, respectively, and with δ i defined as per vertex v of the 3D object i Influence radius of (2):
Figure BDA0003004951780000041
Wherein d (v) i ,v j ) Is the vertex v i And v j Geodesic distance therebetween;
s52, defining a curve function on the two-dimensional decision diagram
Figure BDA0003004951780000042
Separation curves for determining three-dimensional points of interest and other common vertices, where C 1 And C 2 Variables representing up and down movement of the curve, defining a separation curve
Figure BDA0003004951780000043
S53, according to the separation curve on the two-dimensional decision diagram 3 Determining three-dimensional points of interestAnd mapping the obtained three-dimensional interest point back to the 3D object to obtain the final three-dimensional interest point.
According to the technical scheme, compared with the prior art, the invention discloses the three-dimensional interest point extraction method based on multi-view projection and deep learning, and compared with the prior art, the three-dimensional interest point extraction method has the following beneficial effects:
1. the method for extracting the three-dimensional interest points is end-to-end, has robustness, and can realize better performance than the traditional method: good performance can be obtained even if only a small number of training samples are provided.
2. The invention is completely driven by data, and can further improve the extraction performance of the three-dimensional interest points as long as enough labeled samples are provided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.
FIG. 1 is a schematic flow diagram of a process provided by the present invention;
FIG. 2 is a schematic diagram of the projection position of a 3D object provided by the present invention;
fig. 3 is a schematic diagram of different curves of a two-dimensional decision diagram provided by the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention shown in fig. 1 discloses a three-dimensional interest point extraction method based on multi-view projection and deep learning, which comprises the following steps:
s1, training data acquisition: marking the vertexes of the 3D object and projecting the vertexes into a plurality of two-dimensional views in different directions, acquiring 2D image data of the 3D object and recording the relation among the vertexes of the 3D object:
for each shape of 3D object, a virtual three-dimensional boundary sphere and the longitude and latitude thereof are constructed by taking the 3D object as the sphere center, as shown in fig. 2, 26 virtual cameras are defined at different longitude and latitude positions of a virtual three-dimensional boundary sphere, the virtual cameras are placed at the radius of the three-dimensional boundary sphere, any position at the radius of the three-dimensional boundary sphere is selected and a direction is randomly set as an initial azimuth angle, the selected position point is contained on the built equator of the longitude and latitude, the first virtual camera (camera 1) is then placed in this position, and another 7 virtual cameras (cameras 2 to 8) are fixed uniformly at every 45 degrees longitude angle along the equator, when the latitudes of the elevation angles reach 45 degrees and-45 degrees, respectively, another 16 virtual cameras (cameras 9 to 16 and cameras 17 to 24) are placed at the same longitude angles as the cameras 1 to 8 in latitudes of 45 ° and-45 °. The last two cameras are placed on both poles. Further, each camera was rotated 4 times at intervals of 90 degrees to increase the training data set. After obtaining shadow and depth images of 3D objects, we convert the single channel images into three channel images: setting the positive depth image as a first channel as an input image of the projection neural network, setting the shadow image as a second channel, and setting the negative depth image as a third channel.
In other embodiments, the angle at which the virtual cameras are uniformly arranged along the equator with the initial position as the starting point may be other values, such as 30 ° and 60 °, and the like, and the elevation angle of the latitude may also be other angles, and different angles may be selected according to different 3D objects, so that different numbers of virtual cameras may be set according to different 3D objects.
S2, constructing the training probability distribution of the artificially marked 3D object interest points: constructing an interest point training probability distribution P on the surface of each 3D object based on normal probability density;
the human labeled POIs are usually several vertices on the mesh, so if the projection images of those labeled shapes are used directly in the neural network training process, the distributions of the positive and negative samples will be greatly unbalanced, for which we will construct probability distributions for the 3D labeled object' S POI before the training starts, i.e. the 3D object surface constructed POI training probability distribution P is generated in step S2 according to the attenuation of the nearest POI geodesic distance, and any 3D object surface vertex v i Above defines P as:
Figure BDA0003004951780000061
wherein d (v) i ,p i ) Representing a vertex v i To the nearest point of interest p i σ is a parameter that can be used to control the decay rate, and in experiments where σ is set to one fifth of the maximum geodesic distance, the probability distribution P will be mapped onto all 2D images projected from the 3D object.
S3, training a neural network: training a neural network according to training image data and training probability distribution P of the artificial marker to obtain a neural network model capable of automatically generating 3D object interest point probability distribution;
and training the convolutional neural network by using the prepared data to predict the probability distribution of the POI of the 3D object on the 2D view, taking the 3D-shaped 2D image as input by using the projection neural network in the training process, and outputting a corresponding labeled image. The neural network attempts to learn a mapping from each pixel of the input image (containing shadow and depth information of the 3D shape) to the corresponding pixel of the output 2D image (containing probability distribution information of whether the vertex is a POI).
The neural network training process specifically comprises the following steps:
s31, designing an encoder network consisting of Conv + BN + ReLU and a pooling layer as a convolutional layer of a neural network into a feature extractor of a 2D view, extracting a feature value of each pixel of a 2D image, and forming a decoder network by an upsampling layer and the Conv + BN + ReLU layer;
s32, the up-sampling layer receives corresponding pooling indexes from the pooling layer, each pixel characteristic value is set in a position according to the merging indexes, and a density characteristic diagram is generated by the conversion layer;
s33, feeding the feature map to the softmax layer to classify each pixel independently, and adding a weighted pixel classification layer after the softmax layer to achieve class balancing.
It is worth noting that even if the probability distribution is constructed without using the projection image (usually binary image) of the marker shape directly, the unbalanced distribution is still faced in the training sample, and the positive sample (pixel) only accounts for 7.5% of the output image in the experiment, which may negatively affect the prediction performance of the neural network. Therefore, class weighting is used to balance the samples of different classes, in which the weight w of the kth class of samples is used k Is defined as:
Figure BDA0003004951780000071
wherein W k Representing the number of samples (pixels) in the kth class of images, and then using the weight w k Establishing a weighted pixel classification layer with cross entropy loss, wherein a loss function is defined as:
Figure BDA0003004951780000072
where m is the pixel on the output image, t km Is the ground truth probability that pixel m belongs to class k, y km Is the predicted probability that pixel m belongs to class k, the last layer of the neural network, the weighted pixel classification layer, is represented by w k Composition, i.e. class weight of each pixel.
S4, obtaining probability distribution Q of interest points on the surface of the test 3D object: and directly mapping the obtained POI probability distribution on the 2D view back to the surface of the 3D object according to the relation between the pixel point of the 2D image and the vertex of the 3D object recorded in the projection process, and obtaining the three-dimensional interest point probability distribution Q of the surface of the 3D object. Since one vertex may appear in a plurality of projected 2D images, the following strategy is adopted using step S4 Determining vertices v on a 3D object i True probability Q of whether or not it is a POI i
Figure BDA0003004951780000073
Wherein q is ij Is that the 3D object is at the corresponding vertex v i Respectively, the predicted value of the jth pixel of (1) is considered, and no pixel or n pixels corresponding to the vertex v are considered i The case (1).
S5, extracting three-dimensional interest points: the detection of extracting a specific POI on a 3D shape by a three-dimensional interest point probability distribution Q of the surface of the 3D object may be understood as extracting a point having a locally highest probability value on the surface of the 3D shape. Specifically, a new density peak value clustering algorithm is adopted, the defect that the traditional method lacks an automatic peak value extraction mechanism is overcome, a new method for automatically extracting required peak points is established, and each peak point v is subjected to a clustering algorithm i Let ρ i denote v i After the two-dimensional decision diagram is obtained, POI is distributed at the upper right corner, and other points are distributed near a coordinate axis, at the moment, the extraction problem of the interest points becomes the simplest two-classification problem in machine learning, so that the values of variable parameters C1 and C2 are adjusted according to the result of training data, and a better separation curve is obtained 3 . For different interest point extraction tasks, the position of the curve can change along with the change of C1 and C2.
The method specifically comprises the following steps:
S51, as shown in FIG. 3, mapping all the vertexes vi on the 3D object to a two-dimensional decision graph, wherein the horizontal axis and the vertical axis represent ρ and δ, respectively, and δ i is defined as each vertex v of the 3D object i Influence radius of (2):
Figure BDA0003004951780000081
wherein d (v) i ,v j ) Is the vertex v i And v j The geodesic distance between the two is large at the upper right corner on the decision diagramPoints of δ and large ρ are considered good candidates for POI;
s52, defining a curve function on the two-dimensional decision diagram
Figure BDA0003004951780000082
For finding separate curves that define three-dimensional points of interest and other common vertices (POI always appears in the upper right corner, while other common vertices appear around two axes), where C 1 And C 2 Variables representing the up and down movement of the curve, defining a separation curve
Figure BDA0003004951780000083
S53, according to the separation curve on the two-dimensional decision diagram of FIG. 3 3 Determining that three-dimensional interest point is located curve 3 And the vertex on the right side and the top side is regarded as POI, and the obtained three-dimensional interest point is mapped to a 3D object, namely the final three-dimensional interest point.
On the basis of the original algorithm in the step S51, the density peak value clustering algorithm is improved and used for extracting the interest points.
The embodiments in the present description are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.
The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (6)

1. A three-dimensional interest point extraction method based on multi-view projection and deep learning is characterized by comprising the following steps:
s1, collecting training image data: projecting the 3D object into a plurality of 2D views in different directions, and recording the corresponding relation between each pixel in the 2D views and each vertex on the surface of the 3D object;
s2, constructing the training probability distribution of the artificially marked 3D object interest points: constructing a training probability distribution P of the artificially marked interest points on the surface of each 3D object based on the normal probability density;
s3, training a neural network: training a neural network according to training image data and training probability distribution P of the artificial marker to obtain a neural network model capable of automatically generating 3D object interest point probability distribution;
The method comprises the following specific steps:
s31, taking an encoder network formed by Conv + BN + ReLU and a pooling layer as a convolutional layer of a neural network to design a feature extractor of a 2D view, extracting a feature value of each pixel of a 2D image, and forming a decoder network by an upsampling layer and the Conv + BN + ReLU layer;
s32, the up-sampling layer receives corresponding pooling indexes from the pooling layer, each pixel characteristic value is set in a position according to the merging indexes, and a density characteristic diagram is generated by the conversion layer;
s33, feeding the feature map to a softmax layer to classify each pixel independently;
s4, obtaining probability distribution Q of interest points on the surface of the test 3D object: projecting a 3D model of interest points to be extracted to obtain a 2D view, inputting images into a trained neural network model to obtain probability distribution of the interest points to be extracted on the 2D view, and then back-projecting the probability distribution to the surface of a 3D object to obtain the probability distribution Q of three-dimensional interest points on the surface of the tested 3D object;
s5, extracting three-dimensional interest points: extracting three-dimensional interest points on the 3D object by using an improved density peak clustering algorithm according to the probability distribution Q of the step S4;
the improved density peak clustering algorithm specifically comprises the following steps:
S51, connecting all the vertexes v of the 3D object i Mapping to a two-dimensional decision graph, where the horizontal and vertical axes represent p and δ, respectively, while δ is represented i Defined as each vertex v of a 3D object i Influence radius of (2):
Figure FDA0003674253890000011
wherein d (v) i ,v j ) Is the vertex v i And v j Geodesic distance therebetween;
s52, defining a curve function on the two-dimensional decision diagram
Figure FDA0003674253890000021
Separation curves for determining three-dimensional points of interest and other common vertices, where C 1 And C 2 Variables representing the up and down movement of the curve, defining a separation curve
Figure FDA0003674253890000022
S53, according to the separation curve on the two-dimensional decision diagram 3 And determining a three-dimensional interest point, and mapping the obtained three-dimensional interest point back to the 3D object to obtain the final three-dimensional interest point.
2. The multi-view projection and deep learning three-dimensional interest point extraction method according to claim 1, wherein the step S1 of projecting the 3D object into the 2D views in a plurality of different directions comprises:
s11, constructing a virtual three-dimensional boundary sphere by taking the 3D object as a sphere center;
s12, taking any point on the surface of the virtual three-dimensional boundary sphere as an initial position and selecting any direction to construct longitude and latitude, wherein the initial position point is contained on the equator of the constructed longitude and latitude;
s13, placing a first virtual camera at the initial position of the virtual three-dimensional boundary sphere, and uniformly arranging the virtual cameras along the equator by using the initial position as a starting point and using the same longitude angle;
S14, correspondingly and uniformly arranging virtual cameras on latitude lines with the same latitude angle on two sides of the equator by taking the positions of the virtual cameras arranged at different longitudes on the equator of the virtual three-dimensional boundary sphere as a reference, wherein two virtual cameras are respectively arranged at two polar positions of the virtual three-dimensional boundary sphere;
and S15, shooting a 3D object located at the sphere center position of the virtual three-dimensional boundary sphere through the virtual camera to acquire a 2D image, wherein the 2D image comprises a shadow image and a positive and negative depth image of the 3D object.
3. The multi-view projection and deep learning three-dimensional point of interest extraction method of claim 2, wherein the same longitude angle comprises a 45 ° longitude angle and the same latitude angle comprises a 45 ° latitude angle.
4. The multi-view projection and deep learning three-dimensional interest point extracting method according to claim 2, wherein the virtual camera rotates 4 times at an angle interval of 90 degrees to increase the amount of training data when acquiring the object image.
5. The method for extracting three-dimensional interest points through multi-view projection and deep learning according to claim 1, wherein the training probability distribution P of the 3D object interest points constructed in the step S2 and labeled by human is generated according to attenuation of geodesic distance of the nearest interest points, and any vertex v is generated i The upper P is defined as:
Figure FDA0003674253890000031
wherein d (v) i ,p i ) Representing a vertex v i To the nearest point of interest p i σ is a parameter that can be used to control the decay rate.
6. The multi-view projection and deep learning three-dimensional interest point extraction method according to claim 1, wherein the probability distribution Q of the interest points on the surface of the test 3D object obtained in the step S4 adopts the following strategy:
Figure FDA0003674253890000032
wherein Q i For the surface vertex v of the 3D object i The true probability of whether the interest point is; q. q.s i J is the 3D object at the corresponding vertex v i 0 indicates that there is no pixel point at the corresponding vertex, and n indicates that there are n pixel points at the corresponding vertex.
CN202110359551.7A 2021-04-02 2021-04-02 Three-dimensional interest point extraction method based on multi-view projection and deep learning Active CN113052110B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110359551.7A CN113052110B (en) 2021-04-02 2021-04-02 Three-dimensional interest point extraction method based on multi-view projection and deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110359551.7A CN113052110B (en) 2021-04-02 2021-04-02 Three-dimensional interest point extraction method based on multi-view projection and deep learning

Publications (2)

Publication Number Publication Date
CN113052110A CN113052110A (en) 2021-06-29
CN113052110B true CN113052110B (en) 2022-07-29

Family

ID=76517480

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110359551.7A Active CN113052110B (en) 2021-04-02 2021-04-02 Three-dimensional interest point extraction method based on multi-view projection and deep learning

Country Status (1)

Country Link
CN (1) CN113052110B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972958B (en) * 2022-07-27 2022-10-04 北京百度网讯科技有限公司 Key point detection method, neural network training method, device and equipment
CN117670911B (en) * 2023-11-23 2024-06-14 中航通飞华南飞机工业有限公司 Quantitative description method of sand paper ice

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108257139A (en) * 2018-02-26 2018-07-06 中国科学院大学 RGB-D three-dimension object detection methods based on deep learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070052700A1 (en) * 2005-09-07 2007-03-08 Wheeler Frederick W System and method for 3D CAD using projection images
CN105005755B (en) * 2014-04-25 2019-03-29 北京邮电大学 Three-dimensional face identification method and system
US10417533B2 (en) * 2016-08-09 2019-09-17 Cognex Corporation Selection of balanced-probe sites for 3-D alignment algorithms
CN110334704B (en) * 2019-06-21 2022-10-21 浙江大学宁波理工学院 Three-dimensional model interest point extraction method and system based on layered learning

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108257139A (en) * 2018-02-26 2018-07-06 中国科学院大学 RGB-D three-dimension object detection methods based on deep learning

Also Published As

Publication number Publication date
CN113052110A (en) 2021-06-29

Similar Documents

Publication Publication Date Title
CN108648161B (en) Binocular vision obstacle detection system and method of asymmetric kernel convolution neural network
CN104484648B (en) Robot variable visual angle obstacle detection method based on outline identification
CN109559310B (en) Power transmission and transformation inspection image quality evaluation method and system based on significance detection
CN104063702B (en) Three-dimensional gait recognition based on shielding recovery and partial similarity matching
CN105279372B (en) A kind of method and apparatus of determining depth of building
CN106909902B (en) Remote sensing target detection method based on improved hierarchical significant model
CN104850850B (en) A kind of binocular stereo vision image characteristic extracting method of combination shape and color
CN104134200B (en) Mobile scene image splicing method based on improved weighted fusion
CN108596108B (en) Aerial remote sensing image change detection method based on triple semantic relation learning
CN111899172A (en) Vehicle target detection method oriented to remote sensing application scene
CN114758252B (en) Image-based distributed photovoltaic roof resource segmentation and extraction method and system
CN108960404B (en) Image-based crowd counting method and device
CN106778659B (en) License plate recognition method and device
CN105678806B (en) A kind of live pig action trail automatic tracking method differentiated based on Fisher
CN113052110B (en) Three-dimensional interest point extraction method based on multi-view projection and deep learning
CN112801074B (en) Depth map estimation method based on traffic camera
CN105869178A (en) Method for unsupervised segmentation of complex targets from dynamic scene based on multi-scale combination feature convex optimization
CN103632167B (en) Monocular vision space recognition method under class ground gravitational field environment
CN113160062B (en) Infrared image target detection method, device, equipment and storage medium
CN109376641B (en) Moving vehicle detection method based on unmanned aerial vehicle aerial video
CN112950780B (en) Intelligent network map generation method and system based on remote sensing image
CN110263716B (en) Remote sensing image super-resolution land cover mapping method based on street view image
CN107944437B (en) A kind of Face detection method based on neural network and integral image
CN114943893B (en) Feature enhancement method for land coverage classification
CN113901874A (en) Tea tender shoot identification and picking point positioning method based on improved R3Det rotating target detection algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant