CN113191361A

CN113191361A - Shape recognition method

Info

Publication number: CN113191361A
Application number: CN202110418108.2A
Authority: CN
Inventors: 杨剑宇; 李一凡; 闵睿朋; 黄瑶
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2021-04-19
Filing date: 2021-04-19
Publication date: 2021-07-30
Anticipated expiration: 2041-04-19
Also published as: CN113191361B

Abstract

The invention provides a shape identification method, which comprises the steps of extracting contour key points of a shape sample; defining an approximate bias curvature value at each key point and judging the concavity and convexity at the key points to obtain candidate segmentation points; adjusting a curvature screening threshold value to obtain shape division points; calculating the minimum segmentation cost to perform shape segmentation to obtain a plurality of sub-shape parts; constructing a topological structure of the shape sample; obtaining a feature expression image of a corresponding sub-shape part by using a full-scale visual representation method of the shape; inputting each feature expression image into a convolutional neural network for training, and learning to obtain a feature vector of each sub-shape part; constructing a feature matrix of the shape sample; constructing a graph convolution neural network; and training the graph convolution neural network, acquiring a characteristic matrix and an adjacency matrix of the test sample, and inputting the characteristic matrix and the adjacency matrix into the trained graph convolution network model to realize shape classification and identification.

Description

Shape recognition method

Technical Field

The invention relates to a shape recognition method, and belongs to the technical field of shape recognition.

Background

Outline shape recognition is an important research direction in the field of machine vision, and object recognition by using object shape features is a main research subject of machine vision, and the main result of the research is to sufficiently extract object shape features for better similarity measurement by improving a shape matching algorithm or designing effective shape descriptors. The method is widely applied to engineering in a plurality of fields such as radar, infrared imaging detection, image and video matching and retrieval, robot automatic navigation, scene semantic segmentation, texture recognition, data mining and the like.

In general, the expression and retrieval of outline shapes is based on hand-designed Shape descriptors to extract target outline features, such as Shape Contexts, Shape Vocalbulary, and Bag of constraint fragments. However, the shape information extracted by the manual descriptor is usually incomplete, and the descriptor cannot be guaranteed to have robustness to changes such as local changes, occlusion, overall deformation and the like of the target shape. And designing too many descriptors leads to feature extraction redundancy and higher computational complexity. Therefore, recognition accuracy and efficiency are low. In recent years, convolutional neural networks have been applied to shape recognition tasks as they have achieved better performance in image recognition tasks. And because the outline shape lacks information of surface texture, color and the like, the identification effect of directly applying the convolutional neural network is lower.

In view of the above problems of the shape recognition algorithm, how to provide a target recognition method capable of accurately classifying target contour shapes is a problem to be solved by those skilled in the art.

Disclosure of Invention

The invention is provided for solving the problems in the prior art, the technical proposal is as follows,

a method of shape recognition, the method comprising the steps of:

firstly, extracting outline key points of a shape sample;

step two, defining approximate bias curvature values at each key point and judging the curve concavity and convexity at the key points to obtain candidate shape segmentation points;

step three, adjusting a curvature screening threshold value to obtain shape division points;

fourthly, segmenting the shape based on the principle that the segmentation line segments are positioned in the shape and do not intersect with each other, and segmenting at the minimum segmentation cost to obtain a plurality of sub-shape parts;

step five, constructing a topological structure of the shape sample;

step six, obtaining a feature expression image of a corresponding sub-shape part by using a full-scale visual representation method of the shape;

inputting each feature expression image into a convolutional neural network for training, and learning to obtain a feature vector of each sub-shape part;

step eight, constructing a feature matrix of the shape sample;

constructing a graph convolution neural network;

step ten, training a graph convolution neural network, carrying out shape segmentation on the test sample, obtaining the feature vector of each sub-shape part, calculating the feature matrix and the adjacency matrix of the test sample, and inputting the feature matrix and the adjacency matrix into the trained graph convolution network model to realize shape classification and recognition.

Preferably, in the first step, the method for extracting the key points of the contour includes:

the contour of each shape sample is composed of a series of sampling points, and for any shape sample S, sampling n points on the contour results in:

S＝{(p_x(i)，p_y(i))|i∈[1，n]}，

wherein p is_x(i)，p_y(i) Is the horizontal and vertical coordinates of the contour sampling points p (i) in a two-dimensional plane, and n is the contour length, namely the number of the contour sampling points;

evolving the contour curve of the shape sample to extract key points, and deleting the point with the minimum contribution to target identification in each evolution process, wherein the contribution of each point p (i) is defined as:

wherein H (i, i-1) is the length of the curve between points p (i) and p (i-1), H (i, i +1) is the length of the curve between points p (i) and p (i +1), H₁(i) The angle between the segment p (i) p (i-1) and the segment p (i) p (i +1) is defined, and the length h is normalized according to the perimeter of the contour; a larger value of Con (i) indicates a larger contribution of the point p (i) to the shape feature;

the method introduces an adaptive termination function F (t) based on a region to overcome the problem of excessive or insufficient extraction of the key points of the contour:

wherein S₀Is the area of the original shape, S_iFor the shape area after i evolutions, n₀The total number of points on the outline of the original shape; when the end function value F (t) exceeds the set threshold, the extraction of the contour key points is ended and n is obtained^*And (4) contour key points.

Further, in the second step, defining an approximate bias curvature value at each key point and judging the concave-convex of the curve at the key point to obtain candidate segmentation points includes:

in order to calculate an approximate bias curvature value of any key point p (i) in the shape sample S, taking adjacent contour points p (i-epsilon) before and after p (i), p (i + epsilon), wherein epsilon is an empirical value; because:

cosHε(i)∝cur(p(i))，

where H ε (i) is the angle between line segment p (i) p (i- ε) and line segment p (i) p (i + ε), and cur (p (i)) is the curvature at point p (i); the approximate bias curvature values cur (p (i)) at points p (i)) are defined as:

cur～(p(i))＝cosHε(i)+1，

wherein H epsilon (i) is an angle between a line segment p (i) p (i-epsilon) and the line segment p (i) p (i + epsilon), cosH epsilon (i) ranges from-1 to 1, and cur (p (i)) ranges from 0 to 2;

according to a shape segmentation method conforming to visual naturalness, shape segmentation points are all positioned at the concave curve of the outline; therefore, when screening candidate segmentation points for shape segmentation, a method for judging the concave-convex property of the curve at the key point p (i) is defined:

for the shape binarization image, the numerical values of the pixel points inside the shape sample S outline are all 255, and the numerical values of the pixel points outside the shape sample S outline are all 0; the method comprises the steps of equidistantly sampling a line segment p (i-epsilon) p (i + epsilon) to obtain R discrete points, if the pixel values of the R discrete points are all 255, the line segment p (i-epsilon) p (i + epsilon) is completely in the shape outline, namely, the curve at p (i) is shown to be convex, if the pixel values of the R discrete points are all 0, the line segment p (i-epsilon) p (i + epsilon) is completely outside the shape outline, namely, the curve at p (i) is shown to be concave, and recording the key points p (i) with all the curves shown to be concave as candidate segmentation points P (j).

Further, in the third step, the step of adjusting the curvature screening threshold Th and obtaining the shape segmentation point is as follows:

(1) regarding all candidate segmentation points P (j) obtained in the step two, taking the average approximate bias curvature value of the candidate segmentation points P (j) as an initial threshold Th₀：

Wherein J is the total number of the candidate segmentation points;

(2) for threshold Th at the time of adjustment of the t-Th time_τBased on the approximate bias curvature value and Th of each candidate division point P (j)_τThe size relationship of (a) can be divided into two types: approximate bias curvature value greater than Th_τCandidate division point of

And approximate bias curvature value of Th or less_τCandidate division point of

Calculating and recording the segmentation division degree D under the current threshold value_τ：

Wherein the content of the first and second substances,

wherein

Respectively represent threshold values Th_τThe positive and negative curvature deviations of the next candidate division points P (j),

represents the minimum of the positive curvature deviations of all candidate segmentation points,

representing the maximum value of the negative curvature deviation of all candidate segmentation points;

judging whether an approximate bias curvature value larger than a threshold value Th exists or not_τIf the candidate segmentation point does not exist, the adjustment is not carried out, and the step (4) is carried out; if there is an approximate bias curvature value greater than threshold Th_τTurning to the step (3) to continuously adjust the threshold value;

(3) continuing to adjust the threshold, new threshold Th_τ+1The minimum value of the positive curvature deviations of all candidate segmentation points in the last threshold value adjusting process is expressed by the following formula:

according to a threshold Th_τ+1Calculating positive and negative curvature deviation of each candidate division point under the adjustment of the (tau +1) th time

And a division degree D_τ+1And recording; judging whether an approximate bias curvature value larger than a threshold value Th exists or not_τ+1If the candidate segmentation point does not exist, the adjustment is not carried out, and the step (4) is carried out; if there is an approximate bias curvature value greater than threshold Th_τ+1If the candidate segmentation point is τ +1, repeating the current step and continuously adjusting the threshold;

(4) the multiple threshold adjustment value has a plurality of segmentation degrees, the threshold corresponding to the maximum segmentation degree is the final curvature screening threshold Th, and the point with the approximate bias curvature value smaller than the threshold Th is the final shape segmentation point.

Further, in the fourth step, the specific method of segmenting the shape based on the principle that the segmentation line segments are located in the shape and do not intersect with each other to obtain a plurality of sub-shape portions by segmentation with the minimum segmentation cost is as follows:

(1) partition Point P (e) for any two shapes₁)，P(e₂) Equally-spaced sample-divided line segment P (e)₁)P(e₂) C discrete points are obtained, and if there is a discrete point with a pixel value of 0 among the C discrete points, the line segment P (e)₁)P(e₂) The part outside the shape outline exists and is not selected as a segmentation line segment;

(2) partition Point P (e) for any two shapes₃)，P(e₄) If there is a bar-shaped segment P (e)₅)P(e₆) Such that:

or

The line segment P (e)₃)P(e₄) With the existing segment P (e)₅)P(e₆) Intersect, no line segment P (e) is selected₃)P(e₄) As a segment;

(3) the segmentation line segment set meeting the two principles is further screened, and three measurement indexes I for evaluating the quality of the segmentation line segment are defined to realize segmentation under the minimum segmentation cost:

wherein D^*(u，v)、L^*(u，v)、S^*(u, v) are three division measurement indexes of normalized division length, division arc length and division residual area, u and v are serial numbers of any two shape division points,

the total number of the division points is;

for any shape segment p (u) p (v), three ways of calculating the segmentation evaluation index are as follows:

wherein D_maxThe length of the segment having the largest length among all the segments, D^*The value range of (u, v) should be between 0 and 1, and the smaller the value is, the more remarkable the segmentation effect is;

wherein

Is a profile curve between two points P (u) and P (v)

Length of (L)^*The value range of (u, v) should be between 0 and 1, and the smaller the value is, the more remarkable the segmentation effect is;

wherein S_dIs the shape area divided by the line segment P (u) P (v), i.e. the line segment P (u) P (v) and the contour curve

Area of the enclosed area formed, S^*The value range of (u, v) should be between 0 and 1, and the smaller the value is, the more remarkable the segmentation effect is;

calculating the segmentation Cost for the segmentation line segment P (u) P (v) according to the steps:

Cost＝αD^*(u，v)+βL^*(u，v)+γS^*(u，v)，

wherein, alpha, beta and gamma are the weight of each measurement index;

calculating the segmentation Cost of the segmentation line segments in the screened segmentation line segment set; sorting all the calculated Cost from small to large, and finally selecting N-1 segmentation line segments with the smallest Cost according to the number N of the segmentation sub-shape parts set by the type of the shape sample S, so as to realize optimal segmentation and obtain N sub-shape parts; the number N of the divided sub-shape portions depends on the category to which the current shape sample S belongs, and the corresponding number of the divided sub-shape portions is manually set for shapes of different categories.

Further, in the fifth step, a specific method for constructing the topological structure of the shape sample is as follows: for N sub-shape portions obtained by dividing any shape sample S, the central shape portion is recorded as a start vertex v₁And the other adjacent shape parts are sorted in the clockwise direction as vertex { v }_o|o∈[2，N]}; v. note connection₁To the remaining vertices v_oIs (v)₁，v_o) And further forming a shape directed graph which meets the topological order:

G₁＝(V₁，E₁)，

wherein V₁＝{v_o|o∈[1，N]}，E₁＝{(v₁，v_o)|o∈[2，N]}；

After all the training shape samples are optimally segmented, recording the maximum number of the sub-shape parts obtained by segmenting the training shape samples as

For any shape sample S, its adjacency matrix

The calculation method is as follows:

wherein

To represent

The matrix of real numbers of the order of order,

further, in the sixth step, a specific method for obtaining the color feature expression image corresponding to the sub-shape part by using a full-scale visual representation method of the shape is as follows:

for a sub-shape portion S of any shape sample S¹：

Wherein the content of the first and second substances,

for the contour sampling point p of the sub-shape part¹(i) Abscissa and ordinate in two-dimensional plane, n¹The length of the outline, namely the number of the outline sampling points;

the profile of the sub-shape portion S1 is first described using a feature function M composed of three shape descriptors:

M＝{s_k(i)，l_k(i)，c_k(i)|k∈[1，m]，i∈[1，n¹]}，

wherein s is_k，l_k，c_kThree invariant parameters of normalized area s, arc length l and gravity center distance c in a scale k, wherein k is a scale label, and m is a total scale degree; these three shape invariant descriptors are defined separately:

sampling points p by a contour¹(i) As the center of circle, with the initial radius

Making a preset circle C₁(i) The preset circle is the initial semi-global scale for calculating the parameters of the corresponding contour points; obtaining a preset circle C according to the steps₁(i) Then, the three shape descriptors under the scale k-1 are calculated as follows:

in calculating s₁(i) When describing the son, the circle C will be preset₁(i) And target contour point p¹(i) Zone Z having a direct connection relationship₁(i) Is marked as

Then there are:

wherein, B (Z)₁(i) Z) is an indicator function defined as

Will Z₁(i) Area of and predetermined circle C₁(i) The ratio of the areas is used as the area parameter s of the descriptor of the target contour point₁(i)：

s₁(i) Should range between 0 and 1;

in the calculation of c₁(i) When describing the sub-image, firstly, the contour point p of the target is calculated¹(i) The center of gravity of the region having the direct connection relationship is specifically an average of coordinate values of all pixel points in the region, and an obtained result is a coordinate value of the center of gravity of the region, which can be expressed as:

wherein, w₁(i) The gravity center of the area is obtained;

then, the target contour point p is calculated¹(i) And center of gravity w₁(i) Is a distance of

Can be expressed as:

finally will be

With the target contour point p¹(i) Predetermined circle C₁(i) Is taken as the gravity center distance parameter c of the descriptor of the target contour point₁(i)：

c₁(i) Should range between 0 and 1;

in calculating l₁(i) When describing the son, the circle C will be preset₁(i) Inner and target contour points p¹(i) The length of the arc segment having the direct connection relationship is recorded as

And will be

And a predetermined circle C₁(i) The ratio of the perimeter is used as the arc length parameter l of the descriptor of the target contour point₁(i)：

Wherein l₁(i) Should range between 0 and 1;

calculating according to the steps to obtain the initial radius of the scale label k which is 1

Of the shape sample S at the semi-global scale¹Characteristic function M of₁：

M₁＝{s₁(i)，l₁(i)，c₁(i)|i∈[1，n¹]}，

Selecting a single pixel as a continuous scale change interval under a full-scale space because the digital image takes one pixel as a minimum unit; i.e. for the kth dimension label, set circle C_kRadius r of_k：

I.e. when the initial dimension k is 1,

after this radius r_kReducing the size of the pixel by m-1 times in a constant amplitude mode by taking one pixel as a unit until the minimum size k is m; according to the characteristic function M under the calculation scale k being 1₁In the method, the characteristic functions under other scales are calculated, and finally the sub-shape part S of the shape sample S under the whole scale is obtained¹The characteristic function of (1):

M＝{s_k(i)，l_k(i)，c_k(i)|k∈[1，m]，i∈[1，n¹]}，

respectively storing the characteristic functions under all scales into a matrix S^M、L^M、C^M，S^MFor storing s_k(i)，S^MIs stored as a point p in the k-th row and i-th column of¹(i) Area parameter s at k-th scale_k(i)；L^MFor storing l_k(i)，L^MIs stored as a point p in the k-th row and i-th column of¹(i) Arc length parameter l at k-th scale_k(i)；C^MFor storing c_k(i)，C^MIs stored as a point p in the k-th row and i-th column of¹(i) Center of gravity distance parameter c at k scale_k(i)；S^M、L^M、C^MFinally as a sub-shape portion S of the shape sample S in the full-scale space¹Grayscale map representation of three shape features of (1):

GM¹＝{S^M，L^M，C^M}，

wherein S is^M、L^M、C^MAll the matrixes are m multiplied by n in size and respectively represent a gray image;

then the sub-shape part S¹The three gray images are used as RGB three channels to obtain a color image as the sub-shape part S¹Is characteristic expression image

Further, in the seventh step, the feature expression image samples of the sub-shape parts of all the training shape samples are input into a convolutional neural network, and a convolutional neural network model is trained; different sub-shape portions of each type of shape have different class labels; after the convolutional neural network is trained to be convergent, for any shape sample S, the feature expression image { T) corresponding to N sub-shape parts formed by dividing the shape sample S is formed^num|num∈[1，N]Inputting the trained convolutional neural network respectively, outputting the feature vector of the corresponding sub-shape part from the second layer full-connection layer of the network

Wherein Vec is the number of neurons in the second fully-connected layer;

the structure of the convolutional neural network comprises an input layer, a pre-training layer and a full-connection layer; the pre-training layer consists of the first 4 modules of the VGG16 network model, parameters obtained after the 4 modules are trained in the imagenet data set are used as initialization parameters, and three full-connection layers are connected behind the pre-training layer;

the 1 st module in the pre-training layer specifically comprises 2 convolutional layers and 1 maximum pooling layer, wherein the number of convolutional layers and convolutional kernels is 64, the size of the convolutional layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 2 nd module specifically comprises 2 convolutional layers and 1 maximum pooling layer, wherein the number of convolutional cores of the convolutional layers is 128, the size is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 3 rd module specifically comprises 3 convolutional layers and 1 maximum pooling layer, wherein the number of convolutional cores of the convolutional layers is 256, the size of the convolutional layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 4 th module specifically comprises 3 convolutional layers and 1 maximum pooling layer, wherein the number of convolutional cores of the convolutional layers is 512, the size of the convolutional layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the calculation formula of each convolution layer is as follows:

C_O＝φ_relu(W_C·C_I+θ_C),

wherein, theta_CIs the offset vector of the convolutional layer; w_CIs the weight of the convolutional layer; c_IIs an input to the convolutional layer; c_OIs the output of the convolutional layer;

the full-link layer module specifically comprises 3 full-link layers, wherein the 1 st full-link layer comprises 512 nodes, the 2 nd full-link layer comprises Vec nodes, and the 3 rd full-link layer comprises N_TA node; n is a radical of_TThe sum of the number of split sub-shape parts for all classes of shapes; the calculation formula of the first 2 layers of full connection layers is as follows:

F_O＝φ_tanh(W_F·F_I+θ_F),

wherein phi is_tanhIs tan h activation function, θ_FIs the offset vector of the fully-connected layer; w_FIs the weight of the fully-connected layer; f_IIs the input to the fully-connected layer; f_OIs the output of the fully-connected layer;

the last full-connection layer is an output layer, and the output calculation formula is as follows:

Y_O＝φ_softmax(W_Y·Y_I+θ_Y),

wherein phi is_softmaxFor softmax activation function, θ_YIs a bias vector of output layers, each output layer having neurons representing a corresponding one of the sub-shape portion classes, W_YIs the weight of the output layer, Y_IIs an input to the output layer; y is_OIs the output of the output layer.

Further, the specific method for constructing the feature matrix of the shape sample in the step eight is as follows:

for any shape sample S, N sub-shape parts formed by dividing the sample S are expressed by corresponding feature matrixes

The calculation formula of (2) is as follows:

wherein, F_aA row vector, F, representing the matrix F^aThe feature vector of the a-th sub-shape part output by the seventh step,

representing a zero vector of dimension Vec.

Further, in the ninth step, a structure of the graph convolutional neural network is constructed, which includes a preprocessing input layer, a hidden layer and a classification output layer, and the preprocessing input layer is provided with an adjacency matrix

The normalization pretreatment specifically comprises the following steps:

wherein

I_NIs a matrix of units, and is,

is a matrix of the degrees, and the degree matrix,

after normalization pretreatment

The hidden layer comprises 2 graph convolution layers, and the calculation formula of each graph convolution layer is as follows:

wherein the content of the first and second substances,

is the weight of the graph convolution layer; h_IIs the input of the graph convolution layer, the input of the 1 st convolution layer being the feature matrix of the shape sample

H_OIs the output of the graph convolution layer;

the calculation formula of the classification output layer is as follows:

wherein phi is_softmaxFor softmax activation function, G_IIs the input of the output layer, i.e. the output of the second layer map convolutional layer, G_WIs the weight of the output layer; g_OIs the output of the output layerDischarging; the neurons of each output layer represent a corresponding one of the shape classes.

Further, the specific method for realizing the contour shape classification and identification in the step ten is as follows: training the graph convolution neural network model until convergence; for any test shape sample, firstly extracting shape contour key points, calculating curvature values of the key points, judging the concavity and the convexity of the key points, obtaining candidate segmentation points, and then adjusting a curvature screening threshold value to obtain shape segmentation points; obtaining a segmentation line set according to the two principles of (1) and (2) in the fifth step, and calculating the segmentation cost of the segmentation line in the segmentation line set; if the number of the segment lines is less than

All the segmentation line segments are used to segment the shape; otherwise, according to the minimum segmentation cost

Each segmentation line segment segments the shape; calculating the color feature expression image of each sub-shape part, inputting the color feature expression image into a trained convolutional neural network, and taking the output of a second layer full-connection layer of the convolutional neural network as a feature vector of the sub-shape part; and constructing a shape directed graph of the test shape sample, calculating an adjacency matrix and a feature matrix of the test shape sample, inputting the shape directed graph into a trained graph convolution neural network model, and judging the shape type corresponding to the maximum value in the output vector as the shape type of the test sample to realize shape classification and identification.

The invention provides a new shape recognition method, and designs a new shape classification method by utilizing a graph convolution neural network; the proposed topological graph expression of the shape features is a directed graph structure constructed based on graph segmentation, which not only distinguishes shape hierarchies, but also fully utilizes the stable topological feature relationship among the hierarchical parts of the shapes to replace the geometric position relationship. Compared with the method in the background technology, the method only calculates and compares the corresponding salient point characteristics for matching, and the method can be more robustly suitable for the interferences of shape hinge transformation, partial shielding, rigid body transformation and the like; the full-scale visual representation method of the shape can comprehensively express all information of each sub-shape part, and then the characteristics of each part in the full-scale space are extracted by utilizing the continuous convolution calculation of the neural network; compared with the method of directly applying the convolutional neural network, the designed graph convolutional neural network has the advantages that training parameters are greatly reduced, and the calculation efficiency is higher.

Drawings

FIG. 1 is a flow chart of the operation of a shape recognition method of the present invention.

FIG. 2 is a partial sample schematic of a target shape in a shape sample set.

Fig. 3 is a schematic diagram of the segmentation of a shape sample.

Fig. 4 is a schematic diagram of a full scale space.

Fig. 5 is a schematic diagram of the target shape after being cut by a preset scale.

Fig. 6 is a schematic diagram of the target shape after being segmented by the preset scale.

FIG. 7 is a schematic diagram of a feature function of a sub-shape portion of a target shape at a single scale.

FIG. 8 is a schematic diagram of a feature matrix of a sub-shape portion of an object shape in full scale space.

Fig. 9 is a schematic diagram of three gray-scale images calculated from a sub-shape portion of the target shape and a synthesized color image.

FIG. 10 is a diagram of a convolutional neural network structure for training each of the sub-shape portion feature representation images.

Fig. 11 is a characteristic configuration diagram of each sub-shape portion of the target shape.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

As shown in fig. 1, a shape recognition method includes the following steps:

1. the number of lumped samples of shape samples is 1400 for 70 shape classes, each with 20 shape samples. Fig. 2 is a partial sample view of a target shape in a shape sample set. And randomly selecting half of the samples in each shape category to be divided into a training set, and dividing the rest half of the samples into a testing set to obtain 700 training samples and 700 testing samples. Sampling each shape sample to obtain 100 contour points, taking a shape sample S as an example:

S＝{p_x(i)，p_y(i)|i∈[1，100]}，

wherein p is_x(i)，p_y(i) The abscissa and ordinate of the contour sampling point p (i) in the two-dimensional plane.

And (3) evolving the contour curve of the shape sample to extract key points, and deleting the point which has the minimum contribution to target identification in each evolution process. Wherein the contribution of each point p (i) is defined as:

wherein H (i, i-1) is the length of the curve between points p (i) and p (i-1), H (i, i +1) is the length of the curve between points p (i) and p (i +1), H₁(i) The length h is normalized by the perimeter of the contour for the angle between segment p (i) p (i-1) and segment p (i) p (i + 1). A larger value of con (i) indicates a larger contribution of the point p (i) to the shape feature.

wherein S₀Is the area of the original shape, S_iIs the area after i evolutions, n₀Is the total number of points on the outline of the original shape. When the ending function value F (t) exceeds the set threshold, the extraction of the key points of the contour is ended. For the case shown in FIG. 324 contour key points are obtained by extracting the shape sample S.

2. And calculating the approximate bias curvature value and the curve concavity and convexity at each key point of the shape sample. Taking the shape sample S as an example, the approximate bias curvature value cur of the contour key point p (i)^～The calculation formula of (p (i)) is as follows:

cur^～(p(i))＝cosH_ε(i)+1，

wherein H_ε(i) The angle between line p (i) p (i-epsilon) and line p (i) p (i + epsilon), epsilon equals 3.

The method for judging the concave-convex performance of the curve at the key point p (i) of the contour is as follows:

equidistantly sampling a line segment p (i-epsilon) p (i + epsilon) to obtain R discrete points, and if the pixel values of the R discrete points are all 255, the line segment p (i-epsilon) p (i + epsilon) is completely in the shape contour, namely the curve at p (i) shows a convex shape; if the pixel values of the R discrete points are all 0, the line segment p (i-epsilon) p (i + epsilon) is all outside the shape contour, i.e. the curve at p (i) shows a concave shape. Let the key point p (i) where all curves represent a valley be the candidate segmentation point p (j). For the shape sample S, a total of 11 candidate segmentation points are extracted.

3. And adjusting a curvature screening threshold Th and obtaining a shape segmentation point. For 11 candidate segmentation points p (j) of the shape sample S, their average approximate bias curvature value is used as an initial threshold Th₀：

Wherein the average approximate bias curvature value cur of each of the 11 candidate segmentation points P (j)^～The sizes of (P (j)) are 0.1,0.2,0.25,0.35,0.4,0.5,0.5,0.64,0.7,0.7,0.8, respectively.

The threshold is sequentially increased according to the following method:

(1) for threshold Th at the time of adjustment of the t-Th time_τBased on the approximate bias curvature value and Th of each candidate division point P (j)_τThe size relationship of (a) can be divided into two types: approximate bias curvature value greater than Th_τCandidate division point of

Wherein the content of the first and second substances,

wherein

represents the maximum value of the negative curvature deviations of all candidate segmentation points.

Judging whether an approximate bias curvature value larger than a threshold value Th exists or not_τIf the candidate segmentation point does not exist, the adjustment is not performed, and the step (3) is carried out. If there is an approximate bias curvature value greater than threshold Th_τGo to step (2) to continue adjusting the threshold.

(2) Continuing to adjust the threshold, new threshold Th_τ+1Positive curvature deviation of all candidate segmentation points in the last threshold adjustment processIs formulated as follows:

And a division degree D_τ+1And recorded. Judging whether an approximate bias curvature value larger than a threshold value Th exists or not_τ+1If the candidate segmentation point does not exist, the adjustment is not performed, and the step (3) is carried out. If there is an approximate bias curvature value greater than threshold Th_τ+1Let τ be τ +1, repeat the current step and continue to adjust the threshold.

(3) The multiple threshold adjustment value has a plurality of segmentation degrees, the threshold corresponding to the maximum segmentation degree is the final curvature screening threshold Th, and the point with the approximate bias curvature value larger than the threshold Th is the final shape segmentation point.

For the shape sample S, the segmentation degree and the threshold value recorded in the 4-time threshold value adjustment process are respectively:

therefore, the maximum segmentation degree D₁Corresponding threshold Th₁A threshold value was selected for the final curvature, i.e., Th ═ 0.5. The 5 candidate segmentation points with the approximate bias curvature values smaller than Th are the final shape segmentation points, and the corresponding approximate bias curvatures are 0.1,0.2,0.25,0.35 and 0.4 respectively.

4. For the shape sample S, 5 shape division points are connected pairwise in sequence to form 10 line segments, and 7 line segments which are positioned in the shape and do not intersect with each other are reserved as a division line segment set. Calculating the segmentation Cost of each segmentation line segment by using the metric index I according to the following method:

wherein D^*(u，v)，L^*(u，v)，S^*(u, v) are three division measurement indexes of normalized division length, division arc length and division residual area, u and v are serial numbers of any two shape division points,

is the total number of the division points.

wherein D_maxThe length of the segment having the largest length among all the segments, D^*The value range of (u, v) should be between 0 and 1, and the segmentation effect is more remarkable when the numerical value is smaller.

Wherein

Is a profile curve between two points P (u) and P (v)

Length of (L)^*The value range of (u, v) should be between 0 and 1, and the segmentation effect is more remarkable when the numerical value is smaller.

Area of the enclosed area formed, S^*The value range of (u, v) should be between 0 and 1, and the segmentation effect is more remarkable when the numerical value is smaller.

Cost＝αD^*(u，v)+βL^*(u，v)+γS^*(u，v)，

where α, β, γ are the weights of the metrics.

As shown in fig. 3, for the shape sample S, 2 segment with the smallest and the second smallest segmentation cost are selected as the final optimal segmentation segment, and 3 sub-shape portions are obtained.

5. For the shape sample S, the central shape portion is taken as the starting vertex v₁And the other adjacent 2 shape parts are subjected to vertex sorting in the clockwise direction and are respectively recorded as vertexes { v₂，v₃}. V. note connection₁To the vertex v₂，v₃Are each (v)₁，v₂)，(v₁，v₃) And further forming a shape directed graph which meets the topological order:

G₁＝(V₁，E₁)，

wherein V₁＝{v₁，v₂，v₃}，E₁＝{(v₁，v₂)，(v₁，v₃)}。

Since the maximum number of sub-shape portions obtained by dividing each training sample of the contour shape set is 11, the adjacency matrix of the shape sample S

Expressed as:

wherein a belongs to [1, 11], b belongs to [1, 11 ].

6. Respectively carrying out full-scale visual representation on the 3 sub-shape parts obtained by segmentation, wherein the specific method of the full-scale visual representation is as follows:

(1) for any one of the sub-shape portion contours, sampling the contour results in 100 contour sample points. As shown in fig. 4, the total scale number in the full scale space is set to be 100, and the normalized area, the arc length and the gravity center distance of each contour point corresponding to each layer scale are calculated based on the coordinates of 100 contour sampling points. In the form of a sub-shaped part S¹For example, the specific calculation is as follows:

in the form of a sub-shaped part S¹Sampling point p of contour¹(i) As the center of circle, with the initial radius

Making a preset circle C₁(i) And the preset circle is the initial semi-global scale for calculating the parameters of the corresponding contour points. Obtaining a preset circle C according to the steps₁(i) Then, the target shape must have a portion falling within the predetermined circle, which is schematically shown in fig. 5. If the part of the target shape falling within the preset circle is a single area, the single area is the target contour point p¹(i) The region having a direct connection relationship, denoted as Z₁(i) (ii) a If the portion of the target shape falling within the preset circle is divided into a plurality of regions which are not connected to each other, such as the region A and the region B shown in FIG. 5, the target contour point p is determined¹(i) The area on the contour of the target is the point p corresponding to the target contour¹(i) The region having a direct connection relationship, i.e., region A in FIG. 5, is denoted asZ₁(i) In that respect Based on this, a circle C is preset₁(i) And target contour point p¹(i) Zone Z having a direct connection relationship₁(i) Is marked as

Then there are:

wherein, B (Z)₁(i) Z) is an indicator function defined as

Will Z₁(i) Area of and predetermined circle C₁(i) The ratio of the areas is used as the target contour point p¹(i) Area parameter s of descriptor (1)₁(i)：

s₁(i) Should range between 0 and 1.

At the calculated and target contour point p¹(i) When the center of gravity of the region having the direct connection relationship exists, specifically, the average value is calculated for the coordinate values of all the pixel points in the region, and the obtained result is the coordinate value of the center of gravity of the region, which can be expressed as:

wherein, w₁(i) I.e. the center of gravity of the above-mentioned region.

And calculate the target contour point and the center of gravity w₁(i) Is a distance of

Can be expressed as:

and will be

The ratio of the radius of the preset circle of the target contour point is used as the target contour point p¹(i) Centroid distance parameter c of descriptor₁(i)：

c₁(i) Should range between 0 and 1.

The contour of the target shape is cut by the predetermined circle and then one or more arc segments are inevitably formed in the predetermined circle, as shown in fig. 6. If only one arc segment of the target shape falls in the preset circle, determining the arc segment as the target contour point p¹(i) The arc segments with direct connection relation, if the target shape has a plurality of arc segments falling in the preset circle, such as arc segments A (segment A), B (segment B), C (segment C) in FIG. 6, the target contour point p is determined¹(i) The arc segment is the point p corresponding to the target contour¹(i) The arc segment having the direct connection relationship is the arc segment A (segment A) in FIG. 6. Based on this, a circle C is preset₁(i) Inner and target contour points p¹(i) The length of the arc segment having the direct connection relationship is recorded as

And will be

Wherein l₁(i) Should range between 0 and 1.

M₁＝{s₁(i)，l₁(i)，c₁(i)|i∈[1，100]}，

As shown in FIG. 7, the respective feature functions at 100 scales in the full scale space are calculated separately, where for the kth scale label, a circle C is set_kRadius r of_k：

I.e. when the initial dimension k is 1,

after this radius r_kThe image is reduced by equal amplitude 99 times by taking a pixel as a unit until the minimum dimension k is 100. Calculating to obtain a sub-shape part S of the shape sample S under the whole scale space¹The characteristic function of (1):

M＝{s_k(i)，l_k(i)，c_k(i)|k∈[1，100]，i∈[1，100]}，

(2) as shown in fig. 8, the sub-shape portion S¹Combining the characteristic functions under 100 scales in the full-scale space into three characteristic matrixes under the full-scale space according to the order of the scales:

GM¹＝{S^M，L^M，C^M}，

wherein S is^M、L^M、C^MEach representing a grayscale image is a grayscale matrix of size m × n.

(3) As shown in fig. 9, the sub-shape portion S¹III of (2)The gray scale image is used as RGB three channels to synthesize a color image as the sub-shape part S¹Is characteristic expression image

7. And constructing a convolutional neural network, which comprises an input layer, a pre-training layer and a full connection layer. The method inputs the characteristic expression image samples of all the sub-shape parts of the training shape samples into the convolutional neural network to train the convolutional neural network model. Different sub-shape portions of each type of shape have different class labels. After the convolutional neural network is trained to converge, taking the shape sample S as an example, the feature expression image { T with the size of 100 × 100 corresponding to the 3 sub-shape parts formed by dividing the shape sample S^num|num∈[1，3]Inputting the trained convolutional neural network respectively, outputting the feature vector of the corresponding sub-shape part from the second layer full-connection layer of the network

Wherein Vec is the number of neurons in the second fully-connected layer, and Vec is set to 200.

The invention uses sgd optimizer, learning rate is set to 0.001, delay rate is set to 1e-6, cross entropy is selected as loss function, and size of batch size is selected to be 128. As shown in fig. 10, the pre-training layer is composed of the first 4 modules of the VGG16 network model, and the parameters obtained after the 4 modules are trained in the imagenet data set are used as initialization parameters, and three full-connection layers are connected behind the pre-training layer.

The 1 st module in the pre-training layer specifically comprises 2 convolutional layers and 1 maximum pooling layer, wherein the number of convolutional layers and convolutional kernels is 64, the size of the convolutional layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 2 nd module specifically comprises 2 convolutional layers and 1 maximum pooling layer, wherein the number of convolutional cores of the convolutional layers is 128, the size is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 3 rd module specifically comprises 3 convolutional layers and 1 maximum pooling layer, wherein the number of convolutional cores of the convolutional layers is 256, the size of the convolutional layers is 3 multiplied by 3, and the size of the pooling layer is 2 multiplied by 2; the 4 th module specifically comprises 3 convolutional layers and 1 maximum pooling layer, wherein the number of convolutional layers is 512, the size is 3 × 3, and the size of the pooling layer is 2 × 2. The calculation formula of each convolution layer is as follows:

C_O＝φ_relu(W_C·C_I+θ_C),

the full-link layer module specifically includes a 3-layer full-link layer, wherein the 1 st layer full-link layer includes 512 nodes, the 2 nd layer full-link layer includes 200 nodes, and the 3 rd layer full-link layer includes 770 nodes. The calculation formula of the first 2 layers of full connection layers is as follows:

F_O＝φ_tanh(W_F·F_I+θ_F),

wherein phi is_tanh is the tan h activation function, θ_FIs the offset vector of the fully-connected layer; w_FIs the weight of the fully-connected layer; f_IIs the input to the fully-connected layer; f_OIs the output of the fully-connected layer;

Y_O＝φ_softmax(W_Y·Y_I+θ_Y),

wherein phi is_softmaxFor softmax activation function, θ_YIs a bias vector of output layers, each output layer having neurons representing a corresponding one of the sub-shape portion classes, W_YIs the weight of the output layer, Y_IIs an input to the output layer; y is_OIs the output of the output layer;

8. as shown in FIG. 11, the feature matrix of the shape sample is constructed from the 3 sub-shape feature vectors of the shape sample

Wherein, F_aA row vector, F, representing the matrix F^aThe feature vector of the a-th sub-shape part output for the above step,

representing a zero vector of dimension size 200.

9. And constructing a graph convolution neural network, which comprises a preprocessing input layer, a hiding layer and a classification output layer. The invention relates to a adjacency matrix of a shape sample topological graph

And feature matrix

And inputting the graph convolution neural network structure model for training. The invention uses sgd optimizer, learning rate is set to 0.001, delay rate is set to 1e-6, cross entropy is selected as loss function, and size of batch size is selected to be 128.

Preprocessing of adjacency matrices in input layer

The normalization pretreatment specifically comprises the following steps:

wherein

Is a matrix of units, and is,

is a matrix of the degrees, and the degree matrix,

after normalization pretreatment

wherein the content of the first and second substances,

H_OIs the output of the graph convolution layer;

the calculation formula of the classification output layer is as follows:

wherein phi is_softmaxFor softmax activation function, G_IIs the input of the output layer, i.e. the output of the second layer map convolutional layer, G_WIs the weight of the output layer; g_OIs the output of the output layer; the neurons of each output layer represent a corresponding one of the shape classes.

10. And inputting all training samples into the graph convolution neural network, and training the graph convolution neural network model. For any test shape sample, firstly extracting shape contour key points, calculating curvature values of the key points, judging the concavity and the convexity of the key points, obtaining candidate segmentation points, and then adjusting a curvature screening threshold value to obtain shape segmentation points. And connecting the shape segmentation points pairwise in sequence to form segmentation line segments, reserving line segments which are positioned in the shape and do not cross with each other as the segmentation line segments to obtain a segmentation line segment set, and calculating the segmentation cost of the segmentation line segments in the segmentation line segment set. If the number of segment lines is less than 10, all segment lines are used to segment the shape. Otherwise, the shape is segmented according to the 10 segmentation line segments with the minimum segmentation cost. And calculating the color feature expression image of each sub-shape part, inputting the color feature expression image into a trained convolutional neural network, and taking the output of a second layer full-connection layer of the convolutional neural network as the feature vector of the sub-shape part. And constructing a shape directed graph of the test shape sample, calculating an adjacency matrix and a feature matrix of the test shape sample, inputting the shape directed graph into a trained graph convolution neural network model, and judging the shape type corresponding to the maximum value in the output vector as the shape type of the test sample to realize shape classification and identification.

Although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that various changes in the embodiments and/or modifications of the invention can be made, and equivalents and modifications of some features of the invention can be made without departing from the spirit and scope of the invention.

Claims

1. A shape recognition method characterized by: the method comprises the following steps:

firstly, extracting outline key points of a shape sample;

step five, constructing a topological structure of the shape sample;

step eight, constructing a feature matrix of the shape sample;

constructing a graph convolution neural network;

2. A shape recognition method according to claim 1, characterized in that: in the first step, the method for extracting the key points of the contour comprises the following steps:

S＝{(p_x(i)，p_y(i))|i∈[1，n]}，

3. A shape recognition method according to claim 2, wherein: in the second step, the approximate bias curvature value at each key point is defined and the curve concavity and convexity at the key point is judged, so as to obtain the candidate segmentation points, the specific method comprises the following steps:

cosH_ε(i)∝cur(p(i))，

wherein H_ε(i) Is the angle between line p (i) p (i-epsilon) and line p (i) p (i + epsilon), cur (p (i)) is the curvature at point p (i);

the approximate bias curvature values cur (p (i)) at points p (i)) are defined as:

cur～(p(i))＝cosH_ε(i)+1，

wherein H_ε(i) Is the angle between line p (i) p (i-epsilon) and line p (i) p (i + epsilon), cosH_ε(i) The value range is between-1 and 1, and the value range of cur to (p (i)) is between 0 and 2;

for the shape binarization image, the numerical values of the pixel points inside the shape sample S outline are all 255, and the numerical values of the pixel points outside the shape sample S outline are all 0; equidistantly sampling a line segment p (i-epsilon) p (i + epsilon) to obtain R discrete points, and if the pixel values of the R discrete points are all 255, the line segment p (i-epsilon) p (i + epsilon) is completely in the shape contour, namely the curve at p (i) shows a convex shape; if the pixel values of the R discrete points are all 0, the line segment p (i-epsilon) p (i + epsilon) is all outside the shape outline, namely the curve at p (i) shows a concave shape; let the key point p (i) where all curves represent a valley be the candidate segmentation point p (j).

4. A shape recognition method according to claim 3, wherein: in the third step, the step of adjusting the curvature screening threshold Th and obtaining the shape segmentation point is as follows:

Wherein J is the total number of the candidate segmentation points;

Wherein the content of the first and second substances,

wherein

5. A shape recognition method according to claim 4, wherein: in the fourth step, the specific method for segmenting the shape based on the principle that the segmentation line segments are positioned in the shape and do not intersect with each other and obtaining a plurality of sub-shape parts by segmentation with the minimum segmentation cost comprises the following steps:

or

the total number of the division points is;

wherein

Is a profile curve between two points P (u) and P (v)

Length of (2)，L^*The value range of (u, v) should be between 0 and 1, and the smaller the value is, the more remarkable the segmentation effect is;

Cost＝αD^*(u，v)+βL^*(u，v)+γS^*(u，v)，

wherein, alpha, beta and gamma are the weight of each measurement index;

6. A shape recognition method according to claim 5, wherein: in the fifth step, a specific method for constructing the topological structure of the shape sample is as follows: for N sub-shape portions obtained by dividing any shape sample S, the central shape portion is recorded as a start vertex v₁And the other adjacent shape parts are sorted in the clockwise direction as vertex { v }_o|o∈[2，N]}; v. note connection₁To the remaining vertices v_oIs (v)₁，v_o) And further forming a shape directed graph which meets the topological order:

G₁＝(V₁，E₁)，

wherein V₁＝{v_o|o∈[1，N]}，E₁＝{(v₁，v_o)|o∈[2，N]}；

For any shape sample S, its adjacency matrix

The calculation method is as follows:

wherein

To represent

The matrix of real numbers of the order of order,

7. a shape recognition method according to claim 6, wherein: in the sixth step, a specific method for obtaining the color feature expression image corresponding to the sub-shape part by using a full-scale visual representation method of the shape is as follows:

for a sub-shape portion S of any shape sample S¹：

Wherein the content of the first and second substances,

first, the sub-shape portion S is described by using a feature function M composed of three shape descriptors¹The profile of (c):

M＝{s_k(i)，l_k(i)，c_k(i)|k∈[1，m]，i∈[1，n¹]}，

Then there are:

wherein, B (Z)₁(i)，z) is an indicator function defined as

s₁(i) Should range between 0 and 1;

wherein, w₁(i) The gravity center of the area is obtained;

Can be expressed as:

finally will be

With the target contour point p¹(i) Predetermined circle C₁(i) Radius ratio ofThe value is used as the gravity center distance parameter c of the descriptor of the target contour point₁(i)：

c₁(i) Should range between 0 and 1;

And will be

Wherein l₁(i) Should range between 0 and 1;

M₁＝{s₁(i)，l₁(i)，c₁(i)|i∈[1，n¹]}，

I.e. when the initial dimension k is 1,

M＝{s_k(i)，l_k(i)，c_k(i)|k∈[1，m]，i∈[1，n¹]}，

GM¹＝{S^M，L^M，C^M}，

8. A shape recognition method according to claim 7, wherein: in the seventh step, the feature expression image samples of all the sub-shape parts of the training shape samples are input into a convolutional neural network, and a convolutional neural network model is trained; different sub-shape portions of each type of shape have different class labels; after the convolutional neural network is trained to be convergent, for any shape sample S, the feature expression image { T) corresponding to N sub-shape parts formed by dividing the shape sample S is formed^num|num∈[1，N]Inputting the trained convolutional neural network respectively, outputting the feature vector of the corresponding sub-shape part from the second layer full-connection layer of the network

Wherein Vec is the number of neurons in the second fully-connected layer;

C_O＝φ_relu(W_C·C_I+θ_C)，

F_O＝φ_tanh(W_F·F_I+θ_F)，

Y_O＝φ_softmax(W_Y·Y_I+θ_Y)，

9. A shape recognition method according to claim 8, wherein: the specific method for constructing the feature matrix of the shape sample in the step eight comprises the following steps:

Is calculated byThe formula is as follows:

representing a zero vector of dimension Vec.

10. A shape recognition method according to claim 9, wherein: in the ninth step, a structure of the graph convolution neural network is constructed, the structure comprises a preprocessing input layer, a hiding layer and a classification output layer, and the preprocessing input layer is subjected to adjacency matrix

The normalization pretreatment specifically comprises the following steps:

wherein

I_NIs a matrix of units, and is,

is a matrix of the degrees, and the degree matrix,

after normalization pretreatment

wherein the content of the first and second substances,

H_OIs the output of the graph convolution layer;

the calculation formula of the classification output layer is as follows:

11. A shape recognition method according to claim 10, wherein: the specific method for realizing the contour shape classification and identification in the step ten comprises the following steps: training the graph convolution neural network model until convergence; for any test shape sample, firstly extracting shape contour key points, calculating curvature values of the key points, judging the concavity and the convexity of the key points, obtaining candidate segmentation points, and then adjusting a curvature screening threshold value to obtain shape segmentation points; obtaining a segmentation line segment set according to the two principles of (1) and (2) in the fifth step, and calculating the scoreThe segmentation cost of the segmentation line segments in the segmentation line segment set; if the number of the segment lines is less than