CN112465024A - Image pattern mining method based on feature clustering - Google Patents

Image pattern mining method based on feature clustering Download PDF

Info

Publication number
CN112465024A
CN112465024A CN202011353678.XA CN202011353678A CN112465024A CN 112465024 A CN112465024 A CN 112465024A CN 202011353678 A CN202011353678 A CN 202011353678A CN 112465024 A CN112465024 A CN 112465024A
Authority
CN
China
Prior art keywords
network
layer
clustering
feature
pictures
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011353678.XA
Other languages
Chinese (zh)
Inventor
梁雪峰
王倩楠
朱照延
石惠文
周颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xidian University
Original Assignee
Xidian University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xidian University filed Critical Xidian University
Priority to CN202011353678.XA priority Critical patent/CN112465024A/en
Publication of CN112465024A publication Critical patent/CN112465024A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses an image mode mining method based on feature clustering, which mainly solves the problem that a visual mode mined by the prior art cannot have discriminability and frequency at the same time and has the scheme that a picture is obtained and divided into a training set and a test set; training an AlexNet network by using a training set; screening pictures from the trained network for mining visual representation; acquiring high-level correlation characteristics according to the hierarchical correlation back propagation network; clustering the high-level relevant features to obtain frequent relevant features; the relevant features are propagated backwards to obtain a visual representation with discriminability and frequency. The method converts the mode mining task into the classification task, improves the discriminability, improves the frequency by carrying out density clustering on the related characteristics, and transmits the hierarchical relevance back to the original image to position the representative area in the original image to obtain the visual mode, improves the discriminability and the frequency of the mining visual mode, and can be used for representing the image mode in natural scenes and tourism.

Description

Image pattern mining method based on feature clustering
Technical Field
The invention belongs to the technical field of image processing, and further relates to an image mode mining method which can be used for representing image modes in natural scenes and travels.
Background
The feature clustering refers to clustering the features learned by the network in the neural network, and the unsupervised learning algorithm can well ensure that the distance between similar samples in a feature space is short and the distance between different samples is long. The clustering algorithm is a typical unsupervised learning algorithm and is mainly used for automatically classifying similar samples into a category. In the clustering algorithm, samples are divided into different categories according to the similarity among the samples, different clustering results can be obtained by using different similarity calculation methods, and a common similarity calculation method is an Euclidean distance method.
Pattern mining is an important topic in data mining research, and is the basis of many important data mining tasks such as association rules, correlation analysis, sequence patterns, causal relationships, plot segments, local periodicity, exposure patterns and the like. Therefore, the frequent mode has wide application, such as shopping basket data analysis, cross shopping, web page prefetching, personalized website, etc.
In recent years, how to mine visual patterns from a large number of photos has become an important problem, and some workers have studied on the problem. But in the past people have mainly used traditional methods to extract features of images. For example, David G Lowe et al, 1999 in the article Object recognition from local scale-innovative feature, proposed a feature extraction method SIFT, Carl Doersch et al, 2015 in the article at maps book local like part? In the method, a feature extraction method HOG is provided, and the local features extracted by the methods have limited capability in representing semantic information of an image. Later, researchers used convolutional neural networks to extract features that were able to learn the ability to stratify and high-level semantic representations of images, which were exploited by people to mine visual patterns in photographs.
Et al, in its published paper "Mining mid-level visual patterns with deep CNN activities" (2017 IJCV conference paper), propose a pattern Mining method based on convolutional neural network CNN and association rules. The method uses a convolutional neural network for feature representation and association rules for mining visual patterns. Because the discriminative information is at the position where the CNN activation value response is large, the discriminative performance of the mode is ensured by extracting the top K activation value index, and then association rule mining is carried out after the discrete indexes are converted into transactions, so that the visual mode which is frequently discriminated is obtained. The method has the disadvantages that some judgment information is lost in the mode of dividing the image into the image blocks, and the occupied memory is excessive.
Zhang W et al propose a binary pattern search based method of mining visual patterns in its published paper, "Binarized mode search for scalable visual pattern discovery" (the 2017 CVPR conference paper). The method comprises the steps of inputting an image into a VGG19 network, extracting 4096-dimensional features of FC7, converting the features from Euclidean space to binary space to reduce feature storage, clustering the image by using a mean shift algorithm, determining the frequency of the image, and finding out frequent and discriminant images by introducing a contrast set. The method has the disadvantages that only frequent and distinguished images can be found, and the mode in the images cannot be positioned.
Yang L et al, in their published paper "Learning discrete visual elements using part-based connected neural network" (2018 neuro-computing conference paper), propose to use the hierarchical abstraction principle of convolutional neural networks and maximum threshold analysis to add part-level structure in the network, where the structure is composed of conv, SPP, and Relu, and locate patterns with discriminant in the image by using unsupervised maximum threshold analysis method. The disadvantage of this method is that the frequency of the found patterns cannot be guaranteed.
Hongzhi Li et al, in their published paper "Pattern: Visual Pattern mining with deep neural network" (the 2018 ICMR conference paper), propose to use the filter of the last convolutional layer of the convolutional neural network to find the Visual pattern. The implementation scheme is as follows: firstly, the last convolutional layer of a pre-trained Alexnet network is connected with a global maximum pooling layer, 256-dimensional vectors are generated, 20 output neurons of a full connection layer are set, the previous parameters are fixed, only the full connection layer is trained, and the number of the output neurons is also the number of visual patterns; and then finding the first three convolution kernels with the largest contribution corresponding to each visual mode, and performing deconvolution on feature maps of the three convolution kernels to find the position of the visual mode corresponding to the original image. The method has the disadvantages that without theoretical basis, the found visual mode only comes from the maximum pooling layer, and the visual mode cannot be guaranteed to frequently appear in the image data set.
Disclosure of Invention
The invention aims to provide an image pattern mining method based on feature clustering to find a visual pattern with both discriminability and frequency in travel data aiming at the defects of the prior art.
The technical idea for realizing the aim of the invention is as follows: finding out pictures with discriminability by designing an image classification task, and finding out visual representation with frequency by a density clustering algorithm; visual patterns in an image are located by hierarchical relevance propagation.
According to the technical idea, the specific implementation of the invention comprises the following steps:
(1) acquiring pictures and dividing the pictures into a training set and a test set:
(1a) acquiring 20 types of picture data, wherein 10 thousands of pictures are obtained in total;
(1b) 1000 pictures are selected from each class, and in total, twenty thousand pictures are used as a test set for mining visual representation, and the rest pictures are used as a training set for training the convolutional neural network.
(2) Training a classification network AlexNet:
(2a) scaling the pictures in the training set to 227 x 227;
(2b) and inputting the zoomed picture into an AlexNet network to train the network until the network converges to obtain a fine-tuned classification network, namely an AlexNet model, wherein the AlexNet model comprises a feature extraction layer and a classification layer, the feature extraction layer comprises 5 convolution layers, and the classification layer comprises a first full-connection layer, a second full-connection layer and an output layer.
(3) The screening pictures are used to mine the visual representation:
(3a) setting a screening threshold Tp
(3b) Inputting the test set into a classification network AlexNet model to obtain a result which meets the requirement that the output layer is larger than a threshold value TpThe picture of (2) is regarded as a discriminant picture.
(4) Acquiring related characteristics of a high layer:
(4a) and (3) reversely propagating the network output to a second full-connection layer of the AlexNet model by using hierarchical correlation propagation to obtain high-level correlation characteristics of the image with discriminability for characteristic clustering.
(5) Obtaining the related characteristics with frequency:
(5a) clustering the high-level related features by using a density-based clustering algorithm;
(5b) and selecting the first 20 related feature vectors with the maximum density in each cluster in the clustering result so as to obtain high-level related features with frequency.
(6) Obtaining a visual representation with discriminability and frequency:
(6a) according to the hierarchical relevance propagation, continuously performing backward propagation in 5 convolutional layers of the AlexNet model until the propagation reaches the original image in the input layer;
(6b) the areas of the original image corresponding to the high-level related features are the visual patterns mined by the invention and having discriminability and frequency.
Compared with the prior art, the invention has the following advantages:
first, the present invention is able to locate discriminative visual patterns from images by using hierarchical correlation propagation;
secondly, clustering the features in a feature space by using density-based related feature clustering to find a frequent visual pattern;
thirdly, the invention can find out the visual mode with both discriminability and frequency by combining the hierarchical correlation propagation and the density-based correlation feature clustering, thereby overcoming the defect that the prior art can only find out the visual mode with discriminability or frequency.
Experiments show that the discriminativity of the mined visual mode is higher than that of other advanced methods, and the frequency of the visual mode is higher than that of other advanced methods.
Drawings
FIG. 1 is a flow chart of an implementation of the present invention;
fig. 2 is ten visual representations mined from five types of pictures using the present invention.
Fig. 3 is a ten-visual representation of a video image mined from a class of pictures using the present invention and four other advanced methods.
Detailed Description
The embodiment and effects of the present invention will be further described with reference to fig. 1.
Referring to fig. 1, the specific steps of this embodiment are as follows.
Step 1, obtaining pictures and dividing the pictures into a training set and a testing set.
1.1) acquiring 20 types of picture data from a TripAdvisor website, wherein 10 thousands of pictures are obtained in total;
1.2) selecting 1000 pictures from each class, taking twenty thousand pictures as a test set for mining visual representation, and taking the rest pictures as a training set for training a convolutional neural network.
And 2, training a classification network AlexNet.
2.1) scaling the pictures in the training set to 227 x 227;
2.2) inputting the scaled picture into an AlexNet network to carry out iterative training on the network:
2.2.1) selecting cross entropy loss as a loss function, selecting Adam as an optimizer, setting the learning rate to be 0.0001, and setting the number of the neurons of the network output layer to be 20;
2.2.2) initializing AlexNet network parameters, setting the initial iteration number K to be 1 and setting the learning rate L to be 1 e-3;
2.2.3) calculating the network loss L using the Cross entropy loss functionCE
Figure BDA0002801997650000041
Where m is the number of classes, n is the number of class images, yjiIs a label for the image or images,
Figure BDA0002801997650000042
is the output of the network;
2.2.4) determination of loss LCEAnd if so, adding 1 to K and returning to 2.2.3, otherwise, stopping training and storing the AlexNet network model at the moment when the loss begins to oscillate and is not reduced any more.
And 3, screening pictures for mining visual representation.
3.1) setting the screening threshold Tp
3.2) inputting the test set into a classification network AlexNet model to obtain a result which meets the requirement that the output layer is larger than a threshold value TpThe picture of (2) is regarded as a discriminant picture.
And 4, acquiring high-level related characteristics.
Using hierarchical correlation propagation, outputs greater than a threshold T will be satisfiedpThe test set pictures of (2) continue to propagate back in the model:
the ith neuron of the l layer in the network passes weight wijAnd an activation function h () inputting a set of xiMapping to output xj
Figure BDA0002801997650000051
And by outputting xjAll correlations R ofjCalculating an input neuron xiCorrelation of (2)iThe back propagation is performed according to the following formula:
Figure BDA0002801997650000052
wherein the content of the first and second substances,
Figure BDA0002801997650000053
is the correlation of the ith neuron in the ith layer,
Figure BDA0002801997650000054
is the correlation of the jth neuron at layer l +1,
Figure BDA0002801997650000055
is the activation value of the kth neuron of the l-th layer,
Figure BDA0002801997650000056
is the weight between the jth neuron at layer l +1 and the kth neuron at layer l,
Figure BDA0002801997650000057
is the weight between the jth neuron at level l +1 and the ith neuron at level l.
Outputting the network more than the threshold T according to the back propagation rulepThe test set picture is reversely propagated to a second full connection layer of the AlexNet model, and relevant features of the layer are stored for density clustering.
And 5, obtaining the related characteristics with frequency.
5.1) clustering the high-level related features by using a density-based clustering algorithm:
5.1.1) setting the parameter radius r to be 0.35 and the minimum point number m to be 20;
5.1.2) in the relevant feature space, marking each relevant feature vector:
if the number of points contained in the radius of one feature point is larger than the minimum number of points m, the feature point is marked as a core point;
if a feature point contains less than the minimum number of points within its radius but contains a core point, the feature point is labeled
Recording as a boundary point;
if a feature point contains less than the minimum number of points within its radius and does not contain a core point, the feature point is marked as a noise point.
5.1.3) connecting the core points with boundary points which belong to the core points to form clusters which are a type of mined visual patterns;
and 5.2) selecting the first 20 related feature vectors with the maximum density in each cluster in the clustering result, thereby obtaining high-level related features with frequency.
And 6, obtaining a visual representation with discriminability and frequency.
According to the hierarchical relevance propagation, continuously performing backward propagation in 5 convolutional layers of the AlexNet model until the propagation reaches the original image in the input layer;
the areas of the original image corresponding to these high-level related features are the mined visual patterns with both discriminability and frequency.
The effect of the present invention is further explained with the simulation as follows:
the four existing methods used in the simulation experiment are:
the 1 type is a mode excavation method, MDPM method for short, proposed by Yao Li et al in Mining mid-level visual patterns with deep cNn activities. IJCV, vol.121, No.3, pp.344-364,2017.
2 is the pattern mining method proposed by Wei Zhang et al in "Binarized mode search for scalable visual pattern discovery," in CVPR,2017, pp.3864-3872 ", which is abbreviated as CBMS method.
3, a pattern mining method, called P-CNN method for short, proposed by Lingxiao Yang et al in "Learning discrete visual elements using part-based connected logical network. neural rendering, vol.316, pp.135-143,2018.
4 types of the method are pattern mining methods proposed by Hongzhi Li et al in "Visual pattern mining with deep neural network.in ICMR,2018, pp.291-299", which are called pattern Net methods for short.
1. Simulation experiment conditions are as follows:
the hardware platform of the simulation experiment of the invention is as follows: the Dall computer has the CPU model of Intel (R) E5-2603, the frequency of 1.60GHz, the GPU model of GeForce GTX 2080 and the video memory 11G.
The software platform of the simulation experiment of the invention is as follows: ubuntu 18.0 system, Python 3.6, pyrroch 1.2.0.
The simulation experiment of the invention uses 20 types of picture data of input images with more than 10 ten thousand pictures, wherein each type of data exceeds 3500 pictures, and the pictures are divided into two parts for experiment: firstly, 20000 pictures in the test set: each type has 1000 pictures for mining visual representation; and secondly, training the convolutional neural network model by using 8 ten thousand pictures in the training set.
2. Simulation content and result analysis thereof:
simulation experiment 1, pattern mining is carried out on five types of picture data by using the DRFC method under the simulation condition, and the obtained result is shown in figure 2. Wherein:
FIG. 2(a) from The Little Mermaid class, FIG. 2(b) from The Santa Justa Lift class, FIG. 2(c) from The Lisbon District Central Port class, FIG. 2(d) from The Merlion Park Singapore class, and FIG. 2(e) from The Kiyomizu Dera Temple class.
As can be seen from FIG. 2, the visual representation mined by the present invention contains representative targets and represents these different categories of data well. Surprisingly, the number of mined visual representations of each category may be more than one. Representative representations of statues, trams and pagodas are shown in fig. 2(a), 2(c) and 2(e), respectively, however, there are two types of visual representations for each sight in fig. 2(b) and 2 (d). Marked by red boxes, i.e. black boxes and green boxes, i.e. grey boxes in the corresponding grey map, respectively. And figure 2(b) shows two different viewing angles including looking up the tower and looking down from the top of the tower, and figure 2(d) shows two views of the figurine, day and night.
Simulation experiment 2, the present invention and the above four existing methods MDPM, CBMS, P-CNN, pattern net are used to perform pattern mining on a class of picture data under the above simulation conditions, and the obtained result is shown in fig. 3.
Fig. 3(a) shows a visual pattern mining experiment performed on a type of picture data by the MDPM method in the prior art under the above simulation conditions.
Fig. 3(b) shows a CBMS method in the prior art performing a visual pattern mining experiment on a type of picture data under the above simulation conditions.
FIG. 3(c) is a prior art P-CNN method performing a visual pattern mining experiment on a type of picture data under the above simulation conditions.
Fig. 3(d) is a visual pattern mining experiment performed on a type of picture data by the pattern net method in the prior art under the above simulation conditions.
Fig. 3(e) shows that the DRFC method of the present invention performs a visual pattern mining experiment on a type of picture data under the above simulation conditions.
As can be seen from the comparison of fig. 3, the mining result of the MDPM method is the worst, marked with a blue box, i.e., a gray box in the corresponding gray-scale map, because the method of mining the visual pattern using the image block may lose a part of the representative objects. CBMS only finds frequent images and not visual patterns of images, marked with yellow boxes, i.e. white boxes in the corresponding gray map. The P-CNN and the Pattern Net can find visual patterns, but some visual patterns without targets or with incomplete targets are mined, and the visual patterns are marked by red frames in the graph, namely the black frames in the corresponding gray-scale graph, and on the contrary, the invention can find consistency examples of visual representation.
Simulation experiment 3, using the present invention and the above four existing methods MDPM, CBMS, P-CNN, patternenet to evaluate the discriminability of the mined visual pattern under the above simulation conditions:
all calculations are plotted in table 1:
TABLE 1 comparison of Pattern Classification accuracy of the present invention and various prior art miners in simulation experiments
Method of producing a composite material MDPM CBMS P-CNN PatternNet The invention
Precision (%) 84.08 94.75 96.75 90.00 99.54
As can be seen from Table 1, the average accuracy of the method is 99.54%, which is higher than that of the four prior art methods, and the visual mode obtained by the method is proved to have high discriminability. The MDPM accuracy is lowest because this method samples the image into image blocks, which loses discrimination information.
Simulation experiment 4, the frequency of the excavated visual patterns is evaluated by using the present invention and the above existing four methods, MDPM, CBMS, P-CNN, patternenet, respectively under the above simulation conditions:
the frequency rate FR is calculated using the following formula, and all the calculation results are plotted in table 2:
Figure BDA0002801997650000081
wherein the content of the first and second substances,
Figure BDA0002801997650000082
is the degree of similarity of the cosine of the line,
Figure BDA0002801997650000083
and
Figure BDA0002801997650000084
all from the signature of the last convolutional layer of the network,
Figure BDA0002801997650000085
is a feature map of one of the photos from the w-th attraction,
Figure BDA0002801997650000086
is a feature map of a visual representation mined from the w-th attraction, Nw,NuAnd N is the number of sights, the number of visual representations and the number of photographs in each sight, T, respectivelyfIs a similarity threshold.
TABLE 2. different methods at different cosine similarity thresholds TfComparison of Pattern Frequency Rate (FR) of Down mining
Figure BDA0002801997650000087
As can be seen from Table 2, the frequency FR of the visual representation mined by the invention is at all threshold values TfThe following is the highest, although MDPM uses a frequent pattern mining algorithm, the result is the worst, CBMS uses a mean shift algorithm to find frequent images, and the result is lower than that of P-CNN and the present invention. Pattern Net and P-CNN focus on mining patterns of discriminant, which are relatively frequent.
The above simulation experiments show that: the invention provides a DRFC (feature clustering based) method for solving the problem of mining visual patterns from massive photos, compared with the existing mode which is researched and mined only frequently or discriminately, the visual patterns mined by the method can simultaneously have discriminativity and frequency, classification experiments and frequency rate experiments also prove that the method has higher precision and frequency rate compared with other four advanced methods, and experimental results also show the effectiveness of the method in mining visual pattern tasks.

Claims (4)

1. An image pattern mining method based on feature clustering is characterized by comprising the following steps:
(1) acquiring 20 types of picture data, wherein 10 thousands of pictures are obtained in total; selecting 1000 pictures from each type of pictures, taking twenty thousand pictures as a test set for mining visual representation, and taking the rest pictures as a training set for training a convolutional neural network;
(2) the method comprises the steps that pictures in a training set are scaled to 227 x 227, the scaled pictures are input into an AlexNet network to train the network until the network converges, and a fine-tuned classification network, namely an AlexNet model, is obtained, wherein the AlexNet model comprises a feature extraction layer and a classification layer, the feature extraction layer comprises 5 convolution layers, and the classification layer comprises a first full-connection layer, a second full-connection layer and an output layer;
(3) setting a screening threshold TpInputting the test set into a classification network AlexNet model to obtain a result which meets the requirement that the output layer is larger than a threshold value TpThe picture of (2) is used as a picture with discriminability;
(4) using hierarchical correlation propagation, the satisfied output in (3) is greater than a threshold TpThe test set picture is continuously propagated in the reverse direction in the model until the test set picture is propagated to a second full connection layer of the AlexNet model, and high-level related features of the picture with discriminability are obtained for feature clustering;
(5) performing feature clustering on the high-level relevant features obtained in the step (4) by using a density-based clustering algorithm, and selecting the first 20 relevant feature vectors with the maximum density in each cluster in a clustering result to obtain high-level relevant features with frequency;
(6) and (4) according to the hierarchical correlation propagation, continuously performing backward propagation on the high-level correlation features with the frequency obtained in the step (5) in 5 convolutional layers of the AlexNet model until the high-level correlation features are propagated to the original image in the input layer, wherein the areas corresponding to the high-level correlation features in the original image are the mined visual modes with the distinguishing property and the frequency.
2. The method of claim 1, wherein: (2) the zoomed picture is input into an AlexNet network to train the network, and the implementation steps are as follows:
(2a) initializing AlexNet network parameters, setting an initial iteration number K to be 1 and setting a learning rate L to be 1 e-3;
(2b) computing network loss L using cross entropy loss functionCE
Figure FDA0002801997640000011
Where m is the number of classes, n is the number of class images, yjiIs a label for the image or images,
Figure FDA0002801997640000021
is the output of the network;
(2c) judging the loss LCEAnd (4) if the AlexNet network model is not reduced, adding 1 to K and returning to the step (2b), otherwise, stopping training when the loss begins to oscillate and is not reduced any more, and storing the AlexNet network model at the moment.
3. The method of claim 1, wherein: (4) using hierarchical correlation propagation as described in (1), the satisfied output in (3) is greater than a threshold TpThe test set pictures of (2) continue to propagate back in the model, which is implemented as follows:
(4a) the ith neuron of the l layer in the network passes weight wijAnd an activation function h () inputting a set of xiMapping to output xj
Figure FDA0002801997640000022
And by outputting xjAll correlations R ofjCalculating an input neuron xiCorrelation of (2)iThe back propagation is performed according to the following formula:
Figure FDA0002801997640000023
wherein the content of the first and second substances,
Figure FDA0002801997640000024
is the correlation of the ith neuron in the ith layer,
Figure FDA0002801997640000025
is the correlation of the jth neuron at layer l +1,
Figure FDA0002801997640000026
is the activation value of the kth neuron of the l-th layer,
Figure FDA0002801997640000027
is the weight between the jth neuron at layer l +1 and the kth neuron at layer l,
Figure FDA0002801997640000028
is the weight between the jth neuron at level l +1 and the ith neuron at level l;
(4b) outputting the network more than the threshold T according to the back propagation rulepThe test set picture is reversely propagated to a second full connection layer of the AlexNet model, and relevant features of the layer are stored for density clustering.
4. The method of claim 1, wherein: (5) the method for clustering the relevant features by using the clustering method based on the density comprises the following implementation steps:
(5a) setting the parameter radius r to be 0.35 and the minimum point number m to be 20;
(5b) in the relevant feature space, marking each relevant feature vector:
if the number of points contained in the radius of one feature point is larger than the minimum number of points m, the feature point is marked as a core point;
if a feature point contains less than the minimum number of points within its radius but contains a core point, the feature point is marked as a boundary point;
if a feature point contains less than the minimum number of points within its radius and does not contain a core point, the feature point is marked as a noise point.
(5c) The core points and the boundary points which belong to the core points are connected to form clusters, and the clusters are a type of visual patterns which are mined.
CN202011353678.XA 2020-11-26 2020-11-26 Image pattern mining method based on feature clustering Pending CN112465024A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011353678.XA CN112465024A (en) 2020-11-26 2020-11-26 Image pattern mining method based on feature clustering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011353678.XA CN112465024A (en) 2020-11-26 2020-11-26 Image pattern mining method based on feature clustering

Publications (1)

Publication Number Publication Date
CN112465024A true CN112465024A (en) 2021-03-09

Family

ID=74808918

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011353678.XA Pending CN112465024A (en) 2020-11-26 2020-11-26 Image pattern mining method based on feature clustering

Country Status (1)

Country Link
CN (1) CN112465024A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112950601A (en) * 2021-03-11 2021-06-11 成都微识医疗设备有限公司 Method, system and storage medium for screening pictures for esophageal cancer model training

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034224A (en) * 2018-07-16 2018-12-18 西安电子科技大学 Hyperspectral classification method based on double branching networks
CN110210555A (en) * 2019-05-29 2019-09-06 西南交通大学 Rail fish scale hurt detection method based on deep learning
CN110689081A (en) * 2019-09-30 2020-01-14 中国科学院大学 Weak supervision target classification and positioning method based on bifurcation learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109034224A (en) * 2018-07-16 2018-12-18 西安电子科技大学 Hyperspectral classification method based on double branching networks
CN110210555A (en) * 2019-05-29 2019-09-06 西南交通大学 Rail fish scale hurt detection method based on deep learning
CN110689081A (en) * 2019-09-30 2020-01-14 中国科学院大学 Weak supervision target classification and positioning method based on bifurcation learning

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ALEX KRIZHEVSKY 等: "ImageNet Classification with Deep Convolutional Neural Networks", 《COMMUNICATIONS OF THE ACM》 *
ALEXANDER BINDER 等: "Layer-wise Relevance Propagation for Neural Networks with Local Renormalization Layers", 《ARXIV:1604.00825V1》 *
QIANNAN WANG 等: "Deep Relevance Feature Clustering for Discovering Visual Representation of Tourism Destination", 《PRCV 2020》 *
伏家云 等: "空间密度聚类模式挖掘方法DBSCAN研究回顾与进展", 《测绘科学》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112950601A (en) * 2021-03-11 2021-06-11 成都微识医疗设备有限公司 Method, system and storage medium for screening pictures for esophageal cancer model training
CN112950601B (en) * 2021-03-11 2024-01-09 成都微识医疗设备有限公司 Picture screening method, system and storage medium for esophageal cancer model training

Similar Documents

Publication Publication Date Title
CN107679250B (en) Multi-task layered image retrieval method based on deep self-coding convolutional neural network
CN106126581B (en) Cartographical sketching image search method based on deep learning
Yu et al. Exploiting the complementary strengths of multi-layer CNN features for image retrieval
CN104036012B (en) Dictionary learning, vision bag of words feature extracting method and searching system
Kadam et al. Detection and localization of multiple image splicing using MobileNet V1
CN105528575B (en) Sky detection method based on Context Reasoning
CN111680176A (en) Remote sensing image retrieval method and system based on attention and bidirectional feature fusion
CN108897791B (en) Image retrieval method based on depth convolution characteristics and semantic similarity measurement
Zheng et al. Differential Learning: A Powerful Tool for Interactive Content-Based Image Retrieval.
CN109492589A (en) The recognition of face working method and intelligent chip merged by binary features with joint stepped construction
Liu et al. Texture classification in extreme scale variations using GANet
Taheri et al. Effective features in content-based image retrieval from a combination of low-level features and deep Boltzmann machine
Zhao et al. An angle structure descriptor for image retrieval
Dong et al. Multilayer convolutional feature aggregation algorithm for image retrieval
Naiemi et al. Scene text detection using enhanced extremal region and convolutional neural network
Guo Research on sports video retrieval algorithm based on semantic feature extraction
CN112465024A (en) Image pattern mining method based on feature clustering
Maheswari et al. Facial expression analysis using local directional stigma mean patterns and convolutional neural networks
CN110674334B (en) Near-repetitive image retrieval method based on consistency region deep learning features
Chen et al. Large-scale indoor/outdoor image classification via expert decision fusion (edf)
JP4302799B2 (en) Document search apparatus, method, and recording medium
John et al. A multi-modal cbir framework with image segregation using autoencoders and deep learning-based pseudo-labeling
Patil et al. Improving the efficiency of image and video forgery detection using hybrid convolutional neural networks
CN116108217A (en) Fee evasion vehicle similar picture retrieval method based on depth hash coding and multitask prediction
Huang et al. Automatic image annotation using multi-object identification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210309

WD01 Invention patent application deemed withdrawn after publication