CN115439718A

CN115439718A - Industrial detection method, system and storage medium combining supervised learning and feature matching technology

Info

Publication number: CN115439718A
Application number: CN202211244936.XA
Authority: CN
Inventors: 方正; 王晓阳
Original assignee: Suzhou Yuankan Technology Co ltd
Current assignee: Suzhou Yuankan Technology Co ltd
Priority date: 2022-10-12
Filing date: 2022-10-12
Publication date: 2022-12-06

Abstract

The invention discloses an industrial detection method, a system and a storage medium combining supervised learning and feature matching technologies, wherein the method comprises the steps of selecting a positive sample picture and constructing a patch-plus model; inputting the selected positive sample picture as a training picture into a backbone network, extracting the positive sample picture characteristics, and forming a fragmented patch; screening similar patches, and storing the screened patches into a memory bank; selecting a test picture, inputting the test picture into a backbone network, and performing feature extraction on the test picture to form a patch on each feature point; simultaneously inputting patches formed by the test pictures and patches in the memory bank into a supervised network, and outputting a segmentation image; by dividing the image to carry out industrial detection defect analysis, the method can form a clearly visible divided graph when the image is divided, and can divide different defect types, thereby shortening the time for dividing the image.

Description

Industrial detection method, system and storage medium combining supervised learning and feature matching technology

Technical Field

The invention relates to the field of machine vision, in particular to an industrial detection method, system and storage medium combining supervised learning and feature matching technology.

Background

Patchcore is a feature matching technique applied to industrial detection. This technique requires only positive sample pictures in the training phase, unlike traditional learning algorithms that require a large number of defect samples to train the model. The problem of long tail distribution in industrial detection is greatly solved.

However, it has disadvantages that a simple feature matching method is used, and only an approximate thermodynamic diagram is formed during detection, which indicates that the method is inaccurate in dividing the detection region, and at the same time, the defect type cannot be effectively classified, and only whether the defect exists can be determined, or the image dividing time is too long, so that the practical requirements of industrial detection cannot be met.

Disclosure of Invention

The invention aims to provide an industrial detection method combining supervised learning and feature matching technology, which comprises the following steps,

s1, selecting a positive sample picture, and constructing a patchcore-plus model;

s2, inputting the selected positive sample picture as a training picture into a backbone network, extracting the characteristics of the positive sample picture, and forming a fragmented patch;

s3, screening similar patches, and storing the screened patches into a memory bank;

s4, selecting a test picture, inputting the test picture into a backbone network, and performing feature extraction on the test picture to form a patch on each feature point;

s5, simultaneously inputting patches formed by the test pictures and patches in the memory bank into a supervised network, and outputting a segmentation image;

and S6, performing semantic segmentation on the segmented image, and forming a corresponding detection frame based on a result of the semantic segmentation.

Preferably, the patchcore-plus model comprises a backbone network, a memory bank and a supervision module;

the backbone network is used for extracting picture features;

the memory bank is used for storing positive sample picture characteristics;

the supervision module is a segmentation network module which is used for segmenting the picture.

Further, the selected positive sample picture is used as a training picture and input to a backbone network for positive sample training; the training stage comprises positive sample training and segmentation network training;

wherein, the positive sample training process is as follows:

inputting the selected positive sample picture into a pre-trained backbone neural network, wherein the backbone neural network is a ResNet feature extraction network pre-trained by using an ImageNet data set and is used for extracting features in the positive sample;

the ResNet network will produce several layers of feature maps of different sizes,

and selecting characteristic graphs of the middle two layers of all good pictures, and storing the characteristic graphs into a memory bank.

Preferably, the split network training process is as follows:

randomly selecting a plurality of pictures from a data set as a training set, wherein the training set comprises positive samples and negative samples;

inputting the training pictures into a backbone network to extract training characteristics to form a characteristic diagram;

for each feature point of the training image, finding out K patches which are most similar to the corresponding L2 distance in the memory bank;

and simultaneously inputting the K patches and the corresponding feature points into a segmentation network to obtain a predicted segmentation picture.

Preferably, in step S4, after the test picture is input into the backbone network, the backbone network is used to extract features to form corresponding patches, then K minimum L2 distances corresponding to each feature patch are selected from the memory bank, and the selected feature patches are simultaneously input into the trained model to obtain the predicted segmented image.

The invention also claims an industrial detection system combining supervised learning and feature matching technologies, which comprises a memory, a processor and a computer program stored in the memory and operable on the processor, wherein the processor executes the computer program and executes the instructions of the industrial detection method combining supervised learning and feature matching technologies.

The invention also claims a computer-readable storage medium storing a computer program which, when executed by a processor, executes instructions of the above-described industrial detection method in combination with supervised learning and feature matching techniques.

Due to the application of the technical scheme, compared with the prior art, the invention has the following advantages:

according to the method and the device, the clearly visible segmentation graph can be formed when the patchcore model is used for image segmentation, and in addition, different defect types can be segmented, so that the time for segmenting the image is shortened.

Drawings

Fig. 1 is an industrial detection method combining supervised learning and feature matching techniques according to the present invention.

Detailed Description

In order to make those skilled in the art better understand the technical solutions in the present specification, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present specification without making any creative effort shall fall within the protection scope of the present specification.

Example 1

As shown in fig. 1, the present invention discloses an industrial detection method combining supervised learning and feature matching technology, comprising the following steps,

further, selecting the positive sample picture includes: firstly, selecting pictures with the most forms and background semantics. That is, it is desirable to avoid excessively similar pictures from being simultaneously selected into the training set. Although in an industrial inspection scene, due to the fixation of the lens and the lighting mode, the pictures are very similar. There are still subtle differences between different parts, such as small burrs, different textures, etc. Our model can capture these features very sensitively, so we need to choose patches containing as much semantics as possible.

The construction of the patch-plus model comprises a backbone network responsible for extracting the characteristics, a memory bank responsible for storing the characteristics of the positive sample and a supervision module responsible for segmentation.

s5, simultaneously inputting the patch formed by the test picture and the patch in the memory bank into a supervision network, and outputting a segmentation image;

the backbone network is used for extracting picture characteristics;

the memory bank is used for storing the positive sample picture characteristics;

Further, the supervision module (namely the segmentation network module) is a supervised learning network for classifying each element on the picture according to different semantics.

Further, the number of patches in the memory bank is reduced: different positive sample pictures contain the same characteristics, but a characteristic is not expected to be stored in the memory bank for multiple times, so a K-means clustering algorithm is adopted to reduce the size of the memory bank, the part is in a patchcore primitive model, all patches in the memory bank are clustered, and only the clustering center of each class is reserved to reduce the number in the memory bank.

Further, the selected positive sample picture is used as a training picture and input to the backbone network for positive sample training; the training stage comprises positive sample training and segmentation network training;

wherein, the positive sample training process is as follows:

the ResNet network can generate a plurality of layers of feature maps with different sizes, the feature maps are a series of results formed after the training network is input into the backbone network, and the two middle feature maps in the series of feature maps are selected as the feature maps of the training pictures;

Preferably, the split network training process is as follows:

Further, the training phase is divided into two phases: firstly, in a positive sample training stage, inputting a selected positive sample picture into a pre-trained backbone neural network, wherein the neural network is a ResNet feature extraction network pre-trained by using an ImageNet data set and aims to extract features in the positive sample, the ResNet network can generate a plurality of layers of feature maps with different sizes, and the feature maps of two layers in the middle of all good product pictures are selected and stored in a memory bank.

Further, the data set such as ImageNet is a large public data set on the network, and comprises a large number of objects in nature.

Further, in the second stage of training, we need to train the segmentation network. Firstly, randomly selecting a plurality of pictures from a data set as a training set, inputting the training pictures into a backbone network to extract training characteristics to form a characteristic diagram with the same size as that of the training of a positive sample, and then finding out K patches which are most similar to the corresponding L2 distance in a memory bank for each characteristic point of the training image, wherein the formula is as follows: l is ₂ And (4) taking the K patches and the corresponding feature points as input simultaneously, and inputting the input into a segmentation network to obtain a predicted segmentation picture.

Error calculation is carried out on the prediction picture and the group route, and in the aspect of Loss selection, a common Loss MSE (mean square error) used for predicting the regression problem is selected firstly, namely the Loss MSE is used

Wherein, y _i Representing the segmented pictures that we predict,

indicating the correct picture.

Second, the final classification result we choose to use a Cross Engine Loss, which is commonly used in the classification prediction problem, i.e., the

Wherein, y _i A picture representing a predicted segment of a picture,

indicating the correct picture.

We use the prediction regression Loss for the error calculation of the patch of the sample to be tested and the k patches selected from the bank, and the Loss for prediction classification for the final classification result, i.e. the

Wherein, y _i A segmented picture representing a prediction is shown,

indicating the correct picture.

After the Loss is calculated, a gradient descent method is used as a back propagation algorithm, and the weight of the classification segmentation network is updated.

And after the Loss is converged, ending the training process.

Example 2

The present disclosure also provides a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to execute the instructions of the industrial detection method combining supervised learning and feature matching techniques described in the above embodiments.

The computer device may include one or more processors, such as one or more Central Processing Units (CPUs) or Graphics Processors (GPUs), each of which may implement one or more hardware threads. The computer device may also comprise any memory for storing any kind of information, such as code, settings, data, etc., and in a particular embodiment a computer program on the memory and executable on the processor, which computer program when executed by the processor may perform the instructions of the method of any of the above embodiments. For example, and without limitation, memory may include any one or combination of the following: any type of RAM, any type of ROM, flash memory devices, hard disks, optical disks, etc. More generally, any memory may use any technology to store information. Further, any memory may provide volatile or non-volatile retention of information. Further, any memory may represent fixed or removable components of the computer device. In one case, the computer device may perform any of the operations of the associated instructions when the processor executes the associated instructions, which may be stored in any memory or combination of memories. The computer device also includes one or more drive mechanisms for interacting with any memory, such as a hard disk drive mechanism, an optical disk drive mechanism, and so forth.

The present disclosure also provides a computer-readable medium, which may be contained in the electronic device described in the above embodiments; or may exist separately without being assembled into the electronic device. The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to implement the method described in embodiment 1 or 2 above; computer-readable media, including both permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium which can be used to store information that can be accessed by a computer device. As defined herein, computer readable media does not include transitory computer readable media (transitionadia) such as modulated data signals and carrier waves.

The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the system embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for relevant points. In the description of the specification, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the embodiments of the specification. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art to which the present application pertains. Any modification, equivalent replacement, improvement or the like made within the spirit and principle of the present application shall be included in the scope of the claims of the present application.

Claims

1. An industrial detection method combining supervised learning and feature matching technology is characterized by comprising the following steps,

s5, simultaneously inputting patches formed by the test pictures and patches in the memory bank into a supervised segmentation network, and outputting segmentation images;

2. The industrial detection method combining supervised learning and feature matching techniques of claim 1, wherein the patchcore-plus model comprises a backbone network, a memory bank and a supervision module;

the backbone network is used for extracting picture features;

the memory bank is used for storing positive sample picture characteristics, namely fragmented patches which pass through a backbone network and are screened;

3. The industrial detection method combining supervised learning and feature matching technology as recited in claim 2, wherein the selected positive sample picture is input to a backbone network as a training picture for positive sample training, and the training phase comprises positive sample training and segmentation network training;

wherein, the positive sample training process is as follows:

the ResNet network can generate a plurality of layers of feature maps with different sizes;

4. The industrial detection method combining supervised learning and feature matching techniques as recited in claim 3, wherein the segmentation network training process is as follows:

randomly selecting a plurality of samples from industrial samples to be detected as a training set, wherein the training set comprises positive samples and negative samples;

finding out K patches with the closest L2 distance corresponding to each feature point of the training image in a memory bank;

5. The industrial inspection method according to claim 4, wherein in step S4, after the test picture is inputted into the backbone network, the backbone network is used to extract features to form corresponding patches, then K minimum L2 distances corresponding to each feature patch are selected from the memory bank, and the selected feature patches are simultaneously inputted into the trained model to obtain the predicted segmented image.

6. An industrial detection system incorporating supervised learning and feature matching techniques, comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor, when executing the computer program, executes instructions of an industrial detection method incorporating supervised learning and feature matching techniques according to any one of claims 1 to 5.

7. A computer-readable storage medium, storing a computer program, wherein the computer program, when executed by a processor, executes instructions of the industrial detection method in combination with supervised learning and feature matching techniques of any one of claims 1 to 5.