CN116523917A

CN116523917A - Defect detection method, device, computer equipment and storage medium

Info

Publication number: CN116523917A
Application number: CN202310806554.XA
Authority: CN
Inventors: 吴凯; 江冠南; 王智玉; 束岸楠
Original assignee: Contemporary Amperex Technology Co Ltd
Current assignee: Contemporary Amperex Technology Co Ltd
Priority date: 2023-07-04
Filing date: 2023-07-04
Publication date: 2023-08-01
Anticipated expiration: 2043-07-04
Also published as: CN116523917B

Abstract

The application relates to a defect detection method, a defect detection device, a computer device and a storage medium. The method comprises the steps of inputting an image to be detected into a teacher network for encoding to obtain a plurality of encoding features, inputting the integrated encoding features into a student network for decoding to obtain a plurality of decoding features, and determining a defect area in the image to be detected according to the encoding features and the decoding features. According to the defect detection method, the asymmetric student network is established on the output of the teacher network to perform unsupervised learning, and compared with a detection algorithm that the teacher network and the student network are independently used for learning, the sensitivity of noise can be reduced to a certain extent in the learning process of the teacher network and the student network respectively, so that the accuracy of defect detection is improved.

Description

Defect detection method, device, computer equipment and storage medium

Technical Field

The present disclosure relates to the field of battery detection technologies, and in particular, to a defect detection method, a defect detection device, a computer device, and a storage medium.

Background

The sealing nail welding is an indispensable link in the production process of the power battery, and whether the sealing nail welding reaches the standard directly affects the safety of the battery. The welding area of the sealing nail is called a welding bead, and due to the changes of temperature, environment and the like during welding, defects such as pinholes, explosion points, explosion lines (cold joint), lack of welding, bead melting and the like often exist on the welding bead. Defects on the welding bead directly affect the welding quality of the sealing nail, so the defect detection on the welding bead is very important.

However, the existing method for detecting the defects of the welding bead has the problem of inaccurate detection.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a defect detection method, apparatus, computer device, and storage medium capable of improving detection accuracy.

In a first aspect, the present application provides a defect detection method. The method comprises the following steps:

inputting an image to be detected into a teacher network for coding to obtain a plurality of coding features;

the plurality of coding features are input into a student network for decoding after being fused, and a plurality of decoding features are obtained;

and determining a defect area in the image to be detected according to the plurality of coding features and the plurality of decoding features.

According to the defect detection method, the image to be detected is input into the teacher network for encoding, so that a plurality of encoding features are obtained, the encoding features are fused and then input into the student network for decoding, a plurality of decoding features are obtained, and the defect area in the image to be detected is determined according to the encoding features and the decoding features. According to the defect detection method, the asymmetric student network is established on the output of the teacher network to perform unsupervised learning, and compared with a detection algorithm that the teacher network and the student network are independently used for learning, the sensitivity of noise can be reduced to a certain extent in the learning process of the teacher network and the student network respectively, so that the accuracy of defect detection is improved.

In one embodiment, the teacher network includes: the system comprises a first convolution module and a plurality of coding modules, wherein the coding modules are connected in cascade, and the image to be detected is input into a teacher network for coding to obtain a plurality of coding characteristics, and the system comprises:

inputting the image to be detected into the first convolution module to obtain convolution characteristics;

and inputting the convolution characteristics to the plurality of coding modules to code so as to obtain the plurality of coding characteristics.

According to the teacher network provided by the embodiment of the application, coding at different stages or at the same stage is realized through the coding network, so that a plurality of coding features with different sizes or the same size are obtained, rich feature information can be provided for defect detection based on the coding features at a later stage, and further detection accuracy can be improved to a certain extent.

In one embodiment, the inputting the convolution feature to the plurality of encoding modules to encode, to obtain the plurality of encoding features includes:

inputting the convolution characteristics into the plurality of coding modules to code so as to obtain a plurality of initial coding characteristics;

and carrying out size adjustment on each initial coding feature to obtain the plurality of coding features.

According to the coding network, coding at different stages is realized through the plurality of cascade connected coding modules, a plurality of coding features with different sizes or the same size are obtained, abundant feature information can be provided for defect detection based on the plurality of coding features at a later stage, and further detection accuracy can be improved to a certain extent.

In one embodiment, the coding network includes a plurality of coding modules, the plurality of coding modules are connected in cascade, the convolutional feature is input to the coding network to perform coding, so as to obtain the plurality of coding features, including:

According to the coding network, coding at different stages is achieved through the coding modules connected in a cascade mode, a plurality of coding features with the same size are obtained, convenience can be brought to fusion of the coding features at the later stage, defect detection is conducted on the basis of the fused coding features, and then detection accuracy can be improved to a certain extent.

According to the coding network, N stages of coding are realized through the N coding modules which are connected in a cascade mode, N coding features with the same size are obtained, convenience can be brought to the fusion of the N coding features in the later stage, defect detection is conveniently carried out based on the fused coding features, and then detection accuracy can be improved to a certain extent.

In one embodiment, the encoding network comprises: the method comprises the steps of carrying out size adjustment on each initial coding feature to obtain a plurality of coding features, wherein the N coding modules and N second convolution modules, the output end of the Nth coding module is connected with the input end of the Nth second convolution module, and the method comprises the following steps:

inputting the initial coding features output by each coding module to a corresponding second convolution module for size adjustment to obtain a plurality of coding features; each of the encoding features is of uniform size.

In one embodiment, the encoding network comprises: the method comprises the steps of (1) M+1 coding modules and M second convolution modules, wherein the output end of the Mth coding module is connected with the input end of the Mth second convolution module, the initial coding features are subjected to size adjustment to obtain a plurality of coding features, and the method comprises the following steps:

the initial coding features output by the M coding modules are respectively input to the corresponding second convolution modules for size adjustment, so that M coding features are obtained;

and determining the M coding features and the initial coding features output by the M+1th coding module as the plurality of coding features.

According to the coding network, M stages of coding are realized through the M coding modules which are connected in a cascade mode, M coding features with the same size are obtained, the size of the M coding features is consistent with that of the coding features output by the M+1th coding module, the M+1th coding features can be fused conveniently in the later stage, defect detection is conducted on the basis of the fused coding features, and then detection accuracy can be improved to a certain extent.

In one embodiment, the student network comprises: the transformation module and the decoding modules are used for inputting the fused coding features into a student network for decoding to obtain decoding features, and the transformation module comprises the following components:

The plurality of coding features are fused and then input to the transformation module, so that abstract features are obtained;

and inputting the abstract features to the plurality of decoding modules for decoding to obtain the plurality of decoding features.

According to the student network provided by the embodiment of the application, decoding at different stages is realized through the plurality of decoding modules, so that a plurality of decoding features are obtained, the plurality of decoding features can be in one-to-one correspondence with a plurality of coding features, and defect detection can be performed by analyzing the similarity between the plurality of coding features and the plurality of decoding features in the later stage, so that an accurate and efficient defect detection method is realized.

In one embodiment, the plurality of decoding modules are cascade-connected, and the inputting the abstract feature to the plurality of decoding modules for decoding, to obtain the plurality of decoding features, includes:

According to the student network provided by the embodiment of the application, decoding at different stages is realized through the plurality of cascaded decoding modules, so that a plurality of decoding features are obtained, the plurality of decoding features can be in one-to-one correspondence with a plurality of coding features, and defect detection can be performed by analyzing the similarity between the plurality of coding features and the plurality of decoding features in the later stage, so that an accurate and efficient defect detection method is realized.

In one embodiment, the plurality of decoding modules includes K decoding modules, and from the 1 st decoding module to the K-1 st decoding module in the K decoding modules are connected in cascade, and the inputting the abstract feature to the plurality of decoding modules to decode, to obtain the plurality of decoding features includes:

inputting the abstract features to the 1 st decoding module to the K-1 st decoding module for decoding to obtain decoding features output by each of the 1 st decoding module to the K-1 st decoding module;

and inputting the coding features output by the second coding module in the teacher network into a Kth decoding module in the K decoding modules for decoding to obtain the decoding features output by the Kth decoding module.

According to the student network provided by the embodiment of the application, decoding at different stages is realized through the plurality of decoding modules, so that a plurality of decoding features are obtained, the plurality of decoding features can be in one-to-one correspondence with a plurality of coding features, and defect detection can be performed by analyzing the similarity between the plurality of coding features and the plurality of decoding features in the later stage, so that an accurate and efficient defect detection method is realized. On the other hand, the input of the last decoding module is the coding characteristic output by the coding module in the teacher network, so that the detail characteristic lost in the image reconstruction in the decoding process can be made up to a certain extent, and the quality of the decoding characteristic is improved.

In one embodiment, the determining a defective area in the image to be detected according to the plurality of encoding features and the plurality of decoding features includes:

obtaining a plurality of similar feature images according to the similarity value between each coding feature and each decoding feature;

fusing the plurality of similar characteristic images to obtain a target detection image;

and carrying out defect detection on the target detection image to obtain the defect region.

According to the embodiment of the application, after the plurality of similar characteristic images are fused, the defect detection is performed based on the fused similar characteristic images, and as the plurality of similar characteristic images reflect the learning processes of a teacher network and a student network in a plurality of stages, namely reflect the characteristics of a plurality of scales, the defect detection based on the characteristics of a plurality of scales is realized, and the detection accuracy can be improved to a certain extent.

In one embodiment, the performing defect detection on the target detection image to obtain the defect area includes:

determining a pixel point corresponding to a gray value larger than a preset gray threshold value as a target pixel point;

and determining a region formed by connecting the target pixel points as the defect region.

According to the method, the defect area is detected by communicating the pixel points larger than the preset gray threshold, the method is simple and efficient, and the detection accuracy can be improved to a certain extent.

In one embodiment, the method further comprises:

performing size adjustment on the plurality of similar feature images to obtain a plurality of similar feature images with adjusted sizes; the size of each of the similar feature images after the size adjustment is consistent with the size of the image to be detected;

the fusing the plurality of similar characteristic images to obtain a target detection image comprises the following steps:

and fusing the plurality of the similar characteristic images after the size adjustment to obtain the target detection image.

According to the embodiment of the application, the sizes of the similar characteristic images are adjusted to be consistent, so that the fusion quality of fusing the similar characteristic images in the later stage can be improved to a certain extent, and the detection accuracy in the later stage is improved.

In one embodiment, the method further comprises:

training a detection network according to the sample image to obtain the multi-stage teacher network and the multi-stage student network; the sample image is an image without a defect area; the detection network comprises a multi-stage initial teacher network and a multi-stage initial student network.

The training method of the embodiment can train based on the sample images without defects, the sample images of the type are better obtained, and the training effect can be improved to a certain extent by obtaining a large number of sample images.

In one embodiment, the training the detection network according to the sample image to obtain the teacher network and the student network includes:

inputting the sample image into the initial teacher network for coding to obtain a plurality of sample coding features;

the plurality of sample coding features are fused and then input into an initial student network for decoding, so that a plurality of sample decoding features are obtained;

determining a target image according to the plurality of sample coding features and the plurality of sample decoding features;

and determining target loss according to the target image and the sample image, and training the initial teacher network and the initial student network according to the target loss to obtain the teacher network and the student network.

According to the training method, the teacher network and the student network are trained simultaneously, so that the teacher network and the student network learn each other to form a self-distillation structure, the sensitivity of the student network to noise can be effectively restrained to a certain extent, the training effect is improved, and the detection accuracy of a detection network formed by the trained teacher network and the trained student network is further improved.

In a second aspect, the present application further provides a defect detection apparatus. The device comprises:

the coding module is used for inputting the image to be detected into a teacher network for coding to obtain a plurality of coding features;

the decoding module is used for inputting the fused multiple coding features into a student network for decoding to obtain multiple decoding features;

and the detection module is used for determining a defect area in the image to be detected according to the plurality of coding features and the plurality of decoding features.

In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor which when executing the computer program performs the steps of:

In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of:

In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:

The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the accompanying drawings. In the drawings:

FIG. 1 is an internal block diagram of a computer device in one embodiment;

FIG. 2 is a flow chart of a defect detection method according to an embodiment;

FIG. 3 is a flow chart of a defect detection method according to another embodiment;

FIG. 4 is a schematic diagram of one implementation of a teacher network in one embodiment;

FIG. 5 is a schematic diagram of one implementation of a coding network in one embodiment;

FIG. 6 is a flow chart of a defect detection method according to another embodiment;

FIG. 7 is a schematic diagram of one implementation of a coding network in one embodiment;

FIG. 8 is a schematic diagram of another implementation of a coding network in one embodiment;

FIG. 9 is a flow chart of a defect detection method according to another embodiment;

FIG. 10 is a flow chart of a defect detection method according to another embodiment;

FIG. 11 is a schematic diagram of an implementation of a student network in one embodiment;

FIG. 12 is a schematic diagram of another implementation of a student network in one embodiment;

FIG. 13 is a flow chart of a defect detection method according to another embodiment;

FIG. 14 is a flow chart of a defect detection method according to another embodiment;

FIG. 15 is a flow chart of a training method in another embodiment;

FIG. 16 is a schematic diagram of a detection network in one embodiment;

FIG. 17 is a block diagram of a defect detection apparatus in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.

Currently, the method for detecting the defects of the welding bead is mainly based on artificial intelligence (Artificial Intelligent, AI), and can automatically detect whether the defects or the types of the defects exist on the welding bead. In the existing method, a co-propagating teacher-student structure is generally used, that is, images to be detected are respectively input into a teacher network and a student network for image processing, and the similarity between images output by the teacher network and the student network is utilized for detecting weld bead defect areas in the images to be detected. The teacher and student networks are generally similar in structure, and such similarity can compromise the overall network to fail to cover all of the weld bead defect areas or to misestimate the location of the weld bead defect areas, resulting in inaccurate weld bead defect areas being ultimately detected. The application provides a defect detection method, which aims to solve the technical problems. The following examples will describe the defect detection method described in the present application in detail.

The defect detection method provided by the embodiment of the application can be applied to the computer equipment shown in fig. 1. The computer device may be a terminal, and its internal structure may be as shown in fig. 1. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a defect detection method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 1 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, as shown in fig. 2, a defect detection method is provided, and the method is applied to the computer device in fig. 1 for illustration, and includes the following steps:

s201, inputting the image to be detected into a teacher network for coding, and obtaining a plurality of coding features.

The image to be detected is an image of a welding bead area, and may contain defects on the welding bead. The teacher network is a pre-trained neural network or encoder, which can be used to perform multi-stage image coding on an input image, and output coding features of each stage, i.e. the multiple coding features can be coding features output by the teacher network in different coding stages. Each coding feature may be represented using a feature map. Alternatively, the teacher network may also be used to encode the input image once, so as to directly obtain multiple encoding features, that is, the multiple encoding features may be the encoding features output by the teacher network in the same encoding stage.

In the embodiment of the application, the computer device may acquire the image of the weld bead region, that is, the image to be detected, in the battery production process. The image to be detected may or may not contain a weld bead defect. Then, the computer device may also input the image to be detected into a pre-trained teacher network to perform encoding at different stages, and output encoding features with different sizes or the same size at each stage or several stages thereof, so as to obtain a plurality of encoding features. Optionally, the computer device may further input the image to be detected to a pre-trained teacher network to perform the encoding at the same stage, so as to obtain a plurality of encoding features with the same size or different sizes at a time.

S202, inputting the fused multiple coding features into a student network for decoding to obtain multiple decoding features.

The student network is a pre-trained neural network or decoder, and is used for performing multi-stage image decoding on the input characteristic image, and outputting decoding characteristics of each stage, namely the plurality of decoding characteristics can be decoding characteristics output by the student network in different decoding stages. Each decoding feature may be represented using a feature map. The plurality of encoding features and the plurality of decoding features are in a one-to-one correspondence.

In this embodiment of the present application, after a computer device obtains a plurality of coding features output by a teacher network, the plurality of coding features may be fused in channel dimensions of an image to obtain a fused feature, then the fused feature is input into a pre-trained student network to perform decoding at different stages, and decoding features with different sizes or the same size are output at each stage or several stages thereof to obtain a plurality of decoding features. Optionally, the computer device may also input the fusion feature to a pre-trained student network for decoding at the same stage, i.e. to disassemble the fusion feature into a plurality of decoding features of different sizes or the same size.

S203, determining a defect area in the image to be detected according to the plurality of coding features and the plurality of decoding features.

In the embodiment of the application, after the computer equipment obtains a plurality of coding features and a plurality of decoding features based on the steps, a similar feature map between each coding feature and a corresponding decoding feature can be calculated, then a target similar feature map is calculated according to the similar feature maps between all coding features and all decoding features, and finally a defect area in an image to be detected is determined according to the target similar feature map; optionally, when obtaining the similar feature graphs between each coding feature and the corresponding decoding feature, that is, a plurality of similar feature graphs, the computer device may also screen a part of the similar feature graphs from the plurality of similar feature graphs according to the screening condition, then obtain the target similar feature graph according to the part of the similar feature graphs, and finally determine the defect area in the image to be detected according to the target similar feature graph. When the similarity value corresponding to each pixel point in the similar feature map is larger, it is indicated that the image to be detected may include a defect, and in this scenario, the screening condition may be: the similar feature images with the similarity value corresponding to each pixel point in the similar feature images larger than the first preset similarity threshold value are screened out, so that the intensity of the pixel point corresponding to the defect area in the final target similar feature image can be enhanced, and the detection based on the target similar feature image in the later period is facilitated; when the similarity value corresponding to each pixel point in the similar feature map is smaller, it is indicated that the image to be detected may not include a defect, and in this scenario, the screening condition may be: the similar feature images, of which the similarity values corresponding to the pixel points in the similar feature images are smaller than a second preset similarity threshold, are screened out, so that the strength of the pixel points corresponding to the non-defect area in the target similar feature images can be enhanced, and the detection based on the target similar feature images in the later period is facilitated.

In one embodiment, there is provided a structure of a teacher network, as shown in fig. 4, including: based on the teacher network, the first convolution module and the coding network, S201 "inputs the image to be detected to the teacher network for coding, and obtains a plurality of coding features", as shown in fig. 3, including:

s301, inputting an image to be detected into a first convolution module to obtain convolution characteristics.

The first convolution module is used for carrying out convolution processing on the input image, namely carrying out size adjustment or dimension reduction convolution processing on the input image.

In this embodiment of the present application, when the computer device obtains the image to be detected, the image to be detected may be input to the first convolution module in the teacher network to perform convolution processing, so as to obtain a convolution feature after the convolution processing, where the convolution feature may be represented by using a convolution feature map. Optionally, the computer device may further input the image to be detected to a first convolution module in the teacher network to perform convolution processing, so as to obtain a plurality of convolution features after the convolution processing, where the plurality of convolution features may be represented by using a plurality of convolution feature graphs.

S302, inputting the convolution characteristics into a coding network for coding, and obtaining a plurality of coding characteristics.

In this embodiment of the present application, when the computer device acquires the convolution feature, the convolution feature may be input to a pre-trained encoding network to perform encoding at different stages, and in each stage or several stages thereof, encoding features with different sizes or the same size may be output, so as to obtain a plurality of encoding features. Optionally, when the computer device acquires a plurality of convolution features, the convolution features can be input into the pre-trained coding network to perform the same-stage coding at the same time, so as to obtain a plurality of coding features with the same size or different sizes at a time.

Further, in an embodiment, as shown in fig. 5, there is further provided a structure of an encoding network, where the encoding network includes a plurality of encoding modules, and the plurality of encoding modules are cascade-connected, and in this scenario, when the computer device performs the step S302 "input convolution characteristics to the encoding network to encode, and obtain a plurality of encoding characteristics", the specific steps are performed: and inputting the convolution characteristics into a plurality of coding modules to code, so as to obtain a plurality of coding characteristics.

In this embodiment of the present application, when the computer device obtains the convolution feature, the convolution feature may be input to a plurality of cascade connected coding modules to perform step-by-step coding, and each coding module outputs the coding feature to obtain a plurality of coding features, or several coding modules output the coding feature to obtain a plurality of coding features. Optionally, when the computer device acquires a plurality of convolution features, the plurality of convolution features may be input to a plurality of encoding modules for step-by-step encoding, and each encoding module outputs a plurality of encoding features, or several encoding modules output a plurality of encoding features.

Further, in an embodiment, there is also provided a method for encoding by using a plurality of encoding modules, that is, when the computer device performs the above steps, "input the convolution feature into the plurality of encoding modules to encode, obtain a plurality of encoding features", as shown in fig. 6, specifically, the steps are performed:

s401, inputting the convolution characteristics into a plurality of coding modules to code, and obtaining a plurality of initial coding characteristics.

In this embodiment of the present application, when the computer device obtains the convolution feature, the convolution feature may be input to a plurality of cascade connected encoding modules to perform step-by-step encoding, and each encoding module outputs an initial encoding feature to obtain a plurality of initial encoding features, or several encoding modules output initial encoding features to obtain a plurality of initial encoding features. Optionally, when the computer device acquires a plurality of convolution features, the convolution features may be input to a plurality of encoding modules for step-by-step encoding, and each encoding module outputs a plurality of initial encoding features, or several encoding modules output a plurality of initial encoding features.

S402, performing size adjustment on each initial coding feature to obtain a plurality of coding features.

Wherein the plurality of encoded features are of uniform size.

When the computer device obtains the initial coding feature output by each coding module or the initial coding features output by several coding modules, the size of each coding feature can be adjusted to a target size, the target size can be determined according to the actual detection requirement or the coding requirement, optionally, the target size can be consistent with the size of any initial coding feature in the multiple initial coding features, preferably, the target size can be consistent with the size of the initial coding feature output by the coding module of the last stage.

Based on the encoding method described in the foregoing embodiments, the present application further provides a structure of an encoding network, as shown in fig. 7, where the encoding network includes: the output end of the 1 st coding module is connected with the input end of the 1 st second convolution module, the output end of the 2 nd coding module is connected with the input end of the 2 nd second convolution module, the output end of the N-1 st coding module is connected with the input end of the N-1 st second convolution module, the output end of the N-th coding module is connected with the input end of the N-th second convolution module, under the scene, the computer equipment executes the S402' to carry out size adjustment on each initial coding feature, and when a plurality of coding features are obtained, the steps are specifically executed: inputting the initial coding features output by each coding module to a corresponding second convolution module for size adjustment to obtain a plurality of coding features; each coding feature is uniform in size.

The second convolution module is used for carrying out convolution processing on the input image, namely carrying out size adjustment or dimension reduction convolution processing on the input image.

In the embodiment of the application, when the computer equipment acquires the initial coding feature output by the 1 st coding module, the initial coding feature output by the 1 st coding module is input to the corresponding 1 st second convolution module to carry out size adjustment, so that the 1 st coding feature after size adjustment is obtained; when the computer equipment acquires the initial coding feature output by the 2 nd coding module, the initial coding feature output by the 2 nd coding module is input to the corresponding 2 nd second convolution module to carry out size adjustment, so as to acquire the coding feature after the 2 nd size adjustment, and the method is executed until the computer equipment acquires the initial coding feature output by the N-1 st coding module, the initial coding feature output by the N-1 st coding module is input to the corresponding N-1 st second convolution module to carry out size adjustment, so as to acquire the coding feature after the N-1 st size adjustment; when the computer equipment acquires the initial coding feature output by the Nth coding module, the initial coding feature output by the Nth coding module is input to the corresponding Nth second convolution module for size adjustment, so that the coding feature after the size adjustment is obtained, and the sizes from the 1 st coding feature after the size adjustment to the N coding feature after the size adjustment are consistent.

Based on the encoding method described in the foregoing embodiments, the present application further provides another structure of an encoding network, as shown in fig. 8, where the encoding network includes: the output end of the 1 st coding module is connected with the input end of the 1 st second convolution module, the output end of the 2 nd coding module is connected with the input end of the 2 nd second convolution module, the output end of the M coding module is connected with the input end of the M second convolution module, the m+1st coding module has no corresponding second convolution module, under this scenario, the computer device executes the above S402 "to perform size adjustment on each initial coding feature, and when a plurality of coding features are obtained", as shown in fig. 9, the steps are specifically executed:

s501, the initial coding features output by the M coding modules are respectively input to the corresponding second convolution modules for size adjustment, and M coding features are obtained.

In the embodiment of the application, when the computer equipment acquires the initial coding feature output by the 1 st coding module, the initial coding feature output by the 1 st coding module is input to the corresponding 1 st second convolution module to carry out size adjustment, so that the 1 st coding feature after size adjustment is obtained; when the initial coding feature output by the 2 nd coding module is obtained, the computer equipment inputs the initial coding feature output by the 2 nd coding module to the corresponding 2 nd second convolution module to carry out size adjustment, so as to obtain the coding feature after the 2 nd size adjustment, and the method is executed until the computer equipment inputs the initial coding feature output by the M th coding module to the corresponding M th second convolution module to carry out size adjustment when the initial coding feature output by the M th coding module is obtained, so as to obtain the coding feature after the M th size adjustment, and the sizes of the 1 st coding feature after the size adjustment are consistent with the M th coding feature.

S502, determining M coding features and initial coding features output by an M+1th coding module as a plurality of coding features.

When the computer equipment acquires M coding features, the M+1th coding features and the initial coding features output by the M+1th coding modules can be used as a plurality of coding features together, and the sizes of the M coding features are consistent with the sizes of the initial coding features output by the M+1th coding modules.

In one embodiment, there is provided an architecture of a teacher network, as shown in fig. 11, the student network comprising: based on the student network, the above S201 "inputs the multiple coding features after fusion to the student network to decode, and obtains multiple decoding features", as shown in fig. 10, includes:

s601, merging a plurality of coding features and inputting the merged coding features into a transformation module to obtain abstract features.

The transformation module is used for extracting abstract features of the input image. The transformation module may be a neural network model.

In the embodiment of the application, after the computer device acquires the plurality of coding features, the plurality of coding features can be fused in the channel dimension of the image to obtain the fused features, and then the fused features are input into the transformation module to perform feature extraction to obtain the abstract features, wherein the abstract features can be represented by using an abstract feature map. Specifically, when a plurality of coding features are fused, the method can be realized by adopting a pre-trained fusion network or a corresponding fusion algorithm.

S602, inputting the abstract features into a plurality of decoding modules to decode, so as to obtain a plurality of decoding features.

The decoding module is a pre-trained neural network or decoder and is used for performing image decoding on the input characteristic images.

In this embodiment, when the abstract feature is obtained by the computer device, the abstract feature may be input to a plurality of decoding modules to decode, and each decoding module outputs the decoded feature to obtain a plurality of decoded features, or several decoding modules output the decoded feature to obtain a plurality of decoded features. Alternatively, when the computer device acquires the abstract feature, the abstract feature may be input to a plurality of decoding modules for encoding, where each decoding module outputs the decoded feature, or where several decoding modules output the decoded feature.

Furthermore, the plurality of decoding modules may be connected in cascade or not. When the plurality of decoding modules are cascade-connected, the computer device executes the step S602 "input abstract features into the plurality of decoding modules to decode, and obtain a plurality of decoded features", and specifically executes the steps of: the abstract features are input into a plurality of decoding modules for decoding, so that a plurality of decoding features are obtained.

In this embodiment, when the computer device obtains the abstract feature, the abstract feature may be input to a plurality of decoding modules connected in cascade to perform progressive decoding, and each decoding module outputs the decoded feature to obtain a plurality of decoded features, or several decoding modules output the decoded feature to obtain a plurality of decoded features.

Based on the encoding method described in the foregoing embodiments, the present application further provides a specific connection manner of a plurality of decoding modules, as shown in fig. 12, where the plurality of decoding modules includes: the 1 st to the K-1 st decoding modules among the K decoding modules are cascade-connected, and in this scenario, when the computer device executes the step S602 "input abstract features into a plurality of decoding modules to decode and obtain a plurality of decoding features", as shown in fig. 13, the specific implementation steps are as follows:

s701, inputting the abstract features into the 1 st decoding module to the K-1 st decoding module for decoding to obtain decoding features output by each of the 1 st decoding module to the K-1 st decoding module.

In the embodiment of the application, when the computer equipment acquires the abstract feature output by the transformation module, the abstract feature can be input to the 1 st decoding module for decoding (feature image reconstruction) to obtain the decoding feature output by the 1 st decoding module; and then, continuously inputting the decoding characteristics output by the 1 st decoding module into the 2 nd decoding module for decoding (characteristic image reconstruction) to obtain the decoding characteristics output by the 2 nd decoding module, and executing the steps until the computer equipment inputs the decoding characteristics output by the K-2 nd decoding module into the K-1 st decoding module for decoding (characteristic image reconstruction) to obtain the decoding characteristics output by the K-1 st decoding module, and finally obtaining the decoding characteristics output by each of the 1 st decoding module to the K-1 st decoding module to obtain the K-1 decoding characteristics.

S702, inputting the coding features output by the second coding module in the teacher network to the Kth decoding module in the K decoding modules for decoding to obtain the decoding features output by the Kth decoding module.

When the K decoding module in the K decoding modules works, the coding features output by the 2 nd coding module in the previous teacher network are input to the K decoding module for decoding (feature image reconstruction), and the decoding features output by the K decoding module are obtained. The step may be performed before the step of S701, may be performed simultaneously with the step of S701, or may be performed after the step of S702.

In one embodiment, after the computer device obtains the plurality of coding features and the plurality of decoding features based on any of the foregoing embodiments, the defect detection may be further performed on the image to be detected using the plurality of coding features and the plurality of decoding features, so the present application further provides a defect detection method, as shown in fig. 14, including:

s801, obtaining a plurality of similar feature images according to the similarity between each coding feature and each decoding feature.

When the computer equipment obtains a plurality of coding features and a plurality of decoding features, as the plurality of coding features and the plurality of decoding features are in one-to-one correspondence, similarity calculation is carried out on each decoding feature and the corresponding coding feature, specifically, the similarity between each pixel point on the feature map corresponding to the decoding feature and each corresponding pixel point on the feature map corresponding to the coding feature can be calculated, then a similar feature image is determined according to the calculated similarity of each pixel point, and finally, the similar feature image corresponding to each coding feature and each decoding feature is obtained.

S802, fusing the plurality of similar characteristic images to obtain a target detection image.

After the computer equipment acquires a plurality of similar characteristic images, a corresponding fusion algorithm can be adopted to fuse the similar characteristic images, so as to obtain a fused target detection image; optionally, the computer device may screen a part of the similar feature images from the plurality of similar feature images according to the screening condition, and then fuse the part of the similar feature images to obtain the target detection image. When the similarity value corresponding to each pixel point in the similar feature image is larger, the image to be detected may include a defect, and in such a scene, the screening condition may be: the similar characteristic images with the similarity value corresponding to each pixel point in the similar characteristic images being larger than the first preset similarity threshold value are screened out, so that the intensity of the pixel point corresponding to the defect area in the final target detection image can be enhanced, and the detection based on the target detection image in the later period is facilitated; when the similarity value corresponding to each pixel point in the similar feature image is smaller, the image to be detected may not include a defect, and in such a scene, the screening condition may be: the similar characteristic images with the similarity value corresponding to each pixel point in the similar characteristic images smaller than the second preset similarity threshold value are screened out, so that the intensity of the pixel point corresponding to the non-defect area in the target detection image can be enhanced, and the detection based on the target detection image in the later period is facilitated.

Optionally, before fusing the multiple similar feature images, the steps are further required to be performed: performing size adjustment on the multiple similar feature images to obtain multiple similar feature images with adjusted sizes; the size of each of the resized similar feature maps corresponds to the size of the image to be detected.

Correspondingly, when the computer device performs the step of fusing the plurality of similar feature images to obtain the target detection image, the computer device specifically performs the steps of: and fusing the multiple similar characteristic images after the size adjustment to obtain a target detection image.

S803, performing defect detection on the target detection image to obtain a defect area.

When the computer equipment obtains the target detection image, the gray value of each pixel point in the target detection image can be further determined, and then the defect area in the target detection image is determined according to the gray value of each pixel point. Specifically, pixel points corresponding to gray values other than 0 in the target detection image can be communicated, and the formed area is a defect area; alternatively, a pixel point corresponding to a gray value greater than a preset gray threshold in the target detection image may be determined as a target pixel point, and then a region formed by connecting the target pixel points may be determined as a defect region.

Based on the defect detection method, the application further provides a method for training the teacher network and the student network, namely the method in the embodiment of fig. 2 further includes the steps of:

training the detection network according to the sample image to obtain a teacher network and a student network.

Wherein the sample image is an image without a defective area; the detection network includes an initial teacher network and an initial student network. The embodiment of the application relates to a training method, namely, a detection network is built according to an initial teacher network and an initial student network, then a sample image which does not contain a defect area is acquired, and then the sample image is input into the detection network for training, so that a trained teacher network and a trained student network are obtained. The training method of the embodiment can train based on the sample images without defects, the sample images of the type are better obtained, and the training effect can be improved to a certain extent by obtaining a large number of sample images.

Further, the training method, as shown in fig. 15, includes:

and S901, inputting the sample image into an initial teacher network for coding to obtain a plurality of sample coding features.

Wherein the sample image is an image of a bead region, which does not contain defects on the bead. The initial teacher network is a pre-trained neural network or encoder that can be used to encode the input image in multiple stages, outputting sample encoding features for each stage, i.e., the plurality of sample encoding features can be the sample encoding features output by the initial teacher network at different encoding stages. Each sample encoding feature may be represented using a feature map. Alternatively, the initial teacher network may also be used to encode the input image once, so as to directly obtain a plurality of sample coding features, that is, the plurality of sample coding features may be the sample coding features output by the initial teacher network in the same encoding stage.

In this embodiment of the present application, the computer device may acquire the sample image first, then input the sample image to the built initial teacher network to perform encoding at different stages, and output sample encoding features with different sizes or the same size at each stage or several stages thereof, so as to obtain a plurality of sample encoding features. Optionally, the computer device may further input the sample image to the constructed initial teacher network for encoding at the same stage, so as to obtain a plurality of sample encoding features with the same size or different sizes at a time.

Optionally, the sample image may be preprocessed, for example, downsampled, before being input to the detection network for training, for example, downsampled to 1024 x 1024 as the detection network input. During training, the number of 32 sample images can be randomly selected as a batch of sample images for training, and the initial learning rate can be set to be 10 ^-4 。

S902, inputting the fused multiple sample coding features to an initial student network for decoding to obtain multiple sample decoding features.

The initial student network is a neural network or a decoder, which is used for performing multi-stage image decoding on an input characteristic image, and outputting sample decoding characteristics of each stage, that is, the plurality of sample decoding characteristics can be sample decoding characteristics output by the initial student network in different decoding stages. Each sample decoding feature may be represented using a feature map. The plurality of sample encoding features and the plurality of sample decoding features are in a one-to-one correspondence.

In this embodiment of the present application, after a computer device obtains multiple sample coding features output by an initial teacher network, the multiple sample coding features may be fused in a channel dimension of an image to obtain a sample fusion feature, then the sample fusion feature is input into an initial student network to perform decoding at different stages, and sample decoding features with different sizes or the same size are output at each stage or several stages thereof to obtain multiple sample decoding features. Optionally, the computer device may also input the sample fusion feature to the initial student network for the same stage of decoding, i.e. to disassemble the sample fusion feature into a plurality of sample decoding features of different sizes or the same size.

S903, determining a target image according to the plurality of sample coding features and the plurality of sample decoding features.

When the computer equipment acquires the plurality of sample coding features and the plurality of sample decoding features, a plurality of sample similar feature images can be obtained according to the similarity between each sample coding feature and each sample decoding feature. And fusing the similar characteristic images of the plurality of samples to obtain a target image.

S904, determining target loss according to the target image and the sample image, and training an initial teacher network and an initial student network according to the target loss to obtain the teacher network and the student network.

After the target image is determined, a target loss can be determined according to the target image and the sample image, when the target loss is determined specifically, the target loss can be determined by calculating the similarity between the target image and the sample image, and then the respective model parameters of the initial teacher network and the initial student network can be adjusted according to the target loss until training is completed, and the trained teacher network and student network are obtained.

In summary, the present application further provides a detection network, as shown in fig. 16, where the detection network includes a first convolution module, a second convolution module, a third convolution module, a first encoding module, a second encoding module, a third encoding module, a fusion module, a conversion module, a first decoding module, a second decoding module, and a third decoding module, where an output end of the first convolution module is connected with the first encoding module, an output end of the first encoding module is connected with the second encoding module and the second convolution module, an output end of the second encoding module is connected with the third encoding module and the third convolution module, an output end of the second convolution module, an output end of the third convolution module are connected with the fusion module, an output end of the fusion module is connected with the conversion module, an output end of the conversion module is connected with the first decoding module, an output end of the first decoding module is connected with the second decoding module, an output end of the second decoding module is further connected with an input end of the third decoding module.

The detection network has the following working principle: inputting the image I to be detected into a first convolution module for convolution to obtain a first convolution characteristic J ¹ _t Will first convolution feature J ¹ _t Inputting the first code to a first coding module for coding to obtain a first coding feature F ¹ _t Will first code characteristic F ¹ _t Respectively inputting the first code characteristic F to a second code module for coding to obtain a second code characteristic F ² _t And the first coding feature F ¹ _t Inputting the second convolution module to carry out convolution to obtain a second convolution characteristic J ² _t And then the second coding feature F ² _t Inputting the third code characteristic F into a third code module for coding to obtain a third code characteristic F ³ _t And the second coding feature F ² _t Inputting the third convolution module to carry out convolution to obtain a third convolution characteristic J ³ _t Then the third coding feature F ³ _t Second convolution feature J ² _t And a third convolution feature J ³ _t Inputting the abstract feature C into a first decoding module for decoding to obtain a first decoding feature F _S ¹ And will first decode the feature F _S ¹ Inputting the first decoded characteristic F into a first decoding module for decoding to obtain a second decoded characteristic F ² _S The method comprises the steps of carrying out a first treatment on the surface of the In addition, the second coding feature F output by the second coding module ² _t Inputting the third decoded characteristic F into a third decoding module for decoding ³ _S . Then, a first decoding feature F is calculated _S ¹ Corresponding feature map and first coding feature F ¹ _t Similarity between corresponding feature maps to determine a first decoded feature F _S ¹ And a first coding feature F ¹ _t A corresponding first similar feature image; calculating a second decoding characteristic F ² _S Corresponding feature map and second coding feature F ² _t Similarity between corresponding feature maps to determine a second decoded feature F ² _S And a second coding feature F ² _t A corresponding second similar feature image; calculating a third decoding characteristic F ³ _S Corresponding feature map and third coding feature F ³ _t Similarity between corresponding feature maps to determine a third decoding feature F ³ _S And a third coding feature F ³ _t And finally, fusing the first similar characteristic image, the second similar characteristic image and the third similar characteristic image to obtain a target detection image, and carrying out defect detection based on the target detection image to determine a defect area in the image to be detected. It should be noted that, when calculating the similarity between the feature graphs, the cosine similarity may be calculated point by point, for example, F is calculated ⁱ _S Signs F ⁱ _t Point-to-point cosine similarity Cos (F) ⁱ _S ，F ⁱ _t ) And determining corresponding similar characteristic images based on the point-by-point cosine similarity. By using the method for detecting the defect image, the accuracy of the welding bead region positioning can be improved by about 20%, and the accuracy of the welding bead region positioning can be improved by about 10% -15% in terms of noise suppression. And then, the detected image of the defect area can assist related personnel to carry out defect marking on the real weld bead image, and the manual marking time consumption can be reduced by about 50 percent.

According to the defect detection method, the image to be detected is input to the plurality of coding modules connected step by step to be coded, so that a plurality of coding features are obtained, the plurality of coding features are fused and then are input to the plurality of decoding modules connected step by step to be decoded, a plurality of decoding features are obtained, and the defect area in the image to be detected is determined according to the plurality of coding features and the plurality of decoding features. According to the defect detection method, the asymmetric decoding module is established on the output of the encoding module to perform unsupervised learning, so that the sensitivity of noise can be reduced to a certain extent in the learning process of the encoding module and the decoding module, and the defect detection accuracy is improved.

It should be understood that, although the steps in the flowcharts related to the embodiments described above are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.

Based on the same inventive concept, the embodiment of the application also provides a defect detection device for realizing the defect detection method. The implementation of the solution provided by the device is similar to that described in the above method, so the specific limitation of one or more embodiments of the defect detection device provided below may be referred to above for limitation of the defect detection method, and will not be repeated here.

In one embodiment, as shown in fig. 17, there is provided a defect detecting apparatus including:

the coding module 10 is used for inputting the image to be detected into a teacher network for coding to obtain a plurality of coding features;

the decoding module 11 is configured to integrate the plurality of coding features and input the integrated coding features to a student network for decoding, so as to obtain a plurality of decoding features;

a detection module 12, configured to determine a defect area in the image to be detected according to the plurality of encoding features and the plurality of decoding features.

The respective modules in the above defect detection apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided comprising a memory and a processor, the memory having stored therein a computer program, the processor when executing the computer program performing the steps of:

The computer device provided in the foregoing embodiments has similar implementation principles and technical effects to those of the foregoing method embodiments, and will not be described herein in detail.

In one embodiment, a computer readable storage medium is provided having a computer program stored thereon, which when executed by a processor, performs the steps of:

The foregoing embodiment provides a computer readable storage medium, which has similar principles and technical effects to those of the foregoing method embodiment, and will not be described herein.

In one embodiment, a computer program product is provided comprising a computer program which, when executed by a processor, performs the steps of:

The foregoing embodiment provides a computer program product, which has similar principles and technical effects to those of the foregoing method embodiment, and will not be described herein.

Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as Static Random access memory (Static Random access memory AccessMemory, SRAM) or dynamic Random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims

1. A method of defect detection, the method comprising:

2. The method of claim 1, wherein the teacher network comprises: the first convolution module and the coding network are used for inputting the image to be detected into a teacher network for coding, so as to obtain a plurality of coding features, and the first convolution module and the coding network comprise:

and inputting the convolution characteristics into the coding network for coding to obtain the plurality of coding characteristics.

3. The method of claim 2, wherein the encoding network comprises a plurality of encoding modules, the plurality of encoding modules being cascade connected, the inputting the convolution feature into the encoding network for encoding, the plurality of encoding features comprising:

4. A method according to claim 3, wherein said inputting the convolution feature into the plurality of encoding modules for encoding results in the plurality of encoding features, comprising:

5. The method of claim 4, wherein the encoding network comprises: the method comprises the steps of carrying out size adjustment on each initial coding feature to obtain a plurality of coding features, wherein the N coding modules and N second convolution modules, the output end of the Nth coding module is connected with the input end of the Nth second convolution module, and the method comprises the following steps:

6. The method of claim 4, wherein the encoding network comprises: the method comprises the steps of (1) M+1 coding modules and M second convolution modules, wherein the output end of the Mth coding module is connected with the input end of the Mth second convolution module, the initial coding features are subjected to size adjustment to obtain a plurality of coding features, and the method comprises the following steps:

7. The method of claim 1, wherein the student network comprises: the transformation module and the decoding modules are used for inputting the fused coding features into a student network for decoding to obtain decoding features, and the transformation module comprises the following components:

8. The method of claim 7, wherein the plurality of decoding modules are cascade connected, wherein the inputting the abstract feature into the plurality of decoding modules for decoding results in the plurality of decoding features, comprises:

9. The method of claim 7, wherein the plurality of decoding modules includes K decoding modules, and wherein a 1 st decoding module to a K-1 st decoding module of the K decoding modules are connected in cascade, and wherein inputting the abstract feature into the plurality of decoding modules for decoding, to obtain the plurality of decoding features, comprises:

and inputting the coding features output by the second coding module in the teacher network to a Kth decoding module in the K decoding modules for decoding to obtain the decoding features output by the Kth decoding module.

10. The method according to any one of claims 1-9, wherein said determining a defective area in the image to be detected from the plurality of coding features and the plurality of decoding features comprises:

obtaining a plurality of similar feature images according to the similarity between each coding feature and each decoding feature;

11. The method according to claim 10, wherein performing defect detection on the target detection image to obtain the defect area includes:

12. The method according to claim 10, wherein the method further comprises:

13. The method according to claim 1, wherein the method further comprises:

training a detection network according to the sample image to obtain the teacher network and the student network; the sample image is an image without a defect area; the detection network includes an initial teacher network and an initial student network.

14. The method of claim 13, wherein training the test network from the sample image results in the teacher network and the student network, comprising:

15. A defect detection apparatus, the apparatus comprising:

16. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 14 when the computer program is executed.

17. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 14.