CN116958739B

CN116958739B - Attention mechanism-based carbon fiber channel real-time dynamic numbering method

Info

Publication number: CN116958739B
Application number: CN202310752546.1A
Authority: CN
Inventors: 古玲; 武继杰
Original assignee: Nanjing Jushi Technology Co ltd
Current assignee: Nanjing Jushi Technology Co ltd
Priority date: 2023-06-25
Filing date: 2023-06-25
Publication date: 2024-06-21
Anticipated expiration: 2043-06-25
Also published as: CN116958739A

Abstract

A carbon fiber channel real-time dynamic numbering method based on an attention mechanism relates to the technical field of image processing. Obtaining sampling data by using an industrial camera; preprocessing the sampling data, wherein the preprocessing comprises training a channel prediction network and marking carbon fiber channel numbers in actual scenes; and outputting the processing result to obtain the marking number of the carbon fiber yarn channel. The invention adopts a visual transducer network to process carbon fiber images acquired by a linear array industrial camera, and sequentially carries out real-time dynamic numbering on carbon fiber yarns from top to bottom on image data. And under the condition of no obvious distinguishing characteristics, better processing is carried out on the data such as doubling or broken yarn and the like.

Description

Attention mechanism-based carbon fiber channel real-time dynamic numbering method

Technical Field

The invention relates to the technical field of image processing, in particular to a carbon fiber yarn channel real-time dynamic numbering method based on an attention mechanism.

Background

In the production process of the carbon fiber yarn, the problem of surface defects (broken yarn, joint and the like) is prominent, and the quality of carbon fiber yarn products (such as manufacturing carbon fiber composite materials, reinforced carbon fiber concrete structures and the like) is seriously threatened. The surface defects are generated because of the fact that the carbon fiber defects are difficult to locate and trace due to the fact that certain viscosity exists on the surface of the yarn with low solidification speed or the yarn is mixed due to uneven impregnation of an oiling agent in the production of the carbon fiber, and the quality and the service life of a carbon fiber product are seriously influenced by the defects.

Because of the complex and diversified production environments, the quality level of the carbon fiber yarn images acquired by the industrial linear camera is uneven, and the characteristic of distinguishing yarn paths is not obvious. In addition, the carbon fiber yarn in the actual environment has various changes, and is easy to break, shift in the position of the carbon yarn and the like. The traditional image processing algorithm cannot cope with the complex data, so that the accuracy of the screw channel prediction cannot be guaranteed.

Due to the problems of doubling, broken filaments and the like, the algorithm based on the traditional image processing on the market is not applicable at present. That is, there are a plurality of carbon filaments in the image data that overlap together or that appear intermediate discontinuous, and these carbon filaments are collected by an industrial camera and often have no distinguishing property in the visual sense, nor have obvious distinguishing features in the image. Conventional algorithms have difficulty processing such data. Therefore, how to number the doubling or breakage and other conditions more accurately without obvious distinguishing characteristics is a worth discussing problem.

Disclosure of Invention

The invention provides a real-time carbon fiber yarn channel numbering algorithm based on an attention mechanism, which utilizes a transducer to process carbon fiber images acquired by a linear array industrial camera, outputs the prediction results of carbon fiber yarns and backgrounds in the images, and realizes the determination and numbering of carbon fiber yarn channels.

A carbon fiber channel real-time dynamic numbering method based on an attention mechanism comprises the following steps:

step S1: obtaining sampling data by using an industrial camera;

step S2: preprocessing the sampled data;

Step S3: and outputting the processing result to obtain the marking number of the carbon fiber yarn channel.

Preferably, the preprocessing of the sampled data in step S2 of the present invention includes the step S21 of training the silk channel prediction network, which specifically includes the following steps:

Step S211: preprocessing data; specific:

Will sample the data Divided into training sets/>And test set/>Two parts; wherein x _i is a carbon fiber yarn original image of 3 Xh Xw, y _i is a yarn path label of 1 Xw, the value of the yarn path position is 1, and the value of the background position is 0;

For the training set, each original image of the carbon fiber yarn is respectively subjected to the preprocessing operations of Resize, normalization, random enhancement of color brightness and saturation, and random vertical overturning, and each yarn path label is required to be adjusted according to the width of the original image of the carbon fiber yarn after Resize; the method comprises the steps that a bilinear interpolation algorithm is selected for the Resize operation of an original image of the carbon fiber yarn, and a bicubic interpolation algorithm is selected for the adjustment of a label;

for the verification set, only the pretreatment operation of Resize and normalization is needed for each original image of the carbon fiber yarn, and each yarn path label is needed to be adjusted according to the width of the image after Resize;

The sizes of images after the training set and the verification set are the same;

Step S212: outputting a classification predicted value and a position predicted value;

Step S213: calculating loss;

Step S214: optimizing and predicting network parameters;

step S215: and outputting the optimal prediction model.

Preferably, the output classification predicted value and the position predicted value of step S212 of the present invention; the specific process is as follows:

initializing parameters, including Linear mapping layer Linear, transducer coding network, classification network and wire numbering network; an optimizer Adam; marking the carbon wire data set;

Taking an image of [3, i, w ] as an input sample, starting from the i-th position, i.e. taking images from [3, i, w ] to [3, f+h, w ];

then dividing the input sample of [3, h, w ] by using a grid with the size of [ h, w ^p ] to obtain An image block; compressing the first 3 dimensions to obtain/>Is input to the computer; let 3×h×w ^p =d,/>Obtaining input sample/>, of the i-th layer

Linear mapping of input samples: Splice-on position embedding vector Input/>, to get a transducer network

Inputting x _i into a transducer coding network to obtain a coding output o _i＝Transformer(x_i)∈R^(d+1)×k;

O _i is respectively sent into a classification network f _cls and a lane numbering network f _loc to obtain classification prediction of k input blocks And position prediction/>

Preferably, the calculation loss of step S213 of the present invention; the specific process is as follows:

Calculating classification prediction losses Wherein MSELoss is mean square error loss, y _cls∈R^1×k is k real categories, and the value is 0 or 1; when the area occupied by the carbon wire in the corresponding kth input image is larger than a certain threshold value (0.5), the value is 1, otherwise, the value is 0;

Calculating position prediction losses Wherein y _loc∈R^1×k is the real label corresponding to the channel number;

Wherein, J=1, …, N is the number of all the tracks for the true value corresponding to the kth image block;

Total loss = l _cls+l_loc is calculated and network parameters are updated using Adam optimizer, including Linear mapping layer Linear, transform coding network, classification network f _cls, lane numbering network f _loc.

Preferably, the output optimal prediction model of step S215 of the present invention comprises the following specific processes:

Optimizing the whole network parameters by using an Adam optimizer;

Adding 1 to the value of i, and repeating the whole process from the step 1; until i=w-w ^p, the i value is 0, changing the next picture, and repeating the whole process from the 1 st step;

Calculating loss under the verification set by using the verification set, selecting a model with minimum loss as an optimal prediction network, and outputting a final silk channel prediction network

Preferably, the preprocessing of the sampled data in the step S2 of the present invention further includes labeling the carbon fiber channel number in the actual scene, and the specific process is as follows:

for test data Carrying out the operations of restoration and normalization on each original picture;

inputting the processed test image into the final screw channel prediction network obtained in step S215 Obtain the output predictive vector/>The carbon fiber channel numbers are numbered.

The method comprises the steps of training a visual transducer by utilizing images and labeling data of carbon fiber filaments in the early stage, and predicting the channel numbers in the carbon fiber filament images. The model has the advantages of less parameter, quick and efficient calculation, less operation needed by post-treatment, realization of real-time dynamic numbering of the filament tracks of the carbon fiber filaments, and certain generalization capability and stability.

Drawings

FIG. 1 is a flow chart of the real-time dynamic numbering method of the present invention.

FIG. 2 is a flow chart of the training phase of the present invention;

FIG. 3 is a flow chart of the prediction phase of the present invention;

Fig. 4 is a network configuration diagram of the present invention.

Detailed Description

The technical scheme of the invention is described in detail below with reference to the accompanying drawings:

as shown in fig. 1, a method for dynamically numbering carbon fiber channels in real time based on an attention mechanism comprises the following steps:

step S1: obtaining sampling data by using an industrial camera;

step S2: preprocessing the sampled data;

As shown in fig. 2, the preprocessing of the sampled data in step S2 of the present invention includes step S21 of training a silk channel prediction network, which specifically includes the following steps:

Step S211: preprocessing data; specific:

Will sample the data Divided into training sets/>And validation set/>Two parts; wherein x _i is a carbon fiber yarn original image of 3 Xh Xw, y _i is a carbon fiber yarn channel label of 1 Xw, the value of the carbon fiber yarn channel position is 1, the value of the background position is 0, and N _train、N_val is the total number of samples of the training set and the validation set respectively;

For the training set, preprocessing operations of Resize, normalization, color disturbance and random vertical overturning are respectively carried out on each carbon fiber yarn original image, and each yarn channel label is required to be adjusted according to the width of the carbon fiber yarn original image after Resize; the method comprises the steps that a bilinear interpolation algorithm is selected for the Resize operation of an original image of the carbon fiber yarn, and a bicubic interpolation algorithm is selected for the adjustment of a label;

the specific operation of normalization is as follows: Mu is the mean of the sample and sigma is the variance.

The specific operation of the color disturbance is as follows: for one pixel (r, g, b), it is assumed that the value of each channel is between [0,1 ]. ColorJitter will do the following for each channel:

Where C _i is the value of the current channel, For the new value after processing, α is the adjustment intensity, typically the fraction between [0,1], and rand (-1, 1) is a random number uniformly distributed between [ -1,1 ]. After the above operation, the new color isWherein:

where trunc () is a rounding function, rounds the calculation to an integer, and converts it to a pixel value on the image.

Random vertical flip: the current image sample x _i is randomly flipped vertically with a certain probability p, p being set to 0.5.

Resize: let us assume that we want to compute the size m×n of the image Resize as m×n (M < M, N < N), first, the corresponding position of each pixel after scaling down in the original image, and let us say that the position of the (x, y) th pixel in the original image after scaling down is (i, j), then: x=i×m/M, y=j×n/N. The bilinear interpolation algorithm and the bicubic interpolation algorithm in the Resize are respectively:

(1) Bilinear interpolation algorithm:

1. Calculating coordinates of four nearest neighbor pixels of each position (i, j) of the image after Resize at the corresponding position (x, y) in the original image: (x ₁,y₁),(x₁,y₂),(x₂,y₁) and (x ₂,y₂), wherein x ₁ and y ₁ are the largest integers satisfying x ₁.ltoreq.i and y ₁.ltoreq.j, and x ₂ and y ₂ are the smallest integers satisfying x ₂.gtoreq.i and y ₂.gtoreq.j.

2. For each coordinate position (i, j) in the Resize image, the pixel value is f (i, j), which is calculated as follows:

f(i,j)＝(1-w)(1-h)f(x₁,y₁)+w(1-h)f(x₂,y₁)+(1-w)hf(x₁,y₂)+whf(x₂,y₂),

Where w= (i-x ₁)/(x₂-x₁) and h= (j-y ₁)/(y₂-y₁) are weight coefficients.

(2) Bicubic interpolation algorithm:

We need to calculate the pixel value f (i, j) of the scaled down image from the value of (x, y). Specifically, for the (i, j) th pixel, its pixel value can be calculated by the following formula:

Wherein the method comprises the steps of The position of the upper left corner pixel of the 16 pixels adjacent to (i, j) is indicated. g (k, l) represents the kth row and first column pixel values on the original image, w _k-a,l-b is the weighting factor, which is calculated from (x, y). Specifically, the weighting coefficient can be calculated from the following formula:

w_k-a,i-b＝w(s)·w(t)

Where s=k-x+1, t=l-y+1, w(s) and w (t) represent weighting coefficients in s, t directions, respectively, which can be calculated by the following formula:

The w (z) coefficient in the above formula is called a bicubic interpolation weight function, and its function is to control the smoothness of interpolation, when z is 0, the value of w (z) is the largest, the gray value of the corresponding pixel has the largest influence on the interpolation result, and when z is far away from 0, the value of w (z) gradually decreases, and the influence on the interpolation result is smaller and smaller.

For verification set dataThe original image of each carbon fiber wire only needs to do the preprocessing operation of Resize (bilinear interpolation algorithm, the same as above) and normalization, and each wire label needs to adjust Resize (bicubic interpolation algorithm, the same as above) according to the width of the image Resize;

Step S212: outputting a classification predicted value and a position predicted value; the specific process is as follows:

Initializing parameters, including Linear mapping layer Linear, transducer coding network, classification network and wire numbering network; an optimizer Adam; an original image of carbon fiber yarn and a yarn path label corresponding to the original image; as shown in figure 4 of the drawings,

Taking an image of [3, i, w ] as an input sample, starting from the i-th position, i.e. taking images from [3, i, w ] to [3, i+h, w ];

then dividing the input sample of [3, h, w ] by using a grid with the size of [ h, w ^p ] to obtain An image block; compressing the first 3 dimensions to obtain/>Is input to the computer; let 3×h×w ^p =d,/>Obtain input sample/>, of the i-th positionK is the total number of the image blocks divided, and d is the dimension of each image block after compression. R real set vector space, R { d x k }, represents vector space of d by k dimensions; if the width w ^p of each image block is set to 5, a picture is marked, and the/>The number of training samples can realize that the labeling quantity of the total picture is controlled within 100.

The coded output o _i is respectively sent into a classification network f _cls and a lane numbering network f _loc to obtain classification prediction of k input blocks And position prediction/>Specifically calculated as/>Wherein W _cls and b _cls are respectively the weight parameter and bias of the classification network f _cls, σ (x) is a ReLU activation function, and the calculation formula is σ (x) =max (0, x); Where W _loc and b _loc are the weight parameters and bias, respectively, of the lane numbering network f _loc.

Step S213: calculating loss; the specific process is as follows:

Calculating classification prediction losses Wherein MSELoss is mean square error loss, y _cls∈R^1×k is the true category of the corresponding k image blocks, and the value is 0 or 1; when the occupied area of the carbon fiber channels in the corresponding kth input image block is larger than a certain threshold value (0.5), the value is 1, otherwise, the value is 0;

Wherein, J=1, …, N are all the number of tracks, j is the index of the track from left to right; total loss = l _cls+l_loc is calculated and network parameters are updated using Adam optimizer, including Linear mapping layer Linear, transform coding network, classification network f _cls, lane numbering network f _loc. The purpose of the classification network is to assist learning, and only the lane numbering network is used for lane prediction in the test stage.

Step S214: optimizing and predicting network parameters;

step S215: the optimal prediction model is output, and the specific process is as follows:

Optimizing the whole network parameters by using an Adam optimizer;

As shown in fig. 3, the preprocessing of the sampled data in step S2 of the present invention further includes labeling of carbon fiber channel numbers in actual scenes, and the specific process is as follows:

inputting the processed test image into the final screw channel prediction network obtained in step S215 Obtain the output predictive vector/>Specifically calculated as/>When one or more values greater than the threshold value (0.5) appear, the current position or continuous area is judged to be a new carbon fiber, and the number of the carbon fiber is added with 1 (initial 0). The carbon fiber channel numbers are numbered accordingly.

The invention utilizes a linear array industrial camera, and utilizes images and labeling data of carbon fiber filaments in the early stage to train a visual transducer to predict the number of the channel in the carbon fiber filament images. The model has the advantages of less parameter, quick and efficient calculation, less operation needed by post-treatment, realization of real-time dynamic numbering of the filament tracks of the carbon fiber filaments, and certain generalization capability and stability. The invention can directly number the silk channels of the image by only a shallow visual transducer structure, has less overall model parameters, less hardware cost and high calculation efficiency, and can number the carbon fiber silk channels in real time.

Claims

1. A carbon fiber channel real-time dynamic numbering method based on an attention mechanism is characterized by comprising the following steps:

step S1: obtaining sampling data by using an industrial camera;

step S2: preprocessing the sampled data;

The method comprises the following steps of S21 training a silk channel prediction network, wherein the specific process is as follows:

Step S211: preprocessing data; specific:

Will sample the data Divided into training sets/>And a verification setTwo parts; wherein x _i is a carbon fiber yarn original image of 3 Xh x w, y _i is a yarn path label of 1 Xw, the value of the yarn path position is 1, the value of the background position is 0, and N _train、N_val is the total number of samples of the training set and the verification set respectively;

the specific operation of normalization is as follows: μ is the mean of the sample, σ is the variance;

the specific operation of the color disturbance is as follows: for one pixel (r, g, b), assume that the value of each channel is between [0,1 ]; color Jitter will do the following for each channel:

Where C _i is the value of the current channel, For the new value after processing, α is the adjustment intensity, usually the fraction between [0,1], and rand (-1, 1) is a random number uniformly distributed between [ -1,1 ]; after the above operation, the new color isWherein:

Wherein trunc () is a rounding function, rounding the calculation result to an integer, and converting it to a pixel value on the image;

Random vertical flip: randomly vertically overturning the current image sample x _i according to a certain probability p, wherein p is set to be 0.5;

Resize, namely, an image Resize with the size of M multiplied by N is M multiplied by N, wherein M is smaller than M, N and smaller than N, firstly, the corresponding position of each pixel after reduction in the original image needs to be calculated, and the position of the pixel at the (k, l) th position in the original image in the reduced image is (c, d), then: k=c×m/M, l=d×n/N; the bilinear interpolation algorithm and the bicubic interpolation algorithm in the Resize are respectively:

(1) Bilinear interpolation algorithm:

1. Calculating coordinates of four nearest neighbor pixels of each position (c, d) of the image after Resize in the corresponding position (k, l) in the original image: (k ₁,l₁),(k₁,l₂),(k₂,l₁) and (k ₂,l₂), wherein k ₁ and l ₁ are the largest integers satisfying k ₁.ltoreq.c and l ₁.ltoreq.d, and k ₂ and l ₂ are the smallest integers satisfying k ₂.gtoreq.c and l ₂.gtoreq.d;

2. For each position (c, d) in the Resize image, the pixel value is f (c, d), which is calculated as follows:

f(c,d)＝(1-w^/)(1-h^/)f(k₁,l₁)+w^/(1-h^/)f(k₂,l₁)+(1-w^/)h^/f(k₁,l₂)+w^/h^/f(k₂,l₂)

Wherein w ^/＝(c-k₁)/(k₂-k₁) and h ^/＝(d-l₁)/(l₂-l₁) are weight coefficients;

(2) Bicubic interpolation algorithm:

The pixel value f (c, d) of the reduced image needs to be calculated according to the value of (k, l); for the pixel at the (c, d) th position, its pixel value is calculated by the following formula:

Wherein the method comprises the steps of Representing the position of the upper left corner pixel of the 16 pixels adjacent to the (c, d) position; g (k, l) represents the kth row, the first column pixel value on the original image,/>Is a weighting coefficient calculated from (k, l);

for verification set data The original image of each carbon fiber yarn only needs to do the preprocessing operation of Resize and normalization, and each yarn path label needs to carry out Resize adjustment according to the width of the image after Resize;

step S212: the prediction network outputs a prediction vector; specific:

Initializing parameters, including Linear mapping layer Linear, transducer coding network, classification network and wire numbering network; an optimizer Adam; corresponding silk channel labels in the original image of the carbon fiber silk;

Dividing the input samples of [3, h, w ] by using a grid with the size of [ h, w ^p ] to obtain An image block; compressing the first 3 dimensions to obtain/>Is input to the computer; let 3 Xh Xw ^p＝d^/,/>Obtain input sample of the q-th position/>The value of q is 1..k ^/;k^/ is the total number of the divided image blocks, and d ^/ is the dimension of each compressed image block;

respectively carrying out linear mapping on k ^/ image blocks obtained after segmentation, and splicing position embedding vectors Obtaining input of a transducer network, and obtaining a coding output o _i;

the coded output o _i is respectively sent into a classification network f _cls and a lane numbering network f _loc to obtain classification prediction of k ^/ input blocks And position prediction/>Specifically calculated as/>Wherein W _cls and b _cls are the weight parameters and bias, respectively, of the classification network f _cls, σ ^/ (x) is the ReLU activation function; /(I)Wherein W _loc and b _loc are the weight parameter and bias, respectively, of the lane numbering network f _loc;

step S213: calculating loss; the specific process is as follows:

Calculating classification prediction losses Wherein MSELoss is the mean square error loss,/>The value of the real category corresponding to k ^/ image blocks is 0 or 1; when the occupied area of the carbon fiber yarn channel in the corresponding q-th input image block is larger than a certain threshold value, the value is 1, otherwise, the value is 0;

Calculating position prediction losses Wherein/>Is a real label corresponding to the channel number;

Wherein, For the true value corresponding to the q-th image block, J ^/ =1, …, J is the number of all tracks, J ^/ is the index of the track from left to right; calculate total loss = l _cls+l_loc;

Step S214: optimizing and predicting network parameters; updating network parameters by using an Adam optimizer, wherein the network parameters comprise a Linear mapping layer Linear coding network, a Transformer coding network, a classification network f _cls and a lane numbering network f _loc; the classification network aims at assisting learning, and only a silk channel numbering network is used for silk channel prediction in the test stage;

Repeating the whole process from step S212 until all images in the training set are traversed;

Calculating loss under the verification set by using the verification set, selecting a channel number network model with the minimum loss as an optimal prediction network, and outputting a final channel prediction network

For the test data, carrying out the operations of Resize and normalization on each original picture;

inputting the processed test image into the final screw channel prediction network obtained in step S215 Obtain the output predictive vector/>Specifically calculated as/> When one or more continuous values larger than a threshold value appear, judging the current position or continuous area as a new carbon fiber wire, and adding 1 to the number of the carbon fiber wire;