CN114612399A

CN114612399A - Picture identification system and method for mobile phone appearance mark

Info

Publication number: CN114612399A
Application number: CN202210205903.8A
Authority: CN
Inventors: 林乐新; 王佳
Original assignee: Shenzhen Shanhui Technology Co ltd
Current assignee: Shenzhen Shanhui Technology Co ltd
Priority date: 2022-03-03
Filing date: 2022-03-03
Publication date: 2022-06-10

Abstract

The invention discloses a picture recognition system for mobile phone appearance marking, which is characterized in that a region of interest for marking the mobile phone appearance by a user is obtained, the region of interest is a subset of an image to be detected and is a partial region extracted from the image, the partial region consists of a plurality of irregular shapes, the image to be detected is divided into two or more regions which are not overlapped with each other according to the similarity criterion of preset gray scale, color, texture and shape characteristics, the region divided from the image is taken as a target object for subsequent characteristic extraction according to a connected component, the two-way component is a pixel which is connected together in a binary image according to a preset rule for marking, the mapping relation of data from input to output is searched, the optimal mapping function which reflects a target task is searched through training, a low-dimensional target characteristic is expressed by adopting a high-dimensional abstract characteristic, a network training module comprises a convolution layer and a pooling layer, and the data volume is increased by using different data amplification, so that the data sample is richer, and the precision and the accuracy of extracting the defect part are improved.

Description

Picture identification system and method for mobile phone appearance mark

Technical Field

The invention belongs to the technical field of mobile phone detection, and particularly relates to a picture identification system and method for mobile phone appearance marks.

Background

The large data stream generated by summarizing the surface defect detection and manufacturing processes of the glass of the smart phone shows the characteristics of large volume, multiple categories, high speed, high uncertainty, difficult identification and the like, and generally causes the failure of the traditional defect detection and quality diagnosis method. Compared with manual detection, the mobile phone appearance defect detection algorithm based on deep learning can control the production quality of products through real-time quality detection, effectively improve the production efficiency of enterprises, reduce human resources, reduce the production cost, provide a new idea for big data processing and analysis in the large-scale production process of smart phones, and promote the gathering and development of the smart phone industry.

At present, mobile phone manufacturers generally adopt a manual detection method to detect various defects of mobile phone partition plates, and the detection method has many defects due to the interference of factors, for example, the manual detection has high labor intensity, high production cost and low efficiency, so that the manual detection work is single and tedious, and the detection method has multiple functions of hand feeling and visual inspection, is difficult to ensure the high accuracy and reliability of detection results, lacks unified detection standards, and causes the inconsistency of product inspection.

Disclosure of Invention

In view of this, the invention provides a picture recognition system and method for mobile phone appearance marks, which combines a computer with reliability, rapidity and stability with human vision machine vision automatic detection, improves the quality inspection efficiency of mobile phone accessories, and reduces the production cost.

In a first aspect, the present invention provides a picture recognition system for mobile phone appearance marks, including:

the image preprocessing module is used for acquiring an interested area marked with the appearance of the mobile phone by a user, wherein the interested area is a subset of an image to be detected and is a partial area extracted from the image, and the partial area consists of a plurality of irregular shapes;

the image segmentation module is used for segmenting the image to be detected into two or more non-overlapped areas according to the similarity criterion of preset gray scale, color, texture and shape characteristics;

the characteristic extraction module is used for taking the region obtained by dividing the image according to the connected component as a target object of subsequent characteristic extraction, and the two-way component is used for marking pixels which are connected together in the binary image according to a preset rule, wherein the region below a target value is extracted and converted into the binary image by adopting gray threshold segmentation, the connected region is analyzed according to a pixel adjacent mode, and fine miscellaneous points are removed according to the area size;

the network training module is used for searching an optimal mapping function which can reflect a target task most through training according to a mapping relation from input to output of data, and expressing low-dimensional target characteristics by adopting high-dimensional abstract characteristics; the convolutional layers are used for extracting the characteristics of input data, and the convolutional layers with different sizes represent receptive fields with different sizes; the pooling layer takes the maximum value from the sub-matrices of the input matrix and the average pooling is the mean value taken from the input matrix.

As a further improvement of the technical scheme, the image preprocessing module comprises a denoising unit, the denoising unit adopts two structural elements with different sizes to perform top hat transformation on the image, firstly selects one structural element with the size larger than that of the sand grains to perform top hat transformation on the image to be detected so as to enable the residual sand grains and the noise points to be protruded, then selects one structural element with the size smaller than that of the sand grains to perform top hat exchange on the image to be detected, and takes the difference value of the two top hat transformations as the output result of preprocessing enhancement.

As a further improvement of the above technical solution, the image preprocessing module further includes a support line processing unit, the support line processing unit is configured to simply divide the screen region and the screen region, extract and extend the support line of the screen region, and perform subtraction to remove the support line covered by the screen region, so as to obtain a complete support line region to be removed, where the support line in the image is a thin line penetrating through the smartphone cover plate and near the upper and lower ends, and the contact portion is the back of the imaging.

As a further improvement of the above technical solution, the execution process of the supporting line processing unit includes the steps of:

operating the camera to take a picture in a dark environment, wherein the light inlet quantity is zero, obtaining the offset G of each pixel, and the expression is f_E(x, y) ═ K × i (x, y) t + G, where f_E(x, y) is a pixel gray value obtained in a dark environment, K is a conversion relation and has a unit of gray/electron, i (x, y) is the dark current of a specific pixel in a dark field, t is the time for acquiring one frame or one line by a line scan camera, and G is an offset value of an image;

carrying out primary imaging on an object with a flat surface under the condition of uniform illumination to obtain a bright field image, wherein the expression of bright field correction is as follows: f. of_E0(x,y)＝η(x,y)X₀+ K x i (x, y) t + G, wherein f_E0(X, y) is the gray value of the pixel obtained under uniform illumination, eta (X, y) is the conversion sensitive relation of the specific pixel to the light, and X₀Is the light constant;

taking a corrected image in an actual environment, wherein the brightness X (X, y) change represents the image characteristic, and the expression is as follows: f (X, y) ═ η (X, y) X (X, y) + K ═ i (X, y) t + G, obtained from the above expression

X₀Is a constant number f_E0(x, y) isIt is known that the gray scale of the whole image is changed by multiplying the gray scale function of the pixel by a uniform value in the collected image, and the detailed information of the image is not lost.

As a further improvement of the above technical solution, the convolutional layer performs the following processes in the whole input matrix:

performing element point-by-element point multiplication between a sub-matrix of an input matrix and a reception field, wherein the reception field is a filter kernel, randomly generating an initial weight value of the reception field, and setting a bias according to the configuration of a network;

the weights of the receptive fields and the bias of the network are both trained using a stochastic gradient descent algorithm, the size of the submatrix is equal to the size of the receptive field, but the receptive field is smaller than the input matrix, the multiplied values are summed, and the bias is added to the summed value.

As a further improvement of the above technical solution, the execution process of the image segmentation module is as follows: the image f (x, y) is composed of a background part and a foreground part, a gray threshold value T is determined, and the gray threshold value T is expressed according to an expression

And (3) classifying pixels, wherein any point (x, y) meeting the function of the input image is called a target point and is marked as 1, the rest pixel points are called background points and are marked as 0, and an image target region and a background region are divided by adopting a division gray level, wherein when T is a constant, g (x, y) is a binary image after threshold processing.

As a further improvement of the above technical solution, a background image before the object to be inspected is conveyed to the inspection stage and an image to be inspected conveyed to the inspection stage are photographed in real time in the inspection process, the two images are integrated into the same size and matched, residual error processing is performed pixel by pixel to obtain a final residual error denoising effect image, and g (x, y) ═ f₁(x,y)-f₂(x, y), g (x, y) represents the pixel value of the residual de-noised effect image, f₁(x, y) is the pixel value of the image to be inspected, f₂And (x, y) is a detection platform background image pixel value.

As a further improvement of the above technical solution, the execution process of the network training module includes the following steps:

inputting a preprocessed curved surface transparent sample picture into a network, and extracting image features through a convolutional neural network to generate a feature map;

inputting the feature map into an RPN network, generating a series of candidate frames and mapping the candidate frames onto the feature map generated by the convolutional neural network, taking the generated candidate regions as the input of an ROI posing layer, and generating the feature map with a fixed size through the candidate regions of the ROI posing layer;

and inputting the feature map with fixed size passing through the ROI Pooling layer into the full-connected layer for classification and position correction.

In a second aspect, the present invention further provides a picture identification method for mobile phone appearance marks, including the following steps:

acquiring an interested area of the appearance of a marked mobile phone of a user, wherein the interested area is a subset of an image to be detected and is a partial area extracted from the image, and the partial area consists of a plurality of irregular shapes;

dividing an image to be detected into two or more regions which are not overlapped mutually according to the similarity criterion of preset gray scale, color, texture and shape characteristics;

according to the connected components, the regions obtained by dividing the image are used as target objects for subsequent feature extraction, the two-way components are used for marking pixels which are connected together in a binary image according to a preset rule, wherein the regions below a target value are extracted and converted into the binary image by adopting gray threshold value division, connected region analysis is carried out according to a pixel adjacent mode, and fine miscellaneous points are removed according to the area size;

the method comprises the steps that a mapping relation of data from input to output is obtained, an optimal mapping function which can reflect a target task most is found through training, low-dimensional target features are represented by adopting high-dimensional abstract features, a network training module comprises convolution layers and pooling layers, each layer comprises training parameters, one feature of input data is extracted through a convolution kernel by a plurality of feature maps of each layer, each feature map comprises a plurality of neurons, and different neurons are connected with one another; the convolutional layers are used for extracting the characteristics of input data, and the convolutional layers with different sizes represent receptive fields with different sizes; the pooling layer takes the maximum value from the sub-matrices of the input matrix and the average pooling takes the mean value from the input matrix.

The invention provides a picture identification system and method for mobile phone appearance marks, which have the following beneficial effects compared with the prior art:

the method comprises the steps of obtaining an interested area marked with the appearance of the mobile phone by a user, wherein the interested area is a subset of an image to be detected and is a partial area extracted from the image, the partial area is composed of a plurality of irregular shapes, the image to be detected is divided into two or more non-overlapped areas according to the similarity criterion of preset gray scale, color, texture and shape characteristics, the divided area of the image is used as a target object for subsequent characteristic extraction according to a connected component, the two-way component is a pixel which is connected together in a binary image according to a preset rule, the mapping relation of data from input to output is obtained, an optimal mapping function which can most reflect a target task is found through training, a low-dimensional target characteristic is represented by adopting a high-dimensional abstract characteristic, a network training module comprises a convolution layer and a pooling layer, and the problems of single data sample and small data amount in the actual production process are solved, different data amplification is used to increase the data volume, so that the data sample is richer, and the precision and accuracy of defect part extraction are improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and for those skilled in the art, other related drawings can be obtained according to the drawings without inventive efforts.

FIG. 1 is a block diagram of a picture recognition system for a mobile phone appearance mark according to the present invention;

FIG. 2 is a diagram of an implementation of the support line handling unit of the present invention;

FIG. 3 is a diagram of the implementation of the network training module of the present invention;

fig. 4 is a flowchart of a picture recognition method for mobile phone appearance marks according to the present invention.

Detailed Description

Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.

Referring to fig. 1, the present invention provides a picture recognition system for mobile phone appearance marks, comprising:

the network training module is used for searching an optimal mapping function which can reflect a target task most through training according to a mapping relation from input to output of data, and expressing low-dimensional target characteristics by adopting high-dimensional abstract characteristics; the convolution layers are used for extracting the characteristics of input data, and the convolution layers with different sizes represent receptive fields with different sizes; the pooling layer takes the maximum value from the sub-matrices of the input matrix and the average pooling takes the mean value from the input matrix.

In this embodiment, the image preprocessing module includes a denoising unit, the denoising unit performs top hat transformation on the image by using two structural elements with different sizes, first selects a structural element with a size larger than the sand point to perform top hat transformation on the image to be detected, so that the residual sand point and the noise point are protruded, then selects a structural element with a size smaller than the sand point to perform top hat exchange on the image to be detected, and uses the difference of the two top hat transformations as the output result of preprocessing enhancement. The image preprocessing module further comprises a supporting line processing unit, the supporting line processing unit is used for simply dividing the silk screen region and the screen region, extracting and extending the supporting lines of the screen region, and removing the covering supporting lines of the silk screen region to obtain a complete supporting line region to be removed, wherein the supporting lines in the image are thin lines which penetrate through the cover plate of the smart phone and are close to the upper end and the lower end, and the contact part is the back of the imaging.

It should be noted that the convolutional layer performs the following processes in the whole input matrix: performing element point-by-element point multiplication between a sub-matrix of an input matrix and a reception field, wherein the reception field is a filter kernel, randomly generating an initial weight value of the reception field, and setting a bias according to the configuration of a network; the weights of the receptive fields and the bias of the network are both trained using a stochastic gradient descent algorithm, the size of the submatrix is equal to the size of the receptive field, but the receptive field is smaller than the input matrix, the multiplied values are summed, and the bias is added to the summed value. The image segmentation module executes the following steps: the image f (x, y) is composed of a background part and a foreground part, a gray threshold value T is determined, and the gray threshold value T is expressed according to an expression

Classifying pixels, wherein any point (x, y) meeting the function of the input image is called a target point and is marked as 1, the other pixel points are called background points and are marked as 0, an image target region and a background region are divided by adopting a division gray level, and when T is a constant, g (x, y) is a binary value processed by a threshold valueAnd (4) an image.

It should be understood that the appearance defects of the curved transparent glass have higher requirements on an imaging system, a high-quality source image is the basis for improving the robustness of a subsequent detection algorithm, and a good source image has the advantages of target-background separation, high contrast and uniform image brightness; the method is beneficial to subsequent image processing, the background needs to be uniform and does not interfere with the target image, and the method has the characteristics of high-reflectivity targets and no excessive exposure. The method comprises the steps of detecting that noise of an image is effectively suppressed after the image is subjected to improved top hat conversion filtering processing to obtain a gray level image of a protruded sand grain region, next converting a preprocessed image into a binary image by using a threshold segmentation technology, wherein in order to segment more complete sand grain points, threshold selection is key, the difficulty of segmentation is the same, for most of images, omission detection is easy when the threshold is large, noise is completely removed when the threshold is small, a perfect effect is difficult to obtain, in order to adapt to differences of different images, a good segmentation effect is achieved, and the threshold t is expressed by an expression t k max (f max)_d) Is selected, wherein f_dFor detecting a result graph of an image after being subjected to improved top-hat transform filtering, k is a constant and represents that a threshold value t is f_dThe multiple of the maximum response gray value can be used for further screening the target by utilizing the characteristics of the area, the length-width ratio, the gray value and the like according to the geometric characteristics of the sand area in order to accurately detect the sand point and eliminate the interference.

Referring to fig. 2, optionally, the process performed by the support line processing unit includes the steps of:

s20: operating the camera to take a picture in a dark environment, wherein the light inlet quantity is zero, obtaining the offset G of each pixel, and the expression is f_E(x, y) ═ K × i (x, y) t + G, where f_E(x, y) is a pixel gray value obtained in a dark environment, K is a conversion relation and has a unit of gray/electron, i (x, y) is the dark current of a specific pixel in a dark field, t is the time for acquiring one frame or one line by a line scan camera, and G is an offset value of an image;

s21: carrying out primary imaging on an object with a flat surface under the condition of uniform illumination to obtain a bright field image, wherein the expression of bright field correction is as follows: f. of_E0(x,y)＝η(x,y)X₀+ K x i (x, y) t + G, wherein f_E0(x, y) isThe gray value of the pixel obtained under uniform illumination, eta (X, y) is the conversion sensitive relation of the specific pixel to light, X₀Is the light constant;

s22: taking a corrected image in an actual environment, wherein the brightness X (X, y) change represents the image characteristic, and the expression is as follows: f (X, y) ═ η (X, y) X (X, y) + K ═ i (X, y) t + G, obtained from the above expression

X₀Is a constant number f_E0(x, y) is known, the gray scale function of the pixel is multiplied by a uniform numerical value in the collected graph to carry out overall gray scale change on the whole image, and the detailed information of the image cannot be lost.

In this embodiment, a background image before an object to be inspected is conveyed to a detection table and an image to be inspected conveyed to the detection table are shot in real time in a detection process, the two images are integrated into the same size and matched, residual error processing is performed on the two images one by one to obtain a final residual error denoising effect image, and g (x, y) ═ f₁(x,y)-f₂(x, y), g (x, y) represents pixel values of the residual de-noised effect image, f₁(x, y) is the pixel value of the image to be inspected, f₂And (x, y) is a detection platform background image pixel value. The support line mainly has the characteristics that the gray scale range of the thin line is overlapped with the background, and the thin line is not obviously overlapped with the silk screen area and the screen area, the method for removing the pixel points at the fixed positions in the processing is not feasible, and the support line is required to be accurately extracted based on the characteristics of the support line and then denoised.

The support line is extracted and then removed, the support line is directly removed to leave traces on the image, the subsequent defect detection processing is limited, the support line area is selectively covered, the support line area is filled with the surrounding background pixel gray values, the mobile phone glass cover plate image is constructed and restored, and a sufficient processing space is reserved for the subsequent defect detection. Due to the existence of the support lines, the serious influence of the mean filtering is that the gray value of surrounding pixels is lowered, the background is darkened integrally, the defects are extracted seriously by Zhejiang, the support lines are completely removed by the median filtering effect, but the silk-screen area is diffused to two sides at the silk-screen edge position with the suddenly changed gray value of the pixels, and the silk-screen boundary is blurred.

Referring to fig. 3, optionally, the execution process of the network training module includes the following steps:

s30: inputting a preprocessed curved surface transparent sample picture into a network, and extracting image features through a convolutional neural network to generate a feature map;

s31: inputting the feature map into an RPN network, generating a series of candidate frames and mapping the candidate frames onto the feature map generated by the convolutional neural network, taking the generated candidate regions as the input of an ROI posing layer, and generating the feature map with a fixed size through the candidate regions of the ROI posing layer;

s32: and inputting the feature map with fixed size passing through the ROI Pooling layer into the full-connected layer for classification and position correction.

In this embodiment, the target detection needs to go through two stages, namely training and testing, and before training, the preparation of data and the setting of network initialization parameters are completed. The fast R-CNN algorithm is usually pre-trained on an ImageNet data set to initialize a network and fine-tune network parameters, a newly added network layer is initialized at random by using a Gaussian distribution function, wherein standard deviation and mean value are respectively 0.01 and 0, the updating speed of weight values in the network training process is related to the learning rate of the network, overfitting can be caused by excessively large value setting, network descending speed is too slow due to excessively small value setting, the learning rate is generally set according to rules of 1, 0.5, 0.1, 0.01, 0.005 and the like, but specific conditions need to be compared and judged by combining samples and network conditions. When the learning rate is set to be too large, loss is reduced firstly along with the increase of the iteration times, then reaches the river and does not converge, when the learning rate is set to be smaller, loss is reduced along with the decrease of the iteration times and converges but contracts slowly, the proper learning rate enables fast convergence, and the effect of the network training model is best at the moment.

Referring to fig. 4, the present invention further provides a picture recognition method for a mobile phone appearance mark, including the following steps:

s40: acquiring an interested area marked with the appearance of the mobile phone by a user, wherein the interested area is a subset of an image to be detected and is a partial area extracted from the image, and the partial area consists of a plurality of irregular shapes;

s41: dividing an image to be detected into two or more regions which are not overlapped with each other according to the similarity criterion of preset gray scale, color, texture and shape characteristics;

s42: according to the connected components, the regions obtained by dividing the image are used as target objects for subsequent feature extraction, the two-way components are used for marking pixels which are connected together in a binary image according to a preset rule, wherein the regions below a target value are extracted and converted into the binary image by adopting gray threshold value division, connected region analysis is carried out according to a pixel adjacent mode, and fine miscellaneous points are removed according to the area size;

s43: the method comprises the steps that a mapping relation of data from input to output is obtained, an optimal mapping function which can reflect a target task most is found through training, low-dimensional target features are represented by adopting high-dimensional abstract features, a network training module comprises convolution layers and pooling layers, each layer comprises training parameters, one feature of input data is extracted through a convolution kernel by a plurality of feature maps of each layer, each feature map comprises a plurality of neurons, and different neurons are connected with one another; the convolutional layers are used for extracting the characteristics of input data, and the convolutional layers with different sizes represent receptive fields with different sizes; the pooling layer takes the maximum value from the sub-matrices of the input matrix and the average pooling takes the mean value from the input matrix.

In this embodiment, the values of the submatrices summarized by the input matrix and the convolved receptive field are added with the bias and then assigned to the corresponding output position, so that the size of input data is reduced, and the calculation cost is reduced. The convolution layer's additional parameters are the convolution step, which defines the number of times the columns and rows of pixels of the receptive field slide over the width and height of the input matrix, with larger steps reducing the receptive field application times and smaller output sizes. In order to reduce the space size of the input matrix, the CNN introduces a pooling layer adopting a downsampling principle, and two different choices of maximum pooling and average pooling are adopted, wherein the maximum pooling is to obtain a maximum value from a sub-matrix of the input matrix, and the average pooled input matrix is to obtain an average value, so that the accuracy and the detection efficiency of mobile phone appearance identification are improved.

In all examples shown and described herein, any particular value should be construed as merely exemplary, and not as a limitation, and thus other examples of example embodiments may have different values.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures.

The above examples are merely illustrative of several embodiments of the present invention, and the description thereof is more specific and detailed, but not to be construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.

Claims

1. A picture recognition system for mobile phone appearance marks, comprising:

the network training module is used for searching an optimal mapping function which can reflect a target task most through training according to a mapping relation from input to output of data, and expressing low-dimensional target characteristics by adopting high-dimensional abstract characteristics; the convolutional layers are used for extracting the characteristics of input data, and the convolutional layers with different sizes represent receptive fields with different sizes; the pooling layer takes the maximum value from the sub-matrices of the input matrix and the average pooling takes the mean value from the input matrix.

2. The picture recognition system for mobile phone appearance marks as claimed in claim 1, wherein the image preprocessing module comprises a denoising unit, the denoising unit performs top hat transformation on the image by using two structural elements with different sizes, firstly selects a structural element with a size larger than that of the sand point to perform top hat transformation on the image to be detected, so that the residual sand point and the noise point are projected, and then selects a structural element with a size smaller than that of the sand point to perform one top hat transformation on the image to be detected, and uses the difference of the two top hat transformations as the output result of preprocessing enhancement.

3. The picture recognition system for mobile phone appearance marks as claimed in claim 1, wherein the image preprocessing module further comprises a support line processing unit, the support line processing unit is used for simply dividing the screen area and the screen area, extracting and extending the support line of the screen area, and subtracting to remove the support line covered by the screen area, so as to obtain a complete support line area to be removed, wherein the support line in the image is a thin line penetrating through the cover plate of the smart phone and close to the upper end and the lower end, and the contact part is the back of the image.

4. The picture recognition system for mobile phone appearance marks according to claim 3, wherein the execution process of the support line processing unit comprises the following steps:

the camera is operated to take a picture in a dark environment,the light input quantity is zero, the offset G of each pixel is obtained, and the expression is f_E(x, y) ═ K × i (x, y) t + G, where f_E(x, y) is a pixel gray value obtained in a dark environment, K is a conversion relation and has a unit of gray/electron, i (x, y) is the dark current of a specific pixel in a dark field, t is the time for acquiring one frame or one line by a line scan camera, and G is an offset value of an image;

5. The picture recognition system for handset appearance marks according to claim 1, wherein the convolutional layer performs the following process in the whole input matrix:

performing element-point product one by one between a sub-matrix of the input matrix and a receptive field, wherein the receptive field is a filter kernel, randomly generating an initial weight value of the receptive field, and setting the bias according to the configuration of the network;

6. Root of herbaceous plantThe picture recognition system for mobile phone appearance marks according to claim 1, wherein the image segmentation module is executed to: the image f (x, y) is composed of a background part and a foreground part, a gray threshold value T is determined, and the gray threshold value T is expressed according to an expression

7. The system of claim 6, wherein the background image of the object to be inspected is captured in real time during the inspection process, the background image is integrated into a same size and matched with the image to be inspected, and the residual error is processed pixel by pixel to obtain a final residual error de-noising effect image, wherein g (x, y) ═ f₁(x,y)-f₂(x, y), g (x, y) represents the pixel value of the residual de-noised effect image, f₁(x, y) is the pixel value of the image to be inspected, f₂And (x, y) is a detection platform background image pixel value.

8. The picture recognition system for mobile phone appearance marks according to claim 1, wherein the network training module executes the process including the steps of:

9. A picture recognition method for mobile phone appearance marks of the picture recognition system for mobile phone appearance marks according to any one of claims 1-8, characterized by comprising the following steps:

acquiring an interested area marked with the appearance of the mobile phone by a user, wherein the interested area is a subset of an image to be detected and is a partial area extracted from the image, and the partial area consists of a plurality of irregular shapes;

dividing an image to be detected into two or more regions which are not overlapped with each other according to the similarity criterion of preset gray scale, color, texture and shape characteristics;