CN116933041B

CN116933041B - Force sensor number checking system and method

Info

Publication number: CN116933041B
Application number: CN202311187017.8A
Authority: CN
Inventors: 汪星星; 王建国; 王梦茹
Original assignee: Shenzhen Lizhun Sensing Technology Co ltd
Current assignee: Shenzhen Lizhun Sensing Technology Co ltd
Priority date: 2023-09-14
Filing date: 2023-09-14
Publication date: 2024-05-03
Anticipated expiration: 2043-09-14
Also published as: CN116933041A

Abstract

The application relates to the field of intelligent calibration, in particular to a system and a method for calibrating the number of a force sensor, which realize automatic extraction and identification of the number of the force sensor by utilizing computer vision and deep learning technology through image processing and feature extraction, and calibrate the number by using the system and the method.

Description

Force sensor number checking system and method

Technical Field

The application relates to the field of intelligent calibration, and more particularly, to a force sensor number calibration system and method.

Background

The force sensor is a sensor capable of converting received force or pressure into an electric signal, and is widely applied to the fields of industry, medical treatment, aviation and the like. To ensure the quality and traceability of the force sensors, each force sensor has a unique number, typically printed on the surface of the force sensor.

In the production and detection process, the number of the force sensor needs to be checked to ensure that the number is consistent with the code of the force sensor, so that confusion and misuse are avoided. At present, the common calibration method is manual observation and comparison, which is time-consuming and labor-consuming and is easy to generate errors due to visual fatigue.

Therefore, there is a need for a system and method that automatically identifies and collates force sensor numbers, improving the accuracy and efficiency of the collation.

Disclosure of Invention

The present application has been made to solve the above-mentioned technical problems. The embodiment of the application provides a system and a method for checking the number of a force sensor, which realize automatic extraction and identification of the number of the force sensor by utilizing computer vision and deep learning technology through image processing and feature extraction, and perform number checking according to the automatic extraction and identification.

According to one aspect of the present application, there is provided a force sensor number checking method, including:

Acquiring a calibration image which is acquired by a camera and contains a force sensor number;

analyzing the proofreading image to obtain a serial number identification result; and

And based on the number recognition result, performing force sensor number checking.

According to another aspect of the present application, there is provided a force sensor numbering collation system comprising:

The image acquisition module is used for acquiring a calibration image which is acquired by the camera and contains the serial number of the force sensor;

The image analysis module is used for analyzing the proofreading image to obtain a serial number identification result; and

And the number checking module is used for checking the number of the force sensor based on the number identification result.

Compared with the prior art, the system and the method for checking the serial numbers of the force sensors, provided by the application, realize automatic extraction and identification of the serial numbers of the force sensors by utilizing computer vision and deep learning technology and through image processing and feature extraction, and check the serial numbers.

Drawings

The above and other objects, features and advantages of the present application will become more apparent by describing embodiments of the present application in more detail with reference to the attached drawings. The accompanying drawings are included to provide a further understanding of embodiments of the application and are incorporated in and constitute a part of this specification, illustrate the application and together with the embodiments of the application, and not constitute a limitation to the application. In the drawings, like reference numerals generally refer to like parts or steps.

FIG. 1 is a flow chart of a force sensor number checking method according to an embodiment of the present application;

FIG. 2 is a system architecture diagram of a force sensor number verification method according to an embodiment of the present application;

FIG. 3 is a flowchart of sub-step S2 of a force sensor number checking method according to an embodiment of the present application;

FIG. 4 is a flowchart of sub-step S21 of a force sensor number checking method according to an embodiment of the present application;

FIG. 5 is a flowchart of sub-step S212 of a force sensor number checking method according to an embodiment of the present application;

FIG. 6 is a flowchart of sub-step S22 of a force sensor number checking method according to an embodiment of the present application;

FIG. 7 is a block diagram of a force sensor numbering collation system according to an embodiment of the application.

Detailed Description

Hereinafter, exemplary embodiments according to the present application will be described in detail with reference to the accompanying drawings. It should be apparent that the described embodiments are only some embodiments of the present application and not all embodiments of the present application, and it should be understood that the present application is not limited by the example embodiments described herein.

As used in the specification and in the claims, the terms "a," "an," "the," and/or "the" are not specific to a singular, but may include a plurality, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that the steps and elements are explicitly identified, and they do not constitute an exclusive list, as other steps or elements may be included in a method or apparatus.

Although the present application makes various references to certain modules in a system according to embodiments of the present application, any number of different modules may be used and run on a user terminal and/or server. The modules are merely illustrative, and different aspects of the systems and methods may use different modules.

A flowchart is used in the present application to describe the operations performed by a system according to embodiments of the present application. It should be understood that the preceding or following operations are not necessarily performed in order precisely. Rather, the various steps may be processed in reverse order or simultaneously, as desired. Also, other operations may be added to or removed from these processes.

In the production and detection process, the number of the force sensor needs to be checked to ensure that the number is consistent with the code of the force sensor, so that confusion and misuse are avoided. At present, the common calibration method is manual observation and comparison, which is time-consuming and labor-consuming and is easy to generate errors due to visual fatigue. Therefore, there is a need for a system and method that automatically identifies and collates force sensor numbers, improving the accuracy and efficiency of the collation.

In the technical scheme of the application, a force sensor number checking method is provided. FIG. 1 is a flow chart of a force sensor number checking method according to an embodiment of the present application. FIG. 2 is a system architecture diagram of a force sensor number verification method according to an embodiment of the present application. As shown in fig. 1 and 2, the force sensor number checking method according to an embodiment of the present application includes the steps of: s1, acquiring a calibration image which is acquired by a camera and contains a force sensor number; s2, analyzing the proofreading image to obtain a serial number identification result; and S3, based on the number identification result, performing force sensor number checking.

In particular, in step S1, a calibration image containing the force sensor number acquired by the camera is acquired. The force sensor is a sensor capable of converting received force or pressure into an electric signal and is widely applied to the fields of industry, medical treatment, aviation and the like. To ensure the quality and traceability of the force sensors, each force sensor has a unique number, typically printed on the surface of the force sensor. Therefore, during production and inspection, the number of the force sensor needs to be checked to ensure that the number is consistent with the code of the force sensor, thereby avoiding confusion and misuse. In the technical scheme of the application, the calibration image containing the number of the force sensor is obtained through the camera, the automatic extraction and identification of the number of the force sensor are realized through the processing and feature extraction of the calibration image, and the number calibration is completed.

Accordingly, in one possible implementation, a calibration image acquired by a camera containing a force sensor number may be acquired, for example, by: the camera is correctly connected to the computer or the mobile equipment, and the camera can work normally; opening an appropriate camera application on a computer or mobile device; adjusting camera settings as needed, such as resolution, focus, exposure, etc.; and placing a proofreading image containing the force sensor number in the field of view of the camera. The image is ensured to be clearly visible, and the number of the force sensor can be captured by the camera; capturing a proof image using a camera application; after capturing an image, checking the image quality to ensure that the number of the force sensor is clear and readable; the captured collation image is saved to a suitable location on the computer or mobile device, such as in a photo library or designated folder.

In particular, in step S2, the proof images are analyzed to obtain a numbered recognition result. In particular, in one specific example of the present application, as shown in fig. 3, the step S2 includes S21, performing image feature extraction on the proof image to obtain a fusion numbering pooled feature map; and S22, determining the number identification result based on the fusion number pooling feature map.

Specifically, in S21, image feature extraction is performed on the calibration image to obtain a fusion numbering pooled feature map. In particular, in one specific example of the present application, as shown in fig. 4, the S21 includes: s211, extracting a numbered region of interest image in the proofreading image; and S212, carrying out feature extraction based on context association on the numbered region-of-interest images to obtain the fusion numbering pooling feature map.

More specifically, the step S211 extracts a numbered region-of-interest image from the collation image. In the technical solution of the present application, the S211 includes: and passing the proofreading image through a numbering target detection network to obtain the numbering region-of-interest image. Here, the reason for the region of interest identification and extraction of the proof images is that the proof images may contain other elements, devices or background objects that may obscure or confuse the force sensor numbers. In addition, other words and marks may be present in the collation image, such as company name, product model number, etc., which are independent of the force sensor number. By processing in this way, the influence of extraneous background and interference noise on subsequent feature extraction can be greatly attenuated.

Notably, the object detection network is a type of deep learning model for detecting and locating objects in images or video. Its main purpose is to identify the different objects in the image and to determine their positions and bounding boxes. The object detection network is generally composed of two main components: a Backbone Network (Backbone Network) and a target Detection Head (Detection Head). The backbone network is a deep Convolutional Neural Network (CNN) for extracting image features. It is responsible for feature extraction and representation learning of the input image, capturing semantic and contextual information in the image. Common backbone networks include VGG, resNet, mobileNet and the like. The target detection head is a series of network layers behind the backbone network for predicting the location and class of objects in the image. The target detection head typically includes a classification sub-network and a regression sub-network. The classification sub-network is used for classifying each candidate object to determine which class it belongs to, and a common classification sub-network is a classifier based on a full-connection layer or a convolution layer, and can output probability distribution of each class by using a softmax function. Regression subnetworks are used to predict the bounding box (location and size) of each candidate object, typically using regressors to predict the coordinates and size of the bounding box.

Accordingly, in one possible implementation, the collation image may be passed through a numbering target detection network to obtain the numbering region of interest image by, for example: collecting calibration images with force sensor numbers, creating corresponding annotation files for each image, and annotating positions and boundary boxes of the force sensor numbers; selecting a suitable target detection network, such as Faster R-CNN, YOLO, SSD and the like; the calibration image is preprocessed to accommodate the input requirements of the target detection network. The preprocessing step may include operations such as image scaling, normalization, cropping, etc.; training the selected target detection network by using the prepared proofreading image and the labeling data. The training process comprises the steps of inputting images into a network, calculating a loss function, optimizing network parameters and the like; and performing target detection on the new proofreading image by using a trained target detection network. Inputting the image into a network, and outputting a bounding box where the force sensor number is positioned and a corresponding class label by the network; and extracting an image of the region of interest where the force sensor number is located according to the output of the target detection network. Cutting out an interested region image from the original image according to the position information of the boundary box; and verifying the extracted region-of-interest image, and ensuring the accuracy of the region. The region of interest image is saved to a file or further processed and analyzed as needed.

More specifically, the step S212 performs feature extraction based on context correlation on the numbered region of interest image to obtain the fused numbered pooled feature map. It is considered that there may be interference information, such as occlusions, etc., in the numbered region of interest images that have not been filtered out. Therefore, in the technical solution of the present application, attention is required to the context association relationship in the numbered region-of-interest image, so as to correctly capture the implicit feature information about the number. In particular, in one specific example of the present application, as shown in fig. 5, the S212 includes: s2121, enabling the numbered region-of-interest images to pass through a backbone network-based feature extractor to obtain a numbered feature map; s2122, carrying out multi-scale pooling on the numbering feature images by using pooling check with different scales to obtain a plurality of numbering pooled feature images; and S2123, fusing the plurality of numbering pooled feature graphs by using a content context encoder to obtain the fusion numbering pooled feature graph.

The step S2121 is to make the numbering interested region image pass through a backbone network based feature extractor to obtain a numbering feature map. It should be appreciated that numbering the region of interest images through a backbone network based feature extractor may convert the region of interest images into numbered feature maps, thereby extracting and enhancing a feature representation of the region of interest, capturing context information, and guaranteeing dimensional matching of features. This has an important role in the subsequent tasks, and can improve the accuracy and effect of the tasks.

Notably, backbone networks (backbones) are an underlying Network structure in deep-learning computer vision tasks for extracting feature representations of images. Its structure is typically composed of multiple convolution layers for extracting local features of the image and a pooling layer for reducing the size of the feature map and preserving important features. The number of layers and the width of the backbone network can be adjusted according to the complexity of the task and the limitation of the computing resources. Generally, the deeper the layer number of the backbone network, the more rich the semantic information it is characterized to represent. The backbone network design takes into account the problem of balancing computational efficiency with feature expression capabilities. Some advanced backbone network structures, such as ResNet, mobileNet and EFFICIENTNET, improve network performance and computational efficiency by introducing techniques such as residual connection, depth separable convolution, and complex scaling. The method has the effect of converting the original image into a high-level abstract feature representation, wherein the features have better semantic information and expression capability and can be used for subsequent tasks. For example, in a target detection task, the backbone network may extract features such as edges, textures, and shapes in the image to help locate and identify the target object. In the image classification task, the backbone network may learn global features of the image for classifying and identifying different object classes. In the semantic segmentation task, the backbone network can extract semantic information of each pixel point in the image, and classification and segmentation of pixel level are realized.

Accordingly, in one possible implementation, the numbered region of interest images may be passed through a backbone network-based feature extractor to obtain a numbered feature map, for example, by: an image of the region of interest containing the number is acquired. This may be a local region extracted from the whole image, or a region of interest obtained by a target detection algorithm; an appropriate backbone network model, e.g., resNet, etc., is selected and pre-trained weights are loaded. The pre-trained weights can be obtained by training on a large image data set, and have good feature extraction capability; and preprocessing the region-of-interest image to meet the input requirement of the backbone network model. Typically including image scaling, normalization, cropping, etc.; and inputting the preprocessed region-of-interest image into a backbone network model, and obtaining a feature map through forward propagation. The feature map is a high-dimensional feature representation output at the middle layer of the network, and has rich semantic information; the appropriate feature map is selected from the intermediate layer output of the backbone network as the numbered feature map. The proper characteristic diagram can be selected according to the requirements of tasks and the design of a specific network architecture; further post-processing, such as normalization, filtering, thresholding, etc., may be performed on the numbered feature map as needed to enhance the robustness of the features or to extract features of interest.

And S2122, carrying out multi-scale pooling on the numbering characteristic graphs by using pooling check with different scales to obtain a plurality of numbering pooled characteristic graphs. Considering that the feature extractor is limited by the receptive field in the process of feature encoding and filtering, the wrong context information is easy to aggregate. In particular, receptive fields generally expand as the number of network layers increases. However, in the present scenario, there may be multiple categories of objects in the image, and the expansion of the receptive field may lead to false context information aggregation to blend features of different categories of objects or regions together, resulting in difficulty in accurately performing feature extraction. Therefore, in the technical scheme of the application, the pooling cores with different scales are used for controlling the size of the receptive field so as to obtain the context information with a larger range or more scales, and further obtain the characteristic with more discrimination.

Notably, the pooling core (Pooling Kernel) is part of the pooling operation in deep learning. The pooling operation is an operation of performing dimension reduction and feature extraction on an input feature map, and reduces the size of features by merging features of local areas and extracts important feature information. A pooling core is a structure in a pooling operation that defines the window size and stride (stride) in the pooling operation. And in each pooling window, aggregating or extracting the features in the pooling check window to generate a pooled output value.

Accordingly, in one possible implementation, the numbering feature map may be multi-scale pooled using pooling cores with different scales to obtain a plurality of numbering pooled feature maps, for example: input: numbering the feature map; defining a plurality of pooled kernel scales: a plurality of different scale pooling cores, e.g., 3x3, 5x5, 7x7, etc., are selected. The dimensions of these pooling kernels determine the size of the pooling window; performing pooling operation on each pooling core scale: for each pooling core scale, sliding the pooling core on the coding feature map in a fixed step, and selecting the features in the window each time for pooling operation; for maximum pooling, selecting a maximum eigenvalue within a window as an output; for average pooling, calculating an average of the features within the window as an output; the pooling operation can keep the channel number of the feature map unchanged, and only the space size of the feature map is changed; obtaining a plurality of numbering pooled feature graphs: for each pooling kernel scale, a corresponding numbered pooling feature map is obtained; application of multi-scale feature map: multiple numbered pooled feature maps may be used for subsequent tasks such as object detection, classification, segmentation, etc. The feature graphs have different scales and semantic information, can provide richer and diversified feature representations, and are beneficial to improving the performance and the robustness of the model.

The S2123, fusion of the plurality of numbered pooled feature maps using a content context encoder to obtain the fused numbered pooled feature map. That is, context semantic feature extraction is performed using a content context encoder, so that the fusion numbering pooled feature map has more excellent feature expression capability. More specifically, in an embodiment of the present application, an encoding process for fusing the plurality of numbered pooled feature maps using a content context encoder to obtain a fused numbered pooled feature map includes: firstly, carrying out global average pooling on each feature matrix of the plurality of numbered pooling feature graphs along the channel dimension to obtain a plurality of numbered pooling feature vectors; subsequently, passing the plurality of numbered pooled feature vectors through a converter-based context encoder to obtain a fused numbered pooled feature vector; and reconstructing the feature vector of the fusion numbering pooling feature vector to obtain the fusion numbering pooling feature map. The method comprises the steps of carrying out global average pooling on each feature matrix of the plurality of numbered pooled feature graphs along a channel dimension to obtain a plurality of numbered pooled feature vectors; the global averaging pooling is a special pooling operation, which pools the whole feature map and takes the average of the feature values of each channel of the feature map as output; and passing the plurality of numbered pooled feature vectors through a converter-based context encoder to obtain a fused numbered pooled feature vector, comprising: one-dimensional arrangement is carried out on the plurality of numbering pooling feature vectors so as to obtain global numbering pooling feature vectors; calculating the product between the global numbering pooled feature vector and the transpose vector of each local expansion feature vector in the plurality of numbering pooled feature vectors to obtain a plurality of self-attention correlation matrices; respectively carrying out standardization processing on each self-attention correlation matrix in the plurality of self-attention correlation matrices to obtain a plurality of standardized self-attention correlation matrices; obtaining a plurality of probability values by using a Softmax classification function through each normalized self-attention correlation matrix in the normalized self-attention correlation matrices; weighting each numbered pooled feature vector in the plurality of numbered pooled feature vectors by taking each probability value in the plurality of probability values as a weight to obtain the plurality of context semantic numbered pooled feature vectors; and cascading the context semantic numbering pooled feature vectors to obtain the global numbering pooled feature vector. More specifically, the feature vector reconstruction is performed on the fusion-numbered pooled feature vectors to obtain the fusion-numbered pooled feature map, and it is worth mentioning that the aim of the reconstruction is to approximately recover the original data by preserving the most important information and structures in the low-dimensional vector space. This can be used for application scenarios such as data visualization, feature extraction, data compression, etc. For example, in image processing, an image may be compressed into a lower-dimensional vector representation, and then the visual quality of the image is restored by vector reconstruction, that is, the expressive power and distinguishability of features may be enhanced by reconstructing feature vectors. The reconstructed feature vector may better capture the key information of the image, thereby improving the performance and robustness of the model.

Notably, a context encoder is a neural network structure for processing sequence data, commonly used in tasks such as natural language processing and speech recognition. Its function is to encode the input sequence data into a fixed length vector representation, capturing the contextual information and semantic features of the input sequence. In one example, a transducer is a model based on the attention mechanism by which context in a sequence is modeled by self-attention mechanism and position coding. It has advantages in processing long sequences and parallel computing, and has achieved significant results in tasks such as machine translation. The context encoder functions to convert each element in the input sequence into a vector representation and use these vectors to capture the context information and semantic features of the sequence. Such vector representations may be used for subsequent tasks such as sequence classification, sequence generation, etc. Through the learned context coding, the model can better understand and utilize the related information in the sequence data, thereby improving the performance and effect of the task.

It should be noted that, in other specific examples of the present application, the feature extraction based on the context association may be performed on the numbered region-of-interest image in other manners to obtain the fused numbering pool feature map, for example: selecting a suitable CNN model, such as VGG, resNet and the like, and loading pre-training weights thereof; preprocessing the numbered region-of-interest images similarly to the target detection network; and inputting the preprocessed numbered region-of-interest images into a CNN model, and acquiring characteristic representations of the images through forward propagation. In the CNN model, a feature map is typically obtained by removing the last fully connected layer; different methods may be employed to fuse the context information of the numbered region of interest images, depending on the requirements of the context association. One common approach is to use spatial pyramid pooling (SPATIAL PYRAMID Pooling), which can pool feature maps on different scales to capture context information of different scales; and carrying out pooling operation on the contextually-associated feature graphs, and converting the contextually-associated feature graphs into numbered pooled feature graphs with fixed sizes. The pooling operation can be average pooling, maximum pooling and the like, the size of the feature map is reduced, and important feature information is reserved; and verifying the fusion numbering pooled feature map to ensure the accuracy of the features. The fusion numbering pooled feature map is saved to a file or used for subsequent tasks, such as classification, identification, etc., as needed.

It should be noted that, in other specific examples of the present application, the image feature extraction may be performed on the calibration image in other manners to obtain a fused numbering pooled feature map, for example: importing the proofed image into image processing software or a programming environment; preprocessing the imported image to improve the accuracy of feature extraction; image processing techniques or deep learning models are used to extract features of the image. Alternatively, feature extraction may be performed using conventional computer vision feature extraction methods, such as SIFT, SURF, HOG, etc., or using deep learning models, such as Convolutional Neural Networks (CNNs); depending on the location of the force sensor number, a specific algorithm or manual operation is used to extract the region of the force sensor number. This may be a rectangular area or other shaped area where the force sensor number is located; the same feature extraction method is applied to the extracted force sensor numbered regions to obtain a feature representation of the force sensor numbers. This may be a local feature, a global feature, or a set of feature vectors; the features of the calibration image and the features of the force sensor numbers are fused. Features can be fused using simple feature stitching, weighted averaging, feature connection, etc.; and carrying out pooling operation on the fused feature graphs to reduce the dimension of the features and retain key information. Common pooling methods include maximum pooling, average pooling, and the like; and obtaining a fusion numbering pooled feature map, which is a comprehensive representation of the numbers of the force sensors in the proofreading image. The feature map may be used for further classification, identification or other tasks.

Specifically, the step S22 is to determine the number recognition result based on the fusion numbering pooling feature map. In particular, in one specific example of the present application, as shown in fig. 6, the S22 includes: s221, performing feature distribution optimization on the fusion numbering pooling feature images to obtain optimized fusion numbering pooling feature images; and S222, enabling the optimized fusion numbering pooling feature map to pass through a classifier to obtain a classification result, wherein the classification result is the numbering identification result.

More specifically, in S221, the feature distribution optimization is performed on the fusion numbering pooled feature map to obtain an optimized fusion numbering pooled feature map. In particular, in the technical scheme of the application, when the numbering feature images are subjected to multi-scale space pooling by using pooling feature images with different scales to obtain a plurality of numbering pooled feature images, and the numbering pooled feature images are subjected to context polymerization based on feature content by using a content context encoder, the fusion numbering pooled feature images are expected to express the context associated image semantic features of the numbering pooled feature images with different scales, and meanwhile, the fusion numbering pooled feature images still have good expression of the local space associated image semantic features of the numbering region-of-interest images expressed by the numbering pooled feature images with different scales under a classification rule, so that the fusion numbering pooled feature images need to be corrected based on the local space associated image semantic feature representation of the numbering feature images. Based on this, the applicant of the present application refers to the numbered feature map, e.g. noted asAnd the fusion numbering pooling feature map, e.g. denoted/>Performing smooth response parameterization decoupling fusion to obtain an optimized fusion numbering pooling feature map, for example, marked as/>The method specifically comprises the following steps:

Wherein the method comprises the steps of Representing the numbered feature map,/>Representing the fusion numbering pooling feature map,/>Represents the cosine distance between the numbered feature map and the fused numbered pooled feature map, and/>Is a logarithm based on 2,/>() Index operation representing vector,/>Representing difference by location,/>Representing multiplication by location,/>Representing addition by position,/>And representing the optimized fusion numbering pooling feature map. Here, the smoothing response parameterized decoupling fusion is based on the numbered feature map/>, by using a decoupling principle of a smoothing parameterization functionAnd the fusion numbering pooling feature map/>Non-negative symmetry of cosine distances between to compile the numbered feature map/>And the fusion numbering pooling feature map/>Point-by-point embedding between features of (1) to infer the numbered feature map/>, with a spatial transformation (transformation) between featuresAnd the fusion numbering pooling feature map/>Information distribution transfer (information distribution shift) between the features is carried out, so that information structural fusion of smooth response between the features under class rules is expressed, and optimized fusion numbering pooled feature map/>And for the expression effect of the local spatial correlation image semantic features of the numbered feature images based on the classification rules, improving the accuracy of classification results obtained by the optimized fusion numbered pooled feature images through the classifier.

More specifically, in S222, the optimized fusion numbering pooled feature map is passed through a classifier to obtain a classification result, where the classification result is the numbering identification result. Specifically, the optimized fusion numbering pooling feature map is unfolded to be a classification feature vector based on a row vector or a column vector; performing full-connection coding on the classification feature vectors by using a plurality of full-connection layers of the classifier to obtain coded classification feature vectors; and passing the coding classification feature vector through a Softmax classification function of the classifier to obtain the classification result.

A Classifier (Classifier) refers to a machine learning model or algorithm that is used to classify input data into different categories or labels. The classifier is part of supervised learning, which performs classification tasks by learning mappings from input data to output categories.

The fully connected layer (Fully Connected Layer) is one type of layer that is common in neural networks. In the fully connected layer, each neuron is connected to all neurons of the upper layer, and each connection has a weight. This means that each neuron in the fully connected layer receives inputs from all neurons in the upper layer, and weights these inputs together, and then passes the result to the next layer.

The Softmax classification function is a commonly used activation function for multi-classification problems. It converts each element of the input vector into a probability value between 0 and 1, and the sum of these probability values equals 1. The Softmax function is commonly used at the output layer of a neural network, and is particularly suited for multi-classification problems, because it can map the network output into probability distributions for individual classes. During the training process, the output of the Softmax function may be used to calculate the loss function and update the network parameters through a back propagation algorithm. Notably, the output of the Softmax function does not change the relative magnitude relationship between elements, but rather normalizes them. Thus, the Softmax function does not change the characteristics of the input vector, but simply converts it into a probability distribution form.

It should be noted that, in other specific examples of the present application, the number identification result may also be determined by other manners based on the fused number pooling feature map, for example: inputting a plurality of numbering pooled feature graphs; for each pooled feature map, a different fusion method can be selected, such as stitching, weighted summation, etc.; one common fusion method is to splice multiple pooled feature maps in the channel dimension to obtain a larger feature map; the spliced feature images keep the space information and semantic information of each pooled feature image; further processing is carried out on the fused characteristic diagram, such as convolution operation, normalization operation and the like; these processing operations help to further extract features and enhance the expressive power of the features; classifying the processed feature images by using a classifier, and mapping the feature images to specific numbered categories; common classifiers include full connection layer, support Vector Machine (SVM), decision tree, etc.; determining a final number identification result according to the output of the classifier; decisions may be made based on the confidence or probability values of the classifier, selecting the most likely numbered class as the result.

It should be noted that, in other specific examples of the present application, the proof images may also be analyzed by other manners to obtain a number identification result, for example: the acquired proof images are imported into image processing software or a programming environment on the computer. Common image processing software such as Adobe Photoshop, GIMP, or image processing libraries in programming languages such as OpenCV in Python may be used; preprocessing the imported image to improve the accuracy of recognition. The preprocessing steps may include image denoising, adjusting contrast and brightness, image enhancement, etc. Selecting a proper preprocessing method according to the image quality and specific requirements; based on the location of the force sensor number, an image processing technique or manual operation is used to extract the region of interest. This may be a rectangular area or other shaped area where the force sensor number is located; an Optical Character Recognition (OCR) algorithm is applied to the extracted region of interest to convert text in the image into recognizable text data. Text in the region of interest may be extracted using an OCR library or API, such as TESSERACT OCR, *** Cloud Vision API, etc.; and processing and analyzing the extracted text data to obtain the serial number identification result of the force sensor. This may involve techniques such as text cleaning, character matching, pattern recognition, etc. Selecting proper algorithm and method according to the characteristics and format of the force sensor number; and verifying the number of the identified force sensor, and ensuring the accuracy of the identification result. The recognition results are output to a file, database, or other system for further use, as desired.

Specifically, in step S3, the force sensor number is calibrated based on the number identification result. In particular, in one specific example of the present application, the S3 includes: and checking the number identification result with a prestored force sensor code.

Accordingly, in one possible implementation, the number identification result may be collated with a pre-stored force sensor code, for example, by: collecting a number identification result: the image or video is processed using an appropriate method (e.g., computer vision algorithms) to identify the number in the image. This may involve techniques such as object detection, character recognition or pattern matching; extracting force sensor codes: a pre-stored code is obtained from the force sensor. This may involve reading sensor data or retrieving encoded information from the sensor via a communications interface; alignment of data: the number recognition results and force sensor codes are aligned to ensure that they correspond in time or space. This may require synchronizing the data according to the time stamp or location information; and (3) checking and comparing: the identified number is compared to a pre-stored force sensor code. This can be achieved by simple equality comparisons or other more complex matching algorithms, depending on the needs and the nature of the data; judging a checking result: and judging whether the number identification result is matched with the force sensor code or not according to the comparison result. If so, the collation is considered successful; if there is no match, further analysis or processing may be required.

In summary, a force sensor number checking method according to an embodiment of the present application is explained, which implements automatic extraction and identification of force sensor numbers through image processing and feature extraction by using computer vision and deep learning techniques, and performs number checking therewith.

Further, a force sensor number checking system is provided.

FIG. 7 is a block diagram of a force sensor numbering collation system according to an embodiment of the application. As shown in fig. 7, a force sensor numbering collating system 300 according to an embodiment of the present application includes: an image acquisition module 310 for acquiring a calibration image acquired by the camera and containing the force sensor number; the image analysis module 320 is configured to analyze the proof images to obtain a number identification result; and a number checking module 330, configured to perform number checking of the force sensor based on the number identification result.

As described above, the force sensor number checking system 300 according to the embodiment of the present application may be implemented in various wireless terminals, such as a server or the like having a force sensor number checking algorithm. In one possible implementation, the force sensor numbering collation system 300 according to embodiments of the present application may be integrated into a wireless terminal as a software module and/or hardware module. For example, the force sensor numbering collation system 300 may be a software module in the operating system of the wireless terminal, or may be an application developed for the wireless terminal; of course, the force sensor numbering collation system 300 could equally be one of many hardware modules of the wireless terminal.

Alternatively, in another example, the force sensor numbering collation system 300 and the wireless terminal may be separate devices, and the force sensor numbering collation system 300 may be connected to the wireless terminal through a wired and/or wireless network and transmit interactive information in a agreed data format.

The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the improvement of technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims

1. The force sensor number checking method is characterized by comprising the following steps:

Analyzing the proofreading image to obtain a serial number identification result;

based on the number recognition result, performing force sensor number correction;

analyzing the proofreading image to obtain a number identification result, including:

Extracting image features of the proofreading image to obtain a fusion numbering pooling feature image;

Determining the number identification result based on the fusion number pooling feature map;

extracting image features of the proofreading image to obtain a fusion numbering pooled feature map, including:

extracting a numbered region of interest image in the proofreading image;

performing feature extraction based on context correlation on the numbered region-of-interest images to obtain the fusion numbering pooling feature map;

Performing feature extraction based on context correlation on the numbered region-of-interest images to obtain the fusion numbering pooled feature map, including:

The numbered interesting region images pass through a backbone network-based feature extractor to obtain a numbered feature map;

using pooling with different scales to check the numbering feature images for multi-scale pooling so as to obtain a plurality of numbering pooled feature images;

fusing the plurality of numbered pooled feature maps using a content context encoder to obtain the fused numbered pooled feature map;

based on the fusion numbering pooling feature map, determining the numbering identification result comprises:

Performing feature distribution optimization on the fusion numbering pooling feature images to obtain optimized fusion numbering pooling feature images; and

The optimized fusion numbering pooling feature images pass through a classifier to obtain classification results, wherein the classification results are the numbering identification results;

Performing feature distribution optimization on the fusion numbering pooling feature map to obtain an optimized fusion numbering pooling feature map, including: carrying out smooth response parameterization decoupling fusion on the numbering feature map and the fusion numbering pooling feature map by using the following optimization formula to obtain the optimization fusion numbering pooling feature map;

Wherein, the formula is:

；

Wherein the method comprises the steps of Representing the numbered feature map,/>Representing the fusion numbering pooling feature map,/>Represents the cosine distance between the numbered feature map and the fused numbered pooled feature map, and/>For a base-2 logarithm,Index operation representing vector,/>Representing difference by location,/>Representing multiplication by location,/>Representing addition by position,/>Representing the optimized fusion numbering pooling feature map;

the smoothing response parameterization decoupling fusion is based on the numbering plan by using a decoupling principle of a smoothing parameterization function And the fusion numbering pooling feature map/>Non-negative symmetry of cosine distances between to compile the numbered feature map/>And the fusion numbering pooling feature map/>Point-by-point embedding between features of (1) to infer the numbered feature map/>, with spatial transformations between featuresAnd the fusion numbering pooling feature map/>Information distribution transfer between them.

2. The force sensor numbering calibration method according to claim 1, wherein extracting a numbered region of interest image in the calibration image comprises:

and passing the proofreading image through a numbering target detection network to obtain the numbering region-of-interest image.

3. The force sensor number calibration method according to claim 2, wherein performing force sensor number calibration based on the number recognition result includes: and checking the number identification result with a prestored force sensor code.

4. A force sensor number checking system implemented based on the force sensor number checking method of any one of claims 1-3, characterized in that the system comprises:

the image analysis module is used for analyzing the proofreading image to obtain a serial number identification result;

The number checking module is used for checking the number of the force sensor based on the number identification result;

The image analysis module is further configured to:

extracting a numbered region of interest image in the proofreading image;

The image analysis module is further configured to:

Carrying out smooth response parameterization decoupling fusion on the numbering feature map and the fusion numbering pooling feature map by using the following optimization formula to obtain the optimization fusion numbering pooling feature map;

Wherein, the formula is:

；

5. The force sensor numbering collation system according to claim 4, wherein the image analysis module is further operable to: and passing the proofreading image through a numbering target detection network to obtain the numbering region-of-interest image.