CN113591967A - Image processing method, device and equipment and computer storage medium - Google Patents

Image processing method, device and equipment and computer storage medium Download PDF

Info

Publication number
CN113591967A
CN113591967A CN202110852248.0A CN202110852248A CN113591967A CN 113591967 A CN113591967 A CN 113591967A CN 202110852248 A CN202110852248 A CN 202110852248A CN 113591967 A CN113591967 A CN 113591967A
Authority
CN
China
Prior art keywords
graphic code
coordinate information
information
detection model
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110852248.0A
Other languages
Chinese (zh)
Other versions
CN113591967B (en
Inventor
杜森林
杜松
王邦军
杨怀宇
李磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Xurui Software Technology Co ltd
Original Assignee
Nanjing Xurui Software Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Xurui Software Technology Co ltd filed Critical Nanjing Xurui Software Technology Co ltd
Priority to CN202110852248.0A priority Critical patent/CN113591967B/en
Publication of CN113591967A publication Critical patent/CN113591967A/en
Application granted granted Critical
Publication of CN113591967B publication Critical patent/CN113591967B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06KGRAPHICAL DATA READING; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K19/00Record carriers for use with machines and with at least a part designed to carry digital markings
    • G06K19/06Record carriers for use with machines and with at least a part designed to carry digital markings characterised by the kind of the digital marking, e.g. shape, nature, code
    • G06K19/06009Record carriers for use with machines and with at least a part designed to carry digital markings characterised by the kind of the digital marking, e.g. shape, nature, code with optically detectable marking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides an image processing method, an image processing device, image processing equipment and a computer storage medium, relates to the field of image detection, and is used for improving the accuracy of positioning and detecting a graphic code in an image in a complex scene. The method comprises the following steps: acquiring an image to be detected; inputting an image to be detected into a pre-trained key point detection model, determining reference confidence of an object in the image to be detected and multiple groups of reference corner point coordinate information of the object, wherein the key point detection model is obtained by training according to a training sample set, the training sample set comprises a plurality of sample graphic codes and label information of each sample graphic code, and the label information comprises a plurality of corner point coordinate information and central point coordinate information of the sample graphic codes; determining the object as a graphic code to be positioned under the condition that the reference confidence coefficient is greater than a preset confidence coefficient threshold value; and screening a plurality of reference frames determined by a plurality of groups of reference corner point coordinate information to determine frame position information of the graphic code to be positioned.

Description

Image processing method, device and equipment and computer storage medium
Technical Field
The present application relates to the field of image detection, and in particular, to an image processing method, apparatus, device, and computer storage medium.
Background
The graphic code is a most common coding form in daily life, and is widely applied to scenes such as mobile payment and information acquisition. However, in the graphic code image, due to the printing problems, such as printing strips, under printing, and the like, and due to the small proportion, blurring, smudging, uneven illumination, uneven geometric deformation, and the like of the graphic code image, the graphic code image is abnormal in imaging, and therefore the graphic code cannot be accurately positioned.
In the existing graphic code positioning and identifying technology, the main positioning method is to scan an image by a line-by-line and column-by-column point-by-point scanning algorithm to judge a boundary, find pattern features for positioning according to light and dark width flows obtained by gradient transformation, screen by a transverse line segment set and a longitudinal line segment set, and determine the approximate position of a position detection pattern according to the ratio of black and white pixels of the position detection pattern. The method can achieve higher accurate recognition rate and real-time performance for a perfect graphic code, namely an image with good printing, no abnormal imaging and single background, but can reduce generalization performance of the graphic code image in a complex scene, and simultaneously, the speed of positioning and detecting the graphic is slow and inaccurate, so that the accurate recognition rate and the recognition speed of the graphic code are low, and the requirements cannot be met.
Disclosure of Invention
The embodiment of the application provides an image processing method, an image processing device, image processing equipment and a computer storage medium, which are used for improving the accuracy of positioning and detecting a graphic code in an image in a complex scene.
In a first aspect, an embodiment of the present application provides an image processing method, where the method includes:
acquiring an image to be detected;
inputting an image to be detected into a pre-trained key point detection model, and determining reference confidence of an object in the image to be detected and multiple groups of reference corner point coordinate information of the object, wherein the first confidence represents the probability that the object is a graphic code, the key point detection model is obtained by training according to a training sample set, the training sample set comprises a plurality of sample graphic codes and label information of each sample graphic code, and the label information comprises a plurality of corner point coordinate information and center point coordinate information of the sample graphic codes;
determining the object as a graphic code to be positioned under the condition that the reference confidence coefficient is greater than a preset confidence coefficient threshold value;
and screening a plurality of reference frames determined by a plurality of groups of reference corner point coordinate information, and determining frame position information of the graphic code to be positioned, wherein the frame position information comprises a plurality of corner point coordinate information.
In a second aspect, an embodiment of the present application provides an image processing apparatus, including:
the acquisition module is used for acquiring an image to be detected;
the processing module is used for inputting an image to be detected into a pre-trained key point detection model, and determining reference confidence of an object in the image to be detected and multiple groups of reference corner point coordinate information of the object, wherein the first confidence represents the probability that the object is a graphic code, the key point detection model is obtained by training according to a training sample set, the training sample set comprises a plurality of sample graphic codes and label information of each sample graphic code, and the label information comprises a plurality of corner point coordinate information and central point coordinate information of the sample graphic codes;
the first determining module is used for determining that the object is a graphic code to be positioned under the condition that the reference confidence coefficient is greater than a preset confidence coefficient threshold value;
and the second determining module is used for screening a plurality of reference frames determined by a plurality of groups of reference corner point coordinate information and determining frame position information of the graphic code to be positioned, wherein the frame position information comprises a plurality of corner point coordinate information.
In a third aspect, an embodiment of the present application provides an image processing apparatus, including:
a processor, and a memory storing computer program instructions; the processor reads and executes the computer program instructions to implement the image processing method as provided by the first aspect of the embodiments of the present application.
In a fourth aspect, an embodiment of the present application provides a computer storage medium, on which computer program instructions are stored, and when executed by a processor, the computer program instructions implement the image processing method provided in the first aspect of the embodiment of the present application.
The image processing method provided by the embodiment of the application comprises the steps of firstly obtaining an image to be detected; inputting the image to be detected into a pre-trained key point detection model, and determining reference confidence of an object in the image to be detected and multiple groups of reference corner point coordinate information of the object, wherein the first confidence represents the probability that the object is a graphic code, the key point detection model is obtained by training according to a training sample set, the training sample set comprises a plurality of sample graphic codes and label information of each sample graphic code, and the label information comprises a plurality of corner point coordinate information and center point coordinate information of the sample graphic codes; determining the object as a graphic code to be positioned under the condition that the reference confidence coefficient is greater than a preset confidence coefficient threshold value; and screening a plurality of reference frames determined by a plurality of groups of reference corner point coordinate information, and determining frame position information of the graphic code to be positioned, wherein the frame position information comprises a plurality of corner point coordinate information. Compared with the prior art, the method and the device have the advantages that the plurality of corner coordinate information of the sample graphic code are marked in the training sample of the key point detection model, so that the limitation of a rectangular frame is avoided in the image processing process, the frame position of the graphic code is positioned through the plurality of corner coordinate information, the graphic code which is affected by the abnormal influences of dirt, uneven illumination, uneven geometric deformation and the like can be well positioned, in addition, the dual detection of confidence coefficient and the corner coordinate information is carried out on the graphic code to be positioned, the position of the graphic code is accurately positioned while the error recognition probability of the graphic code is reduced, and the recognition accuracy of the graphic code is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic flow chart of a method for positioning and detecting a graphic code in the prior art;
FIG. 2 is a schematic diagram of a prior art key point detection model;
fig. 3 is a schematic structural diagram of a graphic code functional area according to an embodiment of the present disclosure;
FIG. 4 is a schematic structural diagram of a keypoint detection model provided in an embodiment of the present application;
fig. 5 is a schematic flowchart of an image processing method according to an embodiment of the present application;
fig. 6 is a schematic flowchart of an image processing apparatus according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application.
Detailed Description
Features and exemplary embodiments of various aspects of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are intended to be illustrative only and are not intended to be limiting. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The image processing algorithm is one of important research directions in the field of computer vision, and plays an important role in the fields of public safety, road traffic, video monitoring and the like. The graphic code is a most common coding form in daily life, and is widely applied to scenes such as mobile payment and information acquisition. However, in the graphic code image, due to the printing problems, such as printing strips, under printing, etc., and due to the small proportion, blurring, smudging, uneven illumination, uneven geometric deformation, etc., of the graphic code image, the imaging of the graphic code is abnormal, so that the graphic code in the image cannot be accurately positioned.
In recent years, with the development of image processing algorithms based on deep learning, image processing has been increasing in accuracy. In the prior art, the graphic code in the image is positioned by the following two ways:
one, conventional computer vision algorithm:
in a traditional computer vision algorithm, as shown in fig. 1, an image is scanned by a line-by-line and column-by-column point-by-point scanning algorithm to judge a boundary, a light-dark width flow is obtained according to gradient transformation to search pattern features for positioning, screening is performed by a transverse line segment set and a longitudinal line segment set, and the approximate position of a position detection pattern is determined according to the ratio of black pixels to white pixels of the position detection pattern. The center coordinates of the position detection pattern are repositioned. In order to improve the accuracy of positioning the graphic code, methods such as using a line detection method, such as hough transform, positioning the graphic code by searching two groups of straight lines meeting the mutually perpendicular requirements, morphology and the like are also proposed.
Secondly, a key point detection algorithm based on Mask R-CNN:
as shown in fig. 2, a Mask area Convolutional Neural Network (Mask registers with Convolutional Neural Network, Mask R-CNN) is an example segmentation algorithm, which is improved based on a fast area Convolutional Neural Network (fast R-CNN), and a Full Convolutional Network (FCN) branch is added under the frame of the fast R-CNN to output a Mask, and a Region of Interest activation layer (Region of Interest Align, roilign) is also provided to improve a Region of Interest Pooling layer (Region of Interest Pooling). Mask R-CNN can realize the positioning of the outline of the example, classify each pixel and realize the accurate segmentation at the pixel level. For the key point detection task, the positions of the key points are independently modeled, and the cross entropy is used as a loss function, so that higher key point positioning accuracy can be realized.
The two algorithms are common technologies for positioning and detecting the graphic code in the image, and can achieve higher accurate recognition rate and meet requirements on the perfect graphic code, namely, the image with good printing, no abnormal imaging and single background can be obtained, but the problem of abnormal imaging of the graphic code caused by printing strips, under printing, blurring, smudging, uneven illumination, uneven geometric deformation and the like in the complex background is not considered, so that the application of the algorithms in more complex scenes is ignored, the application range is narrower, the speed of positioning position detection of the graphic is low and inaccurate, the accurate recognition rate of the graphic code is low, and the recognition speed is low.
Based on the above, the embodiment of the application provides an image processing method, which can better locate a graphic code which is affected by abnormal influences such as dirt, uneven illumination, uneven geometric deformation and the like in an image based on a pre-trained key point detection model, perform double detection on confidence and corner coordinate information of the graphic code to be located, reduce the probability of misrecognition of the graphic code, accurately locate the position of the graphic code, be suitable for more complex scenes, and improve the accuracy of locating the graphic code in the image.
The main purpose of the embodiments of the present application is to locate a graphic code in an image, and the following first introduces the nature of the graphic code. The graphic code is a matrix two-dimensional code symbol, and compared with a one-dimensional bar code and other two-dimensional bar codes, the graphic code has the advantages of large information capacity, high reliability, capability of representing various data types, strong confidentiality and anti-counterfeiting performance and the like. As shown in fig. 3, each part of the graphic code has its own role, and can be basically divided into a functional region and a coding region. The functional area has 3 position detecting patterns for identifying the position and direction of the pattern code, the positioning pattern 5 for positioning to prevent the pattern code from distortion, and the correcting pattern for aligning the pattern. The format information of the coding region stores formatted data, the version information is used for more than version 7, the data and the error correction code words store actual graphic code information, and the error correction code words are used for correcting errors caused by graphic code damage.
It should be noted that, in the image processing method provided in the embodiment of the present application, it is necessary to recognize an image by using a pre-trained keypoint detection model, and therefore, before performing image processing by using the keypoint detection model, the keypoint detection model needs to be trained first. Therefore, a specific implementation of the method for training the keypoint detection model provided in the embodiment of the present application is described first below.
The embodiment of the application provides a method for training a key point detection model, wherein the key point detection model comprises a first network and a second network, the first network comprises a convolution cascade combination, the convolution cascade combination comprises a first convolution layer, a normalization layer and an activation layer, the second network comprises a maximum pooling layer and a second convolution layer, and iterative training is carried out on a preset key point detection model through a classification and regression detection algorithm until a training stop condition is met. The method can be realized by the following steps:
firstly, acquiring an original graphic code.
In order to improve the generalization and robustness of a key point detection model, in the embodiment of the application, a large number of unreal graphic codes generated randomly are used for simulating poor printing and abnormal imaging which may occur in an industrial production line, the unreal graphic codes are randomly projected in the background of a COCO public data set, and then are randomly transformed and synthesized with a large number of graphic codes in real industrial scenes, such as random synthesis of zooming, blurring, distortion, noise, dirt and the like, the synthesized graphic codes are used as original graphic codes, and possible transformation in the real industry is simulated for subsequent sample marking.
And secondly, marking the original graphic code to generate a sample training set.
Firstly, the original graphic codes are labeled manually, and the labeled contents are label clause label and four corner point coordinate information of each graphic code.
Secondly, in order to ensure the accuracy of the labeling, only label and four corner point coordinate information in the manual labeling, and the central point coordinate information is generated according to the version information of the graphic code and the four corner point coordinate information based on a transformation matrix. Wherein, the version information is determined by label of the graphic code.
And finally, taking the graphic code with the central point coordinate information and the four corner point coordinate information as a sample graphic code, wherein a plurality of sample graphic codes form a sample training set.
It should be noted that, in the scheme for detecting a graphic code through a deep learning neural network in the prior art, when the graphic code is labeled to form a sample training set, the labeling method is as follows: and marking coordinate information of five points, wherein the five points are respectively a center of the three position detection graph, a center of the correction graph and a center point of the graph code.
However, in the marking mode, when the graphic code is printed badly to cause abnormal imaging, the marked position detection graphic center can cause poor training model effect, and further obvious coordinate deviation can occur when the graphic code is positioned and detected through the trained model, so that the identification accuracy of the graphic code is low.
And thirdly, integrating the labeled graphic codes and the labeling information corresponding to each graphic code into a training sample set, wherein the training sample set comprises a plurality of sample graphic codes.
It should be noted that, because the keypoint detection model needs to be iteratively trained for multiple times to adjust the loss function value until the loss function value meets the training stop condition, the trained keypoint detection model is obtained, and in each iterative training, if only one sample pattern code is input, the amount of samples is too small to facilitate the training adjustment of the keypoint detection model, so that the multiple sample pattern codes in the training sample set are used to iteratively train the keypoint detection model.
And fourthly, training a preset key point detection model by using the sample graphic codes in the training sample set until the training stopping condition is met, and obtaining the trained key point detection model.
It should be noted that, as shown in fig. 4, the keypoint detection model includes a first network and a second network, where the first network includes a convolution cascade combination, the convolution cascade combination includes a first convolution layer, a normalization layer, and an activation layer, and the second network includes a maximum pooling layer and a second convolution layer.
After the convolutional layer is used for extracting the image features through the first network, in the prior art, the convolutional neural network is connected with a full connection layer, and then an activation function is used. And the full-connection layer is used for reducing the dimension of the feature map of the key point, so that the parameter quantity is large, and overfitting is easily caused. In the key point detection model of the embodiment of the application, the second network is used for replacing a common full-connection layer after the convolutional neural network, and the whole model is a CNN network without the full-connection layer. The second network is composed of a pooling layer and a convolutional layer, after which the second network is used instead of a fully connected layer, with the following advantages in comparison:
firstly, a large number of parameters for training and tuning are needed for the full connection layer, and the space parameters are greatly reduced after replacement, so that the model is more robust, and the anti-overfitting effect is better;
and the second network reserves the spatial information extracted by each convolution layer and each pooling layer, so that the effect is obviously improved in practical application.
Specifically, the training process of the keypoint detection model may have the following steps:
and 4.1, extracting a plurality of depth features of the sample graphic code by utilizing a first network in a preset key point detection model to form a key point feature map, wherein the sizes of the depth features are different.
And 4.2, inputting the feature map into a second network in a preset key point detection model for fusion, and determining the prediction information corresponding to the sample graphic code, wherein the prediction information comprises the prediction confidence coefficient of the sample graphic code and the coordinate information of the four prediction corner points.
Specifically, the prediction confidence may be obtained by:
inputting the training sample set into a preset key point detection model, and determining the coordinate information of a predicted central point of a target sample graphic code, wherein the target sample graphic code is any one sample in the training sample set;
determining the position of a predicted central point of the target sample graphic code according to the coordinate information of the predicted central point;
and determining the prediction confidence coefficient of the target sample graphic code according to the position relation between the prediction central point position and a target grid, wherein the target grid is a preset area determined according to the central point coordinate information in the target sample graphic code label information.
4.3, calculating a loss value between the predicted identification information and the label information of the sample graph code.
In one embodiment, the method for determining the loss function value of the preset keypoint detection model may include the following steps:
calculating the regression loss of the prediction confidence coefficient by using a cross entropy function to obtain a first loss function value;
calculating the regression loss of the coordinate information of the prediction reference corner point by using a mean square error function to obtain a second loss function value;
and determining a loss function value of the preset key point detection model according to the first loss function value and the second loss function value.
Specifically, a binary cross-loss entropy (binary cross-entropy) is used to calculate the classification loss of the confidence, and the confidence error calculation is a weighted average of the confidence of the graphic code and the confidence of the absence of the graphic code, which can be expressed by formula 1.
Figure BDA0003182791040000081
Wherein λ isobjWeight coefficient, λ, indicating presence of a targetnoobjRepresenting weight coefficients without targets. In the examples of the present application, 1: 100, the confidence of the absence of the target may be adjusted to approach 0, and the confidence of the presence of the target may approach 1.
In addition, the activation function corresponding to the confidence coefficient uses a Sigmoid function, and the corner point coordinate information uses a tanh function. The regression loss to the four corner coordinates of the object with the target is the mean square error, which represents the averaging of the sum of squares of the differences between the predicted values and the target values. To balance the weight of the two loss functions, the loss function of the corner coordinate information is multiplied by the height and width of the grid, which can be expressed by equation 2.
Figure BDA0003182791040000091
The total loss function value of the keypoint detection model is determined by adding the above-mentioned confidence to the loss value of the corner coordinate information, which can be represented by equation 3.
Loss=Lossconf+LossprexyEquation 3
Optimizing the key point detection model according to the loss function shown in the formula 1-formula 3, reversely updating the network parameters by using a gradient descent algorithm to obtain an updated key point detection model, stopping the optimization training until the loss function value is smaller than a preset value, and determining the trained key point detection model.
It should be noted that, in order to improve the accuracy of the key point detection model, the key point detection model may also be continuously trained by using new training samples in practical applications, so as to continuously update the key point detection model, improve the accuracy of the key point detection model, and further improve the accuracy of image processing.
In the above specific implementation of the method for training a keypoint detection model provided in the embodiment of the present application, the keypoint detection model obtained through the above training can be applied to the image processing method provided in the following embodiment.
A specific implementation of the image processing method provided in the present application is described in detail below with reference to fig. 5.
As shown in fig. 5, an embodiment of the present application provides an image processing method, including:
s501, acquiring an image to be detected.
In some embodiments, the image to be detected may be acquired by a camera, or a frame extraction process may be performed on a pre-acquired video to determine the image to be detected.
S502, inputting an image to be detected into a pre-trained key point detection model, and determining reference confidence of an object in the image to be detected and multiple groups of reference corner point coordinate information of the object, wherein the first confidence represents the probability that the object is a graphic code, the key point detection model is obtained by training according to a training sample set, the training sample set comprises a plurality of sample graphic codes and label information of each sample graphic code, and the label information comprises a plurality of corner point coordinate information and center point coordinate information of the sample graphic codes.
In order to improve the generalization and robustness of a key point detection model, in the embodiment of the application, a large number of unreal graphic codes generated randomly are used for simulating poor printing and abnormal imaging which may occur in an industrial production line, the unreal graphic codes are randomly projected in the background of a COCO public data set, and then are randomly transformed and synthesized with a large number of graphic codes in real industrial scenes, such as random synthesis of zooming, blurring, distortion, noise, dirt and the like, the synthesized graphic codes are used as original graphic codes, and possible transformation in the real industry is simulated for subsequent sample marking.
After the original graphic codes are obtained, manually marking the original graphic codes, wherein the marked contents are label short sentence label and four corner point coordinate information of each graphic code; in order to ensure the accuracy of labeling, only label and four corner point coordinate information in manual labeling, the central point coordinate information is based on the transformation matrix and generated according to the version information of the graphic code and the four corner point coordinate information. Wherein, the version information is determined by label of the graphic code; and taking the graphic code with the central point coordinate information and the four corner point coordinate information as a sample graphic code, wherein a plurality of sample graphic codes form a sample training set.
In the above S502, the keypoint detection model includes a first network and a second network, where the first network includes a convolution cascade combination, the convolution cascade combination includes a first convolution layer, a normalization layer, and an activation layer, and the second network includes a maximum pooling layer and a second convolution layer;
the above inputting the image to be detected into the pre-trained key point detection model, determining the reference confidence of the object in the image to be detected and the multi-group reference corner coordinate information of the object, may include:
inputting an image to be detected into a first network, and determining a plurality of depth features of the image to be detected, wherein the sizes of the depth features are different;
and inputting the depth features into a second network for fusion, and determining the reference confidence of the object in the image to be detected and the multiple groups of reference corner point coordinate information of the object.
In the training sample labeling mode of the key point detection model provided by the embodiment of the application, the four corner points of the polygonal labeling graphic code are adopted for labeling the image, and the labeling of the central point of the graphic code is completed according to the inherent properties of the graphic code, so that the training process and the model detection of the whole model are greatly improved; meanwhile, the model structure of the key point detection model does not comprise a full connection layer, and on the basis of keeping a plurality of depth features extracted from the first network, the model structure based on the second network greatly reduces the space parameters, so that the model has better anti-over-fitting effect, and can be well applied to the accurate positioning of graphic codes in industry.
And S503, determining the object as a graphic code to be positioned under the condition that the reference confidence is greater than a preset confidence threshold.
In the embodiment of the application, the confidence coefficient and the angular point coordinate information of the graphic code to be positioned are detected doubly, so that the position of the graphic code is accurately positioned while the probability of error identification of the graphic code is reduced, and the identification accuracy of the graphic code is improved.
S504, a plurality of reference frames determined by a plurality of groups of reference corner point coordinate information are screened, and frame position information of the graphic code to be positioned is determined, wherein the frame position information comprises a plurality of corner point coordinate information.
In the above S504, the screening a plurality of reference frames determined by the plurality of sets of reference corner point coordinate information to determine frame position information of the graphic code to be positioned may include:
obtaining local maximum values of a plurality of reference frames by adopting a non-maximum value suppression algorithm;
and screening the plurality of reference frames according to the local maximum value and a preset intersection ratio threshold value until no overlapped frame exists, and determining frame position information of the graphic code to be positioned.
Specifically, the non-maximum suppression process uses a local maximum search method to determine the most accurate bounding box among a plurality of reference bounding boxes of the graphic code to be located. The searching method for the local peak extreme point can use window judgment, but the size of the window determines the quality of the algorithm, and the window method is difficult to deal with for the problem of continuous peaks. In the embodiment of the application, the morphological thought is adopted to expand from the inside, the original image is expanded, and the adjacent local maximum values are combined to be smaller than the expanded size. The position where the original image equals the extended image will be returned as a local maximum.
According to the image processing method provided by the embodiment of the application, the plurality of corner coordinate information of the sample graphic code are marked in the training sample of the key point detection model, so that the limitation of a rectangular frame is avoided in the image processing process, the frame position of the graphic code is positioned through the plurality of corner coordinate information, the graphic code which is affected by the abnormal influences of dirt, uneven illumination, uneven geometric deformation and the like can be well positioned, in addition, the double detection of confidence coefficient and the corner coordinate information is carried out on the graphic code to be positioned, the position of the graphic code is accurately positioned while the misrecognition probability of the graphic code is reduced, and the recognition accuracy of the graphic code is improved.
Based on the same inventive concept of the image processing method, the embodiment of the application also provides an image processing device.
As shown in fig. 6, an embodiment of the present application provides an image processing apparatus including:
an obtaining module 601, configured to obtain an image to be detected;
the processing module 602 is configured to input an image to be detected into a pre-trained keypoint detection model, and determine a reference confidence of an object in the image to be detected and multiple sets of reference corner coordinate information of the object, where a first confidence represents a probability that the object is a pattern code, the keypoint detection model is obtained by training according to a training sample set, the training sample set includes multiple sample pattern codes and label information of each sample pattern code, and the label information includes multiple corner coordinate information and center coordinate information of the sample pattern codes;
a first determining module 603, configured to determine that the object is a graphic code to be positioned when the reference confidence is greater than a preset confidence threshold;
the second determining module 604 is configured to screen multiple reference frames determined by multiple sets of reference corner coordinate information, and determine frame position information of the graphic code to be positioned, where the frame position information includes multiple corner coordinate information.
In some embodiments, the second determining module may be specifically configured to:
obtaining local maximum values of a plurality of reference frames by adopting a non-maximum value suppression algorithm;
and screening the plurality of reference frames according to the local maximum value and a preset intersection ratio threshold value until no overlapped frame exists, and determining frame position information of the graphic code to be positioned.
In some embodiments, the keypoint detection model comprises a first network comprising a convolutional cascaded combination comprising a first convolutional layer, a normalization layer, and an activation layer, and a second network comprising a max-pooling layer and a second convolutional layer;
the processing module may specifically be configured to:
inputting an image to be detected into a first network, and determining a plurality of depth features of the image to be detected, wherein the sizes of the depth features are different;
and inputting the depth features into a second network for fusion, and determining the reference confidence of the object in the image to be detected and the multiple groups of reference corner point coordinate information of the object.
In some embodiments, the apparatus may further comprise:
the second acquisition module is used for acquiring a training sample set, wherein the training sample set comprises a plurality of sample graphic codes and label information corresponding to each sample graphic code, and the label information is marked with four corner point coordinate information and central point coordinate information of the sample graphic codes;
and the training module is used for training the preset key point detection model by using the sample graphic code in the training sample set until the training stopping condition is met, so as to obtain the trained key point detection model.
In some embodiments, the apparatus may further include a labeling module, and the labeling module may be specifically configured to:
acquiring a plurality of original graphic codes;
for each original graphic code, the following steps are respectively executed:
marking version information and four corner point coordinate information of the graphic code to be processed;
generating central point coordinate information of the original graphic code according to the version information and the four corner point coordinate information;
and determining the original graphic code marked with the coordinate information of the four corner points and the coordinate information of the central point as a sample graphic code.
In some embodiments, the training module may be specifically configured to:
inputting the training sample set into a preset key point detection model, and determining the prediction information of each sample graphic code, wherein the prediction information comprises the prediction confidence coefficient of the sample graphic code and the coordinate information of four prediction angular points;
determining a loss function value of a preset key point detection model according to the prediction information and the label information of each sample graphic code;
and under the condition that the loss function value does not meet the training stopping condition, adjusting model parameters of the key point detection model, training the adjusted key point detection model by using the sample image until the loss function value meets the training stopping condition, and obtaining the trained key point detection model.
In some embodiments, the training module may be specifically configured to:
inputting the training sample set into a preset key point detection model, and determining the coordinate information of a predicted central point of a target sample graphic code, wherein the target sample graphic code is any one sample in the training sample set;
determining the position of a predicted central point of the target sample graphic code according to the coordinate information of the predicted central point;
and determining the prediction confidence coefficient of the target sample graphic code according to the position relation between the prediction central point position and a target grid, wherein the target grid is a preset area determined according to the central point coordinate information in the target sample graphic code label information.
In some embodiments, the training module may be specifically configured to:
calculating the regression loss of the prediction confidence coefficient by using a cross entropy function to obtain a first loss function value;
calculating the regression loss of the coordinate information of the prediction reference corner point by using a mean square error function to obtain a second loss function value;
and determining a loss function value of the preset key point detection model according to the first loss function value and the second loss function value.
Other details of the image processing apparatus according to the embodiment of the present application are similar to those of the image processing method according to the embodiment of the present application described above with reference to fig. 5, and are not repeated herein.
Fig. 7 shows a hardware structure diagram of image processing provided by an embodiment of the present application.
The image processing method and apparatus provided according to the embodiment of the present application described in conjunction with fig. 5 and 6 may be implemented by an image processing device. Fig. 7 is a diagram showing a hardware configuration 700 of an image processing apparatus according to an embodiment of the present invention.
A processor 701 and a memory 702 storing computer program instructions may be included in the image processing apparatus.
Specifically, the processor 701 may include a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement the embodiments of the present Application.
Memory 702 may include a mass storage for data or instructions. By way of example, and not limitation, memory 702 may include a Hard Disk Drive (HDD), a floppy Disk Drive, flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. In one example, memory 702 may include removable or non-removable (or fixed) media, or memory 702 is non-volatile solid-state memory. The memory 702 may be internal or external to the integrated gateway disaster recovery device.
In one example, the Memory 702 may be a Read Only Memory (ROM). In one example, the ROM may be mask programmed ROM, programmable ROM (prom), erasable prom (eprom), electrically erasable prom (eeprom), electrically rewritable ROM (earom), or flash memory, or a combination of two or more of these.
The processor 701 reads and executes the computer program instructions stored in the memory 702 to implement the methods/steps S501 to S504 in the embodiment shown in fig. 5, and achieve the corresponding technical effects achieved by the embodiment shown in fig. 5 executing the methods/steps thereof, which are not described herein again for brevity.
In one example, the image processing device may also include a communication interface 703 and a bus 710. As shown in fig. 7, the processor 701, the memory 702, and the communication interface 703 are connected by a bus 710 to complete mutual communication.
The communication interface 703 is mainly used for implementing communication between modules, apparatuses, units and/or devices in this embodiment of the application.
Bus 710 comprises hardware, software, or both to couple the components of the online data traffic billing device to each other. By way of example, and not limitation, a Bus may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (Front Side Bus, FSB), a Hyper Transport (HT) interconnect, an Industry Standard Architecture (ISA) Bus, an infiniband interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a video electronics standards association local (VLB) Bus, or other suitable Bus or a combination of two or more of these. Bus 710 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The image processing equipment provided by the embodiment of the application marks a plurality of corner coordinate information of the sample graphic code in a training sample of a key point detection model, so that the limitation of a rectangular frame is avoided in the image processing process, the frame position of the graphic code is positioned through the plurality of corner coordinate information, and the graphic code which is polluted, uneven in illumination, and irregularly influenced by geometry, and the like can be better positioned.
In addition, in combination with the image processing method in the foregoing embodiments, the embodiments of the present application may be implemented by providing a computer storage medium. The computer storage medium having computer program instructions stored thereon; the computer program instructions, when executed by a processor, implement any of the image processing methods in the above embodiments.
It is to be understood that the present application is not limited to the particular arrangements and instrumentality described above and shown in the attached drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions or change the order between the steps after comprehending the spirit of the present application.
The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic Circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include electronic circuits, semiconductor memory devices, ROM, flash memory, Erasable ROM (EROM), floppy disks, CD-ROMs, optical disks, hard disks, fiber optic media, Radio Frequency (RF) links, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.
It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above-described steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed simultaneously.
Aspects of the present application are described above with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be, but is not limited to, a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit. It will also be understood that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware for performing the specified functions or acts, or combinations of special purpose hardware and computer instructions.
As described above, only the specific embodiments of the present application are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered within the scope of the present application.

Claims (11)

1. An image processing method, characterized in that the method comprises:
acquiring an image to be detected;
inputting the image to be detected into a pre-trained key point detection model, and determining reference confidence of an object in the image to be detected and multiple groups of reference corner point coordinate information of the object, wherein the first confidence represents the probability that the object is a graphic code, the key point detection model is obtained by training according to a training sample set, the training sample set comprises a plurality of sample graphic codes and label information of each sample graphic code, and the label information comprises a plurality of corner point coordinate information and central point coordinate information of the sample graphic codes;
determining the object as a graphic code to be positioned under the condition that the reference confidence coefficient is greater than a preset confidence coefficient threshold value;
and screening a plurality of reference frames determined by the plurality of groups of reference corner point coordinate information to determine frame position information of the graphic code to be positioned, wherein the frame position information comprises a plurality of corner point coordinate information.
2. The method according to claim 1, wherein the screening the plurality of reference frames determined by the plurality of sets of reference corner point coordinate information to determine frame position information of the graphic code to be positioned comprises:
obtaining local maximum values of the plurality of reference frames by adopting a non-maximum value suppression algorithm;
and screening the plurality of reference frames according to the local maximum value and a preset intersection ratio threshold value until no overlapped frame exists, and determining frame position information of the graphic code to be positioned.
3. The method of claim 1, wherein the keypoint detection model comprises a first network and a second network, wherein the first network comprises a convolutional cascaded combination comprising a first convolutional layer, a normalization layer, and an activation layer, and wherein the second network comprises a max-pooling layer and a second convolutional layer;
the inputting the image to be detected into a pre-trained key point detection model, and determining the reference confidence of the object in the image to be detected and the coordinate information of multiple groups of reference corner points of the object, includes:
inputting the image to be detected into the first network, and determining a plurality of depth features of the image to be detected, wherein the sizes of the depth features are different;
and inputting the depth features into the second network for fusion, and determining the reference confidence of the object in the image to be detected and the multiple groups of reference corner point coordinate information of the object.
4. The method according to claim 1, wherein prior to said acquiring an image to be detected, said method further comprises:
acquiring a training sample set, wherein the training sample set comprises a plurality of sample graphic codes and label information corresponding to each sample graphic code, and the label information is marked with four corner point coordinate information and central point coordinate information of the sample graphic codes;
and training a preset key point detection model by using the sample graphic codes in the training sample set until a training stopping condition is met, and obtaining the trained key point detection model.
5. The method of claim 4, wherein prior to said obtaining a set of training samples, the method further comprises:
acquiring a plurality of original graphic codes;
for each original graphic code, the following steps are respectively executed:
marking the version information and the four corner coordinate information of the graphic code to be processed;
generating central point coordinate information of the original graphic code according to the version information and the four corner point coordinate information;
and determining the original graphic code marked with the four corner point coordinate information and the central point coordinate information as the sample graphic code.
6. The method according to claim 4, wherein the training of the preset keypoint detection model by using the sample pattern code in the training sample set until a training stop condition is met to obtain a trained keypoint detection model comprises:
inputting the training sample set into the preset key point detection model, and determining prediction information of each sample graphic code, wherein the prediction information comprises a prediction confidence coefficient of the sample graphic code and four prediction corner coordinate information;
determining a loss function value of the preset key point detection model according to the prediction information and the label information of each sample graphic code;
and under the condition that the loss function value does not meet the training stopping condition, adjusting model parameters of the key point detection model, and training the adjusted key point detection model by using the sample graphic code until the loss function value meets the training stopping condition to obtain the trained key point detection model.
7. The method according to claim 6, wherein the inputting the training sample set into the predetermined keypoint detection model and determining the prediction information of each sample pattern code comprises:
inputting the training sample set into the preset key point detection model, and determining the coordinate information of a predicted central point of a target sample graphic code, wherein the target sample graphic code is any sample in the training sample set;
determining the position of a predicted central point of the target sample graphic code according to the coordinate information of the predicted central point;
and determining the prediction confidence of the target sample graphic code according to the position relation between the prediction central point position and a target grid, wherein the target grid is a preset area determined according to the central point coordinate information in the target sample graphic code label information.
8. The method of claim 6, wherein determining the loss function value of the predetermined keypoint detection model according to the prediction information and the label information of each sample graphic code comprises:
calculating the regression loss of the prediction confidence coefficient by using a cross entropy function to obtain a first loss function value;
calculating the regression loss of the coordinate information of the prediction reference corner point by using a mean square error function to obtain a second loss function value;
and determining a loss function value of the preset key point detection model according to the first loss function value and the second loss function value.
9. An image processing apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring an image to be detected;
the processing module is used for inputting the image to be detected into a pre-trained key point detection model, and determining a reference confidence coefficient of an object in the image to be detected and multiple groups of reference corner point coordinate information of the object, wherein the first confidence coefficient represents the probability that the object is a graphic code, the key point detection model is obtained by training according to a training sample set, the training sample set comprises a plurality of sample graphic codes and label information of each sample graphic code, and the label information comprises a plurality of corner point coordinate information and central point coordinate information of the sample graphic codes;
the first determining module is used for determining the object as a graphic code to be positioned under the condition that the reference confidence coefficient is greater than a preset confidence coefficient threshold value;
and the second determining module is used for screening a plurality of reference frames determined by the plurality of groups of reference corner point coordinate information and determining frame position information of the graphic code to be positioned, wherein the frame position information comprises a plurality of corner point coordinate information.
10. An image processing apparatus, characterized in that the apparatus comprises: a processor, and a memory storing computer program instructions; the processor reads and executes the computer program instructions to implement the image processing method of any one of claims 1 to 8.
11. A computer storage medium having computer program instructions stored thereon which, when executed by a processor, implement the image processing method of any one of claims 1 to 8.
CN202110852248.0A 2021-07-27 2021-07-27 Image processing method, device, equipment and computer storage medium Active CN113591967B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110852248.0A CN113591967B (en) 2021-07-27 2021-07-27 Image processing method, device, equipment and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110852248.0A CN113591967B (en) 2021-07-27 2021-07-27 Image processing method, device, equipment and computer storage medium

Publications (2)

Publication Number Publication Date
CN113591967A true CN113591967A (en) 2021-11-02
CN113591967B CN113591967B (en) 2024-06-11

Family

ID=78250677

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110852248.0A Active CN113591967B (en) 2021-07-27 2021-07-27 Image processing method, device, equipment and computer storage medium

Country Status (1)

Country Link
CN (1) CN113591967B (en)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114139564A (en) * 2021-12-07 2022-03-04 Oppo广东移动通信有限公司 Two-dimensional code detection method and device, terminal equipment and training method for detection network
CN114494398A (en) * 2022-01-18 2022-05-13 深圳市联洲国际技术有限公司 Processing method and device for inclined target, storage medium and processor
CN114663904A (en) * 2022-04-02 2022-06-24 成都卫士通信息产业股份有限公司 PDF document layout detection method, device, equipment and medium
CN114743018A (en) * 2022-04-21 2022-07-12 平安科技(深圳)有限公司 Image description generation method, device, equipment and medium
CN114913232A (en) * 2022-06-10 2022-08-16 嘉洋智慧安全生产科技发展(北京)有限公司 Image processing method, apparatus, device, medium, and product
CN115019136A (en) * 2022-08-05 2022-09-06 山东圣点世纪科技有限公司 Training method and detection method of target key point detection model for resisting boundary point drift
CN115577728A (en) * 2022-12-07 2023-01-06 深圳思谋信息科技有限公司 One-dimensional code positioning method, device, computer equipment and storage medium
WO2023130717A1 (en) * 2022-01-05 2023-07-13 深圳思谋信息科技有限公司 Image positioning method and apparatus, computer device and storage medium
CN117011575A (en) * 2022-10-27 2023-11-07 腾讯科技(深圳)有限公司 Training method and related device for small sample target detection model
WO2023236008A1 (en) * 2022-06-06 2023-12-14 Intel Corporation Methods and apparatus for small object detection in images and videos

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
US20190220653A1 (en) * 2018-01-12 2019-07-18 Qualcomm Incorporated Compact models for object recognition
CN110991443A (en) * 2019-10-29 2020-04-10 北京海益同展信息科技有限公司 Key point detection method, image processing method, key point detection device, image processing device, electronic equipment and storage medium
CN111104813A (en) * 2019-12-16 2020-05-05 北京达佳互联信息技术有限公司 Two-dimensional code image key point detection method and device, electronic equipment and storage medium
CN111160085A (en) * 2019-11-19 2020-05-15 天津中科智能识别产业技术研究院有限公司 Human body image key point posture estimation method
CN111950318A (en) * 2020-08-12 2020-11-17 上海连尚网络科技有限公司 Two-dimensional code image identification method and device and storage medium
WO2021051650A1 (en) * 2019-09-18 2021-03-25 北京市商汤科技开发有限公司 Method and apparatus for association detection for human face and human hand, electronic device and storage medium
CN112560980A (en) * 2020-12-24 2021-03-26 深圳市优必选科技股份有限公司 Training method and device of target detection model and terminal equipment
CN112966574A (en) * 2021-02-22 2021-06-15 厦门艾地运动科技有限公司 Human body three-dimensional key point prediction method and device and electronic equipment
CN113052159A (en) * 2021-04-14 2021-06-29 ***通信集团陕西有限公司 Image identification method, device, equipment and computer storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107038429A (en) * 2017-05-03 2017-08-11 四川云图睿视科技有限公司 A kind of multitask cascade face alignment method based on deep learning
US20190220653A1 (en) * 2018-01-12 2019-07-18 Qualcomm Incorporated Compact models for object recognition
WO2021051650A1 (en) * 2019-09-18 2021-03-25 北京市商汤科技开发有限公司 Method and apparatus for association detection for human face and human hand, electronic device and storage medium
CN110991443A (en) * 2019-10-29 2020-04-10 北京海益同展信息科技有限公司 Key point detection method, image processing method, key point detection device, image processing device, electronic equipment and storage medium
CN111160085A (en) * 2019-11-19 2020-05-15 天津中科智能识别产业技术研究院有限公司 Human body image key point posture estimation method
CN111104813A (en) * 2019-12-16 2020-05-05 北京达佳互联信息技术有限公司 Two-dimensional code image key point detection method and device, electronic equipment and storage medium
CN111950318A (en) * 2020-08-12 2020-11-17 上海连尚网络科技有限公司 Two-dimensional code image identification method and device and storage medium
CN112560980A (en) * 2020-12-24 2021-03-26 深圳市优必选科技股份有限公司 Training method and device of target detection model and terminal equipment
CN112966574A (en) * 2021-02-22 2021-06-15 厦门艾地运动科技有限公司 Human body three-dimensional key point prediction method and device and electronic equipment
CN113052159A (en) * 2021-04-14 2021-06-29 ***通信集团陕西有限公司 Image identification method, device, equipment and computer storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
DOU XINZE ET AL.: "Vehicle re-identification optimization algorithm based on high-confidence local features", 《JOURNAL OF BEIJING UNIVERSITY OF AERONAUTICS AND ASTRONAUTICS》, vol. 46, no. 9, 30 September 2020 (2020-09-30) *
金子丰 等: "结合多尺度特征学习与特征对齐的行人重识别", 《算机工程与应用》, vol. 58, no. 20, 22 April 2021 (2021-04-22) *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114139564A (en) * 2021-12-07 2022-03-04 Oppo广东移动通信有限公司 Two-dimensional code detection method and device, terminal equipment and training method for detection network
CN114139564B (en) * 2021-12-07 2024-05-07 Oppo广东移动通信有限公司 Two-dimensional code detection method and device, terminal equipment and training method of detection network
WO2023130717A1 (en) * 2022-01-05 2023-07-13 深圳思谋信息科技有限公司 Image positioning method and apparatus, computer device and storage medium
CN114494398A (en) * 2022-01-18 2022-05-13 深圳市联洲国际技术有限公司 Processing method and device for inclined target, storage medium and processor
CN114494398B (en) * 2022-01-18 2024-05-07 深圳市联洲国际技术有限公司 Processing method and device of inclined target, storage medium and processor
CN114663904A (en) * 2022-04-02 2022-06-24 成都卫士通信息产业股份有限公司 PDF document layout detection method, device, equipment and medium
CN114743018A (en) * 2022-04-21 2022-07-12 平安科技(深圳)有限公司 Image description generation method, device, equipment and medium
CN114743018B (en) * 2022-04-21 2024-05-31 平安科技(深圳)有限公司 Image description generation method, device, equipment and medium
WO2023236008A1 (en) * 2022-06-06 2023-12-14 Intel Corporation Methods and apparatus for small object detection in images and videos
CN114913232A (en) * 2022-06-10 2022-08-16 嘉洋智慧安全生产科技发展(北京)有限公司 Image processing method, apparatus, device, medium, and product
CN114913232B (en) * 2022-06-10 2023-08-08 嘉洋智慧安全科技(北京)股份有限公司 Image processing method, device, equipment, medium and product
CN115019136B (en) * 2022-08-05 2022-11-25 山东圣点世纪科技有限公司 Training method and detection method of target key point detection model for resisting boundary point drift
CN115019136A (en) * 2022-08-05 2022-09-06 山东圣点世纪科技有限公司 Training method and detection method of target key point detection model for resisting boundary point drift
CN117011575A (en) * 2022-10-27 2023-11-07 腾讯科技(深圳)有限公司 Training method and related device for small sample target detection model
CN115577728B (en) * 2022-12-07 2023-03-14 深圳思谋信息科技有限公司 One-dimensional code positioning method, device, computer equipment and storage medium
CN115577728A (en) * 2022-12-07 2023-01-06 深圳思谋信息科技有限公司 One-dimensional code positioning method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN113591967B (en) 2024-06-11

Similar Documents

Publication Publication Date Title
CN113591967B (en) Image processing method, device, equipment and computer storage medium
CN113160192B (en) Visual sense-based snow pressing vehicle appearance defect detection method and device under complex background
CN107545239B (en) Fake plate detection method based on license plate recognition and vehicle characteristic matching
US9014432B2 (en) License plate character segmentation using likelihood maximization
US20020051578A1 (en) Method and apparatus for object recognition
CN105160652A (en) Handset casing testing apparatus and method based on computer vision
CN111709416A (en) License plate positioning method, device and system and storage medium
CN111539330B (en) Transformer substation digital display instrument identification method based on double-SVM multi-classifier
CN108764235B (en) Target detection method, apparatus and medium
CN109190625B (en) Large-angle perspective deformation container number identification method
CN111091544A (en) Method for detecting breakage fault of side integrated framework of railway wagon bogie
CN111340041B (en) License plate recognition method and device based on deep learning
Özgen et al. Text detection in natural and computer-generated images
CN111507337A (en) License plate recognition method based on hybrid neural network
CN110659637A (en) Electric energy meter number and label automatic identification method combining deep neural network and SIFT features
CN113780492A (en) Two-dimensional code binarization method, device and equipment and readable storage medium
CN113033558A (en) Text detection method and device for natural scene and storage medium
CN117094975A (en) Method and device for detecting surface defects of steel and electronic equipment
CN115457044A (en) Pavement crack segmentation method based on class activation mapping
CN115512381A (en) Text recognition method, text recognition device, text recognition equipment, storage medium and working machine
Bala et al. Image simulation for automatic license plate recognition
CN114529555A (en) Image recognition-based efficient cigarette box in-and-out detection method
CN111127311B (en) Image registration method based on micro-coincident region
CN113658095A (en) Engineering pattern review identification processing method and device for drawing of manual instrument
CN118072289B (en) Image acquisition optimization method for intelligent driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant