CN114022558A

CN114022558A - Image positioning method and device, computer equipment and storage medium

Info

Publication number: CN114022558A
Application number: CN202210003829.1A
Authority: CN
Inventors: 陈帅; 徐威; 陈逸伦; 陈玉康; 刘枢; 吕江波; 沈小勇
Original assignee: Beijing Simou Intelligent Technology Co ltd; Shenzhen Smartmore Technology Co Ltd
Current assignee: Beijing Simou Intelligent Technology Co ltd; Shenzhen Smartmore Technology Co Ltd
Priority date: 2022-01-05
Filing date: 2022-01-05
Publication date: 2022-02-08
Anticipated expiration: 2042-01-05
Also published as: CN114022558B; WO2023130717A1

Abstract

The application relates to an image positioning method, an image positioning device, a computer device, a storage medium and a computer program product. The method comprises the steps of obtaining a one-dimensional code image to be recognized, inputting the one-dimensional code image to be recognized into a target neural network obtained by training a sample one-dimensional code image to a plurality of network branches in the neural network to be trained, obtaining a plurality of target edge characteristic points and target directions corresponding to the one-dimensional code image to be recognized, which is output by the target neural network through the plurality of network branches, and determining the position of the one-dimensional code image to be recognized according to the plurality of target edge characteristic points and the target directions. Compared with the traditional method for positioning the one-dimensional code based on a simple rectangular frame, the method has the advantages that the target neural network is utilized to identify the one-dimensional code image to be identified based on nine degrees of freedom, so that the information of a plurality of angular points and directions of the one-dimensional code image is obtained, and the effect of improving the positioning accuracy of the one-dimensional code image is realized.

Description

Image positioning method and device, computer equipment and storage medium

Technical Field

The present application relates to the field of machine learning technologies, and in particular, to an image positioning method, an image positioning apparatus, a computer device, a storage medium, and a computer program product.

Background

With the development of computer technology, mobile terminals such as mobile phones become one of the necessary tools for people's daily life, and the identification of one-dimensional code images can be realized through the mobile phones, so that information contained in the one-dimensional code images can be acquired. When the one-dimensional code image is identified, the one-dimensional code image needs to be positioned, and the current method for positioning the one-dimensional code image is to select an area where the one-dimensional code image is located based on a rectangular frame. However, in a real scene, due to the influence of the printing material, the illumination and the acquisition equipment, the position, the deformation and other information of the imaged one-dimensional code image are different, and only the rectangular frame is used for positioning, other interference information is included, so that the positioning accuracy is not high.

Therefore, the current positioning method for the one-dimensional code image has the defect of low positioning accuracy.

Disclosure of Invention

In view of the above, it is necessary to provide an image positioning method, an apparatus, a computer device, a computer readable storage medium and a computer program product, which can improve positioning accuracy.

In a first aspect, the present application provides an image localization method, including:

acquiring a one-dimensional code image to be identified;

inputting the one-dimensional code image to be identified into a target neural network, and acquiring a plurality of target edge characteristic points and target directions corresponding to the one-dimensional code image to be identified, which are output by the target neural network through a plurality of network branches; the target neural network is obtained by training a plurality of network branches in the neural network to be trained according to the sample one-dimensional code image;

and determining the position of the one-dimensional code image to be identified according to the plurality of target edge characteristic points and the target direction.

In one embodiment, the plurality of network branches includes a classification branch, a regression branch, and a direction branch;

the method further comprises the following steps:

acquiring a sample one-dimensional code image, and acquiring the real coordinates and the real direction of sample edge feature points in the sample one-dimensional code image;

inputting the sample one-dimensional code image into a classification branch in a neural network to be trained, and acquiring a sample identification result of edge feature points in the sample one-dimensional code image, which is output by the classification branch;

inputting the sample identification result into a regression branch in the neural network to be trained, and obtaining sample coordinates of edge feature points in the sample one-dimensional code image output by the regression branch;

inputting the sample one-dimensional code image into a direction branch in the neural network to be trained, and acquiring the sample direction of the sample one-dimensional code image output by the direction branch;

constructing a first loss function according to the sample identification result and the real identification result corresponding to the sample one-dimensional code image, constructing a second loss function according to the sample coordinate and the real coordinate corresponding to the sample one-dimensional code image, and constructing a third loss function according to the sample direction and the real direction corresponding to the sample one-dimensional code image;

detecting whether the first loss function is less than or equal to a first threshold, whether the second loss function is less than or equal to a second threshold, and whether the third loss function is less than or equal to a third threshold;

if not, adjusting the neural network to be trained according to the first loss function, the second loss function and the third loss function, and returning to the step of inputting the sample one-dimensional code image into a classification branch in the neural network to be trained;

and if so, taking the current neural network to be trained as the target neural network.

In one embodiment, the constructing a first loss function according to the real recognition result of the sample recognition result corresponding to the sample one-dimensional code image includes:

and constructing a two-classification loss function as a first loss function according to the number of pixels in the sample one-dimensional code image, the matrix of the sample identification result and the real identification result.

In one embodiment, the constructing a second loss function according to the sample coordinates and the real coordinates corresponding to the sample one-dimensional code image includes:

and constructing a first smooth average absolute error loss function as a second loss function according to the sample coordinate, the real coordinate and a preset characteristic threshold.

In one embodiment, the constructing a third loss function according to the real direction of the sample direction corresponding to the sample one-dimensional code image includes:

and constructing a second smooth average absolute error loss function as a third loss function according to the sample direction, the real direction and a preset direction threshold.

In one embodiment, the obtaining of a plurality of target edge feature points and target directions corresponding to the one-dimensional code image to be recognized, which is output by the target neural network through a plurality of network branches, includes:

acquiring target coordinates of a plurality of target edge feature points of the one-dimensional code to be identified, which are output by the target neural network through a plurality of network branches;

and acquiring target angles of the one-dimensional code image to be identified output by the plurality of network branches compared with a horizontal line, and determining the target direction according to the target angles.

In one embodiment, the one-dimensional code image to be identified comprises a plurality of images; the target edge feature points comprise target edge feature points corresponding to a plurality of one-dimensional code images to be identified; the target directions comprise target directions corresponding to a plurality of one-dimensional code images to be identified;

the determining the position of the one-dimensional code image to be identified according to the plurality of target edge feature points and the target direction includes:

carrying out regression analysis on the plurality of target edge feature points to obtain a plurality of regression points; the regression points represent central points corresponding to the target edge feature points;

aiming at each regression point, acquiring a preset number of target edge feature points of which the distances from the regression points are smaller than a preset distance threshold, and determining target directions belonging to the same one-dimensional code image to be identified according to the preset number of target edge feature points;

and determining the position of the one-dimensional code image to be identified according to the preset number of target edge feature points and the target direction.

In a second aspect, the present application provides an image localization apparatus, the apparatus comprising:

the acquisition module is used for acquiring a one-dimensional code image to be identified;

the input module is used for inputting the one-dimensional code image to be identified into a target neural network, and acquiring a plurality of target edge characteristic points and target directions corresponding to the one-dimensional code image to be identified, which is output by the target neural network through a plurality of network branches; the target neural network is obtained by training a plurality of network branches in the neural network to be trained according to the sample one-dimensional code image;

and the positioning module is used for determining the position of the one-dimensional code image to be identified according to the plurality of target edge characteristic points and the target direction.

In a third aspect, the present application provides a computer device comprising a memory storing a computer program and a processor implementing the steps of the method described above when the processor executes the computer program.

In a fourth aspect, the present application provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the method described above.

According to the image positioning method, the image positioning device, the computer equipment, the storage medium and the computer program product, the one-dimensional code image to be recognized is obtained, the one-dimensional code image to be recognized is input into a target neural network obtained by training a sample one-dimensional code image to a plurality of network branches in the neural network to be trained, a plurality of target edge characteristic points and target directions corresponding to the one-dimensional code image to be recognized, which is output by the target neural network through the plurality of network branches, are obtained, and the position of the one-dimensional code image to be recognized is determined according to the plurality of target edge characteristic points and the target directions. Compared with the traditional method for positioning the one-dimensional code based on a simple rectangular frame, the method has the advantages that the target neural network is utilized to identify the one-dimensional code image to be identified based on nine degrees of freedom, so that the information of a plurality of angular points and directions of the one-dimensional code image is obtained, and the effect of improving the positioning accuracy of the one-dimensional code image is realized.

Drawings

FIG. 1 is a diagram of an exemplary embodiment of an image localization method;

FIG. 2 is a flow diagram illustrating an exemplary method for image localization;

FIG. 3 is a diagram illustrating edge feature points and directions in one embodiment;

FIG. 4 is a schematic flow diagram of neural network identification in one embodiment;

FIG. 5 is a diagram of one-dimensional code regression analysis in one embodiment;

FIG. 6 is a diagram illustrating one-dimensional code directions in one embodiment;

FIG. 7 is a block diagram showing the structure of an image locating apparatus according to an embodiment;

FIG. 8 is a diagram illustrating an internal structure of a computer device according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The image positioning method provided by the embodiment of the application can be applied to the application environment shown in fig. 1. The terminal 102 may be provided with an image acquisition device, and the terminal 102 may acquire the one-dimensional code image to be recognized through the image acquisition device, so that the terminal 102 may input the one-dimensional code image to be recognized into the target neural network, and acquire a plurality of target edge feature points and target directions in the one-dimensional code image to be recognized, which are output by the target neural network through a plurality of network branches, so that the terminal 102 may determine the position of the one-dimensional code image to be recognized based on the plurality of target edge feature points and the target directions. Additionally, in some embodiments, a server 104 is also included. Wherein the terminal 102 communicates with the server 104 via a network. The data storage system may store data that the server 104 needs to process. The data storage system may be integrated on the server 104, or may be located on the cloud or other network server. The terminal 102 may also upload the identified one-dimensional code image location information to the server 104 for storage. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The portable wearable device can be a smart watch, a smart bracelet, a head-mounted device, and the like. The server 104 may be implemented as a stand-alone server or as a server cluster comprised of multiple servers.

In one embodiment, as shown in fig. 2, an image positioning method is provided, which is described by taking the method as an example applied to the terminal in fig. 1, and includes the following steps:

step S202, acquiring a one-dimensional code image to be identified.

The one-dimensional code image to be recognized may be an image containing one-dimensional codes, and one or more one-dimensional codes in the one-dimensional code image to be recognized may be used. The terminal 102 may obtain a one-dimensional code image to be identified, and position a one-dimensional code included in the image to obtain a position of the one-dimensional code in the one-dimensional code image to be identified.

Step S204, inputting the one-dimensional code image to be identified into a target neural network, and acquiring a plurality of target edge characteristic points and target directions corresponding to the one-dimensional code image to be identified, which is output by the target neural network through a plurality of network branches; and the target neural network is obtained by training a plurality of network branches in the neural network to be trained according to the sample one-dimensional code image.

The target neural network may be a convolutional neural network. A plurality of network branches may be included in the target neural network. For example, the target network branches may include classification branches, regression branches, and direction branches, among others. The terminal 102 may input the one-dimensional code image to be recognized into the target neural network, and the target neural network may perform multi-dimensional recognition on the one-dimensional code image to be recognized based on a plurality of network branches thereof, and output a target edge feature point corresponding to the one-dimensional code image to be recognized and a target direction of the one-dimensional code image to be recognized. The classification branch can be a network branch used for identifying whether a point in the one-dimensional code image to be identified is a target edge feature point; the regression branch can be a network branch used for identifying the coordinates of target edge feature points in the one-dimensional code image to be identified and the distances between a plurality of target edge feature points and the central point of the one-dimensional code where the target edge feature points are located; the direction branch may be a network branch for identifying the direction of the one-dimensional code in the one-dimensional code image to be identified.

Specifically, in one embodiment, obtaining a plurality of target edge feature points and target directions corresponding to a one-dimensional code image to be recognized, which is output by a target neural network through a plurality of network branches, includes: acquiring target coordinates of a plurality of target edge feature points of a one-dimensional code to be identified, which are output by a target neural network through a plurality of network branches; and acquiring target angles of the one-dimensional code images to be identified output by the plurality of network branches compared with a horizontal line, and determining a target direction according to the target angles. In this embodiment, the target edge feature points output by the target neural network may include target coordinates of the target edge feature points, and the target direction output by the target neural network may be a target angle of the one-dimensional code to be recognized. The number of the target edge features may be multiple, and the terminal 102 may respectively obtain target coordinates of the multiple target edge feature points and obtain a target angle of the one-dimensional code image to be recognized output by the target neural network, compared with a horizontal line, since directions of the one-dimensional codes are symmetrical, the terminal 102 may determine a target direction of the one-dimensional code image to be recognized based on the target angle.

The target neural network may be a network obtained by training a plurality of network branches in the neural network to be trained based on the sample one-dimensional code image. The sample one-dimensional code image may be an image of which position information of the one-dimensional code is known, that is, the sample one-dimensional code image may include position information of a known sample edge feature point, and information such as a sample direction of the sample one-dimensional code. The terminal 102 may train the neural network to be trained based on the sample one-dimensional code image, thereby obtaining a target neural network, and perform one-dimensional code positioning identification based on nine degrees of freedom on the one-dimensional code image to be identified.

And step S206, determining the position of the one-dimensional code image to be identified according to the plurality of target edge characteristic points and the target direction.

The terminal 102 may obtain a plurality of target edge feature points and target directions in the one-dimensional code image to be identified output by the target neural network. For example, the terminal 102 may obtain a target coordinate of the target edge feature point in the image of the one-dimensional code to be recognized and a target angle of the one-dimensional code to be recognized in the image, so that the terminal 102 may determine the position information, the shape information, and the like of the one-dimensional code to be recognized in the image by using a plurality of the target coordinates and the target angles.

The identification process of the one-dimensional code image to be identified may be a detection identification process based on nine degrees of freedom, and the final identified effect may be as shown in fig. 3, where fig. 3 is a schematic diagram of edge feature points and directions in one embodiment. The target edge feature points may be corner points of a one-dimensional code image, the one-dimensional code image to be identified may have four corner points, and the terminal 102 may identify coordinates of the four corner points of the one-dimensional code image to be identified in the image through the target neural network as the target edge feature points. Since the directions of the one-dimensional code image are symmetrical, the terminal 102 may also determine, based on the angle information of the one-dimensional code to be recognized, direction information of the one-dimensional code to be recognized in the image as a target direction; the terminal 102 may make the coordinates of the four corner points respectively (x1, y1), (x2, y2), (x3, y3) and (x4, y4), and make the target direction α, then the terminal 102 may determine the position information of the one-dimensional code in the image of the one-dimensional code to be recognized based on the nine-degree-of-freedom information (x1, y1, x2, y2, x3, y3, x4, y4, α) output by the target neural network.

In the image positioning method, the one-dimensional code image to be recognized is obtained, the one-dimensional code image to be recognized is input into a target neural network obtained by training a sample one-dimensional code image to a plurality of network branches in a neural network to be trained, a plurality of target edge feature points and target directions corresponding to the one-dimensional code image to be recognized, which is output by the target neural network through the plurality of network branches, are obtained, and the position of the one-dimensional code image to be recognized is determined according to the plurality of target edge feature points and the target directions. Compared with the traditional method for positioning the one-dimensional code based on a simple rectangular frame, the method has the advantages that the target neural network is utilized to identify the one-dimensional code image to be identified based on nine degrees of freedom, so that the information of a plurality of angular points and directions of the one-dimensional code image is obtained, and the effect of improving the positioning accuracy of the one-dimensional code image is realized.

In one embodiment, further comprising: acquiring a sample one-dimensional code image, and acquiring the real coordinates and the real direction of sample edge characteristic points in the sample one-dimensional code image; inputting the sample one-dimensional code image into a classification branch in a neural network to be trained, and acquiring a sample identification result of edge feature points in the sample one-dimensional code image, which is output by the classification branch; inputting the sample identification result into a regression branch in a neural network to be trained, and acquiring sample coordinates of edge feature points in a sample one-dimensional code image output by the regression branch; inputting the sample one-dimensional code image into a direction branch in a neural network to be trained, and acquiring the sample direction of the sample one-dimensional code image output by the direction branch; constructing a first loss function according to the sample identification result and the real identification result corresponding to the sample one-dimensional code image, constructing a second loss function according to the sample coordinate and the real coordinate corresponding to the sample one-dimensional code image, and constructing a third loss function according to the sample direction and the real direction corresponding to the sample one-dimensional code image; detecting whether the first loss function is less than or equal to a first threshold, whether the second loss function is less than or equal to a second threshold, and whether the third loss function is less than or equal to a third threshold; if not, adjusting the neural network to be trained according to the first loss function, the second loss function and the third loss function, and returning to the step of inputting the sample one-dimensional code image into the classification branch in the neural network to be trained; and if so, taking the current neural network to be trained as the target neural network.

In this embodiment, the terminal 102 may train the classification branch, the regression branch, and the direction branch in the neural network to be trained based on the sample one-dimensional code to obtain the target neural network. The flow diagram can be as shown in fig. 4, and fig. 4 is a flow diagram of neural network identification in one embodiment. The terminal 102 may obtain a sample one-dimensional code image, input the sample one-dimensional code image into a neural network to be trained, and identify a one-dimensional code in the sample one-dimensional code image by using a classification branch (cls), a regression branch (reg), and an ori in the neural network, thereby implementing training of a network branch in the neural network.

The training of each network branch in the neural network based on the sample one-dimensional code may be an independent training process. The terminal 102 may obtain a sample edge feature point and a real coordinate in the sample one-dimensional code image and a sample direction of the real one-dimensional code as a verification set of the neural network to be trained. The terminal 102 may input the sample one-dimensional code into the neural network to be trained, and the neural network to be trained outputs the sample identification result of the edge feature point in the sample one-dimensional code image through the classification branch. For example, the terminal 102 may identify whether an edge feature point of the one-dimensional code exists in the image of the sample one-dimensional code through a classification branch in the neural network to be trained, that is, the classification branch may be a branch for judgment. The terminal 102 may further input the sample recognition result into a regression branch in the neural network to be trained, and obtain sample coordinates of edge feature points in the sample one-dimensional code image output by the regression branch, thereby implementing recognition of the edge feature point coordinates of the one-dimensional code in the sample one-dimensional code image, and the regression branch may further obtain distances from each coordinate to the center of the one-dimensional code image where the regression branch is located, based on the coordinates of the edge feature points. The terminal 102 may further input the sample one-dimensional code into a direction branch in the neural network to be trained, and obtain a sample direction of the sample one-dimensional code image output by the direction branch.

After obtaining the outputs of the network branches in the neural network to be trained, the terminal 102 may determine whether to complete training based on a loss function. The terminal 102 may construct the first loss function based on the sample identification result obtained by performing the existence identification on the edge feature point of the sample one-dimensional code image and the real identification result corresponding to the sample one-dimensional code image. The terminal 102 may further construct a second loss function according to the sample coordinates and the real coordinates corresponding to the sample one-dimensional code. And the terminal 102 may further construct a third loss function according to the sample direction and the real direction corresponding to the sample one-dimensional code image. Wherein, the three loss functions can be judged independently. The terminal 102 may detect whether the first loss function is less than or equal to a first threshold, whether the second loss function is less than or equal to a second threshold, and whether the third loss function is less than or equal to a third threshold, respectively. And obtaining a plurality of detection results. If the terminal 102 detects that the three loss functions are judged to be negative, that is, the first loss function is greater than the first threshold, the second loss function is greater than the second threshold, and the third loss function is greater than the third threshold, the terminal 102 may adjust the neural network to be trained according to the first loss function, the second loss function, and the third loss function, and return to the step of inputting the classification branch in the neural network to be trained to the sample one-dimensional code image, thereby implementing the next training of the neural network model to be trained after the weight parameter is adjusted.

If the terminal 102 detects all of the three loss functions, that is, when the terminal 102 detects that the first loss function is less than or equal to the first threshold, the second loss function is less than or equal to the second threshold, and the third loss function is less than or equal to the third threshold, it may be determined that training of each network branch in the neural network to be trained is completed, so that the terminal 102 may end the loop, and use the current neural network as the target neural network.

The output result of each network branch may be output in a matrix form. For example, for x of the feature map of the depth network, i.e., the sample one-dimensional code image, assuming that the height is h and the width is w, the output of the classification branch for one map is 1x2xhxw, where 1x2 indicates that the matrix includes 2 feature outputs, i.e., corner points and non-corner points, and the regression branch is 1x8xhxw, where 1x8 indicates that the matrix includes 8 feature outputs, i.e., the abscissa and ordinate of four corner points in the one-dimensional code, and the input of the direction branch is 1x1 xhxhw, where 1x1 indicates that the matrix includes 1 feature output, i.e., the angle of the one-dimensional code. The terminal 102 finally obtains a result of nine degrees of freedom for the one-dimensional code image based on the outputs of the above branches: (x1, y1, x2, y2, x3, y3, x4, y4, α).

Through the embodiment, the terminal 102 can perform loss function-based training on a plurality of network branches in the neural network to be trained based on the sample one-dimensional code image, so that the terminal 102 can perform positioning monitoring on the one-dimensional code image by using the trained target neural network, and the accuracy of positioning the one-dimensional code image is improved.

In one embodiment, constructing the first loss function according to the real recognition result of the sample recognition result corresponding to the sample one-dimensional code image includes: and constructing a two-classification loss function as a first loss function according to the number of pixels in the sample one-dimensional code image, the matrix of the sample identification result and the real identification result.

In this embodiment, the terminal 102 may construct a first loss function corresponding to the classification branch. The sample recognition result may be stored in a matrix form. The terminal 102 may obtain the number of pixels included in the sample one-dimensional code image, and construct a two-classification loss function as the first loss function based on the number of pixels, the matrix of the sample identification result, and the real identification result. The edge feature point in the identification result may be an angular point in the one-dimensional code image, and the classification branch may be a network branch for identifying whether a pixel point in the one-dimensional code image is an angular point. For example, for classificationThe branch, whose task is to detect the position of the corner point in the deeply learned feature map (x _ n), i.e. the sample one-dimensional code image, and construct a loss function based on bce loss (binary cross entropy loss), whose specific formula is as follows:

(ii) a Wherein l (x, y) represents the value of the loss function, N represents the number of pixels, y represents the true value of the label, i.e. the true identification result of the edge feature point in the sample one-dimensional code image, w is a preset weight, which may also be referred to as a hyper-parameter, and the value thereof may be set according to the actual situation, and may be 1, for example.

Through the embodiment, the terminal 102 can construct the first loss function corresponding to the classification branch based on the two classification loss functions, so that the terminal 102 can train the neural network to be trained based on the first loss function, and the accuracy of recognizing the one-dimensional code image is further improved.

In one embodiment, constructing the second loss function according to the sample coordinates and the real coordinates corresponding to the sample one-dimensional code image includes: and constructing a first smooth average absolute error loss function as a second loss function according to the sample coordinate, the real coordinate and a preset characteristic threshold.

In this embodiment, the terminal 102 may construct the second loss function corresponding to the regression branch. The terminal 102 may obtain a preset feature threshold, where the preset feature threshold may be set according to an actual situation, and the terminal 102 may construct a first smooth average absolute error loss function as the second loss function based on the sample coordinates, the real coordinates, and the preset feature threshold. The sample coordinates and the real coordinates may be coordinates of a corner point in the sample one-dimensional code image. The edge feature point in the recognition result may be a corner point in the one-dimensional code image, and the regression branch may be a network branch for recognizing coordinates of the corner point in the one-dimensional code image. For example, the terminal 102 may use a smoothed L1loss (smoothed mean absolute error loss) loss function as the second loss function, and the specific formula thereof may be as follows:

；

wherein, L is a value of the loss function, and x is a feature value of each point in the sample one-dimensional code image.

In addition, the terminal 102 may further determine, through a regression branch, a distance between the corner point and a center point of the one-dimensional code where the corner point is located, where a schematic diagram of the distance is shown in fig. 5, and fig. 5 is a schematic diagram of regression analysis of the one-dimensional code in one embodiment. The terminal 102 may first identify the corner points in the one-dimensional code through the regression branch, and obtain a distance from each corner point to a central point in the one-dimensional code where the corner point is located through the regression branch.

Through the embodiment, the terminal 102 may construct the second loss function corresponding to the regression branch based on the smooth average absolute error loss function, so that the terminal 102 may train the neural network to be trained based on the second loss function, thereby improving the accuracy of identifying the one-dimensional code image.

In one embodiment, constructing the third loss function according to the real direction of the sample direction corresponding to the sample one-dimensional code image includes: and constructing a second smooth average absolute error loss function as a third loss function according to the sample direction, the real direction and the preset direction threshold.

In this embodiment, the terminal 102 may construct a third loss function corresponding to the direction branch. The terminal 102 may construct a second smooth average absolute error loss function as a third loss function based on the sample direction obtained through the training, the real direction corresponding to the sample one-dimensional code image, and the preset direction threshold. The sample direction and the real direction may be based on the identification of the angle of the one-dimensional code compared to the horizontal line. For example, as shown in fig. 6, fig. 6 is a schematic diagram of a direction of a one-dimensional code in one embodiment. Since the direction of the one-dimensional code is symmetrical, only the angular range needs to be (0, 180) degrees. The terminal 102 may construct a smooth L1loss based on the angle corresponding to the sample direction, the angle corresponding to the real direction, and the preset direction threshold, so as to obtain the third loss function.

Through the embodiment, the terminal 102 may construct the third loss function corresponding to the direction branch based on the smooth average absolute error loss function, so that the terminal 102 may train the neural network to be trained based on the third loss function, and further, the accuracy of recognizing the one-dimensional code image is improved.

In one embodiment, determining the position of the one-dimensional code image to be recognized according to a plurality of target edge feature points and a target direction includes: carrying out regression analysis on the plurality of target edge feature points to obtain a plurality of regression points; the regression points represent central points corresponding to the multiple target edge characteristic points; aiming at each regression point, acquiring a preset number of target edge feature points of which the distances from the regression points are smaller than a preset distance threshold, and determining target directions belonging to the same one-dimensional code image to be identified according to the preset number of target edge feature points; and determining the position of the one-dimensional code image to be identified according to the preset number of target edge feature points and the target direction.

In this embodiment, the one-dimensional code image to be identified includes a plurality of images; the target edge feature points comprise target edge feature points corresponding to the one-dimensional code images to be identified; the target direction comprises a plurality of target directions corresponding to the one-dimensional code images to be recognized. The terminal 102 may perform regression analysis on the plurality of target edge feature points to obtain a plurality of regression points, that is, the central points corresponding to the plurality of target edge feature points. The number of target edge feature points for obtaining the center point may be set according to actual conditions, and may be 4, for example. For each regression point, the terminal 102 may obtain a preset number of target edge feature points whose distances from the regression point are smaller than a preset distance threshold, so that the terminal 102 may determine, according to the preset number of target edge feature points, target directions belonging to the same one-dimensional code image to be recognized. And the terminal 102 may determine the position of the one-dimensional code image to be recognized based on the preset number of target edge feature points and the target direction.

The target edge feature points may be corners of one-dimensional codes, and since the one-dimensional code image may include a plurality of one-dimensional codes, the terminal 102 may perform clustering on the corners after recognizing the plurality of corners through the target neural network, and determine the corners belonging to the same one-dimensional code. For example, a one-dimensional code uses one nine degrees of freedom (x1, y1, x2, y2, x3, y3, x4, y4, α) to represent, a complex aggregation strategy is not needed when there is only one code in a picture, but when there are multiple codes in a scene, it is necessary to distinguish which code a detected key point belongs to, the terminal 102 may perform detection through a regression branch, and if four centripetal vectors predicted based on regression of the key point all point to the same central region, it is indicated that the four key points belong to the same one-dimensional code. Specifically, the algorithm can be as follows:

inputting: feature diagram of classification branch output, feature diagram C of regression branch output, and feature diagram O of direction branch output of one-dimensional code

M, wherein M comprises M (x1, y1, x2, y2, x3, y3, x4, y4, alpha) pieces of one-dimensional code information

1. The terminal 102 performs maximum suppression on C to obtain an index, and searches corresponding positions in the C and O graphs through the index to obtain k key points (x, y), regression points (j, k) and directions (O);

2. terminal setting M = [ ]

3.For i->k：

4. The terminal 102 obtains the regression point (j _ i, k _ i)

5. If the point is already in M then exit and the next point calculation is performed.

6. If not in M, the terminal 102 calculates to obtain 4 points around the point satisfying a distance less than a preset distance threshold (x1_ i, y1_ i, x2_ i, y2_ i, x3_ i, y3_ i, x4_ i, y4_ i, α _ i)

7. The terminal 102 adds (x1_ i, y1_ i, x2_ i, y2_ i, x3_ i, y3_ i, x4_ i, y4_ i, α _ i) to M.

8.Return M

The terminal 102 may then cluster the one-dimensional codes in the one-dimensional code image based on the above algorithm to locate the position information of the plurality of one-dimensional codes in the image.

Through the embodiment, the terminal 102 can identify a plurality of one-dimensional codes contained in the one-dimensional code image based on the clustering algorithm, thereby achieving the effect of improving the identification accuracy of the one-dimensional codes. In addition, the nine-degree-of-freedom one-dimensional code positioning based on the neural network predicts four corner point coordinates and the bar code direction of the one-dimensional code in the image through the convolutional neural network, and finally inputs the position information of the corner points and the direction of the bar code, so that the decoding speed and the decoding precision are improved more accurately and rapidly by further combining the traditional decoding algorithm, meanwhile, the influence of the algorithm based on the neural network on the interference of illumination deformation and the like is reduced more, and the robustness of the whole algorithm is improved.

It should be understood that, although the steps in the flowcharts related to the embodiments as described above are sequentially displayed as indicated by arrows, the steps are not necessarily performed sequentially as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a part of the steps in the flowcharts related to the embodiments described above may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, and the execution order of the steps or stages is not necessarily sequential, but may be rotated or alternated with other steps or at least a part of the steps or stages in other steps.

Based on the same inventive concept, the embodiment of the present application further provides an image positioning apparatus for implementing the image positioning method. The implementation scheme for solving the problem provided by the device is similar to the implementation scheme described in the above method, so the specific limitations in one or more embodiments of the image positioning device provided below can be referred to the limitations of the image positioning method in the above, and are not described herein again.

In one embodiment, as shown in fig. 7, there is provided an image localization apparatus including: an acquisition module 500, an input module 502, and a positioning module 504, wherein:

an obtaining module 500, configured to obtain a one-dimensional code image to be identified.

An input module 502, configured to input the one-dimensional code image to be recognized into a target neural network, and obtain a plurality of target edge feature points and target directions corresponding to the one-dimensional code image to be recognized, which is output by the target neural network through a plurality of network branches; and the target neural network is obtained by training a plurality of network branches in the neural network to be trained according to the sample one-dimensional code image.

And the positioning module 504 is configured to determine the position of the one-dimensional code image to be identified according to the plurality of target edge feature points and the target direction.

In one embodiment, the above apparatus further comprises: the training module is used for acquiring a sample one-dimensional code image and acquiring the real coordinates and the real direction of sample edge feature points in the sample one-dimensional code image; inputting the sample one-dimensional code image into a classification branch in a neural network to be trained, and acquiring a sample identification result of edge feature points in the sample one-dimensional code image, which is output by the classification branch; inputting the sample identification result into a regression branch in a neural network to be trained, and acquiring sample coordinates of edge feature points in a sample one-dimensional code image output by the regression branch; inputting the sample one-dimensional code image into a direction branch in a neural network to be trained, and acquiring the sample direction of the sample one-dimensional code image output by the direction branch; constructing a first loss function according to the sample identification result and the real identification result corresponding to the sample one-dimensional code image, constructing a second loss function according to the sample coordinate and the real coordinate corresponding to the sample one-dimensional code image, and constructing a third loss function according to the sample direction and the real direction corresponding to the sample one-dimensional code image; detecting whether the first loss function is less than or equal to a first threshold, whether the second loss function is less than or equal to a second threshold, and whether the third loss function is less than or equal to a third threshold; if not, adjusting the neural network to be trained according to the first loss function, the second loss function and the third loss function, and returning to the step of inputting the sample one-dimensional code image into the classification branch in the neural network to be trained; and if so, taking the current neural network to be trained as the target neural network.

In one embodiment, the above apparatus further comprises: and the training module is used for constructing a two-classification loss function as a first loss function according to the number of pixels in the sample one-dimensional code image, the matrix of the sample identification result and the real identification result.

In one embodiment, the above apparatus further comprises: and the training module is used for constructing a first smooth average absolute error loss function as a second loss function according to the sample coordinate, the real coordinate and the preset characteristic threshold.

In one embodiment, the above apparatus further comprises: and the training module is used for constructing a second smooth average absolute error loss function as a third loss function according to the sample direction, the real direction and the preset direction threshold.

In an embodiment, the input module 502 is specifically configured to obtain target coordinates of a plurality of target edge feature points of a to-be-identified one-dimensional code output by a target neural network through a plurality of network branches; and acquiring target angles of the one-dimensional code images to be identified output by the plurality of network branches compared with a horizontal line, and determining a target direction according to the target angles.

In an embodiment, the positioning module 504 is specifically configured to perform regression analysis on a plurality of target edge feature points to obtain a plurality of regression points; the regression points represent central points corresponding to the multiple target edge characteristic points; aiming at each regression point, acquiring a preset number of target edge feature points of which the distances from the regression points are smaller than a preset distance threshold, and determining target directions belonging to the same one-dimensional code image to be identified according to the preset number of target edge feature points; and determining the position of the one-dimensional code image to be identified according to the preset number of target edge feature points and the target direction.

The modules in the image positioning device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a terminal, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement an image localization method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In an embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the image localization method described above when executing the computer program.

In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, is adapted to carry out the above-mentioned image localization method.

In an embodiment, a computer program product is provided, comprising a computer program which, when being executed by a processor, carries out the image localization method as described above.

It should be noted that, the user information (including but not limited to user device information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high-density embedded nonvolatile Memory, resistive Random Access Memory (ReRAM), Magnetic Random Access Memory (MRAM), Ferroelectric Random Access Memory (FRAM), Phase Change Memory (PCM), graphene Memory, and the like. Volatile Memory can include Random Access Memory (RAM), external cache Memory, and the like. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others. The databases referred to in various embodiments provided herein may include at least one of relational and non-relational databases. The non-relational database may include, but is not limited to, a block chain based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic devices, quantum computing based data processing logic devices, etc., without limitation.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present application. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present application shall be subject to the appended claims.

Claims

1. An image localization method, characterized in that the method comprises:

acquiring a one-dimensional code image to be identified;

2. The method of claim 1, wherein the plurality of network branches comprises a classification branch, a regression branch, and a direction branch;

the method further comprises the following steps:

3. The method of claim 2, wherein the constructing a first loss function according to the real recognition result of the sample recognition result corresponding to the sample one-dimensional code image comprises:

4. The method of claim 2, wherein the constructing a second loss function according to the sample coordinates and the real coordinates corresponding to the sample one-dimensional code image comprises:

5. The method of claim 2, wherein constructing a third loss function according to the true direction of the sample direction corresponding to the sample one-dimensional code image comprises:

6. The method according to claim 1, wherein the obtaining of the plurality of target edge feature points and the target directions corresponding to the one-dimensional code image to be recognized, which is output by the target neural network through a plurality of network branches, comprises:

7. The method according to claim 1, wherein the one-dimensional code image to be identified comprises a plurality; the target edge feature points comprise target edge feature points corresponding to a plurality of one-dimensional code images to be identified; the target directions comprise target directions corresponding to a plurality of one-dimensional code images to be identified;

8. An image localization arrangement, characterized in that the arrangement comprises:

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.