CN111832554A

CN111832554A - Image detection method, device and storage medium

Info

Publication number: CN111832554A
Application number: CN201910300190.1A
Authority: CN
Inventors: 张恒瑞; 郭明坚; 宋翔
Original assignee: SF Technology Co Ltd
Current assignee: SF Technology Co Ltd; SF Tech Co Ltd
Priority date: 2019-04-15
Filing date: 2019-04-15
Publication date: 2020-10-27

Abstract

The embodiment of the application discloses an image detection method, an image detection device and a storage medium, wherein the image detection device acquires a plurality of images to be detected of a detection object; then, image detection processing is carried out on the multiple images to be detected respectively to obtain a detection result corresponding to each image to be detected, wherein the detection result comprises a character string; determining an edit distance matrix corresponding to a plurality of images to be detected according to the character string; and finally, determining a target character string corresponding to the detection object according to the editing distance matrix. In the embodiment of the application, the target character strings corresponding to the detection objects are determined by using the edit distance matrixes corresponding to the multiple images to be detected, and the image detection is performed by combining the multiple images to be detected, so that the accuracy of image identification can be improved.

Description

Image detection method, device and storage medium

Technical Field

The present application relates to the field of image recognition, and in particular, to an image detection method, an image detection apparatus, and a storage medium.

Background

Recognition of objects such as license plates, house plates, etc. is a popular application in the field of Optical Character Recognition (OCR).

A large number of parking lots, toll stations and the like are arranged on the market, and the OCR is adopted for license plate recognition. However, in these application scenarios, the distance between the license plate and the camera needs to be about 1 meter, the imaging requirement is relatively high, but when the distance between the camera and the license plate is relatively long (for example, in the logistics industry, when the license plate of a vehicle at a loading/unloading port is identified, the distance between the camera and the license plate is greater than 3 meters), or the imaging quality of the object is poor due to environmental reasons, the method is applied to image identification, and the identification accuracy is relatively low.

Disclosure of Invention

The embodiment of the application provides an image detection method, an image detection device and a storage medium, which can improve the accuracy of image identification.

In one aspect, the present application provides an image detection method, including:

acquiring a plurality of images to be detected of a detection object;

respectively carrying out image detection processing on the multiple images to be detected to obtain a detection result corresponding to each image to be detected, wherein the detection result comprises a character string;

determining an editing distance matrix corresponding to the plurality of images to be detected according to the character string;

and determining a target character string corresponding to the detection object according to the editing distance matrix.

Optionally, in some embodiments, the determining, according to the character string, an edit distance matrix corresponding to the multiple images to be detected includes:

respectively carrying out editing distance calculation on any two character strings in the character strings to obtain the editing distance between any two character strings;

and determining the editing distance matrix according to the editing distance.

Optionally, in some embodiments, the determining, according to the edit distance matrix, a target character string corresponding to the detection object includes:

determining the number of zero elements of each row in the editing distance matrix;

determining the maximum zero element number with the maximum numerical value from the zero element numbers of each line;

determining whether the maximum number of zero elements is greater than a number threshold;

and if the number of the zero elements is larger than the number threshold, determining the character string corresponding to the zero element in the maximum zero element number as the target character string.

Optionally, in some embodiments, the detection result further includes a character, a frame corresponding to the character, and a confidence of the character, and after determining whether the maximum zero element number is greater than a number threshold, the method further includes:

if the number of the characters is not larger than the number threshold, determining a target character corresponding to each character position according to the characters, the frames corresponding to the characters and the confidence degrees of the characters;

and determining the target character string according to the target character corresponding to each character position.

Optionally, in some embodiments, the determining, according to the character, a frame corresponding to the character, and the confidence level of the character, a target character corresponding to each character position includes:

determining a character position corresponding to each character according to the frame corresponding to the character;

determining characters corresponding to each character position;

determining the score of each character in each character position according to the character corresponding to each character position and the confidence coefficient of the character;

and determining the character with the highest score in each character position as the target character corresponding to each character position.

Optionally, in some embodiments, the acquiring a plurality of images to be detected of the detection object includes:

acquiring a video to be detected of a detected object;

and acquiring the plurality of images to be detected from the video to be detected.

Optionally, in some embodiments, the acquiring the plurality of images to be detected from the video to be detected includes:

determining the confidence coefficient of each frame of image to be detected in the video to be detected and the target pixel area of each frame of image to be detected;

and determining the multiple images to be detected from each frame of image to be detected according to the confidence coefficient of each frame of image to be detected and the target pixel area of each frame of image to be detected.

Correspondingly, the present application further provides an image detection apparatus, specifically including:

the device comprises an acquisition unit, a detection unit and a processing unit, wherein the acquisition unit is used for acquiring a plurality of images to be detected of a detection object;

the processing unit is used for respectively carrying out image detection processing on the multiple images to be detected to obtain a detection result corresponding to each image to be detected, and the detection result comprises a character string;

the first determining unit is used for determining an editing distance matrix corresponding to the multiple images to be detected according to the character strings;

and the second determining unit is used for determining a target character string corresponding to the detection object according to the editing distance matrix.

Optionally, in some embodiments, the first determining unit is specifically configured to:

and determining the editing distance matrix according to the editing distance.

Optionally, in some embodiments, the second determining unit includes:

the first determining subunit is used for determining the number of zero elements in each row in the editing distance matrix;

the second determining subunit is used for determining the maximum zero element number with the maximum numerical value from the zero element numbers of each row;

a third determining subunit, configured to determine whether the maximum number of zero elements is greater than a number threshold;

and the fourth determining subunit is configured to determine, when the maximum number of zero elements is greater than the number threshold, a character string corresponding to a zero element in the maximum number of zero elements as the target character string.

Optionally, in some embodiments, the detection result further includes a character, a frame corresponding to the character, and a confidence level of the character, and the apparatus further includes:

a third determining unit, configured to determine, when the maximum zero element number is not greater than the number threshold, a target character corresponding to each character position according to the character, a frame corresponding to the character, and a confidence of the character;

and the fourth determining unit is used for determining the target character string according to the target character corresponding to each character position.

Optionally, in some embodiments, the third determining unit is specifically configured to:

determining characters corresponding to each character position;

Optionally, in some embodiments, the obtaining unit is specifically configured to:

acquiring a video to be detected of a detected object;

Optionally, in some embodiments, the obtaining unit is further specifically configured to:

Yet another aspect of the present application provides a computer program product containing instructions which, when run on a computer, cause the computer to perform the method of the above-described aspects.

In addition, a storage medium is further provided, where multiple instructions are stored, and the instructions are suitable for being loaded by a processor to perform the steps in any one of the image detection methods provided in the embodiments of the present application.

In the embodiment of the application, an image detection device acquires a plurality of images to be detected of a detection object; then, image detection processing is carried out on the multiple images to be detected respectively to obtain a detection result corresponding to each image to be detected, wherein the detection result comprises a character string; determining an edit distance matrix corresponding to a plurality of images to be detected according to the character string; and finally, determining a target character string corresponding to the detection object according to the editing distance matrix. In the embodiment of the application, the target character strings corresponding to the detection objects are determined by using the edit distance matrixes corresponding to the multiple images to be detected, and the image detection is performed by combining the multiple images to be detected, so that the accuracy of image identification can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a schematic view of an application scenario of an image detection method provided in an embodiment of the present application;

FIG. 2 is a schematic flowchart of an image detection method according to an embodiment of the present disclosure;

FIG. 3 is a schematic flow chart of another image detection method provided in the embodiments of the present application;

FIG. 4 is a schematic structural diagram of an image detection apparatus provided in an embodiment of the present application;

FIG. 5 is a schematic structural diagram of an image detection apparatus according to an embodiment of the present disclosure;

fig. 6 is a schematic structural diagram of a server according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the description that follows, specific embodiments of the present application will be described with reference to steps and symbols executed by one or more computers, unless otherwise indicated. Accordingly, these steps and operations will be referred to, several times, as being performed by a computer, the computer performing operations involving a processing unit of the computer in electronic signals representing data in a structured form. This operation transforms the data or maintains it at locations in the computer's memory system, which may be reconfigured or otherwise altered in a manner well known to those skilled in the art. The data maintains a data structure that is a physical location of the memory that has particular characteristics defined by the data format. However, while the principles of the application have been described in language specific to above, it is not intended to be limited to the specific embodiments shown, and it will be recognized by those of ordinary skill in the art that various of the steps and operations described below may be implemented in hardware.

The principles of the present application may be employed in numerous other general-purpose or special-purpose computing, communication environments or configurations. Examples of well known computing systems, environments, and configurations that may be suitable for use with the application include, but are not limited to, hand-held telephones, personal computers, servers, multiprocessor systems, microcomputer-based systems, mainframe-based computers, and distributed computing environments that include any of the above systems or devices.

The terms "first", "second", and "third", etc. in this application are used to distinguish between different objects and not to describe a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions.

The embodiment of the application provides an image detection method, an image detection device and a storage medium.

The image detection device can be integrated in the server, and the accuracy of image identification can be improved by the image detection device.

In some embodiments, the image detection apparatus in the present application may be applied to a loading and unloading port of a logistics transition, for identifying a license plate number of a vehicle at the loading and unloading port, as shown in fig. 1, fig. 1 is an application scene schematic diagram of the image detection method in the embodiments of the present application, a camera in fig. 1 may be installed at a position of the loading and unloading port close to a ceiling, and a lens faces a direction of a vehicle coming, the image detection apparatus in the present application obtains a surveillance video of a license plate position of the truck through the camera, extracts a plurality of images to be detected from the surveillance video, and then performs image detection processing on the plurality of images to be detected to obtain a detection result corresponding to each image to be detected, where the detection result includes a character string; then determining an edit distance matrix corresponding to a plurality of images to be detected according to the character strings; and finally, determining a target character string corresponding to the detection object according to the editing distance matrix. And the target character string is a license plate character string in the application scene.

The image detection apparatus in the present application may also be used to detect other objects, such as a doorplate, and the specific application scenario and the detection object are not limited herein.

Referring to fig. 2, fig. 2 is a schematic flowchart of an image detection method according to an embodiment of the present disclosure. The method comprises the following specific processes:

201. and acquiring a plurality of images to be detected of the detection object.

The object to be detected can be a license plate, a doorplate and other objects needing character strings identified through images.

In some embodiments, acquiring a plurality of images to be detected of a detection object includes: acquiring a video to be detected of a detected object; and then acquiring a plurality of images to be detected from the video to be detected.

In some embodiments, a camera may be used to collect a video to be detected of a detected object, a multi-target detection network (SSD) may be used to continuously detect a license plate in the video to be detected, and the detected license plate image and a corresponding confidence thereof may be stored in a queue.

More specifically, acquiring a plurality of images to be detected from a video to be detected includes: determining the confidence coefficient of each frame of image to be detected in the video to be detected and the target pixel area of each frame of image to be detected; and then determining a plurality of images to be detected from each frame of images to be detected according to the confidence coefficient of each frame of images to be detected and the target pixel area of each frame of images to be detected.

After the confidence of each frame of image to be detected and the target pixel area of each frame of image to be detected are determined from the queue, the confidence of each frame of image to be detected is multiplied by the target pixel area of the corresponding image to be detected to serve as an evaluation index of the image to be detected, then each frame of image to be detected is sequenced, N license plates with the highest scores are taken out to serve as a plurality of images to be detected of an object to be detected in the embodiment of the application and are normalized to be a uniform size, wherein the value of N can be any number from 6 to 10.

202. And respectively carrying out image detection processing on the plurality of images to be detected to obtain a detection result corresponding to each image to be detected.

The method and the device for recognizing the character strings can be used for respectively carrying out image detection processing on the multiple images to be detected based on a deep learning character string recognition method, and storing detailed detection results, wherein the detection results comprise the character strings of each image to be detected.

In some embodiments, the detection result further includes a character, a frame corresponding to the character, and a confidence level of the character.

203. And determining an edit distance matrix corresponding to the plurality of images to be detected according to the character strings.

Specifically, in some embodiments, the edit distance between any two character strings in the character strings is calculated respectively to obtain the edit distance between any two character strings; and then determining the edit distance matrix according to the edit distance.

That is, the edit distance (Levenshtein distance) is calculated between every two character strings of each image to be detected in the multiple images to be detected, wherein the multiple images to be detected in the embodiment of the application can be N images to be detected, and the character strings of the N images to be detected are plateStr_i(i-0, 1, …, N-1), where the edit distance between string i and string j is L_i,j＝levenshteinDistance(plateStr_i,plateStr_j)。

Wherein the edit distance is used to quantify the difference between two strings, defined as: the minimum number of times required to edit a single character (modify, insert, delete) when modifying string a to string B.

After the editing distance calculation is carried out on every two character strings in the N character strings, the editing distance between every two character strings can be combined into an editing distance matrix.

204. And determining a target character string corresponding to the detection object according to the editing distance matrix.

In some embodiments, the number of zero elements in each row in the edit distance matrix needs to be determined first; then determining the maximum zero element number with the maximum numerical value from the zero element number of each row; determining whether the maximum number of zero elements is greater than a number threshold; and if the number of the zero elements is larger than the number threshold, determining the character string corresponding to the zero element in the maximum zero element number as a target character string. If the number of the characters is not greater than the number threshold, determining a target character corresponding to each character position according to the characters, the frames corresponding to the characters and the confidence degrees of the characters; and determining a target character string according to the target character corresponding to each character position, wherein the number threshold may be an up-integer of N/2, for example, if N is 8, the number threshold is 4, and if N is 9, the number threshold is 5.

The editing distance matrix is a symmetric square matrix with a main diagonal line of 0, and the number of zero elements in each row represents the number of times that the corresponding character string of the row appears in the N detection results (character strings). If all the detection results are the same, the row is all zero, and the edit distance matrix L is a zero matrix at this time.

If L is a zero matrix, all detection results are the same at this time, the results are directly output, the calculated amount is reduced, otherwise, the number of zero elements of each row of the matrix L needs to be counted, the rows of the matrix L are sorted according to the number of the zero elements, the row with the most zero elements is obtained, and the zero elements of the row with the most zero elements represent the number n of the same detection results.

And judging whether N > ceil (N/2) is established or not, if so, determining the fused detection result to be the N same detection results, and determining the character strings corresponding to the N zero elements as target character strings.

If N > ceil (N/2) is not true, then the target character corresponding to each character position needs to be determined respectively, and then the target character string is determined according to the target character corresponding to each character position.

Specifically, because N images to be detected are normalized, and the detection result further includes each character in each image to be detected, a frame (i.e., bounding-box) of each character, and a confidence of each character, based on these information, we align the characters first, and then vote and sort the characters at the positions of each character to obtain the best character (i.e., target character) at the position, where the step of aligning the characters is as follows:

according to the bounding-box of each character in the detection result, a vertex coordinate x and a character width can be obtained, and then the horizontal coordinate of the center point of the corresponding character is calculated according to the vertex coordinate x and the character width:

then, one detection result is taken from the N detection results (each detection result includes all characters of one image to be detected) as a reference, for example, if the detection result of one image to be detected includes 8 characters, the 8 characters are taken as the reference, then each character in the remaining N-1 detection results is compared with the reference once from left to right, specifically, a threshold is set, and if the minimum value of the distance between the center point of one character and the horizontal coordinate of the center point of one reference character is smaller than the threshold, the character is classified to the corresponding reference character.

When all characters in the N-1 detection results are classified to the reference detection result (namely, when the character corresponding to each character position is determined), character voting of the character position is carried out from left to right according to the character position in the reference detection result, and it is assumed that m characters (C) exist at a certain character position₁，C₂，…，C_m) For character C_i(i ═ 1,2, …, m), the number of occurrences is l, the character score is:

score_Ci＝confidence₁+confidence₂+…+confidence_l；

i.e. character C_iAnd adding the corresponding character confidences to obtain the final score of the character, then taking the character with the highest score in the position as the optimal character (target character) of the position, and sequencing the optimal characters of all the character positions according to the sequence from left to right to obtain the target character string (namely the fused detection result).

By using the image detection method, the problem of low image character recognition accuracy caused by poor image imaging quality due to the fact that the camera is far away from a detection object and the illumination of the shooting environment is complex can be effectively solved, the image detection technology can be applied to complex scenes such as a logistics loading and unloading port, the license plate of a vehicle can be automatically detected, and the scheduling management of the vehicle can be improved.

Referring to fig. 3, fig. 3 is another schematic flow chart of an image detection method according to an embodiment of the present application, and a specific flow of the method may be as follows:

301. the image detection equipment acquires a video to be detected of the license plate.

The object to be detected can be a license plate, a doorplate and other objects needing character strings through image recognition.

In some embodiments, a camera may be used to collect a video to be detected of a detected object, an SSD is used to continuously detect a license plate in the video to be detected, and the detected license plate image and a corresponding confidence thereof are stored in a queue.

302. The image detection equipment acquires a plurality of images to be detected from a video to be detected.

Specifically, in some embodiments, the confidence of each frame of the image to be detected in the video to be detected and the target pixel area of each frame of the image to be detected can be determined; and then determining a plurality of images to be detected from each frame of images to be detected according to the confidence coefficient of each frame of images to be detected and the target pixel area of each frame of images to be detected.

303. And the image detection equipment respectively carries out image detection processing on the multiple images to be detected to obtain a detection result corresponding to each image to be detected.

The detection result comprises a character string and a character corresponding to each image to be detected, a frame corresponding to the character and the confidence coefficient of the character.

In this embodiment, the character string is a specific series of license plate numbers, for example, the character string of a certain image to be detected is "yue B8866 x", the character is a character at each character position in the series of license plate numbers, for example, "B", "8", "6" and "x", a frame corresponding to the character is a bounding-box of each character, and the confidence of the character may reflect the definition or the trueness of each character.

Specifically, the embodiment of the application can perform image detection processing on the multiple images to be detected respectively based on a deep learning character string recognition method, and store detailed detection results.

304. The image detection equipment respectively carries out edit distance calculation on any two license plates in the license plates to obtain the edit distance between any two license plates.

That is, the edit distance (Levenshtein distance) is calculated between every two character strings (license plates) of each image to be detected in the plurality of images to be detected, wherein the plurality of images to be detected in the embodiment of the application can be N images to be detected, and the character strings of the N images to be detected are placestr_i(i-0, 1, …, N-1), where the edit distance between string i and string j is L_i,j＝levenshteinDistance(plateStr_i,plateStr_j)。

305. The image detection device determines an edit distance matrix from the edit distance.

The edit distance matrix is a symmetric square matrix with a main diagonal line of 0, and the number of zero elements in each row represents the number of times that the corresponding character string in the row appears in the N detection results (character strings).

306. The image detection device determines the number of zero elements in each row in the edit distance matrix.

After the edit distance matrix is determined, the number of zero elements in each row in the edit distance matrix is counted.

If all the detection results (license plates) are the same, the row is all zero, and the edit distance matrix L is a zero matrix at the moment.

If L is a zero matrix, all detection results are the same at the moment, the results are directly output at the moment, the calculated amount is reduced, and otherwise, the number of zero elements in each row of the matrix L needs to be counted.

307. The image detection device determines the maximum zero element number with the maximum value from the zero element numbers of each line.

After the number of zero elements in each row in the edit distance matrix is determined, the rows of the edit distance matrix L are sorted according to the number of zero elements to obtain a row with the most zero elements, and the number of zero elements in the row (i.e., the maximum number of zero elements) is determined, where the number of zero elements in the row represents the number of the same detection results, for example, the number of zero elements in the row with the most zero elements is n (i.e., the maximum number of zero elements is n).

308. The image detection equipment determines whether the maximum number of zero elements is greater than a number threshold value; if yes, go to step 309, and if not, go to step 310 and 311.

After the maximum zero element number N of the edit distance matrix L is obtained, it is determined whether the maximum zero element number is greater than a number threshold, where the number threshold may be ceil (N/2), which is an upward integer of N/2.

309. And the image detection equipment determines the character strings corresponding to the zero elements in the maximum zero element number as the target license plate.

If n is greater than the number threshold, it indicates that most elements of the row where the largest zero element is located are zero elements, and at this time, the character string corresponding to the zero element in the row can be directly determined as a target license plate (a finally confirmed license plate).

310. And the image detection equipment determines a target character corresponding to each character position according to the characters, the frames corresponding to the characters and the confidence degrees of the characters.

If n is not greater than the number threshold, determining the target license plate according to each character in the picture to be detected, specifically:

because the N images to be detected are normalized, and the detection result further includes each character in each image to be detected, a border of each character (i.e., bounding-box), and a confidence of each character, based on these information, we align the characters first, and then vote and sort the characters at the positions of each character to obtain the optimal characters (i.e., target characters) at the positions, where the step of aligning the characters is as follows:

score_Ci＝confidence₁+confidence₂+…+confidence_l；

i.e. character C_iAdding the corresponding character confidences to obtain the final score of the character, and then, obtaining the highest score in the positionAs the optimal character (target character) for that position.

311. And the image detection equipment determines a target license plate according to the target character corresponding to each character position.

And sequencing the optimal characters of all the character positions from left to right to obtain the target license plate (namely the fused detection result).

In the embodiment of the application, the image detection equipment acquires a plurality of images to be detected of a detection object; then, image detection processing is carried out on the multiple images to be detected respectively to obtain a detection result corresponding to each image to be detected, wherein the detection result comprises a character string; determining an edit distance matrix corresponding to a plurality of images to be detected according to the character string; and finally, determining a target license plate corresponding to the detection object according to the editing distance matrix. In the embodiment of the application, the target license plate corresponding to the detection object is determined by using the edit distance matrix corresponding to the multiple images to be detected, and the image detection is performed by combining the multiple images to be detected, so that the accuracy of image identification can be improved.

That is, in the image detection method in the embodiment of the present application, first, two character strings corresponding to detection objects are compared to obtain an edit distance, an edit distance matrix is constructed, then, zero element properties in the edit distance matrix are used to simplify calculation under specific conditions, and a fused detection result is directly obtained, for example, if the edit distance matrix is an all-zero element matrix, the detection result is directly output at this time, if the edit distance matrix is a non-all-zero element matrix, the detection result is determined according to a row with the most zero elements, if the number of zero elements in the row with the most zero elements is greater than ceil (N/2), the character strings corresponding to the zero elements in the row are fused, the detection result is output, if the number of zero elements in the row with the most zero elements is not greater than ceil (N/2), then a voting strategy based on character positions needs to be performed through a normalized image at this time, and obtaining the optimal character of each character position, and then obtaining a final detection result according to the optimal character of each character position.

In order to better implement the image detection method provided by the embodiment of the present application, an embodiment of the present application further provides an image detection device, and the image detection device may be specifically integrated in a server. The meanings of the nouns are the same as those in the text recognition method, and specific implementation details can refer to the description in the method embodiment.

Referring to fig. 4, fig. 4 is a schematic structural diagram of an image detection apparatus according to an embodiment of the present application, where the image detection apparatus includes: the acquisition unit 401, the processing unit 402, the first determination unit 403, and the second determination unit 404 are as follows:

an acquiring unit 401, configured to acquire a plurality of images to be detected of a detection object;

the processing unit 402 is configured to perform image detection processing on the multiple images to be detected respectively to obtain a detection result corresponding to each image to be detected, where the detection result includes a character string;

a first determining unit 403, configured to determine, according to the character string, an edit distance matrix corresponding to the multiple images to be detected;

a second determining unit 404, configured to determine, according to the edit distance matrix, a target character string corresponding to the detection object.

In some embodiments, the first determining unit 403 is specifically configured to:

and determining the editing distance matrix according to the editing distance.

Referring to fig. 5, in some embodiments, the second determining unit 404 includes:

a first determining subunit 4041, configured to determine the number of zero elements in each row in the edit distance matrix;

a second determining subunit 4042, configured to determine, from the number of zero elements in each row, the maximum number of zero elements with the largest numerical value;

a third determining subunit 4043, configured to determine whether the maximum number of zero elements is greater than a number threshold;

a fourth determining subunit 4044, configured to determine, when the maximum number of zero elements is greater than the number threshold, a character string corresponding to a zero element in the maximum number of zero elements as the target character string.

In some embodiments, the detection result further includes a character, a frame corresponding to the character, and a confidence of the character, and the apparatus further includes:

a third determining unit 405, configured to determine, when the maximum number of zero elements is not greater than the number threshold, a target character corresponding to each character position according to the character, a frame corresponding to the character, and a confidence of the character;

a fourth determining unit 406, configured to determine the target character string according to the target character corresponding to each character position.

In some embodiments, the third determining unit 405 is specifically configured to:

determining characters corresponding to each character position;

In some embodiments, the obtaining unit 401 is specifically configured to:

acquiring a video to be detected of a detected object;

In some embodiments, the obtaining unit 401 is further specifically configured to:

In the embodiment of the present application, the obtaining unit 401 obtains a plurality of images to be detected of a detection object; then, the processing unit 402 performs image detection processing on the multiple images to be detected respectively to obtain a detection result corresponding to each image to be detected, wherein the detection result comprises a character string; the first determining unit 403 determines an edit distance matrix corresponding to a plurality of images to be detected according to the character string; finally, the second determining unit 404 determines a target character string corresponding to the detection object according to the edit distance matrix. In the embodiment of the application, the target character strings corresponding to the detection objects are determined by using the edit distance matrixes corresponding to the multiple images to be detected, and the image detection is performed by combining the multiple images to be detected, so that the accuracy of image identification can be improved.

Referring to fig. 6, embodiments of the present application provide a server 600, which may include one or more processors 601 of a processing core, one or more memories 602 of a computer-readable storage medium, a Radio Frequency (RF) circuit 603, a power supply 604, an input unit 605, and a display unit 606. Those skilled in the art will appreciate that the server architecture shown in FIG. 5 is not meant to be limiting, and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components. Wherein:

the processor 601 is a control center of the server, connects various parts of the entire server using various interfaces and lines, and performs various functions of the server and processes data by running or executing software programs and/or modules stored in the memory 602 and calling data stored in the memory 602, thereby performing overall monitoring of the server. Optionally, processor 601 may include one or more processing cores; preferably, the processor 601 may integrate an application processor, which mainly handles operating systems, user interfaces, application programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 601.

The memory 602 may be used to store software programs and modules, and the processor 601 executes various functional applications and data processing by operating the software programs and modules stored in the memory 602.

The RF circuitry 603 may be used for receiving and transmitting signals during the process of transmitting and receiving information.

The server also includes a power supply 604 (e.g., a battery) for powering the various components, which may preferably be logically coupled to the processor 601 via a power management system to manage charging, discharging, and power consumption management functions via the power management system.

The server may also include an input unit 605, and the input unit 605 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

The server may also include a display unit 606, and the display unit 606 may be used to display information input by the user or provided to the user, as well as various graphical user interfaces of the server, which may be made up of graphics, text, icons, video, and any combination thereof. Specifically, in this embodiment, the processor 601 in the server loads the executable file corresponding to the process of one or more application programs into the memory 602 according to the following instructions, and the processor 601 runs the application programs stored in the memory 602, thereby implementing various functions as follows:

acquiring a plurality of images to be detected of a detection object;

As can be seen from the above, in the embodiment of the present application, the image detection device obtains a plurality of images to be detected of the detection object; then, image detection processing is carried out on the multiple images to be detected respectively to obtain a detection result corresponding to each image to be detected, wherein the detection result comprises a character string; determining an edit distance matrix corresponding to a plurality of images to be detected according to the character string; and finally, determining a target character string corresponding to the detection object according to the editing distance matrix. In the embodiment of the application, the target character strings corresponding to the detection objects are determined by using the edit distance matrixes corresponding to the multiple images to be detected, and the image detection is performed by combining the multiple images to be detected, so that the accuracy of image identification can be improved.

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.

To this end, the present application provides a storage medium, in which a plurality of instructions are stored, where the instructions can be loaded by a processor to execute the steps in any one of the image detection methods provided in the present application. For example, the instructions may perform the steps of:

acquiring a plurality of images to be detected of a detection object;

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

Wherein the storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Since the instructions stored in the storage medium can execute the steps in any image detection method provided in the embodiments of the present application, beneficial effects that can be achieved by any image detection method provided in the embodiments of the present application can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.

The foregoing detailed description is directed to an image detection method, an image detection apparatus, and a storage medium provided in the embodiments of the present application, and specific examples are applied in the present application to explain the principles and implementations of the present application, and the descriptions of the foregoing embodiments are only used to help understand the methods and core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. An image detection method, comprising:

acquiring a plurality of images to be detected of a detection object;

2. The method according to claim 1, wherein the determining the edit distance matrix corresponding to the plurality of images to be detected according to the character string comprises:

and determining the editing distance matrix according to the editing distance.

3. The method according to claim 1, wherein the determining a target character string corresponding to the detection object according to the edit distance matrix comprises:

4. The method of claim 3, wherein the detection result further includes a character, a frame corresponding to the character, and a confidence of the character, and after determining whether the maximum number of zero elements is greater than a number threshold, the method further includes:

5. The method according to claim 4, wherein the determining a target character corresponding to each character position according to the character, a frame corresponding to the character, and a confidence level of the character comprises:

determining characters corresponding to each character position;

6. The method according to any one of claims 1 to 5, wherein the acquiring a plurality of images to be detected of the detection object comprises:

acquiring a video to be detected of a detected object;

7. The method according to claim 6, wherein the obtaining the plurality of images to be detected from the video to be detected comprises:

8. An image detection apparatus, characterized by comprising:

9. A computer program product comprising instructions which, when run on a computer, cause the computer to perform the steps in the image detection method according to any one of claims 1 to 7.

10. A storage medium storing a plurality of instructions adapted to be loaded by a processor to perform the steps of the image detection method according to any one of claims 1 to 7.