CN115601293A

CN115601293A - Object detection method and device, electronic equipment and readable storage medium

Info

Publication number: CN115601293A
Application number: CN202210816299.2A
Authority: CN
Inventors: 邱瀚; 陈晓炬
Original assignee: Nanjing Xurui Software Technology Co ltd
Current assignee: Nanjing Xurui Software Technology Co ltd
Priority date: 2022-07-12
Filing date: 2022-07-12
Publication date: 2023-01-13

Abstract

The embodiment of the application discloses an object detection method, an object detection device, electronic equipment and a readable storage medium, wherein the method comprises the following steps: performing feature extraction on the acquired image to be detected to obtain a first feature image, wherein the image to be detected comprises an object; inputting the first characteristic image into a first detection model, and carrying out reconstruction processing on the first characteristic image to obtain a second characteristic image; determining a feature difference value of the first feature image and the second feature image; and under the condition that the characteristic difference value is smaller than a preset threshold value, determining that the object is a defect-free object, wherein the preset threshold value is used for representing the upper limit value of the characteristic difference value corresponding to the defect-free object. According to the embodiment of the application, whether the object in the image to be detected is defective or not can be accurately detected.

Description

Object detection method and device, electronic equipment and readable storage medium

Technical Field

The present application belongs to the field of information processing technologies, and in particular, to an object detection method, an object detection device, an electronic apparatus, and a readable storage medium.

Background

At present, with the continuous development of artificial intelligence, the neural network model is also widely applied to product detection, for example, products produced on a production line are detected, and defective products are detected from the products, so that the manual input is reduced.

Since the products themselves are various in kind, it is difficult to collect a sample image of each kind of product. And there may be unknown defects in the product, sample images of the product including the unknown defects cannot be collected. Due to the lack of training samples, the accuracy of object detection in the image to be detected is not high at present.

Disclosure of Invention

The embodiment of the application provides an object detection method, an object detection device, an electronic device and a readable storage medium, and can solve the problem that the accuracy of object detection is not high.

In a first aspect, an embodiment of the present application provides an object detection method, where the method includes:

performing feature extraction on the acquired image to be detected to obtain a first feature image, wherein the image to be detected comprises an object;

inputting the first characteristic image into a first detection model, and carrying out reconstruction processing on the first characteristic image to obtain a second characteristic image;

determining a feature difference value of the first feature image and the second feature image;

and under the condition that the characteristic difference value is smaller than a preset threshold value, determining that the object is a defect-free object, wherein the preset threshold value is used for representing the upper limit value of the characteristic difference value corresponding to the defect-free object.

In a second aspect, an embodiment of the present application provides an object detection apparatus, including:

the extraction module is used for extracting the characteristics of the acquired image to be detected to obtain a first characteristic image, wherein the image to be detected comprises an object;

the input module is used for inputting the first characteristic image to the first detection model and reconstructing the first characteristic image to obtain a second characteristic image;

the determining module is used for determining a feature difference value of the first feature image and the second feature image;

the determining module is further configured to determine that the object is a defect-free object when the feature difference is smaller than a preset threshold, where the preset threshold is used to represent an upper limit value of the feature difference corresponding to the defect-free object.

In a third aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory storing computer program instructions; the processor, when executing the computer program instructions, performs the method as in the first aspect or any possible implementation of the first aspect.

In a fourth aspect, embodiments of the present application provide a readable storage medium, on which computer program instructions are stored, and when executed by a processor, the computer program instructions implement the method as in the first aspect or any possible implementation manner of the first aspect.

In the embodiment of the application, a first characteristic image is obtained by extracting the characteristics of the obtained image to be detected including an object, the first characteristic image is input to a first detection model, and the first characteristic image is reconstructed to obtain a second characteristic image; and determining the feature difference value of the first feature image and the second feature image. Because the details and the features of the image including the defect object and the image including the non-defect object are different, the images and the images respectively correspond to different feature difference values, and whether the object in the image to be detected is the non-defect object can be judged through the feature difference values. And determining that the object is a non-defective object under the condition that the characteristic difference value is smaller than the preset threshold value. Here, the method can effectively detect the object of unknown type, and can quickly and accurately determine whether the object in the image to be detected has defects, thereby improving the detection efficiency.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the embodiments of the present application will be briefly described below, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic diagram of an object detection method according to an embodiment of the present disclosure;

fig. 2 is a flowchart of an object detection method provided in an embodiment of the present application;

fig. 3 is a schematic structural diagram of an object detection apparatus according to an embodiment of the present application;

fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Features of various aspects and exemplary embodiments of the present application will be described in detail below, and in order to make objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. It will be apparent to one skilled in the art that the present application may be practiced without some of these specific details. The following description of the embodiments is merely intended to provide a better understanding of the present application by illustrating examples thereof.

It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising 8230; \8230;" comprise 8230; "do not exclude the presence of additional identical elements in any process, method, article, or apparatus that comprises the element.

First, technical terms related to embodiments of the present application will be described.

Multi-target detection network (Faster R-CNN), a common model of target detection. The method is a third generation model of a regional convolutional neural network (R-CNN) family, and integrates four basic steps of target detection (the four basic steps comprise candidate Region generation, feature extraction, classification and position refinement) into a deep network framework.

The multi-target detection network can be divided into 4 main parts: CNN backbone Networks (e.g., VGG, resNet, denseNet, etc.), regional recommendation Networks (RPN), region of interest Pooling Networks (ROI Pooling), and classification Networks.

The Faster RCNN uses the RPN, takes image features as input, and generates a set of proposed objects, each object proposing an object score as output.

As a CNN Network target detection method, fast RCNN extracts a feature image of an input image, where the feature image is used for a subsequent RPN layer and a full connection layer, specifically, a Network layer of a VGG model (Visual Geometry Group Network).

The RPN is mainly used for generating proposed objects, firstly, some anchor frames are generated, and after the anchor frames are cut and filtered, the anchor frames are judged to belong to a foreground region or a background region, namely, objects or not, so that the two-classification is realized; at the same time, the other branch revises the anchor frame, forming a more accurate proposed object.

And the region-of-interest pooling network obtains a fixed-size proposed feature image by using the proposed object generated by the RPN and the feature image obtained by the last layer of the VGG16, and then can perform target identification and positioning by using a full-connection operation.

And the classification network performs full connection operation on the characteristic images formed by the Roi Pooling layer with a fixed size, performs classification of specific categories by using Softmax, and obtains the accurate position of the object through regression operation.

Convolutional Neural Networks (CNN), a class of feed-forward Neural Networks that include convolution calculations and have a deep structure, is one of the algorithms that represent deep learning. On one hand, the method reduces the number of parameters needing training of the neural network through the sharing of receptive fields and weights, and on the other hand, the method obtains the mapping of different levels of features of the image through layer-by-layer convolution. The output of each convolutional layer is called feature maps, and typically deeper convolutional layers contain richer spatial and semantic information.

An Auto Encoder (AE), an artificial neural network capable of learning an efficient representation of input data through unsupervised learning. This efficient representation of the input data is called encoding, which is typically much smaller in dimension than the input data, so that the self-encoder can be used for dimension reduction. More importantly, the self-encoder can be used as a powerful feature detector and applied to the pre-training of the deep neural network.

The object detection method provided by the embodiment of the present application can be applied to at least the following application scenarios, which are described below.

With the continuous development of artificial intelligence, the detection of objects can be realized through a detection model to identify defective objects in a production line, for example, syringes are used, because the defects of the syringes in the production line are various, it is difficult to collect a sample image of each type of syringe, and the syringes may have unknown abnormal defects, and sample images of syringes with unknown defects cannot be collected. Due to the fact that training samples are insufficient, the accuracy and efficiency of object detection are low at present.

Based on the application scenario, the object detection method provided in the embodiment of the present application is described in detail below.

First, the object detection method provided in the embodiments of the present application will be explained in entirety.

Fig. 1 is a schematic diagram of an object detection method provided in an embodiment of the present application, and as shown in fig. 1, first, feature extraction is performed on an acquired image to be detected 110 to obtain a first feature image 120, where the image to be detected includes an object.

Then, the first feature image 120 is input to the second detection model 130, and the object class information 140 of the image to be detected is detected. And under the condition that the object type information is matched with the preset type information, outputting the object type information of the image to be detected. Here, if the object type information matches the preset type information, it indicates that the object in the image to be detected belongs to the type information of the known defect type, and the object type information of the image to be detected can be directly output.

Next, when the object category information does not match the preset category information, the first feature image 120 is input to the first detection model 150, and the first feature image is reconstructed to obtain a second feature image 160. Here, if the object type information does not match the preset type information, it indicates that the object in the image to be detected belongs to the type information of the unknown defect type. Detection by the first detection model is required. A feature difference 170 is determined for the first feature image and the second feature image. And finally, determining that the object is a defect-free object under the condition that the characteristic difference value is smaller than a preset threshold value, wherein the preset threshold value is used for representing the upper limit value of the characteristic difference value corresponding to the defect-free object.

Therefore, the first detection model and the second detection model can be used for respectively detecting the unknown defect types and the known defect types, and the comprehensiveness and the accuracy of object detection are ensured.

The object detection method provided by the embodiment of the present application is described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

Fig. 2 is a flowchart of an object detection method according to an embodiment of the present disclosure.

As shown in fig. 2, the object detection method may include steps 210 to 240, and the method is applied to an object detection apparatus, and is specifically as follows:

step 210, performing feature extraction on the obtained image to be detected to obtain a first feature image, wherein the image to be detected comprises an object.

Step 220, inputting the first characteristic image into the first detection model, and performing reconstruction processing on the first characteristic image to obtain a second characteristic image.

In step 230, a feature difference value between the first feature image and the second feature image is determined.

And 240, determining that the object is a defect-free object when the characteristic difference value is smaller than a preset threshold value, wherein the preset threshold value is used for representing an upper limit value of the characteristic difference value corresponding to the defect-free object.

The following describes the contents of steps 210 to 240, respectively:

step 210 is involved.

The method comprises the following steps of carrying out feature extraction on an acquired image to be detected to obtain a first feature image, and specifically comprises the following steps:

and inputting the image to be detected into a feature extraction network densenert to obtain a first feature image.

In a conventional convolutional neural network, if there are L layers, there are L connections, but in DenseNet there are L (L + 1)/2 connections. I.e. the input of each layer comes from the output of all previous layers.

The method and the device can specifically use the Densenet161 as a feature extraction network to perform feature extraction on the acquired image to be detected to obtain a first feature image, and can fully learn the features of different background images so as to accurately extract feature information of different images. DenseNet establishes a dense connection of all previous layers to the next, and its name is derived from this.

DenseNet enables feature reuse through the connection of features over frequency channels. The characteristics enable the DenseNet to achieve better performance under the condition of less parameters and calculation cost, and the reason is that the DenseNet is selected to perform feature extraction on the acquired image to be detected to obtain the first feature image.

The DenseNet network uses a structure of a dense module (DenseBlock) and a transition layer, wherein the DenseBlock is a module comprising a plurality of layers, the size of a characteristic diagram of each layer is the same, and a dense connection mode is adopted between the layers. And the Transition module connects two adjacent DenseBlock and reduces the size of the feature map through pooling. The DenseNet comprises 4 DenseBlock, and each DenseBlock is connected together through Transition.

In a possible embodiment, before step 210, the method further includes:

identifying the acquired original image and determining an image area corresponding to the object;

and according to the image area corresponding to the object, cutting the original image to obtain at least one image to be detected.

In an actual scene, a plurality of objects may be included in an original image obtained by shooting, so that the obtained original image needs to be identified, and an image area corresponding to the object needs to be determined.

The image area corresponding to at least one object can be determined from the original image, then the original image is processed, and the original image is cut to obtain at least one image to be detected, wherein each object to be detected can comprise an image area corresponding to the related object.

Step 220 is involved.

In a possible embodiment, before step 220, the method further includes:

performing feature extraction on the obtained multiple first sample images to obtain multiple first sample feature images, wherein each first sample image comprises a first sample object;

inputting the first sample feature image to an auto-encoder, the auto-encoder comprising an encoder and a decoder;

reconstructing the first sample characteristic image through an encoder to obtain a second sample characteristic image;

restoring the second sample characteristic image through a decoder to obtain a third sample characteristic image;

calculating a reconstruction loss value according to the first sample characteristic image and the third sample characteristic image;

and training the self-encoder according to the reconstruction loss value until the self-encoder meets a first preset training condition to obtain a first detection model.

The self-encoder is an unsupervised learning model. In essence it uses a neural network to produce a low-dimensional representation of a high-dimensional input. The auto-encoder is similar to the principal component analysis, but overcomes the limitations of the principal component analysis linearity when using a non-linear activation function.

The self-encoder comprises two main parts, an encoder and a decoder. The role of the encoder is to find a compressed representation of a given data and the decoder is to reconstruct the original input. During the training process from the encoder, the decoder forces the encoder to select the most informative features that are ultimately stored in the compressed representation.

The main role of the self-encoder in the present application is to perform compression reconstruction on the first feature image.

During training, the whole network of the self-encoder is trained by using the first sample characteristic image, and the training aim is to minimize the reconstruction loss value of the characteristic.

And inputting the first sample characteristic image into a self-encoder, and reconstructing the first sample characteristic image through the encoder to obtain a second sample characteristic image.

And then, restoring the second sample characteristic image through a decoder to obtain a third sample characteristic image, wherein the aim of restoring the restored third sample characteristic image is to restore the restored third sample characteristic image to the first sample characteristic image as much as possible. At the completion of the self-coder training, the third sample feature image should be in close proximity to the first sample feature image.

Calculating a reconstruction loss value according to the first sample characteristic image and the third sample characteristic image; and training the self-encoder according to the reconstruction loss value until the self-encoder meets a first preset training condition, and stopping training to obtain a first detection model.

And during reasoning, only the encoder part is used for compressing the first characteristic image, the characteristic with the most information quantity is reserved, and the characteristic with less information quantity is removed to obtain a second characteristic image.

In one possible embodiment, step 220 includes:

inputting the first characteristic image into a second detection model, and detecting object category information of the image to be detected;

and under the condition that the object category information is not matched with the preset category information, inputting the first characteristic image into the first detection model, and reconstructing the first characteristic image to obtain a second characteristic image.

And if the object type information is matched with the preset type information, indicating that the object in the image to be detected belongs to the type information of the known defect type. Therefore, in the case where the object class information matches the preset class information, the object class information of the image to be detected can be directly output.

And then, under the condition that the object category information is not matched with the preset category information, inputting the first characteristic image into the first detection model, and reconstructing the first characteristic image to obtain a second characteristic image. Here, if the object type information does not match the preset type information, the object type information indicates that the object in the image to be detected belongs to the type information of the unknown defect, and the object needs to be detected by the first detection model.

The first detection model may be a full-volume self-encoder model, in which a full-volume self-encoder reconstructs an entire image, i.e. three-dimensional data is input.

In a possible embodiment, the step of inputting the first feature image into the second detection model and detecting the object class information of the image to be detected includes:

extracting at least one candidate region from the first feature image;

and under the condition that the candidate area comprises the object, carrying out defect detection to obtain object type information.

The second detection model includes: and the RPN can perform region extraction on the first feature image by using a 3-by-3 sliding window to obtain at least one candidate region corresponding to the original image. Each point on the first feature image can be mapped to a region (i.e., receptive field) of the image to be detected.

And carrying out different shape and size transformation on each point of the first characteristic image corresponding to the area of the original image to obtain frames with different shapes and sizes, namely anchor frames. The size and dimensions of these anchor frames are preset. Each point on the first characteristic image is the central point of the anchor frame in the image to be detected.

In order to make the coverage wider, detect more objects, and ensure detection efficiency, the second detection model may be transformed based on 3 different sizes, 3 different scale transformed anchor frames, and for each point of the first characteristic image, 9 anchor frames are obtained in the image area to be detected. Removing the anchor frames beyond the image boundary, sending all the anchor frames into a classification layer, and judging whether each anchor frame is a foreground or a background; and meanwhile, sending the anchor frame into a regression layer, and adjusting the anchor frame to be close to the label value.

Before the step of inputting the first characteristic image into the second detection model and detecting the object class information of the image to be detected, the method comprises the following steps:

obtaining a plurality of sample data, wherein each sample data comprises a corresponding second sample image and preset object type information, and the second sample image comprises a second sample object;

inputting the second sample image into the multi-target detection model to obtain the class information of the detection object;

and training the multi-target detection model according to the preset object type information and the detection object type information until the multi-target detection model meets a second preset training condition to obtain a second detection model.

Each sample data comprises a corresponding second sample image and preset object type information, namely the second sample object in the second sample image corresponds to the type information of the known defect type. And inputting the second sample image into the multi-target detection model, and predicting the defect types of the multi-target detection model to obtain the detection object class information. Calculating a loss value according to the preset object type information and the pre-marked detection object type information, training the multi-target detection model according to the loss value until the multi-target detection model meets a second preset training condition, and obtaining a second detection model.

Step 230 is involved.

And determining the feature difference value of the first feature image and the second feature image.

The existence form of the first characteristic image and the second characteristic image can be a multi-dimensional characteristic matrix, the first characteristic image and the second characteristic image are subjected to difference to obtain a multi-dimensional difference matrix, and then the multi-dimensional difference matrix is converted into a one-dimensional characteristic difference.

Step 240 is involved.

In a possible embodiment, before step 240, the method further includes:

acquiring a plurality of test images, each test image including a defect-free test object;

performing feature extraction on the plurality of test images to obtain a plurality of first test feature images;

inputting the plurality of first test characteristic images into a first detection model, and performing reconstruction processing on the first test characteristic images to obtain a plurality of second test characteristic images;

determining a test feature difference value of the first test feature image and the second test feature image for each test image;

and determining a preset threshold according to the plurality of test characteristic difference values.

The method comprises the steps of extracting features of a plurality of test images to obtain a plurality of first test feature images, reconstructing the extracted first test feature images by using a first detection model to obtain a plurality of second test feature images, namely the input of the first detection model is the first test feature images, and the output of the first detection model is the second test feature images. Here, it is equivalent to making a difference between the input and the output of the first detection model, and a test feature difference value can be obtained.

And determining a preset threshold according to the test characteristic difference value corresponding to each test image, wherein the preset threshold is used for representing the upper limit value of the characteristic difference value corresponding to a non-defective object and is used for judging whether the object in the image to be detected is a defective object or not according to the preset threshold.

In this way, detection interception of a possible defect of unknown type can be performed only with a test image including a test object without defects. The detection efficiency and accuracy are improved.

Wherein, the step of determining the preset threshold value according to the plurality of test feature difference values comprises:

generating a reconstruction error distribution map according to the plurality of test feature difference values;

and extracting a preset threshold value from the reconstruction error distribution map.

The reconstruction error distribution map may be a scatter diagram. The method includes extracting a preset threshold from the reconstructed error distribution map, specifically determining which of the test feature differences has the highest occurrence frequency, determining the test feature difference having the highest occurrence frequency as the preset threshold, determining a median of the test feature differences as the preset threshold, and determining an average of the test feature differences as the preset threshold.

And under the condition that the characteristic difference value is smaller than a preset threshold value, determining that the object is a defect-free object, wherein the preset threshold value is used for representing the upper limit value of the characteristic difference value corresponding to the defect-free object. And determining the object as a defective object under the condition that the characteristic difference value is smaller than a preset threshold value.

In the embodiment of the application, a first characteristic image is obtained by extracting the characteristics of the obtained image to be detected including an object, the first characteristic image is input to a first detection model, and the first characteristic image is reconstructed to obtain a second characteristic image; and determining the feature difference value of the first feature image and the second feature image. Because the details and the features of the image including the defect object and the image including the non-defect object are different, the images and the images respectively correspond to different feature difference values, and whether the object in the image to be detected is the non-defect object can be judged through the feature difference values. And determining the object as a non-defective object under the condition that the characteristic difference value is smaller than the preset threshold value. Here, the method can effectively detect the object of unknown type, and can quickly and accurately determine whether the object in the image to be detected has defects, thereby improving the detection efficiency.

Based on the object detection method shown in fig. 2, an embodiment of the present application further provides an object detection apparatus, as shown in fig. 3, the apparatus 300 may include:

the extracting module 310 is configured to perform feature extraction on the acquired image to be detected to obtain a first feature image, where the image to be detected includes an object.

The input module 320 is configured to input the first feature image to the first detection model, and perform reconstruction processing on the first feature image to obtain a second feature image.

A determining module 330 is configured to determine a feature difference value between the first feature image and the second feature image.

The determining module 330 is further configured to determine that the object is a non-defective object when the feature difference is smaller than a preset threshold, where the preset threshold is used to represent an upper limit value of the feature difference corresponding to the non-defective object.

In a possible embodiment, the apparatus 300 may further include:

the device comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a plurality of test images, and each test image comprises a defect-free test object.

The first extraction module is used for extracting the features of the plurality of test images to obtain a plurality of first test feature images.

And the first input module is used for inputting the plurality of first test characteristic images into the first detection model, and reconstructing the first test characteristic images to obtain a plurality of second test characteristic images.

And the first determining module is used for determining the test characteristic difference value of the first test characteristic image and the second test characteristic image for each test image.

And the second determining module is used for determining a preset threshold according to the plurality of test characteristic difference values.

In a possible embodiment, the second determining module is specifically configured to: generating a reconstruction error distribution map according to the plurality of test feature difference values;

In a possible embodiment, the apparatus 300 may further include:

the second extraction module is used for performing feature extraction on the obtained multiple first sample images to obtain multiple first sample feature images, and each first sample image comprises a first sample object.

And the second input module is used for inputting the first sample characteristic image to the self-encoder, and the self-encoder comprises an encoder and a decoder.

And the reconstruction module is used for reconstructing the first sample characteristic image through the encoder to obtain a second sample characteristic image.

And the restoring module is used for restoring the second sample characteristic image through a decoder to obtain a third sample characteristic image.

And the calculation module is used for calculating a reconstruction loss value according to the first sample characteristic image and the third sample characteristic image.

And the training module is used for training the self-encoder according to the reconstruction loss value until the self-encoder meets a first preset training condition to obtain a first detection model.

In a possible embodiment, the apparatus 300 may further include:

and the identification module is used for identifying the acquired original image and determining the image area corresponding to the object.

And the cutting module is used for cutting the original image according to the image area corresponding to the object to obtain at least one image to be detected.

In a possible embodiment, the input module 320 is specifically configured to:

In a possible embodiment, the apparatus 300 may further include:

the second acquisition module is used for acquiring a plurality of sample data, each sample data comprises a corresponding second sample image and preset object type information, and the second sample image comprises a second sample object.

And the third input module is used for inputting the second sample image to the multi-target detection model to obtain the class information of the detection object.

And the second training module is used for training the multi-target detection model according to the preset object type information and the detection object type information until the multi-target detection model meets a second preset training condition, so as to obtain a second detection model.

In the embodiment of the application, a first characteristic image is obtained by performing characteristic extraction on an obtained image to be detected comprising an object, the first characteristic image is input to a first detection model, and the first characteristic image is subjected to reconstruction processing to obtain a second characteristic image; and determining the feature difference value of the first feature image and the second feature image. Because the details and the features of the image including the defect object and the image including the non-defect object are different, the images and the images respectively correspond to different feature difference values, and whether the object in the image to be detected is the non-defect object can be judged through the feature difference values. And determining that the object is a non-defective object under the condition that the characteristic difference value is smaller than the preset threshold value. Here, the method can effectively detect the object of unknown type, and can quickly and accurately determine whether the object in the image to be detected has defects, thereby improving the detection efficiency.

Fig. 4 shows a schematic structural diagram of an electronic device provided in an embodiment of the present application.

The electronic device may comprise a processor 601 and a memory 602 in which computer program instructions are stored.

Specifically, the processor 601 may include a Central Processing Unit (CPU), or an Application Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.

Memory 602 may include a mass storage for data or instructions. By way of example, and not limitation, memory 602 may include a Hard Disk Drive (HDD), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, tape, or Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 602 may include removable or non-removable (or fixed) media, where appropriate. The memory 602 may be internal or external to the integrated gateway disaster recovery device, where appropriate. In a particular embodiment, the memory 602 is a non-volatile solid-state memory. In a particular embodiment, the memory 602 includes Read Only Memory (ROM). Where appropriate, the ROM may be mask-programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically Erasable PROM (EEPROM), electrically rewritable ROM (EAROM), or flash memory, or a combination of two or more of these.

The processor 601 implements any one of the object detection methods in the illustrated embodiments by reading and executing computer program instructions stored in the memory 602.

In one example, the electronic device may also include a communication interface 603 and a bus 610. As shown in fig. 4, the processor 601, the memory 602, and the communication interface 603 are connected via a bus 610 to complete communication therebetween.

The communication interface 603 is mainly used for implementing communication between modules, apparatuses, units and/or devices in the embodiments of the present application.

The bus 610 includes hardware, software, or both to couple the components of the electronic device to one another. By way of example, and not limitation, a bus may include an Accelerated Graphics Port (AGP) or other graphics bus, an Enhanced Industrial Standard Architecture (EISA) bus, a Front Side Bus (FSB), a Hyper Transport (HT) interconnect, an Industrial Standard Architecture (ISA) bus, an infiniband interconnect, a Low Pin Count (LPC) bus, a memory bus, a Micro Channel Architecture (MCA) bus, a Peripheral Component Interconnect (PCI) bus, a PCI-Express (PCI-X) bus, a Serial Advanced Technology Attachment (SATA) bus, a video electronics standards association local (VLB) bus, or other suitable bus or a combination of two or more of these. Bus 610 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.

The electronic device may perform the object detection method in the embodiment of the present application, thereby implementing the method described in conjunction with fig. 1 to 2.

In addition, in combination with the methods in the foregoing embodiments, the embodiments of the present application may be implemented by providing a computer-readable storage medium. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement the object detection method of fig. 1-2.

It is to be understood that the present application is not limited to the particular arrangements and instrumentality described above and shown in the attached drawings. A detailed description of known methods is omitted herein for the sake of brevity. In the above embodiments, several specific steps are described and shown as examples. However, the method processes of the present application are not limited to the specific steps described and illustrated, and those skilled in the art can make various changes, modifications, and additions or change the order between the steps after comprehending the spirit of the present application.

The functional blocks shown in the above-described structural block diagrams may be implemented as hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, plug-in, function card, or the like. When implemented in software, the elements of the present application are the programs or code segments used to perform the required tasks. The program or code segments may be stored in a machine-readable medium or transmitted by a data signal carried in a carrier wave over a transmission medium or a communication link. A "machine-readable medium" may include any medium that can store or transfer information. Examples of a machine-readable medium include an electronic circuit, a semiconductor memory device, a ROM, a flash memory, an Erasable ROM (EROM), a floppy disk, a CD-ROM, an optical disk, a hard disk, an optical fiber medium, a Radio Frequency (RF) link, and so forth. The code segments may be downloaded via computer networks such as the internet, intranet, etc.

It should also be noted that the exemplary embodiments mentioned in this application describe some methods or systems based on a series of steps or devices. However, the present application is not limited to the order of the above steps, that is, the steps may be performed in the order mentioned in the embodiments, may be performed in an order different from the order in the embodiments, or may be performed at the same time.

As described above, only the specific embodiments of the present application are provided, and it can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the system, the module and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive various equivalent modifications or substitutions within the technical scope of the present application, and these modifications or substitutions should be covered within the scope of the present application.

Claims

1. An object detection method, characterized in that the method comprises:

performing feature extraction on an obtained image to be detected to obtain a first feature image, wherein the image to be detected comprises an object;

inputting the first characteristic image into a first detection model, and performing reconstruction processing on the first characteristic image to obtain a second characteristic image;

and determining that the object is a defect-free object under the condition that the characteristic difference value is smaller than a preset threshold value, wherein the preset threshold value is used for representing the upper limit value of the characteristic difference value corresponding to the defect-free object.

2. The method according to claim 1, wherein before determining that the object is a non-defective object if the characteristic difference value is smaller than a preset threshold value, the method further comprises:

acquiring a plurality of test images, each of the test images including the defect-free test object;

inputting the plurality of first test characteristic images into the first detection model, and performing reconstruction processing on the first test characteristic images to obtain a plurality of second test characteristic images;

for each test image, determining a test feature difference value of the first test feature image and the second test feature image;

and determining the preset threshold value according to the plurality of test characteristic difference values.

3. The method of claim 2, wherein determining the preset threshold value from the plurality of test feature difference values comprises:

and extracting the preset threshold value from the reconstruction error distribution map.

4. The method according to claim 1, wherein before the first feature image is input to a first detection model and the first feature image is subjected to reconstruction processing to obtain a second feature image, the method further comprises:

reconstructing the first sample characteristic image through the encoder to obtain a second sample characteristic image;

restoring the second sample characteristic image through the decoder to obtain a third sample characteristic image;

and training the self-encoder according to the reconstruction loss value until the self-encoder meets a first preset training condition to obtain the first detection model.

5. The method according to claim 1, wherein before the performing feature extraction on the acquired image to be detected to obtain the first feature image, the method further comprises:

6. The method according to claim 1, wherein the inputting the first feature image into a first detection model, and performing reconstruction processing on the first feature image to obtain a second feature image comprises:

and under the condition that the object category information is not matched with preset category information, inputting the first characteristic image into the first detection model, and reconstructing the first characteristic image to obtain a second characteristic image.

7. The method according to claim 6, wherein before the inputting the first feature image into a second detection model and detecting the object class information of the image to be detected, the method further comprises:

inputting the second sample image into a multi-target detection model to obtain the class information of the detected object;

8. An object detection apparatus, characterized in that the apparatus comprises:

the input module is used for inputting the first characteristic image to a first detection model and carrying out reconstruction processing on the first characteristic image to obtain a second characteristic image;

a determining module, configured to determine a feature difference value of the first feature image and the second feature image;

the determining module is further configured to determine that the object is a non-defective object when the feature difference is smaller than a preset threshold, where the preset threshold is used to represent an upper limit of a feature difference corresponding to the non-defective object.

9. An electronic device, characterized in that the device comprises: a processor and a memory storing computer program instructions; the processor, when executing the computer program instructions, implements an object detection method as claimed in any of claims 1-7.

10. A readable storage medium, characterized in that the computer readable storage medium has stored thereon computer program instructions which, when executed by a processor, implement the object detection method of any one of claims 1-7.