CN112561889A

CN112561889A - Target detection method and device, electronic equipment and storage medium

Info

Publication number: CN112561889A
Application number: CN202011510808.6A
Authority: CN
Inventors: 吴晓东
Original assignee: Shenzhen Saiante Technology Service Co Ltd
Current assignee: Shenzhen Saiante Technology Service Co Ltd
Priority date: 2020-12-18
Filing date: 2020-12-18
Publication date: 2021-03-26

Abstract

The invention relates to the field of image detection, and discloses a target object detection method, which comprises the following steps: performing size standardization processing on a picture to be detected to obtain a standard picture, and extracting picture features in the standard picture through a feature extraction network to obtain a feature picture; performing target detection on the feature picture by using a region generation network to generate a candidate frame, and pooling the candidate frame into a fixed size by using a region feature aggregation algorithm to obtain a standard candidate frame; performing regression and classification on the standard candidate frame to obtain a target object candidate frame; and performing coordinate mapping on the picture to be detected according to the target object candidate frame, and marking a target object detection result in the picture to be detected. The invention also relates to a block chain technology, and the picture to be detected can be stored in the block chain node. The invention also provides a target object detection device, electronic equipment and a storage medium. The embodiment of the invention solves the problem that the target detection result is inaccurate when the target object is fuzzy.

Description

Target detection method and device, electronic equipment and storage medium

Technical Field

The present invention relates to the field of image detection technologies, and in particular, to a method and an apparatus for detecting a target object, an electronic device, and a computer-readable storage medium.

Background

The application lane is a special lane used by vehicles for processing emergency affairs, such as engineering rescue, medical rescue, policemen executing emergency official affairs and the like. The traffic law stipulates that the motor vehicle cannot occupy an emergency lane under non-special conditions in the driving process, otherwise, penalty is given. At present, a deep learning method based on a regional convolutional neural network can be adopted for the detection of the emergency lane. Although the deep learning method can achieve high accuracy in a simple scene, the accuracy is relatively low in the scenes of haze, rainy days, nights, fuzzy emergency lane lines and the like.

Disclosure of Invention

The invention provides a target detection method, a target detection device, electronic equipment and a computer readable storage medium, and mainly aims to improve the accuracy of target detection.

In order to achieve the above object, the present invention provides a target detection method, including:

acquiring a picture to be detected, performing size standardization processing on the picture to be detected to obtain a standard picture, and extracting picture characteristics in the standard picture through a pre-constructed characteristic extraction network to obtain a characteristic picture;

performing target detection on the feature picture by using a region generation network, generating a candidate frame according to a detection result, and pooling the candidate frame into a fixed size by using a region feature aggregation algorithm to obtain a standard candidate frame;

performing regression and classification on the standard candidate frame to obtain a target object candidate frame;

and performing coordinate mapping on the picture to be detected according to the target object candidate frame, and marking a target object detection result in the picture to be detected.

Optionally, the performing size standardization processing on the picture to be detected to obtain a standard picture includes:

judging whether the size of the picture to be detected is larger than the size of a standard picture input by a user;

when the size of the picture to be detected is larger than the size of the standard picture, performing cutting processing on the picture to be detected according to the size of the standard picture to obtain a standard picture;

and when the size of the picture to be detected is smaller than the size of the standard picture, performing filling processing on the picture to be detected according to the size of the standard picture to obtain the standard picture.

Optionally, the performing target detection on the feature picture by using the area generation network, and generating a candidate frame according to a detection result includes:

generating a preset number of anchor point frames with different scales and aspect ratios for each point on the feature picture;

inputting the anchor frame into a detection frame classification layer of a region generation network, classifying the detection frame classification layer, and judging whether a feature map in the anchor frame belongs to a foreground or a background;

inputting the anchor point frame into a detection frame regression layer of the area generation network to obtain coordinate information of the anchor point frame;

and selecting an anchor frame of the feature picture belonging to the foreground as a candidate frame, and displaying the candidate frame on the feature picture according to the corresponding coordinate value.

Optionally, the pooling of the candidate frames into a fixed size by using a regional feature clustering algorithm to obtain standard candidate frames includes:

dividing each of the candidate boxes into n x n fixed-size cells;

determining sampling points in each unit according to a preset rule, calculating pixel values of the sampling points by using a bilinear interpolation method, and performing maximum pooling operation on the pixel values of the sampling points to select pixel points with maximum pixel values in the sampling points;

and obtaining a standard candidate frame corresponding to each candidate frame according to the selected pixel points.

Optionally, the performing regression and classification on the standard candidate frame to obtain a target candidate frame includes:

obtaining an offset predicted value of the standard candidate frame relative to the actual position by using a frame regression function so as to correct the standard candidate frame;

inputting the standard candidate box into a full-link layer and a softmax function in a pre-trained neural network, calculating the category to which the feature map in the standard candidate box belongs, outputting the score of the category, and obtaining the target detection box according to the score.

Optionally, before the picture features in the standard picture are extracted through the pre-constructed feature extraction network to obtain the feature picture, the method further includes:

constructing a first convolution layer according to convolution operation, normalization operation and activation operation;

constructing a second convolution layer by using the combination function and the addition function;

and constructing the feature extraction network according to the first convolution layer and the second convolution layer.

Optionally, the target object is an application lane.

In order to solve the above problems, the present invention also provides a target detection apparatus, comprising:

the image feature extraction module is used for performing size standardization processing on the image to be detected to obtain a standard image, and extracting image features in the standard image through a pre-constructed feature extraction network to obtain a feature image;

the candidate frame generation module is used for carrying out target detection on the feature picture by utilizing a region generation network, generating a candidate frame according to a detection result, and pooling the candidate frame into a fixed size by utilizing a region feature aggregation algorithm to obtain a standard candidate frame;

the classification regression module is used for performing regression and classification on the standard candidate frame to obtain a target object candidate frame;

and the candidate frame mapping module is used for executing coordinate mapping on the picture to be detected according to the target object candidate frame and marking a target object detection result in the picture to be detected.

In order to solve the above problem, the present invention also provides an electronic device, including:

at least one processor; and the number of the first and second groups,

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores computer program instructions executable by the at least one processor to cause the at least one processor to perform a method of object detection as described above.

In order to solve the above problem, the present invention further provides a computer-readable storage medium comprising a storage data area and a storage program area, wherein the storage data area stores created data, and the storage program area stores a computer program; wherein the computer program, when executed by a processor, implements a method of object detection as described above.

The embodiment of the invention extracts the features of the picture to be detected through the pre-constructed feature extraction network, enhances the feature expression capability in difficult scenes, thereby improving the overall accuracy of target object detection, such as emergency lane detection, simultaneously utilizes the region generation network and the candidate frame generated by the region feature set algorithm, and regresses and classifies the candidate frame, thereby effectively relieving the problem of pixel deviation, improving the regression positioning of the candidate frame for target object detection, and further improving the overall accuracy of target object detection. Therefore, the target detection method, the target detection device and the computer-readable storage medium provided by the embodiment of the invention can improve the accuracy of target detection.

Drawings

Fig. 1 is a schematic flow chart of a target detection method according to an embodiment of the present invention;

FIG. 2 is a schematic diagram illustrating a detailed implementation process of one step in the method for detecting a target object provided in FIG. 1;

FIG. 3 is a schematic diagram illustrating a detailed implementation of another step in the method for detecting a target object provided in FIG. 1;

FIG. 4 is a schematic view of another detailed implementation of another step in the method for detecting a target object provided in FIG. 1;

FIG. 5 is a schematic view of another detailed implementation of another step in the method for detecting a target object provided in FIG. 1;

fig. 6 is a schematic block diagram of a target detection apparatus according to an embodiment of the present invention;

fig. 7 is a schematic diagram of an internal structure of an electronic device for implementing a target detection method according to an embodiment of the present invention;

the implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

The embodiment of the application provides a target object detection method. The execution subject of the target object detection method includes, but is not limited to, at least one of electronic devices such as a server and a terminal, which can be configured to execute the method provided by the embodiments of the present application. In other words, the target object detection method may be performed by software or hardware installed in the terminal device or the server device, and the software may be a block chain platform. The server includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.

Fig. 1 is a schematic flow chart of a target detection method according to an embodiment of the present invention. In this embodiment, the target detection method includes:

s1, obtaining a picture to be detected, performing size standardization processing on the picture to be detected to obtain a standard picture, and extracting picture characteristics in the standard picture through a pre-constructed characteristic extraction network to obtain a characteristic picture.

In the embodiment of the invention, various optical camera devices can be used for acquiring the picture to be detected, for example, a traffic monitoring camera is used for shooting emergency lanes at intervals of a specified time period, the shot emergency lane picture is uploaded to a database, and finally the emergency lane picture in the database is summarized to obtain the picture to be detected.

In detail, referring to fig. 2, the performing of the size normalization process on the picture to be detected includes:

s10, judging whether the size of the picture to be detected is larger than the size of the standard picture input by the user;

s11, when the size of the picture to be detected is larger than the size of the standard picture, cutting the picture to be detected according to the size of the standard picture to obtain the standard picture;

and S12, when the size of the picture to be detected is smaller than the size of the standard picture, performing filling processing on the picture to be detected according to the size of the standard picture to obtain the standard picture.

For example, the standard picture size may be set to 1000 × 600. When the size of the picture to be detected is 1200 x 1200, a standard picture can be obtained by cutting through a center cutting method by taking the center of the picture as an original point, the length of the picture is 1000 and the width of the picture is 600; when the size of the picture to be detected is 800 × 400, the picture frame is used as a boundary by an edge expansion filling method, and the picture is expanded outwards by using a preset pixel value until the expanded size is 1000 in length and 600 in width, so that a standard picture is obtained.

In an embodiment of the present invention, the feature extraction network may be a DarkNet63 network, and further, in one embodiment of the present invention, before the extracting the picture features in the standard picture by using the pre-constructed feature extraction network to obtain the feature picture, the constructing of the feature extraction network is further included.

In detail, the feature extraction network is constructed by the following method: constructing a first convolution layer according to convolution operation (Conv), normalization operation (BN) and activation operation (Leaky relu); combining the first convolutional layer using a combining function (Concat) and an adding function (Add) to construct a second convolutional layer; and constructing the feature extraction network according to the first convolution layer and the second convolution layer.

The convolution operation is a 2D convolution operation and is used for obtaining a feature map by utilizing 2D convolution kernels with different effects through convolution from the standard picture; the normalization operation is used for reducing the pixel values of the pixel points in the characteristic diagram by utilizing a normalization function; the activation operation reduces the area size of the feature map by using an activation function; the merge function is used to connect two or more first convolution layers, and the add function is used to add the first convolution layers into a flow execution.

And S2, performing target detection on the feature picture by using the region generation network, generating a candidate frame according to a detection result, and pooling the candidate frame into a fixed size by using a region feature clustering algorithm to obtain a standard candidate frame.

In the embodiment of the present invention, the RPN (regional pro-portal Network) Network is configured to select a candidate frame from the feature picture.

Referring to fig. 3, the performing target detection on the feature picture by using the area generation network and generating a candidate frame according to a detection result includes:

s20, generating a preset number of anchor boxes (anchor boxes) with different scales and aspect ratios for each point on the feature picture;

s21, inputting the anchor point frame into a detection classification layer of a region generation network for classification, and judging whether a feature map in the anchor point frame belongs to a foreground or a background;

s22, inputting the anchor point frame to a detection frame regression layer of the area generation network to obtain coordinate information of the anchor point frame;

s23, selecting an anchor frame of the feature map belonging to the foreground as a candidate frame, and displaying the candidate frame on the feature map according to the corresponding coordinate value.

In the embodiment of the present invention, for each point on the feature picture, 9 anchor blocks with different scales and aspect ratios may be generated. Wherein, the 9 anchor boxes are obtained by 3 different sizes and 3 different proportions, for example: the 3 sizes are 8, 16 and 32 (may be set to other sizes), and the 3 different ratios are 1:1, 1:2 and 2:1 (may be set to other ratios), so that the resulting 9 anchor boxes x are (8 × 8, 8 × 16, 16 × 8, 16 × 32, 32 × 16, 32 × 32, 32 × 64 and 64 × 32).

Further, in the embodiment of the present invention, the detection classification layer and the detection frame regression layer of the area generation network are used to classify and identify the obtained feature map content framed in all anchor frames, determine whether the foreground or the background is obtained, obtain the coordinate information of the anchor frame, further select an anchor frame of which the feature map belongs to the foreground as a candidate frame, and display the candidate frame on the feature map according to the corresponding coordinate value.

Further, referring to fig. 4, in an embodiment of the present invention, the pooling the candidate frames into a fixed size by using a regional feature aggregation algorithm to obtain a standard candidate frame includes:

s24, dividing each candidate frame into n x n units with fixed size;

s25, determining sampling points in each unit according to a preset rule, calculating pixel values of the sampling points by using a bilinear interpolation method, and performing maximum pooling operation on each block to select pixel points with maximum pixel values in the sampling points;

and S26, obtaining a standard candidate frame corresponding to each candidate frame according to the selected pixel points.

The embodiment of the invention executes the maximum pooling operation on each unit to select the pixel point with the maximum pixel value in the sampling points, reserves the candidate frame containing the pixel point with the maximum pixel value, and eliminates the candidate frame without the pixel point with the maximum pixel value to obtain the standard candidate frame.

The bilinear interpolation method is to perform linear interpolation in two directions respectively, and to determine the pixel values of sampling points by using the intersection points generated by the two linear interpolations as the sampling points.

And S3, performing regression and classification on the standard candidate frame to obtain a target object candidate frame.

In detail, referring to fig. 5, the performing regression and classification on the standard candidate frame to obtain the target candidate frame includes:

s30, obtaining a predicted offset value of the standard candidate frame relative to the actual position by using a frame regression function so as to correct the standard candidate frame;

s31, inputting the standard candidate box into a full connection layer and a softmax function in a pre-trained neural network, and calculating the category of the feature map in the standard candidate box to obtain a target detection box.

In the embodiment of the present invention, the standard candidate frame is generally represented by a four-dimensional vector (x, y, w, h), and the center coordinates (x, y), the width w, and the height h of the standard candidate frame are respectively represented. Using a to represent standard candidate box embodiments of the present invention seek the transformation relationship F using a box regression function such that the standard candidate box is modified to obtain an actual candidate box G, namely:

given A ═ A_x，A_y，A_w，A_h)，G＝(G_x，G_y，G_w，G_h)，F(A_x，A_y，A_w，A_h)＝(G_x，G_y，G_w，G_h)；

F (a) ═ G by translation and scaling;

translation: g_x＝A_x+A_w·d_x(A)，G_y＝A_y+A_h·d_y(A)；

Zooming: g_w＝A_w·exp(d_w(A))，G_h＝A_h·exp(d_h(A))；

In the embodiment of the invention, d is obtained by calculating a frame regression function_x(A)，d_y(A)，d_w(A)，d_h(A) Thereby implementing the correction of the standard candidate box.

Further, the modified standard candidate frame is input into a full connection layer and a softmax function in a pre-trained neural network, the category to which the feature map in the standard candidate frame belongs is calculated, and a target detection frame is obtained, for example, in one embodiment of the present invention, the feature map in the standard candidate frame is classified into types of an automobile, a street lamp, an indicator, a normal driving lane, an emergency lane, and the like, and the standard candidate frame classified into the emergency lane is obtained as the target detection frame.

And S4, performing coordinate mapping of the picture to be detected according to the target object candidate frame, and marking a target object detection result in the picture to be detected.

The embodiment of the invention executes coordinate mapping to map the target object candidate frame into the picture to be detected so as to mark the identified target, such as an emergency lane, in the picture to be detected.

The embodiment of the invention extracts the features of the picture to be detected through the pre-constructed feature extraction network, enhances the feature expression capability in difficult scenes, thereby improving the overall accuracy of target object detection, such as emergency lane detection, simultaneously utilizes the region generation network and the candidate frame generated by the region feature set algorithm, and regresses and classifies the candidate frame, thereby effectively relieving the problem of pixel deviation, improving the regression positioning of the candidate frame for target object detection, and further improving the overall accuracy of target object detection. Therefore, the target detection accuracy can be improved by the embodiment of the invention.

Fig. 6 is a schematic block diagram of a target detection device according to the present invention.

The object detection device 100 according to the present invention may be installed in an electronic apparatus. According to the implemented functions, the object detection device 100 may include a picture feature extraction module 101, a candidate frame generation module 102, a classification regression 103, and a candidate frame mapping 104. A module according to the present invention, which may also be referred to as a unit, refers to a series of computer program segments that can be executed by a processor of an electronic device and that can perform a fixed function, and that are stored in a memory of the electronic device.

In the present embodiment, the functions regarding the respective modules/units are as follows:

the picture feature extraction module 101 is configured to perform size standardization processing on the picture to be detected to obtain a standard picture, and extract picture features in the standard picture through a pre-constructed feature extraction network to obtain a feature picture.

In the embodiment of the present invention, the picture feature extraction module 101 may use various optical cameras to obtain the picture to be detected, for example, a traffic monitoring camera is used to photograph emergency lanes at specified time intervals, and the photographed emergency lane picture is uploaded to a database, and finally the emergency lane picture in the database is summarized to obtain the picture to be detected.

In detail, the picture feature extraction module 101 performs size normalization on the picture to be detected by:

step A, judging whether the size of the picture to be detected is larger than the size of a standard picture input by a user;

b, when the size of the picture to be detected is larger than the size of the standard picture, performing cutting processing on the picture to be detected according to the size of the standard picture to obtain the standard picture;

and C, when the size of the picture to be detected is smaller than the size of the standard picture, filling the picture to be detected according to the size of the standard picture to obtain the standard picture.

For example, the standard picture size may be set to 1000 × 600. When the size of the picture to be detected is 1200 × 1200, the picture feature extraction module 101 may cut the picture to obtain a standard picture by a center cutting method, with the center of the picture as an origin, the length of 1000, and the width of 600; when the size of the picture to be detected is 800 × 400, the picture feature extraction module 101 may expand outward by using a preset pixel value by using a picture frame as a boundary through an edge expansion filling method until the expanded size is 1000 in length and 600 in width, so as to obtain a standard picture.

In detail, the feature extraction network is constructed by the following method: the first convolution layer and the second convolution layer construct the first convolution layer according to convolution operation (Conv), normalization operation (BN) and activation operation (Leaky relu); combining the first convolutional layer using a combining function (Concat) and an adding function (Add) to construct a second convolutional layer; and constructing the feature extraction network according to the first convolution layer and the second convolution layer.

The candidate frame generation module 102 is configured to perform target detection on the feature picture by using a region generation network, generate a candidate frame according to a detection result, and pool the candidate frame into a fixed size by using a region feature aggregation algorithm to obtain a standard candidate frame.

In detail, the candidate frame generation module 102 performs target detection on the feature picture by the following operations, and generates a candidate frame according to a detection result:

step a, generating a preset number of anchor boxes (anchor boxes) with different scales and aspect ratios for each point on the characteristic picture;

b, inputting the anchor frame into a detection classification layer of a region generation network for classification, and judging whether a feature map in the anchor frame belongs to a foreground or a background;

step c, inputting the anchor point frame into a detection frame regression layer of the area generation network to obtain coordinate information of the anchor point frame;

and d, selecting an anchor frame of which the feature map belongs to the foreground as a candidate frame, and displaying the candidate frame on the feature map according to the corresponding coordinate value.

In this embodiment of the present invention, for each point on the feature picture, the candidate frame generation module 102 may generate 9 anchor frames with different scales and aspect ratios. Wherein, the 9 anchor boxes are obtained by 3 different sizes and 3 different proportions, for example: the 3 sizes are 8, 16 and 32 (may be set to other sizes), and the 3 different ratios are 1:1, 1:2 and 2:1 (may be set to other ratios), so that the resulting 9 anchor boxes x are (8 × 8, 8 × 16, 16 × 8, 16 × 32, 32 × 16, 32 × 32, 32 × 64 and 64 × 32).

Further, in the embodiment of the present invention, the candidate frame generation module 102 performs classification and identification on the obtained feature map content framed in all anchor frames through a detection classification layer and a detection frame regression layer of the area generation network, determines whether the foreground or the background is obtained, obtains coordinate information of the anchor frame, further selects an anchor frame of which the feature map belongs to the foreground as a candidate frame, and displays the candidate frame on the feature map according to a corresponding coordinate value.

Further, in this embodiment of the present invention, the pooling of the candidate frames into a fixed size by using a regional feature clustering algorithm to obtain a standard candidate frame includes:

step e, dividing each candidate frame into n × n units with fixed size;

f, determining sampling points in each unit according to a preset rule, calculating pixel values of the sampling points by using a bilinear interpolation method, and executing maximum pooling operation on each block to select pixel points with maximum pixel values in the sampling points;

and g, obtaining a standard candidate frame corresponding to each candidate frame according to the selected pixel points.

The classification regression module 103 is configured to perform regression and classification on the standard candidate frames to obtain target candidate frames.

In detail, the classification regression module 103 performs regression and classification on the standard candidate frames by the following method to obtain target candidate frames: obtaining an offset predicted value of the standard candidate frame relative to the actual position by using a frame regression function so as to correct the standard candidate frame; and inputting the standard candidate box into a full connection layer and a softmax function in a pre-trained neural network, and calculating the category of the feature map in the standard candidate box to obtain a target detection box.

In the embodiment of the present invention, the standard candidate frame is generally represented by a four-dimensional vector (x, y, w, h), and the center coordinates (x, y), the width w, and the height h of the standard candidate frame are respectively represented. The classification regression module 103 uses a to represent the standard candidate frame in the embodiment of the present invention, which uses the frame regression function to find the transformation relationship F, so that the standard candidate frame is corrected to obtain the actual candidate frame G, that is:

F (a) ═ G by translation and scaling;

translation: g_x＝A_x+A_w·d_x(A)，G_y＝A_y+A_h·d_y(A)；

Zooming: g_w＝A_w·exp(d_w(A))，G_h＝A_h·exp(d_h(A))；

The classification regression module 103 of the embodiment of the present invention obtains d by calculating a frame regression function_x(A)，d_y(A)，d_w(A)，d_h(A) Thereby implementing the correction of the standard candidate box.

Further, the classification regression module 103 according to an embodiment of the present invention inputs the modified standard candidate frame into a full connection layer and a softmax function in a pre-trained neural network, calculates a category to which a feature map in the standard candidate frame belongs, and obtains a target detection frame, for example, according to an embodiment of the present invention, the feature map in the standard candidate frame is classified into types of an automobile, a street lamp, an indicator, a normal driving lane, an emergency lane, and the like, and obtains the standard candidate frame classified as the emergency lane as the target detection frame.

The candidate frame mapping module 104 is configured to perform coordinate mapping on the picture to be detected according to the target object candidate frame, and mark a target object detection result in the picture to be detected.

The candidate frame mapping module 104 according to the embodiment of the present invention performs coordinate mapping to map the target candidate frame to the to-be-detected picture, so as to mark an identified target, such as an emergency lane, in the to-be-detected picture.

Fig. 7 is a schematic structural diagram of an electronic device for implementing the target detection method according to the present invention.

The electronic device 1 may comprise a processor 10, a memory 11 and a bus, and may further comprise a computer program, such as an object detection program 12, stored in the memory 11 and executable on the processor 10.

The memory 11 includes at least one type of readable storage medium, which includes flash memory, removable hard disk, multimedia card, card-type memory (e.g., SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in mobile hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only to store application software installed in the electronic device 1 and various types of data, such as a code of the object detection program 12, but also to temporarily store data that has been output or is to be output.

The processor 10 may be composed of an integrated circuit in some embodiments, for example, a single packaged integrated circuit, or may be composed of a plurality of integrated circuits packaged with the same or different functions, including one or more Central Processing Units (CPUs), microprocessors, digital Processing chips, graphics processors, and combinations of various control chips. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects various components of the electronic device by using various interfaces and lines, and executes various functions and processes data of the electronic device 1 by running or executing a program or a module (for example, executing an object detection program, etc.) stored in the memory 11 and calling data stored in the memory 11.

The bus may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. The bus is arranged to enable connection communication between the memory 11 and at least one processor 10 or the like.

Fig. 7 only shows an electronic device with components, and it will be understood by a person skilled in the art that the structure shown in fig. 7 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or a combination of certain components, or a different arrangement of components.

For example, although not shown, the electronic device 1 may further include a power supply (such as a battery) for supplying power to each component, and preferably, the power supply may be logically connected to the at least one processor 10 through a power management device, so as to implement functions of charge management, discharge management, power consumption management, and the like through the power management device. The power supply may also include any component of one or more dc or ac power sources, recharging devices, power failure detection circuitry, power converters or inverters, power status indicators, and the like. The electronic device 1 may further include various sensors, a bluetooth module, a Wi-Fi module, and the like, which are not described herein again.

Further, the electronic device 1 may further include a network interface, and optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a bluetooth interface, etc.), which are generally used for establishing a communication connection between the electronic device 1 and other electronic devices.

Optionally, the electronic device 1 may further comprise a user interface, which may be a Display (Display), an input unit (such as a Keyboard), and optionally a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch device, or the like. The display, which may also be referred to as a display screen or display unit, is suitable for displaying information processed in the electronic device 1 and for displaying a visualized user interface, among other things.

It is to be understood that the described embodiments are for purposes of illustration only and that the scope of the appended claims is not limited to such structures.

An object detection program 12 stored in the memory 11 of the electronic device 1 is a combination of instructions that, when executed in the processor 10, enable:

Specifically, the specific implementation method of the processor 10 for the instruction may refer to the description of the relevant steps in the embodiments corresponding to fig. 1 to fig. 5, which is not repeated herein.

Further, the integrated modules/units of the electronic device 1, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. The computer readable storage medium may be volatile or non-volatile. For example, the computer-readable medium may include: any entity or device capable of carrying said computer program code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM).

The present invention also provides a computer-readable storage medium, storing a computer program which, when executed by a processor of an electronic device, may implement:

In the embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method can be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.

The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.

In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.

It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof.

The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.

The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.

Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. A plurality of units or means recited in the system claims may also be implemented by one unit or means in software or hardware. The terms second, etc. are used to denote names, but not any particular order.

Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims

1. A method of detecting a target, the method comprising:

2. The method for detecting the target object according to claim 1, wherein the step of performing size normalization processing on the picture to be detected to obtain a standard picture comprises:

3. The method for detecting the target object according to claim 1, wherein the performing target detection on the feature picture by using the area generation network and generating the candidate frame according to the detection result comprises:

4. The method of claim 3, wherein the pooling of the candidate boxes to a fixed size using a regional feature clustering algorithm results in standard candidate boxes comprising:

dividing each of the candidate boxes into n x n fixed-size cells;

5. The method of claim 1, wherein the step of performing regression and classification on the standard candidate frames to obtain target candidate frames comprises:

6. The method for detecting the target object according to any one of claims 1 to 5, wherein before the extracting the picture features in the standard picture through the pre-constructed feature extraction network to obtain the feature picture, the method further comprises:

7. The object detection method according to any one of claims 1 to 5, wherein the object is an application lane.

8. An intelligent question-answering device based on big data, which is characterized by comprising:

9. An electronic device, characterized in that the electronic device comprises:

at least one processor; and the number of the first and second groups,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a method of object detection as claimed in any one of claims 1 to 7.

10. A computer-readable storage medium comprising a storage data area and a storage program area, wherein the storage data area stores created data, and the storage program area stores a computer program; wherein the computer program, when executed by a processor, implements a method of object detection as claimed in any one of claims 1 to 7.