CN117036482B

CN117036482B - Target object positioning method, device, shooting equipment, chip, equipment and medium

Info

Publication number: CN117036482B
Application number: CN202311064453.6A
Authority: CN
Inventors: 张永波; 郑哲; 袁福生; 崔文朋; 张京晶; 李海涛; 龚向锋; 陶友水; 孙天锋; 蔡雨露
Original assignee: Beijing Smartchip Microelectronics Technology Co Ltd
Current assignee: Beijing Smartchip Microelectronics Technology Co Ltd
Priority date: 2023-08-22
Filing date: 2023-08-22
Publication date: 2024-06-14
Anticipated expiration: 2043-08-22
Also published as: CN117036482A

Abstract

The invention discloses a target object positioning method, a target object positioning device, shooting equipment, a chip, equipment and a medium. Firstly, obtaining a background image obtained by shooting a target scene and a video file to be detected comprising a plurality of frames of images to be detected; then, performing stitching operation by using the background image and each frame of to-be-inspected image to obtain a stitched image corresponding to each frame of to-be-inspected image; then, performing target detection and object comparison processing based on the spliced images to obtain an object class to be determined that the target object is a suspected left object or a suspected lost object and a detection position of the target object in each frame of the image to be detected; and finally, determining a tracking result of the target object based on the detection position of the target object in each frame of to-be-detected image, and carrying out positioning detection on the target object according to the to-be-determined object type and the tracking result of the target object to obtain a positioning result of the target object. The embodiment can lead the detection of the article carry-over and the loss to have high detection rate and higher timeliness.

Description

Target object positioning method, device, shooting equipment, chip, equipment and medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to a method and apparatus for locating a target object, a shooting device, a chip, a device, and a medium.

Background

Article carry-over and loss detection is an important ring in intelligent monitoring systems, and is beneficial to finding out unknown carry-over or retrieving lost articles, so that the potential safety hazard problem is solved.

In the related technology, background modeling can be performed on a monitored scene, then the background modeling is compared with real-time monitoring to find suspicious objects, and then the target position is determined through subsequent morphological operation.

However, the manner of background modeling is susceptible to illumination transformation and environmental transformation, and the detection accuracy thereof is to be improved.

Disclosure of Invention

The embodiments of the present specification aim to solve at least one of the technical problems in the related art to some extent. For this reason, the embodiment of the present specification proposes a target object positioning method, apparatus, photographing device, chip, device, and medium.

The embodiment of the specification provides a target object positioning method, which comprises the following steps:

Acquiring a background image and a video file to be detected, which are shot aiming at a target scene; the video file to be detected comprises a plurality of frames of images to be detected; the target object is a legacy object left in the target scene or a lost object lost from the target scene;

Performing stitching operation by using the background image and each frame of to-be-inspected image to obtain a stitched image corresponding to each frame of to-be-inspected image;

performing target detection and object comparison processing based on the spliced images to obtain detection results; the detection result comprises an object category to be determined of the target object and a detection position of the target object in each frame of to-be-detected image; the object category to be determined is used for representing that the target object is identified as a suspected left-over object or a suspected lost object;

Determining a tracking result of the target object based on the detection position of the target object in each frame of to-be-detected image;

And carrying out positioning detection on the target object according to the object type to be determined of the target object and the tracking result to obtain a positioning result of the target object.

In one embodiment, before the determining the tracking result of the target object based on the detected position of the target object in the each frame of to-be-detected image, the method further includes:

For any frame of to-be-detected image, carrying out position prediction according to the detection position of the target object in the to-be-detected image before the any frame of to-be-detected image to obtain the predicted position of the target object in the any frame of to-be-detected image;

the determining the tracking result of the target object based on the detection position of the target object in each frame of the image to be detected comprises the following steps:

And carrying out association processing according to the detection position and the prediction position of the target object in any frame of to-be-detected image, and determining the tracking result of the target object.

In one embodiment, the determining the tracking result of the target object according to the correlation processing of the detected position and the predicted position of the target object in the to-be-detected image of any frame includes:

Performing association processing according to the cross ratio cost matrix between the boundary frame corresponding to the detection position and the boundary frame corresponding to the prediction position, and determining a tracking result of the target object; or alternatively

And carrying out association processing according to the color information cost matrix between the boundary box corresponding to the detection position and the boundary box corresponding to the prediction position, and determining the tracking result of the target object.

In one embodiment, the positioning detection of the target object according to the object type to be determined of the target object in the each frame of to-be-detected image and the tracking result, to obtain a positioning result of the target object, includes:

And if the object class to be determined is a suspected left-behind object, judging that the target object is not lost from the target scene and the target object does not move according to the tracking result, and taking the detection position as a positioning result of the target object.

In one embodiment, the method further comprises:

if the object class to be determined is a suspected legacy object, judging that the target object is lost from the target scene according to the tracking result, and determining a subsequent image to be detected in a first preset time period after the target object is lost from the target scene;

and continuously executing the step of judging whether the target object is lost from the target scene or not based on the tracking result corresponding to the follow-up to-be-detected image.

If the object class to be determined is a suspected lost object, judging that the target object is out of bounds from the target scene according to the tracking result, and taking the detection position of the target object in the background image as a positioning result of the target object; or alternatively

If the object class to be determined is a suspected lost object, judging that the target object is out of bounds from the target scene according to the tracking result, and determining a final position to-be-detected image corresponding to the target object; taking the detection position of the target object in the final position to-be-detected image as a positioning result of the target object; the last position to-be-detected image is an image to be detected corresponding to the last moment when the target object appears in the target scene.

In one embodiment, the method further comprises:

If the object class to be determined is a suspected lost object, and the target object is judged to be not out of bounds from the target scene according to the tracking result, the step of judging whether the target object is out of bounds from the target scene is continuously executed.

In one embodiment, the background image comprises a plurality of reference objects; the target detection and object comparison processing are carried out based on the spliced image to obtain a detection result, and the method comprises the following steps:

Performing target detection based on the spliced image to obtain a reference object included in the background image and a detection object included in each frame of image to be detected; wherein the detected object is the target object;

and comparing the reference object with the target object to obtain the object class to be determined of the target object.

In one embodiment, the suspected legacy object is an object that is not originally present in the target scene but subsequently appears in the target scene;

The suspected missing object is an object that initially exists in the target scene but does not appear in the target scene later.

The embodiment of the specification provides a shooting device, wherein the shooting device comprises an image acquisition unit and a heterogeneous chip; the heterogeneous chip comprises an embedded neural network processor and a central processing unit;

The image acquisition unit is used for shooting a target scene to obtain a background image and a video file to be detected; the video file to be detected comprises a plurality of frames of images to be detected;

The embedded neural network processor is used for performing splicing operation on the background image and each frame of to-be-detected image to obtain a spliced image corresponding to each frame of to-be-detected image; performing target detection and object comparison processing based on the spliced images to obtain detection results; the detection result comprises an object category to be determined of a target object and a detection position of the target object in each frame of to-be-detected image; the object category to be determined is used for representing that the target object is identified as a suspected left-over object or a suspected lost object; the target object is a legacy object left in the target scene or a lost object lost from the target scene;

the central processing unit is used for determining a tracking result of the target object based on the detection position of the target object in each frame of to-be-detected image; and carrying out positioning detection on the target object according to the object type to be determined of the target object and the tracking result to obtain a positioning result of the target object.

The present specification provides a target object positioning apparatus, the apparatus including:

The target scene shooting module is used for acquiring a background image and a video file to be detected, which are shot aiming at a target scene; the video file to be detected comprises a plurality of frames of images to be detected; the target object is a legacy object left in the target scene or a lost object lost from the target scene;

The spliced image determining module is used for performing splicing operation on the background image and each frame of to-be-detected image to obtain a spliced image corresponding to each frame of to-be-detected image;

the detection result determining module is used for carrying out target detection and object comparison processing based on the spliced image to obtain a detection result; the detection result comprises an object category to be determined of the target object and a detection position of the target object in each frame of to-be-detected image; the object category to be determined is used for representing that the target object is identified as a suspected left-over object or a suspected lost object;

The tracking result determining module is used for determining a tracking result of the target object based on the detection position of the target object in each frame of to-be-detected image;

and the positioning result determining module is used for carrying out positioning detection on the target object according to the object type to be determined of the target object and the tracking result to obtain the positioning result of the target object.

In one embodiment, the tracking result determining module is further configured to, for any frame of to-be-detected image, perform position prediction according to a detection position of the target object in the to-be-detected image before the any frame of to-be-detected image, to obtain a predicted position of the target object in the any frame of to-be-detected image; and carrying out association processing according to the detection position and the prediction position of the target object in any frame of to-be-detected image, and determining the tracking result of the target object.

In one embodiment, the positioning result determining module is further configured to determine, if the object class to be determined is a suspected legacy object, and according to the tracking result, that the target object is not lost from the target scene and that the target object does not move, and use the detection position as the positioning result of the target object.

In one embodiment, the positioning result determining module is further configured to determine, if the object class to be determined is a suspected legacy object, determine that the target object is lost from the target scene according to the tracking result, and determine a subsequent image to be detected within a first preset duration after the target object is lost from the target scene; and continuously executing the step of judging whether the target object is lost from the target scene or not based on the tracking result corresponding to the follow-up to-be-detected image.

In one embodiment, the positioning result determining module is further configured to determine, if the object class to be determined is a suspected lost object, that the target object subsequently exits from the target scene according to the tracking result, and take a detection position of the target object in the background image as a positioning result of the target object; or if the object class to be determined is suspected lost object, judging that the target object is out of bounds from the target scene according to the tracking result, and determining a final position to-be-detected image corresponding to the target object; taking the detection position of the target object in the final position to-be-detected image as a positioning result of the target object; the last position to-be-detected image is an image to be detected corresponding to the last moment when the target object appears in the target scene.

In one embodiment, the positioning result determining module is further configured to, if the object class to be determined is a suspected lost object, determine that the target object is not subsequently out of bounds from the target scene according to the tracking result, and continuously execute the step of determining whether the target object is subsequently out of bounds from the target scene.

In one embodiment, the detection result determining module is further configured to perform target detection based on the stitched image, to obtain a reference object included in the background image and a detected object included in the image to be detected of each frame; wherein the detected object is the target object; and comparing the reference object with the target object to obtain the object class to be determined of the target object. The present specification embodiment provides a computer apparatus including: a memory, and one or more processors communicatively coupled to the memory; the memory has stored therein instructions executable by the one or more processors to cause the one or more processors to implement the steps of the method of any of the embodiments described above.

The present description provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the method according to any of the above embodiments.

The present description provides a computer program product comprising instructions which, when executed by a processor of a computer device, enable the computer device to perform the steps of the method of any one of the embodiments described above.

The present description embodiments provide a heterogeneous chip comprising an embedded neural network processor NPU, a central processing unit CPU, a memory, and a computer program stored in the memory and configured to be executed by the central processing unit CPU and the embedded neural network processor NPU, the central processing unit CPU and the embedded neural network processor NPU implementing the method according to any of the embodiments described above when executing the computer program.

In the above embodiment of the specification, first, a background image obtained by shooting a target scene and a video file to be detected including a plurality of frames of images to be detected are obtained; then, performing stitching operation by using the background image and each frame of to-be-inspected image to obtain a stitched image corresponding to each frame of to-be-inspected image; then, performing target detection and object comparison processing based on the spliced images to obtain an object class to be determined that the target object is a suspected left object or a suspected lost object and a detection position of the target object in each frame of the image to be detected; and finally, determining a tracking result of the target object based on the detection position of the target object in each frame of to-be-detected image, and carrying out positioning detection on the target object according to the to-be-determined object type and the tracking result of the target object to obtain a positioning result of the target object. The method comprises the steps of converting the problem of detecting the residual loss into the problem of detecting the target based on the neural network, and detecting a plurality of frames of images to be detected in the video file by using the neural network to obtain a tracking result of suspected residual matters or suspected lost matters; based on the tracking result, verifying and confirming the suspected left-over object or the suspected lost object to be determined object type, determining the object type as the left-over object or the lost object, and outputting a corresponding positioning result; the suspected left-over object or the suspected lost object is used as two types of detection models, so that the method is applicable to the types of objects which do not participate in model training in a generalization mode, and false detection conditions caused by factors such as illumination transformation, environment transformation and the like can be reduced.

Drawings

Fig. 1a is a schematic view of a scenario of a target object positioning method according to an embodiment of the present disclosure;

fig. 1b is a schematic flow chart of a target object positioning method according to an embodiment of the present disclosure;

fig. 1c is a schematic diagram of a model for determining a detection result according to an embodiment of the present disclosure;

FIG. 2 is a schematic flow chart of a tracking result of a determined target object according to an embodiment of the present disclosure;

fig. 3 is a schematic flow chart of whether a suspected legacy object is lost from a target scene according to an embodiment of the present disclosure;

fig. 4 is a schematic flow chart of determining an object class to be determined of a target object according to an embodiment of the present disclosure;

Fig. 5 is a schematic flow chart of a target object positioning method according to an embodiment of the present disclosure;

FIG. 6 is a schematic diagram of a target object positioning method according to an embodiment of the present disclosure;

FIG. 7 is a schematic diagram of a target object positioning device according to an embodiment of the present disclosure;

fig. 8 is an internal configuration diagram of a computer device according to an embodiment of the present disclosure.

Detailed Description

Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative and intended to explain the present invention and should not be construed as limiting the invention.

Item carry-over and loss detection is a very important feature in intelligent monitoring systems. The article carry-over and loss detection is beneficial to the detection of unknown carry-over or the recovery of lost articles in places with dense traffic flow such as highways, tunnel subway stations, high-speed rail stations, stadiums and the like, thereby solving the potential safety hazard problem.

Carry-over and loss detection of items is a fundamental function in intelligent video surveillance. In the related art, background modeling (such as mixed Gaussian model modeling and double-background modeling with double learning rates) can be performed on a monitored scene through different methods, the background modeling is compared with real-time monitoring, suspicious objects are found, and then the target position is determined through subsequent morphological operation. The optical flow calculation can be carried out on the monitored scene to determine the moving object, then the moving object is continuously tracked, when the moving state of a certain moving object changes from moving to stationary, the object can be judged to be a carry-over, and when the moving state of a certain moving object changes from stationary to moving, the object can be judged to be a lost object. The method can also consider the left-over object and the lost object as a specific target, automatically learn the characteristics through a convolutional neural network, and carry out regression on the target position, thereby finally realizing the detection of the left-over object and the lost object.

However, the way the background is modeled is susceptible to illumination transformations, environmental transformations, leading to false positives and false negatives. The common neural network related scheme cannot be suitable for the scene of an unknown target class, and is difficult to apply in real time at the end side.

Based on this, the present embodiment provides a target object positioning method. Firstly, obtaining a background image obtained by shooting a target scene and a video file to be detected comprising a plurality of frames of images to be detected; then, performing stitching operation by using the background image and each frame of to-be-inspected image to obtain a stitched image corresponding to each frame of to-be-inspected image; then, performing target detection and object comparison processing based on the spliced images to obtain an object class to be determined that the target object is a suspected left object or a suspected lost object and a detection position of the target object in each frame of the image to be detected; and finally, determining a tracking result of the target object based on the detection position of the target object in each frame of to-be-detected image, and carrying out positioning detection on the target object according to the to-be-determined object type and the tracking result of the target object to obtain a positioning result of the target object.

The embodiment can lead the detection of the article carry-over and the loss to have high detection rate and higher timeliness. Through the embodiment, if the model training process is performed by utilizing image data under different illumination and different environments, the model robustness is improved, and false detection conditions caused by factors such as shielding, illumination, shadow change, environment shake and the like can be reduced. Further, the suspected carryover and the suspected lost are used as two categories of the detection model, so that the method can generalize and adapt to the articles which do not participate in training, and improves generalization of the model. The confirmation process of the tracking result and the positioning result is separated from the detection of the left-over object and the lost object, so that the problem of detecting the left-over lost object is converted into the common detection problem of the neural network.

The present embodiments provide a scenario example of a target object positioning method. And shooting the scene A by shooting equipment to obtain a background image B and a video file V. The video file V includes an image to be inspected P1, an image to be inspected P2, an image to be inspected P3, an image to be inspected P4, and an image to be inspected P5. The background image B includes an article object M and an article object N. The photographing apparatus includes an image acquisition unit and a heterogeneous chip including npu cores (Neural-network Process Units, embedded neural network processor) and a cpu core (Central Processing Unit, central processor). And a neural network detection model is deployed on the npu core, target detection processing is carried out on the background image B and the image to be detected which is included in the video file V through the neural network detection model, and the detection category of the neural network detection model comprises suspected carryover and suspected lost. Specifically, referring to fig. 1a, a background image B and an image to be detected P1 are stitched to obtain a stitched image BP1. And performing target detection and object comparison processing based on the spliced image BP1 to obtain a detection result D1. The detection result D1 includes: the detection types of the detected object M in the image P1 to be detected are the suspected lost object and the detection position BM1, and the detection type of the detected object N and the detection position BN1.

And splicing the background image B and the image to be detected P2 to obtain a spliced image BP2. And performing target detection and object comparison processing based on the spliced image BP2 to obtain a detection result D2. The detection result D2 includes: the detection type of the detected object L in the image P2 to be detected is a suspected legacy and the detection position BL2, the detection type of the detected object M in the image P2 to be detected is a suspected lost and the detection position BM2, and the detection type of the detected object N in the image P2 to be detected is the detection position BN1.

And splicing the background image B and the image to be detected P3 to obtain a spliced image BP3. And performing target detection and object comparison processing based on the spliced image BP3 to obtain a detection result D3. The detection result D3 includes: the detection type of the detected object L in the image P3 to be detected is a suspected legacy and the detection position BL2, the detection type of the detected object M in the image P3 to be detected is a suspected lost and the detection position BM3, and the detection type of the detected object N in the image P3 to be detected is the detection position BN1.

And splicing the background image B and the image to be detected P4 to obtain a spliced image BP4. And performing target detection and object comparison processing based on the spliced image BP4 to obtain a detection result D4. The detection result D4 includes: the detection type of the detected object L in the image P4 to be detected is a suspected legacy and the detection position BL2, and the detection type of the detected object N in the image P4 to be detected is the detection position BN1.

And splicing the background image B and the image to be detected P5 to obtain a spliced image BP5. And performing target detection and object comparison processing based on the spliced image BP5 to obtain a detection result D5. The detection result D5 includes: the detection type of the detected object L in the image P5 to be detected is a suspected legacy and the detection position BL2, and the detection type of the detected object N in the image P5 to be detected is the detection position BN1.

The detection object M is suspected lost, and the prediction is performed based on the detection position BM1 and the detection position BM 2 to obtain a predicted position BM 3' of the detection object M. The correlation tracking is performed based on the detection position BM3 of the detected object M and the predicted position BM 3' of the detected object M in the image P3 to be detected, and the tracking result of the detected object M is obtained based on the image P4 to be detected and the image P5 to be detected in the same manner. The detection object M is positioned and detected based on the tracking result of the detection object M and the detection type of the detection object M as a suspected lost object, and the detection position BM3 of the detection object M or the position BM0 of the article object M in the background image B can be output.

The detection object L is a suspected residue, and the prediction is performed based on the detection position BL2 of the image to be detected P2 and the detection position BL2 in the image to be detected P3, so as to obtain a predicted position BL 4' of the detection object L. And performing association tracking based on the detection position BL2 of the detected object L in the image to be detected P4 and the prediction position BL 4' of the detected object L, and obtaining tracking results of the detected object L by the same operation on the image to be detected P4 and the image to be detected P5. And positioning and detecting the detected object L according to the tracking result of the detected object L and the detection type of the detected object L as suspected carryover, and outputting a detection position BL2 of the detected object L.

The embodiment of the present disclosure provides a target object positioning method, referring to fig. 1b, the method may include the following steps:

S110, acquiring a background image shot for a target scene and a video file to be detected.

The video file to be detected comprises a plurality of frames of images to be detected. The target object is a legacy object that is left in the target scene or a lost object that is lost from the target scene. The background image may be an image captured for the target scene without the object, or may be an image captured for the target scene including at least one object. The shooting time of the background image may be earlier than the shooting time of the video file to be detected.

Specifically, the server locally stores a plurality of initial video files obtained by shooting a target scene, acquires a background image of the target scene from the initial video files, and intercepts the initial video files to obtain video files to be detected. It should be noted that the shooting time of the background image may be earlier than the shooting time of the video file to be detected. In other embodiments, the shooting device may shoot the target scene to obtain the background image and the video file to be detected. The background image and the video file to be detected obtained through shooting can be stored locally in the shooting equipment, and the background image and the video file to be detected obtained through shooting can be transmitted to a server for storage, so that the background image and the video file to be detected obtained through shooting aiming at the target scene are obtained. The photographing apparatus may be at least one of a video camera, and a fisheye camera, for example.

And S120, performing stitching operation on the background image and each frame of to-be-inspected image to obtain a stitched image corresponding to each frame of to-be-inspected image.

S130, performing target detection and object comparison processing based on the spliced image to obtain a detection result.

The detection result comprises the object category to be determined of the target object and the detection position of the target object in each frame of to-be-detected image. The object class to be determined is used for representing that the target object is identified as a suspected legacy object or a suspected lost object. The suspected legacy object may be a target object that does not exist within the background image, but appears in the image under inspection. The suspected missing object may be a target object that exists within the background image, but disappears or moves in position in the image to be inspected. The detection position is position data of the target object in the image to be detected.

In some cases, by performing target detection and object comparison processing on the stitched image and determining whether the target object is a suspected left-behind object or a suspected lost object, the method can generalize and apply to articles which do not participate in training. By contrast processing, it is possible to determine which of the background image and the image to be inspected are the same target object, which of the background image and the image to be inspected are target objects newly appearing in the image to be inspected, and which of the background image and the image to be inspected are target objects disappearing from the image to be inspected or moving, thereby obtaining the object type to be determined of the target object.

Specifically, the background image and each frame of to-be-detected image are subjected to splicing operation, and a spliced image corresponding to each frame of to-be-detected image is obtained. And taking the spliced image corresponding to each frame of to-be-detected image as the input of a target detection and object comparison processing model to obtain the detection position of the target object in each frame of to-be-detected image and the detection result of whether the target object is a suspected left object or a suspected lost object. It should be noted that, step S120 and step 130 are implemented through a neural network detection model, so as to convert the problem of detecting the legacy loss into a common detection problem of the neural network.

For example, referring to fig. 1c, the background image 102 may be an image having three channels (R channel, G channel, B channel), a height h, and a width w, i.e., 3×h×w. The image 104 to be inspected may be an image having three channels (R channel, G channel, B channel), a height h, and a width w, i.e., 3×h×w. And performing stitching operation on the background image 102 and the image 104 to be detected through concat operation (merging operation) to obtain a stitched image 106 corresponding to the image 104 to be detected, wherein the stitched image is 6×h×w. The target detection and object alignment processing model may be a yolov 8_nano-based detection model. Taking the spliced image of 6×h×w as the input of the yolov8_nano detection model 108, and obtaining a detection result 110. The detection result 110 may include a detection position a ₁ of the object a in the to-be-detected image 104 and the object a being a suspected carry-over object, and a detection position B ₁ of the object B in the background image 102 and the object B being a suspected lost object.

S140, determining a tracking result of the target object based on the detection position of the target object in each frame of the image to be detected.

And S150, positioning and detecting the target object according to the object type to be determined and the tracking result of the target object to obtain a positioning result of the target object.

The tracking result may be a sequence composed of detection positions of the target object in each frame of the image to be detected based on time sequence. The motion path of the target object can be determined according to the tracking result. The positioning result may be a detection position of the suspected legacy object in the last frame of the to-be-detected image in the to-be-detected video file, and may be a detection position of the suspected lost object in the background image. After verifying the object class to be determined (suspected legacy object or suspected missing object) of the target object based on the tracking result, a positioning result of the target object being a legacy object and/or a positioning result of the target object being a missing object may be output.

In some cases, analysis tracking of the target object can be achieved according to the tracking result of the target object, and the motion path of the target object is determined. It is further verified whether the target object is a legacy object or a missing object by the motion path of the target object.

Specifically, the detection positions of the target object in each frame of the image to be detected are arranged based on the time sequence, and a position sequence can be constituted, which can be determined as the tracking result of the target object. And positioning and detecting the target object by utilizing the object type to be determined and the tracking result of the target object, and taking the detection result of the target object as the positioning result of the target object when the final result of the target object is determined to be a legacy. And when the final result of the target object is determined to be lost, taking the position of the target object in the background image as the positioning result of the target object.

The detection position of the target object in each frame of the image to be detected is tracked by utilizing a multi-target tracking module, so that a tracking result of the target object is obtained. And then, inputting the tracking result of the target object into a state machine logic application module, and carrying out positioning detection on the target object by the state machine logic application module according to the object type to be determined and the tracking result of the target object to obtain the positioning result of the target object.

In the above embodiment, the problem of detecting the remaining loss is converted into the problem of detecting the target based on the neural network, and the tracking result of the suspected remaining object or the suspected lost object is obtained by detecting the multi-frame to-be-detected image in the video file by using the neural network; based on the tracking result, verifying and confirming the suspected left-over object or the suspected lost object to be determined object type, determining the object type as the left-over object or the lost object, and outputting a corresponding positioning result; the suspected left-over object or the suspected lost object is used as two types of detection models, so that the method is applicable to the types of objects which do not participate in model training in a generalization mode, and false detection conditions caused by factors such as illumination transformation, environment transformation and the like can be reduced.

In some embodiments, referring to fig. 2, before determining the tracking result of the target object based on the detected position of the target object in each frame of the image to be inspected, the method may include the steps of:

S210, carrying out position prediction on any frame of to-be-detected image according to the detection position of the target object in the to-be-detected image before any frame of to-be-detected image, and obtaining the predicted position of the target object in any frame of to-be-detected image.

Specifically, any frame of image to be inspected is determined in the video file. And predicting the position of the target object in the to-be-detected image according to the detection position of the target object in each frame of to-be-detected image before the to-be-detected image aiming at the to-be-detected image, so as to obtain the predicted position of the target object in the to-be-detected image.

Illustratively, the video file to be detected includes five frames of images to be detected. And aiming at the third frame to-be-detected image, carrying out position prediction according to the detection position L1 of the target object in the first frame to-be-detected image and the detection position L2 of the target object in the second frame to-be-detected image to obtain the prediction position L3' of the target object in the third frame to-be-detected image.

In some embodiments, the position of the target object in any frame of the image to be detected is obtained by performing prediction according to the detection position of the target object in the image to be detected before the image to be detected in any frame through Kalman filtering. It should be noted that, if the image to be detected in any frame is the first frame of the video file to be detected, the detection position of the target object in the image to be detected in any frame may be used as the prediction position of the target object in the image to be detected in any frame.

Determining a tracking result of the target object based on the detection position of the target object in each frame of to-be-detected image comprises the following steps:

s220, performing association processing according to the detection position and the prediction position of the target object in any frame of to-be-detected image, and determining the tracking result of the target object.

The correlation processing may be processing and prediction by finding a correlation or interdependence between a detected position and a predicted position in the image to be detected.

In some cases, the accuracy of the tracking result of the target object can be improved by performing association processing according to the detection position and the prediction position of the target object in any frame of the image to be detected.

Specifically, the association processing is performed according to the detection position and the prediction position of the target object in any frame of the to-be-detected image, and whether the association relationship exists between the detection position and the prediction position of the target object in the to-be-detected image is determined. When the association relation exists between the detection position and the prediction position of the target object in the image to be detected, the detection position of the target object in the image to be detected is used as a tracking result. And continuing to perform association processing according to the detection position and the prediction position of the target object in the to-be-detected image of the next frame of the to-be-detected image of any frame, and determining the tracking result of the target object. And arranging the obtained tracking results of the target object to construct a position sequence which can be used as the tracking result of the target object.

Illustratively, the detection position of the target object Q in the to-be-detected image H1 is Q1, and the prediction position of the target object Q in the to-be-detected image H1 is Q1'. According to the correlation processing of the detection position Q1 and the prediction position Q1 'of the target object Q in the to-be-detected image H1, it can be determined that the detection position Q1 and the prediction position Q1' of the target object Q in the to-be-detected image H1 have the correlation relationship, and the detection position Q1 is used as the tracking result of the target object. The detection position of the target object Q in the to-be-detected image H2 of the next frame of the to-be-detected image H is Q2, and the prediction position of the target object Q in the to-be-detected image H2 is Q2'. According to the correlation processing of the detection position Q2 and the prediction position Q2 'of the target object Q in the to-be-detected image H2, it can be determined that the detection position Q2 and the prediction position Q2' of the target object Q in the to-be-detected image H2 have the correlation relationship, and the detection position Q2 is used as the tracking result of the target object. Therefore, [ Q1, Q2] can be taken as the tracking result of the target object Q.

In the above embodiment, for any frame of to-be-detected image, position prediction is performed according to a detection position of the target object in the to-be-detected image before any frame of to-be-detected image, a predicted position of the target object in any frame of to-be-detected image is obtained, association processing is performed according to the detection position and the predicted position of the target object in any frame of to-be-detected image, and a tracking result of the target object is determined. The detection precision between the detection position and the prediction position can be further improved, and an accurate data basis is provided for the next step of positioning detection of the target object according to the object type to be determined and the tracking result of the target object, and the positioning result of the target object is obtained.

In some embodiments, the determining the tracking result of the target object according to the detection position and the prediction position of the target object in any frame of the to-be-detected image through association processing may include: and carrying out association processing according to the cross ratio cost matrix between the boundary box corresponding to the detection position and the boundary box corresponding to the prediction position, and determining the tracking result of the target object. Or carrying out association processing according to the color information cost matrix between the boundary box corresponding to the detection position and the boundary box corresponding to the prediction position, and determining the tracking result of the target object.

The cross-over cost matrix is an evaluation index for target detection and is used for measuring the prediction accuracy of a model for the cross-over (IOU) between a real boundary box and a prediction boundary box. The cross-correlation cost matrix records the IOU value between each real boundary frame and each prediction boundary frame, takes the IOU value as a cost calculation, uses the IOU cost matrix for association processing, can calculate the cost between the boundary frame corresponding to the detection position and the boundary frame corresponding to the prediction position, and finds the best match from the cost. The color information cost matrix is a tool for association processing that can determine the degree of matching between different colors by comparing their degrees of similarity. When the color information cost matrix is used for the association processing, the color of the object needs to be compared with other colors in the image, and classification or marking is performed according to the matching degree. The association processing refers to a process of processing and analyzing the relationship and correlation between different data in a computer system.

Specifically, an overlapping area between the bounding box corresponding to the detection position and the bounding box corresponding to the prediction position is calculated, then an overlap ratio cost matrix is constructed according to the size of the overlapping area, and the bounding box corresponding to each detection position is determined to be associated with the bounding box corresponding to the prediction position by minimizing the overlap ratio cost matrix. And when the boundary box corresponding to the target object position is determined to be associated with the boundary box corresponding to the predicted position, taking the detection position of the target object as the tracking result of the target object.

And calculating the similarity between each color and other colors to form a color information cost matrix between the boundary frame corresponding to the detection position and the boundary frame corresponding to the prediction position. Then, the minimum weight matching algorithm is used to find the optimal color matching scheme in the color information cost matrix so as to realize the association processing between the bounding box corresponding to the detection position and the bounding box corresponding to the prediction position. And when the boundary box corresponding to the target object position is determined to be associated with the boundary box corresponding to the predicted position, taking the detection position of the target object as the tracking result of the target object.

For example, the processing of associating the cross-ratio cost matrix between the bounding box corresponding to the detection position and the bounding box corresponding to the prediction position, or the processing of associating the color information cost matrix between the bounding box corresponding to the detection position and the bounding box corresponding to the prediction position may be implemented using the hungarian algorithm.

In the above embodiment, the tracking result of the target object is determined by performing association processing according to the intersection ratio cost matrix or the color information cost matrix between the bounding box corresponding to the detection position and the bounding box corresponding to the prediction position, so that the detection precision between the detection position and the prediction position can be further improved, and an accurate data basis is provided for positioning and detecting the target object according to the object category to be determined and the tracking result of the target object in the next step, so as to obtain the positioning result of the target object.

In some embodiments, performing positioning detection on the target object according to the object type to be determined and the tracking result of the target object in each frame of to-be-detected image to obtain a positioning result of the target object may include: and if the object class to be determined is a suspected left-behind object, judging that the target object is not lost from the target scene and the target object does not move according to the tracking result, and taking the detection position as a positioning result of the target object.

Specifically, when the object class to be determined of the target object is a suspected carry-over object, it is determined, according to the tracking result of the target object, that all the to-be-detected images included in the to-be-detected video file after the target object is determined to be the to-be-detected image of the suspected carry-over object can be associated to the target object all the time, so that it is determined that the target object is not lost from the target scene. And determining that the detection position of the target object does not move all the time according to the tracking result of the target object, and taking any detection position in the tracking result of the target object as the positioning result of the target object.

Illustratively, the video file to be detected includes five frames of images to be detected. And determining that the object type to be determined of the target object L is a suspected left object in the third frame of to-be-detected image, wherein the tracking result of the target object L can be [ L ₁,L₁,L₁ ]. According to the tracking result, it is determined that the target object L is always associated in the remaining images to be detected and the detection positions of the target object L are L ₁, and therefore, it can be determined that the target object L is not lost in the target scene and that the target object L does not move. The detection position L ₁ in the tracking result is determined as a positioning result of the target object L.

In the above embodiment, if the object type to be determined is a suspected legacy object, and it is determined according to the tracking result that the target object is not lost from the target scene and the target object does not move, the detection position is used as the positioning result of the target object. By determining the positioning result of the legacy object, the location of the target object may be determined so that the owner of the legacy object may be found later.

In some embodiments, referring to fig. 3, positioning and detecting the target object according to the object type to be determined and the tracking result in each frame of the image to be detected, to obtain the positioning result of the target object may include the following steps:

S310, if the object class to be determined is a suspected left object, determining that the target object is lost from the target scene according to the tracking result, and determining a follow-up image to be detected in a first preset time period after the target object is lost from the target scene.

S320, continuing to execute the step of judging whether the target object is lost from the target scene or not based on the tracking result corresponding to the subsequent to-be-detected image.

The first preset duration may be customized in combination with the actual requirement, for example, the first preset duration may be five minutes and may be ten minutes.

Specifically, when the object type to be determined of the target object is a suspected legacy object, according to the tracking result, it is determined that the detection positions in the tracking result are the same and the to-be-detected image corresponding to the last detection position is not the last frame to-be-detected image of the to-be-detected video file, and further, it may be determined that the target object is not associated in the to-be-detected image of the to-be-detected image corresponding to the last detection position in the tracking result. And when the target object is not associated with the target object, setting the state of the image to be detected corresponding to the last detection position in the tracking result as lost, and judging that the target object is lost from the target scene. And determining the subsequent images to be detected in a first preset time period after the target object is lost from the target scene according to the frame rate of the video file to be detected. And carrying out association processing on tracking results corresponding to the follow-up images to be detected and detection positions of the target objects in the images to be detected, wherein the detection positions are set to be in a lost state, and when the follow-up images to be detected are not associated with the target objects in the follow-up images to be detected within a first preset time period, the target objects can be considered to be lost from the target scene. When the target object is associated with the subsequent to-be-detected images within the first preset duration, according to whether the tracking result of the subsequent to-be-detected images of the target object is detected to move, if the target object does not move in the subsequent to-be-detected images within the first preset duration, the target object can be considered to be lost from the target scene, and the target object is still a suspected carry-over object. If the tracking result of the subsequent to-be-detected image of the target object moves, the target object can be judged to be lost from the target scene, and then the operation is repeated until the last frame to-be-detected image in the subsequent to-be-detected image in the first preset duration.

In the above embodiment, if the object type to be determined is a suspected legacy object, and the target object is determined to be lost from the target scene according to the tracking result, a subsequent to-be-detected image within a first preset duration after the target object is lost from the target scene is determined, and the step of determining whether the target object is lost from the target scene is continuously performed based on the tracking result corresponding to the subsequent to-be-detected image. By determining whether the target object is lost from the target scene, the accuracy of detecting suspected legacy objects can be improved.

In some embodiments, performing positioning detection on the target object according to the object type to be determined and the tracking result of the target object in each frame of to-be-detected image to obtain a positioning result of the target object may include:

if the object class to be determined is a suspected lost object, judging that the target object is out of range from the target scene according to the tracking result, and taking the detection position of the target object in the background image as the positioning result of the target object. Or alternatively

If the object class to be determined is a suspected lost object, judging that the target object is out of bounds from the target scene according to the tracking result, and determining a final position to-be-detected image corresponding to the target object; and taking the detection position of the target object in the final position to-be-detected image as a positioning result of the target object.

The last position to-be-detected image is the to-be-detected image corresponding to the last moment when the target object appears in the target scene.

Specifically, when the object class to be determined of the target object is a suspected lost object, it is determined that the target object is not associated in the last frame of the to-be-detected image of the to-be-detected video file according to the tracking result, and therefore the target object is determined to be out of range from the target scene. When the target object is out of bounds, the detection position of the target object in the background image is used as a positioning result of the target object.

When the object class to be determined of the target object is suspected lost object, determining that the target object is not associated in the last frame of the to-be-detected image of the to-be-detected video file according to the tracking result, and therefore determining that the target object is out of bounds from the target scene. When the target object is out of range, determining the last time of the to-be-detected image associated with the target object, and taking the detection position in the last time of the to-be-detected image associated with the target object as the positioning result of the target object.

In the above embodiment, when the target object is out of bounds, the detection position of the target object in the background image may be taken as the positioning result of the target object. Or the detection position of the target object in the last position to-be-detected image is used as a positioning result of the target object. So that lost objects can be retrieved later.

In some embodiments, performing positioning detection on the target object according to the object type to be determined and the tracking result of the target object in each frame of to-be-detected image to obtain a positioning result of the target object may include: if the object class to be determined is the suspected lost object, and the target object is judged to be not out of bounds in the target scene according to the tracking result, the step of judging whether the target object is out of bounds in the target scene is continuously executed.

Specifically, when the object class to be determined of the target object is a suspected lost object, determining that the target object is not associated until the current image to be detected according to the tracking result of the target object. According to the tracking result of the target object, it can be determined that the video file to be detected still contains the untracked image to be detected after the current image to be detected, and the target object can be considered to be out of bounds. And then, determining the next frame of the current to-be-detected image as the next to-be-detected image. And based on the subsequent to-be-detected image, continuing to track the target object, determining whether the target object is out of range according to a tracking result, and if the target object is not out of range, continuing to repeat the operation until the subsequent to-be-detected image is the last frame to-be-detected image of the to-be-detected video file.

In the above embodiment, if the object type to be determined is a suspected lost object, and it is determined that the target object is not subsequently out of bounds from the target scene according to the tracking result, the step of determining whether the target object is subsequently out of bounds from the target scene is continuously performed. By determining whether the suspected missing object is out of bounds, the final object class of the target object may be determined.

In some embodiments, referring to fig. 4, a background image includes a number of reference objects. Performing target detection and object comparison processing based on the spliced image to obtain a detection result, and may include the following steps:

And S410, performing target detection based on the spliced image to obtain a reference object contained in the background image and a detected object contained in each frame of to-be-detected image.

S420, comparing the reference object with the target object to obtain the object class to be determined of the target object.

Wherein the reference object is an object of an article existing in the background image detected from the background image. The detected object is an object of an article existing in the image to be detected from the image to be detected. The target object is determined by the comparison result of the reference object and the detected object. For example, the target object is an object that does not appear in both the background image and the image to be inspected; illustratively, the object in the background image that is not in the image to be inspected is the target object, and the object in the background image that is not in the image to be inspected is the target object.

Specifically, the object detection is performed on the spliced image corresponding to any frame of to-be-detected image, a boundary box corresponding to the reference object is output, and the reference object included in the background image is determined according to the detected boundary box. And carrying out target detection on the spliced image corresponding to the image to be detected of any frame, outputting a boundary frame corresponding to the detected object, and determining the detected object included in the image to be detected of any frame according to the detected boundary frame. And simultaneously, performing target detection on the spliced image corresponding to the to-be-detected image of any frame, and outputting the reference position of the reference object and the detection position of the detected object. And comparing and matching the reference object included in the background image with the detected object included in the image to be detected of any frame, finding out the similarity and difference between the reference object and the detected object, and determining whether the reference object and the detected object are the same object. When the reference object and the detection object are the same object, if the reference position of the reference object and the detection position of the detection object are moved, the detection object or the reference object can be considered as a suspected lost object. When a reference object is not determined among the detected objects, the reference object may be regarded as a suspected missing object. When the detected object is not determined among the reference objects, the detected object may be considered as a suspected legacy object.

For example, the target detection is performed on the stitched image corresponding to any frame of the image to be detected, so that the reference objects Y1 and Y2 included in the background image and the detected objects Y3 and Y4 included in the any frame of the image to be detected can be determined. Meanwhile, by performing target detection on the stitched image corresponding to any frame of to-be-detected image, the reference position W1 of the reference object Y1, the reference position W2 of the reference object Y2, the detected position W3 of the detected object Y3, and the detected position W4 of the detected object Y4 can be determined. The reference objects Y1 and Y2 are compared with the detected objects Y3 and Y4, and it can be determined that the reference object Y1 and the detected object Y3 are the same object. When the reference position W1 of the reference object Y1 is different from the detected position W3 of the detected object Y3, the reference object Y1 can be regarded as a suspected missing object. The reference object Y2 is not matched to the detected object, and the reference object Y2 can be regarded as a suspected missing object. The detected object Y4 does not match the reference object, and the detected object Y4 can be considered as a suspected legacy object.

In some embodiments, features of the reference object and the detected object, such as color, shape, size, are extracted. The similarity between the reference object and the detected object is calculated from their features. Items with high similarity can be matched by using a similarity measurement method (such as cosine similarity and Euclidean distance), and a matching result is output.

In the above embodiment, the target detection is performed based on the stitched image, so as to obtain the reference object included in the background image and the detected object included in each frame of to-be-detected image, and the reference object is compared with the target object, so as to obtain the to-be-determined object class of the target object. The object category to be determined of the target object is determined so as to follow the target object and determine the positioning result of the target object.

In some implementations, suspected legacy objects are objects that were not originally present in the target scene but subsequently appear in the target scene; a suspected missing object is an object that initially exists in the target scene but does not subsequently appear in the target scene.

Illustratively, the target scene initially includes an item object X1, an item object X2 appears in a subsequent target scene and the item object X1 disappears. The object X1 may be considered as a suspected missing object, and the object X2 may be considered as a suspected carry-over object.

The embodiment of the specification also provides shooting equipment, wherein the shooting equipment comprises an image acquisition unit and a heterogeneous chip; the heterogeneous chip comprises an embedded neural network processor and a central processing unit;

The shooting equipment is used for shooting a target scene to obtain a background image and a video file to be detected; the video file to be detected comprises a plurality of frames of images to be detected.

The embedded neural network processor is used for performing splicing operation on the background image and each frame of to-be-detected image to obtain a spliced image corresponding to each frame of to-be-detected image; performing target detection and object comparison processing based on the spliced image to obtain a detection result; the detection result comprises an object category to be determined of the target object and a detection position of the target object in each frame of image to be detected; the object class to be determined is used for representing that the target object is identified as a suspected left-over object or a suspected lost object; the target object is a legacy object that is left in the target scene or a lost object that is lost from the target scene.

The central processing unit is used for determining a tracking result of the target object based on the detection position of the target object in each frame of to-be-detected image; and carrying out positioning detection on the target object according to the object type to be determined and the tracking result of the target object to obtain the positioning result of the target object.

The embodiment of the specification also provides a target object positioning method. For example, referring to fig. 5, the target object positioning method may include the steps of:

S502, obtaining a background image and a video file to be detected, wherein the background image and the video file to be detected are shot aiming at a target scene.

The video file to be detected comprises a plurality of frames of images to be detected; the target object is a legacy object left in the target scene or a lost object lost from the target scene; suspected legacy objects are objects that were not originally present in the target scene but subsequently appear in the target scene; a suspected missing object is an object that initially exists in the target scene but does not subsequently appear in the target scene.

S504, performing stitching operation by using the background image and each frame of to-be-inspected image to obtain a stitched image corresponding to each frame of to-be-inspected image.

S506, performing target detection and object comparison processing based on the spliced image to obtain a detection result.

The detection result comprises an object category to be determined of the target object and a detection position of the target object in each frame of image to be detected; the object class to be determined is used for representing that the target object is identified as a suspected legacy object or a suspected lost object.

Specifically, target detection is performed based on the stitched image, so as to obtain a reference object included in the background image and a detected object included in each frame of to-be-detected image. Wherein the detected object is a target object; and comparing the reference object with the target object to obtain the object class to be determined of the target object.

S508, aiming at any frame of to-be-detected image, carrying out position prediction according to the detection position of the target object in the to-be-detected image before any frame of to-be-detected image, and obtaining the predicted position of the target object in any frame of to-be-detected image.

S510, carrying out association processing according to the cross ratio cost matrix between the boundary box corresponding to the detection position and the boundary box corresponding to the prediction position to determine the tracking result of the target object.

S512, if the object class to be determined is a suspected left object, judging that the target object is not lost from the target scene and the target object does not move according to the tracking result, and taking the detection position as a positioning result of the target object.

S514, if the object class to be determined is the suspected left object, and the target object is determined to be lost from the target scene according to the tracking result, determining the follow-up to be detected image in the first preset time period after the target object is lost from the target scene.

S516, continuing to execute the step of judging whether the target object is lost from the target scene or not based on the tracking result corresponding to the subsequent to-be-detected image.

S518, if the object class to be determined is suspected lost object, judging that the target object is out of bounds from the target scene according to the tracking result, and taking the detection position of the target object in the background image as the positioning result of the target object.

S520, if the object class to be determined is the suspected lost object, and the target object is determined to be not out of bounds from the target scene according to the tracking result, the step of determining whether the target object is out of bounds from the target scene is continuously executed.

Referring to fig. 6, the article carry-over and loss detection algorithm mainly comprises three modules, namely a dual-input neural network detection model, a multi-target tracking module and a state machine logic application module. The dual-input neural network detection model is to input an image frame and a video background frame contained in an acquired video file as the dual-input neural network detection model, acquire characteristic information and initially acquire position information of suspected carryover and suspected lost. The multi-target tracking module analyzes and tracks the suspected left-over object and the suspected lost object data to determine an object path. And the state machine logic application module increases the searching state, the unknown state and other states on the basis of multi-target tracking, further improves the accuracy of suspected carryover and suspected lost, and determines the final positions of the carryover and the lost.

Referring to fig. 7, a target object positioning device 700 is provided in the embodiment of the present disclosure, the target object positioning device 700 includes: the system comprises a target scene shooting module 710, a spliced image determining module 720, a detection result determining module 730, a tracking result determining module 740 and a positioning result determining module 750.

The target scene shooting module 710 is configured to obtain a background image obtained by shooting a target scene and a video file to be detected; the video file to be detected comprises a plurality of frames of images to be detected; the target object is a legacy object left in the target scene or a lost object lost from the target scene;

The stitched image determining module 720 is configured to perform stitching operation with the background image and each frame of to-be-inspected image, so as to obtain a stitched image corresponding to each frame of to-be-inspected image;

The detection result determining module 730 is configured to perform target detection and object comparison processing based on the stitched image, to obtain a detection result; the detection result comprises an object category to be determined of the target object and a detection position of the target object in each frame of to-be-detected image; the object category to be determined is used for representing that the target object is identified as a suspected left-over object or a suspected lost object;

a tracking result determining module 740, configured to determine a tracking result of the target object based on a detection position of the target object in the each frame of to-be-detected image;

and the positioning result determining module 750 is configured to perform positioning detection on the target object according to the object type to be determined of the target object and the tracking result, so as to obtain a positioning result of the target object.

In some embodiments, the tracking result determining module is further configured to, for any frame of to-be-detected image, perform position prediction according to a detection position of the target object in the to-be-detected image before the any frame of to-be-detected image, to obtain a predicted position of the target object in the any frame of to-be-detected image; and carrying out association processing according to the detection position and the prediction position of the target object in any frame of to-be-detected image, and determining the tracking result of the target object.

In some embodiments, the positioning result determining module is further configured to determine, if the object class to be determined is a suspected legacy object, and according to the tracking result, that the target object is not lost from the target scene and that the target object does not move, and use the detection position as a positioning result of the target object.

In some embodiments, the positioning result determining module is further configured to determine, if the object class to be determined is a suspected legacy object, determine that the target object is lost from the target scene according to the tracking result, and determine a subsequent image to be inspected within a first preset duration after the target object is lost from the target scene; and continuously executing the step of judging whether the target object is lost from the target scene or not based on the tracking result corresponding to the follow-up to-be-detected image.

In some embodiments, the positioning result determining module is further configured to determine, if the object class to be determined is a suspected lost object, that the target object subsequently exits from the target scene according to the tracking result, and take a detection position of the target object in the background image as a positioning result of the target object; or if the object class to be determined is suspected lost object, judging that the target object is out of bounds from the target scene according to the tracking result, and determining a final position to-be-detected image corresponding to the target object; taking the detection position of the target object in the final position to-be-detected image as a positioning result of the target object; the last position to-be-detected image is an image to be detected corresponding to the last moment when the target object appears in the target scene.

In some embodiments, the positioning result determining module is further configured to, if the object class to be determined is a suspected lost object, determine that the target object is not subsequently out of bounds from the target scene according to the tracking result, and continuously execute the step of determining whether the target object is subsequently out of bounds from the target scene.

In some embodiments, the detection result determining module is further configured to perform target detection based on the stitched image, to obtain a reference object included in the background image and a detected object included in the image to be detected of each frame; wherein the detected object is the target object; and comparing the reference object with the target object to obtain the object class to be determined of the target object.

For a specific description of the target object positioning apparatus, reference may be made to the description of the target object positioning method hereinabove, and the description thereof will not be repeated here.

In some embodiments, a computer device is provided, which may be a terminal, and an internal structure diagram thereof may be as shown in fig. 8. The computer device includes a processor, a memory, a communication interface, a display screen, and an input device connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a target object localization method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.

It will be appreciated by those skilled in the art that the structure shown in fig. 8 is merely a block diagram of a portion of the structure associated with the aspects disclosed herein and is not limiting of the computer device to which the aspects disclosed herein apply, and in particular, the computer device may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In some embodiments, a computer device is provided, comprising a memory in which a computer program is stored, and a processor which, when executing the computer program, carries out the method steps of the above embodiments.

The present description embodiments provide a heterogeneous chip comprising an embedded neural network processor NPU, a central processing unit CPU, a memory, and a computer program stored in the memory and configured to be executed by the central processing unit CPU and the embedded neural network processor NPU, the central processing unit CPU and the embedded neural network processor NPU implementing the method of any of the embodiments described above when executing the computer program.

The present description embodiment provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the steps of the method of any of the above embodiments.

An embodiment of the present specification provides a computer program product comprising instructions which, when executed by a processor of a computer device, enable the computer device to perform the steps of the method of any one of the embodiments described above.

It should be noted that the logic and/or steps represented in the flowcharts or otherwise described herein, for example, may be considered as a ordered listing of executable instructions for implementing logical functions, and may be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium may even be paper or other suitable medium upon which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.

Claims

1. A method of locating a target object, the method comprising:

Performing target detection and object comparison processing based on the spliced image through a neural network detection model to obtain a detection result; the detection result comprises an object category to be determined of the target object and a detection position of the target object in each frame of to-be-detected image; the object class to be determined of the target object is a detection class output by the neural network detection model and is used for representing that the target object is identified as a suspected legacy object or a suspected lost object; the object category to be determined of the target object is obtained by performing target detection based on the spliced image, obtaining a reference object included in the background image and a detected object included in each frame of image to be detected, and comparing the reference object with the detected object; the suspected left-over object and the suspected lost object are used as two categories of the neural network detection model to generalize the object categories applicable to the object categories which do not participate in model training;

Positioning and detecting the target object according to the object type to be determined of the target object and the tracking result to obtain a positioning result of the target object; if the object category to be determined is a suspected left object, and the target object is determined not to be lost from the target scene and the target object does not move according to the tracking result, the detection position is used as a positioning result of the target object; if the object class to be determined is a suspected lost object, judging that the target object is out of bounds from the target scene according to the tracking result, and determining a final position to-be-detected image corresponding to the target object; taking the detection position of the target object in the final position to-be-detected image as a positioning result of the target object; the last position to-be-detected image is an image to be detected corresponding to the last moment when the target object appears in the target scene.

2. The method according to claim 1, wherein before the determination of the tracking result of the target object based on the detected position of the target object in the each frame of the image to be inspected, the method further comprises:

3. The method according to claim 2, wherein the determining the tracking result of the target object according to the detected position and the predicted position of the target object in the to-be-detected image of any frame includes:

and carrying out association processing according to the cross ratio cost matrix between the boundary box corresponding to the detection position and the boundary box corresponding to the prediction position, and determining the tracking result of the target object.

4. The method according to claim 2, wherein the determining the tracking result of the target object according to the detected position and the predicted position of the target object in the to-be-detected image of any frame includes:

5. The method according to any one of claims 1 to 4, further comprising:

6. The method according to any one of claims 1 to 4, further comprising:

7. A photographing apparatus, wherein the photographing apparatus includes an image capturing unit and a heterogeneous chip; the heterogeneous chip comprises an embedded neural network processor and a central processing unit;

The embedded neural network processor is used for performing splicing operation on the background image and each frame of to-be-detected image to obtain a spliced image corresponding to each frame of to-be-detected image; performing target detection and object comparison processing based on the spliced image through a neural network detection model to obtain a detection result; the detection result comprises an object category to be determined of a target object and a detection position of the target object in each frame of to-be-detected image; the object class to be determined of the target object is a detection class output by the neural network detection model and is used for representing that the target object is identified as a suspected legacy object or a suspected lost object; the object category to be determined of the target object is obtained by performing target detection based on the spliced image, obtaining a reference object included in the background image and a detected object included in each frame of image to be detected, and comparing the reference object with the detected object; the suspected left-over object and the suspected lost object are used as two categories of the neural network detection model to generalize the object categories applicable to the object categories which do not participate in model training; the target object is a legacy object left in the target scene or a lost object lost from the target scene;

the central processing unit is used for determining a tracking result of the target object based on the detection position of the target object in each frame of to-be-detected image; positioning and detecting the target object according to the object type to be determined of the target object and the tracking result to obtain a positioning result of the target object; if the object category to be determined is a suspected left object, and the target object is determined not to be lost from the target scene and the target object does not move according to the tracking result, the detection position is used as a positioning result of the target object; if the object class to be determined is a suspected lost object, judging that the target object is out of bounds from the target scene according to the tracking result, and determining a final position to-be-detected image corresponding to the target object; taking the detection position of the target object in the final position to-be-detected image as a positioning result of the target object; the last position to-be-detected image is an image to be detected corresponding to the last moment when the target object appears in the target scene.

8. A target object positioning device, the device comprising:

The detection result determining module is used for detecting a model through a neural network, and performing target detection and object comparison processing based on the spliced image to obtain a detection result; the detection result comprises an object category to be determined of the target object and a detection position of the target object in each frame of to-be-detected image; the object class to be determined of the target object is a detection class output by the neural network detection model and is used for representing that the target object is identified as a suspected legacy object or a suspected lost object; the object category to be determined of the target object is obtained by performing target detection based on the spliced image, obtaining a reference object included in the background image and a detected object included in each frame of image to be detected, and comparing the reference object with the detected object; the suspected left-over object and the suspected lost object are used as two categories of the neural network detection model to generalize the object categories applicable to the object categories which do not participate in model training;

the positioning result determining module is used for performing positioning detection on the target object according to the object type to be determined of the target object and the tracking result to obtain a positioning result of the target object; if the object category to be determined is a suspected left object, and the target object is determined not to be lost from the target scene and the target object does not move according to the tracking result, the detection position is used as a positioning result of the target object; if the object class to be determined is a suspected lost object, judging that the target object is out of bounds from the target scene according to the tracking result, and determining a final position to-be-detected image corresponding to the target object; taking the detection position of the target object in the final position to-be-detected image as a positioning result of the target object; the last position to-be-detected image is an image to be detected corresponding to the last moment when the target object appears in the target scene.

9. The apparatus of claim 8, wherein the tracking result determining module is further configured to perform, for any frame of the image to be inspected, position prediction according to a detected position of the target object in the image to be inspected before the frame of the image to be inspected, to obtain a predicted position of the target object in the frame of the image to be inspected; and carrying out association processing according to the detection position and the prediction position of the target object in any frame of to-be-detected image, and determining the tracking result of the target object.

10. The apparatus of claim 8, wherein the positioning result determining module is further configured to determine, if the object class to be determined is a suspected legacy object, and determine that the target object is lost from the target scene according to the tracking result, a subsequent image to be inspected within a first preset duration after the target object is lost from the target scene; and continuously executing judgment on whether the target object is lost from the target scene or not based on the tracking result corresponding to the follow-up to-be-detected image.

11. The apparatus of claim 8, wherein the positioning result determining module is further configured to, if the object class to be determined is a suspected missing object, determine that the target object is not subsequently out of bounds from the target scene according to the tracking result, and continue to perform determining whether the target object is subsequently out of bounds from the target scene.

12. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 6 when the computer program is executed.

13. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1 to 6.

14. A heterogeneous chip comprising an embedded neural network processor NPU, a central processing unit CPU, a memory, and a computer program stored in the memory and configured to be executed by the central processing unit CPU and the embedded neural network processor NPU, the central processing unit CPU and the embedded neural network processor NPU implementing the method of any of claims 1 to 6 when executing the computer program.