CN112990187B - Target position information generation method based on handheld terminal image - Google Patents

Target position information generation method based on handheld terminal image Download PDF

Info

Publication number
CN112990187B
CN112990187B CN202110436206.9A CN202110436206A CN112990187B CN 112990187 B CN112990187 B CN 112990187B CN 202110436206 A CN202110436206 A CN 202110436206A CN 112990187 B CN112990187 B CN 112990187B
Authority
CN
China
Prior art keywords
object obj
camera
vehicle
person
center point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110436206.9A
Other languages
Chinese (zh)
Other versions
CN112990187A (en
Inventor
孙敏
黄翔
楼夏寅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN202110436206.9A priority Critical patent/CN112990187B/en
Publication of CN112990187A publication Critical patent/CN112990187A/en
Application granted granted Critical
Publication of CN112990187B publication Critical patent/CN112990187B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/10Image acquisition
    • G06V10/17Image acquisition using hand-held instruments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/52Surveillance or monitoring of activities, e.g. for recognising suspicious objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/08Detecting or categorising vehicles

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Traffic Control Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention provides a target position information generation method based on a handheld terminal image, which comprises the following steps: when a suspicious target is found, a camera of the handheld terminal performs image acquisition on a target scene to obtain a monitoring image; the server identifies the person object and/or the vehicle object in the monitored image, and estimates the orientation of the person object and the actual distance of the person object from the center point of the camera, and estimates the orientation of the vehicle object and the actual distance of the vehicle object from the center point of the camera. The server generates informative text information. According to the invention, the object is distinguished into the person object and the vehicle object, and different distance recognition algorithms are respectively adopted for the person object and the vehicle object, so that the accuracy of target object distance recognition is effectively improved.

Description

Target position information generation method based on handheld terminal image
Technical Field
The invention belongs to the technical field of target identification, and particularly relates to a target position information generation method based on a handheld terminal image.
Background
In the fields of public security, military, emergency rescue or travel exploration, along with the continuous popularization of handheld terminals (such as mobile phones), the discovery of outdoor specific targets and information acquisition thereof can be completely completed by the portable and universal handheld terminals such as mobile phones and the like. Particularly, in some specific industries related to the collection and analysis of the national information, such as criminal suspects, or the discovery, recording and reporting of some bad behaviors, and the like, the comprehensive analysis is performed by combining the images with the geographic information, so that reliable information can be obtained for decision making or analysis of related institutions or team command centers, and the collection is simple and convenient, and the transmission speed is high.
The existing information acquisition system has the problem that the acquisition precision of the target geographic position is not high when the target geographic position is acquired, so that popularization and use of the system are limited.
Disclosure of Invention
Aiming at the defects existing in the prior art, the invention provides a target position information generation method based on a handheld terminal image, which can effectively solve the problems.
The technical scheme adopted by the invention is as follows:
the invention provides a target position information generation method based on a handheld terminal image, which comprises the following steps:
step 1, when a suspicious target is found, a camera of the handheld terminal performs image acquisition on a target scene to obtain a monitoring image tu (A), and meanwhile, the handheld terminal obtains camera pose information when the monitoring image tu (A) is acquired, and the method comprises the following steps: position coordinates O (x of camera center point O 0 ,y 0 ) The azimuth beta of the main optical axis of the camera and the pitch angle k of the main optical axis of the camera; the main optical axis direction beta of the camera is an included angle between the main optical axis of the camera and the north direction;
step 2, the handheld terminal uploads the monitoring image tu (A) and the pose information of the camera to a server by utilizing a wireless communication module;
step 3, the server performs object recognition on the monitoring image tu (A), and detects whether a person object obj (r) and/or a vehicle object obj (c) exist in the monitoring image tu (A); if not, indicating that a suspicious target does not exist in the monitoring image tu (A), and ending the flow; if yes, executing the step 4;
step 4, the server identifies a person object obj (r) and/or a vehicle object obj (c) in the monitored image tu (a); if the object is the person object obj (r), adopting the steps 5-6 to estimate the azimuth alpha of the person object obj (r) r And the actual distance S of the person object obj (r) from the camera center point O r
If the vehicle object obj (c) is the vehicle object obj (c), estimating the azimuth alpha of the vehicle object obj (c) by adopting the steps 7-8 c And the actual distance S of the vehicle object obj (c) from the camera center point O c
Step 5, estimating the azimuth alpha of the person object obj (r) r The method comprises the following steps:
step 5.1, the server analyzes the monitor image tu (A), and obtains the pixel distance x between the person object obj (r) and the center point of the monitor image tu (A) on the monitor image tu (A) r
Step 5.2, obtaining the azimuth α of the person object obj (r) according to the following formula r
α r =arctan(x r /f)-β
Wherein:
orientation alpha of person object obj (r) r The method comprises the following steps: the connecting line of the character object obj (r) and the center point O of the camera forms an included angle with the north direction; that is, with the camera center point O as a reference, the bias angle of the person object obj (r) with respect to the north direction;
step 6, estimating the actual distance S between the person object obj (r) and the center point O of the camera r The method comprises the following steps: the server reads the pitching angle k of the main optical axis of the camera, and if the pitching angle k of the main optical axis of the camera is smaller than the pitching angle set threshold k of the person object max Step 6.1 is executed; otherwise, executing the step 6.2;
step 6.1, the server analyzes the monitor image tu (A), and recognizes the head pixel height m of the person object obj (r) on the monitor image tu (A), and sets a threshold value m according to the head pixel height m and the head pixel min The relation between the projection point of the character object obj (r) in the direction of the main optical axis of the camera and the distance D between the projection point of the character object obj (r) and the center point O of the camera is obtained according to the following formula r Then step 6.3 is performed;
wherein:
f is the focal length of the camera;
m is the pixel value of the height of the human body, namely: on top of the monitor image tu (a), the pixel value of the smallest circumscribed rectangle of the person object obj (r) in the height direction is obtained by analyzing the monitor image tu (a);
H 1 the general actual height value of the person is a preset fixed value;
H 2 the general actual head height value of the person is a preset fixed value;
step 6.2, according to the following formula, the projection of the character object obj (r) in the direction of the main optical axis of the camera is obtainedDistance D from shadow point to camera center point O r Then step 6.3 is performed;
step 6.3, obtaining the actual distance S between the person object obj (r) and the center point O of the camera according to the following formula r
S r =D r /cosδ r
Wherein: delta r The connecting line is the connecting line of the character object obj (r) and the center point O of the camera, and forms an included angle with the main optical axis of the camera; delta r =α r +β;
Step 7, estimating the azimuth α of the vehicle object obj (c) c The method comprises the following steps:
step 7.1, the server analyzes the monitor image tu (A), and obtains the pixel distance x between the vehicle object obj (c) and the center point of the monitor image tu (A) on the monitor image tu (A) c
Step 7.2, obtaining the azimuth α of the vehicle object obj (c) according to the following formula c
α c =arctan(x c /f)-β
Wherein:
orientation alpha of vehicle object obj (c) c The method comprises the following steps: the connecting line of the vehicle object obj (c) and the center point O of the camera forms an included angle with the north direction; that is, with the camera center point O as a reference, the vehicle object obj (c) is deviated from the north direction;
step 8, estimating the actual distance S between the vehicle object obj (c) and the camera center point O c The method comprises the following steps:
step 8.1, the server analyzes the monitoring image tu (A), and identifies the minimum circumscribed rectangle of the vehicle object obj (c) on the monitoring image tu (A), wherein the height of the minimum circumscribed rectangle is the vehicle pixel height h;
if the vehicle pixel height h is larger than λf, wherein λ is a proportionality coefficient and is a known fixed value, executing step 8.2; otherwise, executing the step 8.3;
step 8.2, which indicates the actual distance S of the vehicle object obj (c) from the camera center point O c Very small, i.e.: s is S c Approximately 0, that is, the position of the vehicle object obj (c) is approximately at the position of the camera center point O; then step 9 is performed;
step 8.3, obtaining the distance D from the projection point of the vehicle object obj (c) in the direction of the main optical axis of the camera to the center point O of the camera according to the following formula c Then step 8.4 is performed;
wherein:
l 2 the universal actual width value of the vehicle is a preset fixed value;
k min setting a threshold value for a pitch angle of a vehicle object;
h is the pixel height of the vehicle object obj (c), namely: on top of the monitor image tu (a), the pixel value in the height direction of the smallest circumscribed rectangle of the vehicle object obj (c) is obtained by analyzing the monitor image tu (a);
l 1 the general actual height value of the vehicle is a preset fixed value;
step 8.4, obtaining the actual distance S between the vehicle object obj (c) and the camera center point O according to the following formula c
S c =D c /cosδ c
Wherein: delta c The included angle between the connecting line of the vehicle object obj (c) and the center point O of the camera and the main optical axis of the camera; delta c =α c +β;
Then step 9 is performed;
step 9, if the person object obj (r), according to the orientation alpha of the person object obj (r) r And the actual distance S of the person object obj (r) from the camera center point O r And then combining the position coordinates O (x 0 ,y 0 ) Obtaining the position coordinates of the person object obj (r);
if it is a vehicleObject obj (c), according to the orientation alpha of the vehicle object obj (c) c And the actual distance S of the vehicle object obj (c) from the camera center point O c And then combining the position coordinates O (x 0 ,y 0 ) Obtaining the position coordinates of the vehicle object obj (c);
step 10, the server generates informative text information, wherein the informative text information comprises the position coordinates of the identified person object obj (r) and/or the position coordinates of the identified vehicle object obj (c).
Preferably, in step 3, the server performs object recognition on the monitor image tu (a), specifically:
the server adopts a trained machine learning network to carry out object recognition on the monitoring image tu (A).
Preferably, the server adopts a trained machine learning network to perform object recognition on the monitoring image tu (A), specifically:
if the server recognizes that the person object obj (r) exists in the monitor image tu (a), further recognizing the age of the person; determining a general actual height value H of a person based on the person's age 1 A general actual head height value H of a person 2
If the server recognizes that the vehicle object obj (c) exists in the monitor image tu (a), the vehicle type is further recognized; determining a prevalent actual width value l of a vehicle according to the type of the vehicle 2 And a prevalent actual height value of the vehicle l 1
Preferably, after step 10, the method further comprises:
step 11, after obtaining the position coordinates of the person object obj (r) and/or the vehicle object obj (c), the server uses a map servlet to obtain scene geographic information of the person object obj (r) and/or the vehicle object obj (c) by buffer analysis through a buffer analysis module based on the target position, and fuses the person object obj (r) and/or the vehicle object obj (c) with the scene geographic information to generate information text information.
The target position information generation method based on the handheld terminal image has the following advantages:
according to the invention, the object is distinguished into the person object and the vehicle object, and different distance recognition algorithms are respectively adopted for the person object and the vehicle object, so that the accuracy of target object distance recognition is effectively improved.
Drawings
FIG. 1 is a flow chart of a target position information generating method based on a handheld terminal image provided by the invention;
FIG. 2 is a schematic diagram of the method for generating target position information based on a handheld terminal image;
FIG. 3 is a horizontal projection view of a person object geographic coordinate estimation;
FIG. 4 is a graph of the relationship between the actual height of a person's head and the height of its imaged head pixels in the vertical direction;
FIG. 5 is a schematic diagram of the relationship between the actual height of the target and its imaging height in the vertical direction;
fig. 6 is a schematic diagram of distance calculation considering the pitch angle k of the main optical axis of the camera.
Detailed Description
In order to make the technical problems, technical schemes and beneficial effects solved by the invention more clear, the invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The invention belongs to the cross fields of mobile terminal technology, digital map technology, automatic identification technology, public security, military application and other technical fields, and mainly aims at recording the geographic position of a ground sensitive target or person by using a portable device such as a handheld terminal in the public security and military detection process, and rapidly uploading the geographic position to a remote server through a wireless transmission module, wherein the remote server automatically generates information for relevant personnel to analyze and judge.
Referring to fig. 1 and 2, the present invention provides a target location information generating method based on a handheld terminal image, comprising the steps of:
step 1, when a suspicious object is found, shooting by a handheld terminalThe camera performs image acquisition on a target scene to obtain a monitoring image tu (A), and meanwhile, the handheld terminal obtains camera pose information when the monitoring image tu (A) is acquired, and the method comprises the following steps: position coordinates O (x of camera center point O 0 ,y 0 ) The azimuth beta of the main optical axis of the camera and the pitch angle k of the main optical axis of the camera; the main optical axis direction beta of the camera is an included angle between the main optical axis of the camera and the north direction;
the handheld terminal comprises, but is not limited to, a mobile phone with a camera, a tablet personal computer and other terminal equipment; the handheld terminal is provided with a position and posture measuring sensor module which is used for obtaining the position and posture information of the camera when the monitoring image tu (A) is collected. Specifically, the handheld terminal can obtain the current shooting point position coordinates through the built-in or external satellite positioning module, namely: position coordinates O (x of camera center point O 0 ,y 0 ) The method comprises the steps of carrying out a first treatment on the surface of the Through its built-in or external gesture and gyro sensor, can obtain the orientation and the gesture information of camera when shooting at present, namely: the azimuth beta of the main optical axis of the camera and the pitch angle k of the main optical axis of the camera;
step 2, the handheld terminal uploads the monitoring image tu (A) and the pose information of the camera to a server by utilizing a wireless communication module;
step 3, the server performs object recognition on the monitoring image tu (A), and detects whether a person object obj (r) and/or a vehicle object obj (c) exist in the monitoring image tu (A); if not, indicating that a suspicious target does not exist in the monitoring image tu (A), and ending the flow; if yes, executing the step 4;
specifically, considering that personnel and vehicles belong to primary targets of public safety and military detection, the invention mainly recognizes the two types of targets, but the targets which can be recognized by the invention are not limited to personnel and vehicles.
In this step, the server performs object recognition on the monitor image tu (a), specifically:
the server adopts a trained machine learning network to carry out object recognition on the monitoring image tu (A). For example, a machine learning algorithm is first used to train a target recognition neural network through a data set of a large number of people and vehicles, and after obtaining an ideal target recognition neural network, the target recognition neural network is used to recognize a target in a shooting scene of a handheld terminal. The trained target recognition neural network can better recognize targets in a scene, such as whether people in the scene are adults or children, and also can recognize the types of vehicles in the scene, and belongs to cars or trucks.
Specifically, if the server recognizes that the person object obj (r) exists in the monitor image tu (a), the age of the person is further recognized; determining a general actual height value H of a person based on the person's age 1 A general actual head height value H of a person 2
If the server recognizes that the vehicle object obj (c) exists in the monitor image tu (a), the vehicle type is further recognized; determining a prevalent actual width value l of a vehicle according to the type of the vehicle 2 And a prevalent actual height value of the vehicle l 1
Step 4, the server identifies a person object obj (r) and/or a vehicle object obj (c) in the monitored image tu (a); if the object is the person object obj (r), adopting the steps 5-6 to estimate the azimuth alpha of the person object obj (r) r And the actual distance S of the person object obj (r) from the camera center point O r
If the vehicle object obj (c) is the vehicle object obj (c), estimating the azimuth alpha of the vehicle object obj (c) by adopting the steps 7-8 c And the actual distance S of the vehicle object obj (c) from the camera center point O c
In the invention, when the actual distance between the person object obj (r) and the vehicle object obj (c) and the center point O of the camera is estimated, the estimated actual distance between the target and the center point O of the camera is different, so that the sizes of pixels in the shot monitoring images are different, and simultaneously, the sizes of the pixels in the scene of the target are also changed due to the gesture and the azimuth of the handheld terminal during shooting. Therefore, the present invention proposes a method for accurate estimation of the geographic positions of the following person object obj (r) and vehicle object obj (c).
Step 5, estimating the azimuth alpha of the person object obj (r) r The method comprises the following steps:
step 5.1The server analyzes the monitoring image tu (A) and obtains the pixel distance x of the imaging point of the person object obj (r) on the monitoring image tu (A) from the main optical axis direction of the monitoring image tu (A) on the monitoring image tu (A) r
Wherein the imaging point of the person object obj (r) on the monitor image tu (a) is denoted as G in fig. 1;
step 5.2, obtaining the azimuth α of the person object obj (r) according to the following formula r
α r =arctan(x r /f)-β
Wherein:
orientation alpha of person object obj (r) r The method comprises the following steps: the connecting line of the character object obj (r) and the center point O of the camera forms an included angle with the north direction; that is, with the camera center point O as a reference, the bias angle of the person object obj (r) with respect to the north direction;
as shown in fig. 3, a horizontal projection view of the geographical coordinate estimation of the object of the person is shown; wherein, the character object obj (r) is represented by A, and the center point of the image is represented by O; the image plane of the person object obj (r) after imaging by the camera is represented by B; the Z axis is the direction of the main optical axis of the camera, and the Y axis is the direction perpendicular to the focal plane; the X axis is determined from the right hand system direction from the Z axis to the Y axis;
step 6, estimating the actual distance S between the person object obj (r) and the center point O of the camera r I.e. the distance from point a to point O in fig. 1, the method is: the server reads the pitching angle k of the main optical axis of the camera, and if the pitching angle k of the main optical axis of the camera is smaller than the pitching angle set threshold k of the person object max Step 6.1 is executed; otherwise, executing the step 6.2;
the invention estimates the actual distance S between the person object obj (r) and the center point O of the camera r When in use, the main conception is as follows:
firstly, considering whether the pitch angle k of the main optical axis of the camera is smaller than the pitch angle of a person object or not, setting a threshold k max If less, indicating that the target and observer positions are considered to be at approximately the same level, then step 6.1 is performed; otherwise, a larger difference in height between the observer position and the target position is indicated, e.g. the observer is located higherThe position, in which the target is photographed, is performed in step 6.2, taking into account the influence of the pitch angle k of the main optical axis on the distance estimation for accurate estimation.
Step 6.1, the server analyzes the monitor image tu (A), and recognizes the head pixel height m of the person object obj (r) on the monitor image tu (A), and sets a threshold value m according to the head pixel height m and the head pixel min The relation between the projection point of the character object obj (r) in the direction of the main optical axis of the camera and the distance D between the projection point of the character object obj (r) and the center point O of the camera is obtained according to the following formula r Then step 6.3 is performed;
wherein:
f is the focal length of the camera;
m is the pixel value of the height of the human body, namely: on top of the monitor image tu (a), the pixel value of the smallest circumscribed rectangle of the person object obj (r) in the height direction is obtained by analyzing the monitor image tu (a);
H 1 the general actual height value of the person is a preset fixed value; for example, 1.7 meters;
H 2 the general actual head height value of the person is a preset fixed value; for example, 0.56 m;
the implementation concept of the step 6.1 is as follows:
when the object is a person object obj (r), a threshold value m is set according to the head pixel height m and the head pixel min The relation between the character object obj (r) and the camera is determined;
specific:
if m > m min The representative person obj (r) is closer to the camera, and the common actual head height H of the person is adopted 2 And the head pixel height m estimates its distance from the photographer, the principle is: when the person object is close to the camera, the common actual head height value H of different person objects 2 The difference is not great, the actual objects of different figures can be ignoredHead height value, let H 2 The preset fixed value of the device meets the precision requirement; meanwhile, the head pixel height m is larger, and the requirement of accurate measurement on an image is met.
And when m is less than or equal to m min At this time, the representative person obj (r) is far from the camera, and the general actual height value H of the person is adopted 1 And the pixel value M of the height of the person estimates the distance between the person and the photographer, and the principle is as follows: when the person object is far from the camera, the imaging size of the person image on the image becomes smaller, so that the value of the pixel height M of the head with smaller size cannot be accurately measured at this time, and therefore, in order to ensure the precision requirement, the pixel value M of the person height with larger pixel is taken as the calculation object.
In practical application, the head pixel sets the threshold m min Is determined by the following means:
referring to fig. 4, a graph of the relationship between the actual height of the person's head and the height of the head pixels imaged in the vertical direction is shown. Wherein the head pixel height is represented by m, and the actual height of the head of the person is represented by H t And (3) representing. The actual height of the head of an adult is generally 54-58 cm, and the median value is 56cm. If the head is too small in the image, recognition is difficult, so the minimum height of the head on the image is empirically taken as 5 pixels, namely, when the height of the head pixel is less than 5 pixels, the head is not independently processed, and the whole height is directly taken.
Due to D r =H 1 f 2 sin(arctan(x r /f))/(x r M) principle of derivation with D r =H 2 f 2 sin(arctan(x r /f))/(x r m) are identical, only D is described below r =H 1 f 2 sin(arctan(x r /f))/(x r Reasoning principle of M):
1) It should be emphasized that, in the present invention, fig. 3 is a schematic diagram of the target position and its image in the horizontal direction, i.e. the plane of the X-axis and the Z-axis; and fig. 5 is a schematic diagram of the relationship between the actual height of the target and its imaging height in the vertical direction.
When the target moves from the original position to the main optical axis direction in parallel, the change proportion of the horizontal direction is the same as the change proportion of the vertical direction.
If the actual height of the target is known, it is assumed that the general actual height value H of the person 1 Its actual height in the main optical axis direction is H' 1 Thus, there is the following formula (1):
H′ 1 =D r H 1 /S r (1)
namely: the actual height in the main optical axis direction, equal to the original height, multiplied by the ratio D r /S r
2) Based on the schematic diagram in the horizontal direction of fig. 3, it can be seen that:
D r /S r =f/(x r /sinδ r ) (2)
3) Combining equation (1) and equation (2) yields the following equation (3):
H′ 1 =D r H 1 /S r =fH 1 sinδ r /x r (3)
4) Regardless of camera pose, there is the following geometrical relationship:
M/H′ 1 =f/D r (4)
thus: has the following formula (5):
D r =fH′ 1 /M (5)
5) Combining equation (5) and equation (3) yields equation (6):
D r =f 2 H 1 sinδ r /x r M (6)
also due to delta in FIG. 3 r =arctan(x r And/f), the following relation is thus obtained:
D r =H 1 f 2 sin(arctan(x r /f))/(x r M) (7)
step 6.2, obtaining the distance D from the projection point of the character object obj (r) in the direction of the main optical axis of the camera to the center point O of the camera according to the following formula r Then step 6.3 is performed;
as shown in fig. 6, a schematic diagram is calculated in consideration of the distance when the pitch angle k of the main optical axis of the camera is considered. In fig. 6, the distance calculation is performed using the height of the person as a reference. As can be seen from fig. 6, D is considered when the pitch angle k of the main optical axis of the camera r Equal to D when the pitch angle k of the main optical axis of the camera is not considered r Multiplied by cosk.
Step 6.3, obtaining the actual distance S between the person object obj (r) and the center point O of the camera according to the following formula r
S r =D r /cosδ r
Wherein: delta r The connecting line is the connecting line of the character object obj (r) and the center point O of the camera, and forms an included angle with the main optical axis of the camera; delta r =α r +β;
Step 7, estimating the azimuth α of the vehicle object obj (c) c The method comprises the following steps:
step 7.1, the server analyzes the monitor image tu (A), and obtains the pixel distance x between the vehicle object obj (c) and the center point of the monitor image tu (A) on the monitor image tu (A) c
Step 7.2, obtaining the azimuth α of the vehicle object obj (c) according to the following formula c
α c =arctan(x c /f)-β
Wherein:
orientation alpha of vehicle object obj (c) c The method comprises the following steps: the connecting line of the vehicle object obj (c) and the center point O of the camera forms an included angle with the north direction; that is, with the camera center point O as a reference, the vehicle object obj (c) is deviated from the north direction;
step 8, estimating the actual distance S between the vehicle object obj (c) and the camera center point O c The method comprises the following steps:
step 8.1, the server analyzes the monitoring image tu (A), and identifies the minimum circumscribed rectangle of the vehicle object obj (c) on the monitoring image tu (A), wherein the height of the minimum circumscribed rectangle is the vehicle pixel height h;
if the vehicle pixel height h is larger than λf, wherein λ is a proportionality coefficient and is a known fixed value, executing step 8.2; otherwise, executing the step 8.3;
when the invention identifies the distance of the vehicle object obj (c), the main conception is as follows:
determining the distance of the vehicle object obj (c) from the camera by determining the lambda value:
specific:
if the vehicle pixel height h > λf, the actual distance S of the vehicle object obj (c) from the camera center point O is indicated c When the distance is too small, the vehicle height may be limited by the shooting angle and the state, and the vehicle height is not easily recognized, and the position of the vehicle object obj (c) is approximately located at the position of the camera center point O according to the approximation.
If the vehicle pixel height h is less than or equal to λf, then the actual distance S of the vehicle object obj (c) from the camera center point O is indicated c Further, this case further distinguishes between two cases:
first, the camera main optical axis pitch angle k is less, namely: k is less than or equal to k min At this time, the target and observer positions are regarded as being substantially in the same horizontal plane, and the value of the general actual height of the vehicle l is adopted 1 As a reference in distance calculation;
in the second case, when the pitch angle k of the main optical axis of the camera is larger, namely: k > k min In this case, it is indicated that there is a large difference in height between the observer position and the target position, for example, the observer is located at a high position, and the target is photographed in a plane view, and in this case, since the distance is long, the entire vehicle body can be photographed, and therefore, the value l of the general actual width of the vehicle is adopted 2 As a reference in distance calculation.
Wherein lambda may be 0.15.λ can be determined by:
1) Assume that a distance D from a projection point of a vehicle object obj (c) in a camera main optical axis direction to a camera center point O c Less than 10 meters, namely: when D is c If the distance is less than 10, the distance is considered to be too short, the shooting angle and the state are possibly limited, the vehicle height is not easy to identify, and the method is carried out according to the approximate processing:
due to H/H' c =f/D c
Wherein:
h is the vehicle pixel height;
H′ c the height of the vehicle in the direction of the main optical axis is 1.5 m according to the calculation of a conventional vehicle;
2) Thus, it is possible to obtain: h > 0.15f, namely: lambda was 0.15.
Step 8.2, which indicates the actual distance S of the vehicle object obj (c) from the camera center point O c Very small, i.e.: s is S c Approximately 0, that is, the position of the vehicle object obj (c) is approximately at the position of the camera center point O; then step 9 is performed;
step 8.3, obtaining the distance D from the projection point of the vehicle object obj (c) in the direction of the main optical axis of the camera to the center point O of the camera according to the following formula c Then step 8.4 is performed;
wherein:
l 2 the universal actual width value of the vehicle is a preset fixed value; for example, 1.7 meters;
k min setting a threshold value for a pitch angle of a vehicle object; in practical application, the angle is 15 degrees.
h is the pixel height of the vehicle object obj (c), namely: on top of the monitor image tu (a), the pixel value in the height direction of the smallest circumscribed rectangle of the vehicle object obj (c) is obtained by analyzing the monitor image tu (a);
l 1 the general actual height value of the vehicle is a preset fixed value;
step 8.4, obtaining the actual distance S between the vehicle object obj (c) and the camera center point O according to the following formula c
S c =D c /cosδ c
Wherein: delta c The included angle between the connecting line of the vehicle object obj (c) and the center point O of the camera and the main optical axis of the camera; delta c =α c +β;
Then step 9 is performed;
step 9, if the person object obj (r), according to the orientation alpha of the person object obj (r) r And the actual distance S of the person object obj (r) from the camera center point O r And then combining the position coordinates O (x 0 ,y 0 ) Obtaining the position coordinates of the person object obj (r);
in the case of the vehicle object obj (c), according to the orientation alpha of the vehicle object obj (c) c And the actual distance S of the vehicle object obj (c) from the camera center point O c And then combining the position coordinates O (x 0 ,y 0 ) Obtaining the position coordinates of the vehicle object obj (c);
step 10, the server generates informative text information, wherein the informative text information comprises the position coordinates of the identified person object obj (r) and/or the position coordinates of the identified vehicle object obj (c).
Step 11, after obtaining the position coordinates of the person object obj (r) and/or the vehicle object obj (c), the server uses a map servlet to obtain scene geographic information of the person object obj (r) and/or the vehicle object obj (c) by buffer analysis through a buffer analysis module based on the target position, and fuses the person object obj (r) and/or the vehicle object obj (c) with the scene geographic information to generate information text information.
It should be emphasized that the above-mentioned information generation process is limited to the software computing capability of the present handheld terminal, and if the handheld terminal has a relatively strong data processing capability, the information generation process can be directly performed on the handheld terminal, and then the generated information and the monitoring image can be uploaded to the server. The invention is not limited in this regard.
In practical application, if the information needs to be uploaded to a server to generate, before the handheld terminal shoots and uploads the information to the server, a user can be required to specify a sensitive target on an image, and the operation can be specified through simple interaction; then, when the server carries out the target recognition algorithm, only the targets related to the appointed position on the image are extracted, and other targets are ignored, so that the generated information is more clear.
The target position information generating method based on the handheld terminal image integrates the performances of the handheld terminal in all aspects, including photography, positioning, orientation and gesture measurement, combines the existing massive geographic information in the geographic information system and the map analysis function of the geographic information system, combines the target recognition, text fusion and other multi-aspect functions in the existing machine learning method, integrates a set of intelligent information generating system convenient to use, and facilitates relevant departments to collect sensitive or relevant interest point target information through the mobile phone terminals of information collectors and even common users.
The target position information generation method based on the handheld terminal image provided by the invention has the following advantages:
according to the invention, the object is distinguished into the person object and the vehicle object, and different distance recognition algorithms are respectively adopted for the person object and the vehicle object, so that the accuracy of target object distance recognition is effectively improved.
The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which is also intended to be covered by the present invention.

Claims (4)

1. The target position information generation method based on the handheld terminal image is characterized by comprising the following steps of:
step 1, when a suspicious target is found, a camera of the handheld terminal performs image acquisition on a target scene to obtain a monitoring image tu (A), and meanwhile, the handheld terminal obtains camera pose information when the monitoring image tu (A) is acquired, and the method comprises the following steps: position coordinates O (x of camera center point O 0 ,y 0 ) The azimuth beta of the main optical axis of the camera and the pitch angle k of the main optical axis of the camera; the main optical axis direction beta of the camera is an included angle between the main optical axis of the camera and the north direction;
step 2, the handheld terminal uploads the monitoring image tu (A) and the pose information of the camera to a server by utilizing a wireless communication module;
step 3, the server performs object recognition on the monitoring image tu (A), and detects whether a person object obj (r) and/or a vehicle object obj (c) exist in the monitoring image tu (A); if not, indicating that a suspicious target does not exist in the monitoring image tu (A), and ending the flow; if yes, executing the step 4;
step 4, the server identifies a person object obj (r) and/or a vehicle object obj (c) in the monitored image tu (a); if the object is the person object obj (r), adopting the steps 5-6 to estimate the azimuth alpha of the person object obj (r) r And the actual distance S of the person object obj (r) from the camera center point O r
If the vehicle object obj (c) is the vehicle object obj (c), estimating the azimuth alpha of the vehicle object obj (c) by adopting the steps 7-8 c And the actual distance S of the vehicle object obj (c) from the camera center point O c
Step 5, estimating the azimuth alpha of the person object obj (r) r The method comprises the following steps:
step 5.1, the server analyzes the monitor image tu (A), and obtains the pixel distance x between the person object obj (r) and the center point of the monitor image tu (A) on the monitor image tu (A) r
Step 5.2, obtaining the azimuth α of the person object obj (r) according to the following formula r
α r =arctan(x r /f)-β
Wherein:
orientation alpha of person object obj (r) r The method comprises the following steps: the connecting line of the character object obj (r) and the center point O of the camera forms an included angle with the north direction; that is, with the camera center point O as a reference, the bias angle of the person object obj (r) with respect to the north direction;
step 6, estimating the actual distance S between the person object obj (r) and the center point O of the camera r The method comprises the following steps: the server reads the pitching angle k of the main optical axis of the camera, and if the pitching angle k of the main optical axis of the camera is smaller than the pitching angle set threshold k of the person object max Step 6.1 is executed; otherwise, executing the step 6.2;
step 6.1, the server analyzes the monitor image tu (A), and recognizes the head pixel height m of the person object obj (r) on the monitor image tu (A), and sets a threshold value m according to the head pixel height m and the head pixel min The relation between the projection point of the character object obj (r) in the direction of the main optical axis of the camera and the distance D between the projection point of the character object obj (r) and the center point O of the camera is obtained according to the following formula r Then step 6.3 is performed;
wherein:
f is the focal length of the camera;
m is the pixel value of the height of the human body, namely: on top of the monitor image tu (a), the pixel value of the smallest circumscribed rectangle of the person object obj (r) in the height direction is obtained by analyzing the monitor image tu (a);
H 1 the general actual height value of the person is a preset fixed value;
H 2 the general actual head height value of the person is a preset fixed value;
step 6.2, obtaining the distance D from the projection point of the character object obj (r) in the direction of the main optical axis of the camera to the center point O of the camera according to the following formula r Then step 6.3 is performed;
step 6.3, obtaining the actual distance S between the person object obj (r) and the center point O of the camera according to the following formula r
S r =D r /cosδ r
Wherein: delta r The connecting line is the connecting line of the character object obj (r) and the center point O of the camera, and forms an included angle with the main optical axis of the camera; delta r =α r +β;
Step 7, estimating the azimuth α of the vehicle object obj (c) c The method comprises the following steps:
step 7.1, the server pair monitorsAnalyzing the image tu (A) to obtain the pixel distance x between the vehicle object obj (c) and the center point of the image of the monitored image tu (A) on the monitored image tu (A) c
Step 7.2, obtaining the azimuth α of the vehicle object obj (c) according to the following formula c
α c =arctan(x c /f)-β
Wherein:
orientation alpha of vehicle object obj (c) c The method comprises the following steps: the connecting line of the vehicle object obj (c) and the center point O of the camera forms an included angle with the north direction; that is, with the camera center point O as a reference, the vehicle object obj (c) is deviated from the north direction;
step 8, estimating the actual distance S between the vehicle object obj (c) and the camera center point O c The method comprises the following steps:
step 8.1, the server analyzes the monitoring image tu (A), and identifies the minimum circumscribed rectangle of the vehicle object obj (c) on the monitoring image tu (A), wherein the height of the minimum circumscribed rectangle is the vehicle pixel height h;
if the vehicle pixel height h is larger than λf, wherein λ is a proportionality coefficient and is a known fixed value, executing step 8.2; otherwise, executing the step 8.3;
step 8.2, which indicates the actual distance S of the vehicle object obj (c) from the camera center point O c Very small, i.e.: s is S c The position of the vehicle object obj (c) is approximately at the position of the camera center point O, which is approximately 0; then step 9 is performed;
step 8.3, obtaining the distance D from the projection point of the vehicle object obj (c) in the direction of the main optical axis of the camera to the center point O of the camera according to the following formula c Then step 8.4 is performed;
wherein:
l 2 the universal actual width value of the vehicle is a preset fixed value;
k min pitching for a vehicle objectSetting a threshold value for the angle;
h is the pixel height of the vehicle object obj (c), namely: on top of the monitor image tu (a), the pixel value in the height direction of the smallest circumscribed rectangle of the vehicle object obj (c) is obtained by analyzing the monitor image tu (a);
l 1 the general actual height value of the vehicle is a preset fixed value;
step 8.4, obtaining the actual distance S between the vehicle object obj (c) and the camera center point O according to the following formula c
S c =D c /cosδ c
Wherein: delta c The included angle between the connecting line of the vehicle object obj (c) and the center point O of the camera and the main optical axis of the camera; delta c =α c +β;
Then step 9 is performed;
step 9, if the person object obj (r), according to the orientation alpha of the person object obj (r) r And the actual distance S of the person object obj (r) from the camera center point O r And then combining the position coordinates O (x 0 ,y 0 ) Obtaining the position coordinates of the person object obj (r);
in the case of the vehicle object obj (c), according to the orientation alpha of the vehicle object obj (c) c And the actual distance S of the vehicle object obj (c) from the camera center point O c And then combining the position coordinates O (x 0 ,y 0 ) Obtaining the position coordinates of the vehicle object obj (c);
step 10, the server generates informative text information, wherein the informative text information comprises the position coordinates of the identified person object obj (r) and/or the position coordinates of the identified vehicle object obj (c).
2. The method for generating target position information based on hand-held terminal image according to claim 1, wherein in step 3, the server performs object recognition on the monitor image tu (a), specifically:
the server adopts a trained machine learning network to carry out object recognition on the monitoring image tu (A).
3. The target location information generating method based on the hand-held terminal image according to claim 2, wherein the server uses a trained machine learning network to perform object recognition on the monitoring image tu (a), specifically:
if the server recognizes that the person object obj (r) exists in the monitor image tu (a), further recognizing the age of the person; determining a general actual height value H of a person based on the person's age 1 A general actual head height value H of a person 2
If the server recognizes that the vehicle object obj (c) exists in the monitor image tu (a), the vehicle type is further recognized; determining a prevalent actual width value l of a vehicle according to the type of the vehicle 2 And a prevalent actual height value of the vehicle l 1
4. The method for generating target location information based on a handheld terminal image according to claim 1, further comprising, after step 10:
step 11, after obtaining the position coordinates of the person object obj (r) and/or the vehicle object obj (c), the server uses a map servlet to obtain scene geographic information of the person object obj (r) and/or the vehicle object obj (c) by buffer analysis through a buffer analysis module based on the target position, and fuses the person object obj (r) and/or the vehicle object obj (c) with the scene geographic information to generate information text information.
CN202110436206.9A 2021-04-22 2021-04-22 Target position information generation method based on handheld terminal image Active CN112990187B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110436206.9A CN112990187B (en) 2021-04-22 2021-04-22 Target position information generation method based on handheld terminal image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110436206.9A CN112990187B (en) 2021-04-22 2021-04-22 Target position information generation method based on handheld terminal image

Publications (2)

Publication Number Publication Date
CN112990187A CN112990187A (en) 2021-06-18
CN112990187B true CN112990187B (en) 2023-10-20

Family

ID=76341646

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110436206.9A Active CN112990187B (en) 2021-04-22 2021-04-22 Target position information generation method based on handheld terminal image

Country Status (1)

Country Link
CN (1) CN112990187B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114061607A (en) * 2021-11-12 2022-02-18 浙江数智交院科技股份有限公司 Improved navigation system and navigation method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016062076A1 (en) * 2014-10-22 2016-04-28 中兴通讯股份有限公司 Camera-based positioning method, device, and positioning system
WO2018130016A1 (en) * 2017-01-10 2018-07-19 哈尔滨工业大学深圳研究生院 Parking detection method and device based on monitoring video
CN111354046A (en) * 2020-03-30 2020-06-30 北京芯龙德大数据科技有限公司 Indoor camera positioning method and positioning system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016062076A1 (en) * 2014-10-22 2016-04-28 中兴通讯股份有限公司 Camera-based positioning method, device, and positioning system
CN105588543A (en) * 2014-10-22 2016-05-18 中兴通讯股份有限公司 Camera-based positioning method, device and positioning system
WO2018130016A1 (en) * 2017-01-10 2018-07-19 哈尔滨工业大学深圳研究生院 Parking detection method and device based on monitoring video
CN111354046A (en) * 2020-03-30 2020-06-30 北京芯龙德大数据科技有限公司 Indoor camera positioning method and positioning system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于智能前视摄像头的ACC目标车辆探测;张扬;李守成;陈华;制造业自动化;第39卷(第9期);全文 *

Also Published As

Publication number Publication date
CN112990187A (en) 2021-06-18

Similar Documents

Publication Publication Date Title
CN109887040B (en) Moving target active sensing method and system for video monitoring
CN108886582B (en) Image pickup apparatus and focus control method
CN110142785A (en) A kind of crusing robot visual servo method based on target detection
US11210796B2 (en) Imaging method and imaging control apparatus
TW201727537A (en) Face recognition system and face recognition method
CN111970454B (en) Shot picture display method, device, equipment and storage medium
CN111784765A (en) Object measurement method, virtual object processing method, object measurement device, virtual object processing device, medium, and electronic apparatus
CN112990187B (en) Target position information generation method based on handheld terminal image
CN112307912A (en) Method and system for determining personnel track based on camera
CN105678805A (en) ARM platform-based aerial-photography target detection system
KR101518314B1 (en) Method and apparatus for video surveillance by using surveillance apparatus of unmanned aerial vehicle
CN111182221A (en) Automatic following audio and video acquisition system and method
CN112488022B (en) Method, device and system for monitoring panoramic view
US20200033874A1 (en) Systems and methods for remote visual inspection of a closed space
CN113301256A (en) Camera module with low power consumption and multi-target continuous automatic monitoring function and camera shooting method thereof
CN114638880B (en) Planar ranging method, monocular camera and computer readable storage medium
JP2006309450A (en) Image recognition device and image recognition method
CN114040107B (en) Intelligent automobile image shooting system, intelligent automobile image shooting method, intelligent automobile image shooting vehicle and intelligent automobile image shooting medium
JP2016054409A (en) Image recognition device, image recognition method, and program
CN112883809A (en) Target detection method, device, equipment and medium
CN110705533A (en) AI recognition and grabbing system for inspection report
CN115909387B (en) Engineering lofting method based on enhanced image processing technology
CN218258725U (en) Unmanned aerial vehicle and system for poppy inspection
CN114600162A (en) Scene lock mode for capturing camera images
US20190347852A1 (en) Method of processing full motion video data for photogrammetric reconstruction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant