CN108881811B

CN108881811B - Method and device for determining prompt information in video monitoring

Info

Publication number: CN108881811B
Application number: CN201710331968.6A
Authority: CN
Inventors: 沈文忠
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2017-05-11
Filing date: 2017-05-11
Publication date: 2020-06-16
Anticipated expiration: 2037-05-11
Also published as: CN108881811A; CN111800601A

Abstract

The application discloses an information display method and device in video monitoring, and belongs to the field of video processing. The method comprises the following steps: the method comprises the steps of obtaining a video picture, selecting a predefined orientation with the orientation deviation smaller than the preset orientation deviation from a plurality of predefined orientations in one-to-one correspondence with a plurality of predefined orientation images, wherein the target orientation is the collection orientation of the video picture, and the orientation deviation smaller than the preset orientation deviation can indicate that the target orientation is basically close to the predefined orientation, so that prompt information set for the selected predefined orientation can be determined as prompt information needing to be superposed in the video picture. The prompt information set for the selected predefined orientation can be determined as the prompt information to be superimposed in the video picture without the condition parameters of the target orientation and the predefined orientation being completely consistent, so that the flexibility of determining the prompt information in video monitoring is improved.

Description

Method and device for determining prompt information in video monitoring

Technical Field

The present application relates to the field of video processing, and in particular, to a method and an apparatus for determining prompt information in video monitoring.

Background

At present, video monitoring is mainly realized by acquiring video pictures in different directions through a rotatable camera to monitor in different directions, and as the rotatable camera is a camera capable of dynamically changing a visual angle area, monitoring in any direction in a spatial range can be realized through the rotatable camera, when a user monitors an interested target object through the rotatable camera, the rotatable camera needs to be rotated to the direction of the target object. In the process of rotating the rotatable camera, in order to facilitate a user to quickly find the position of the target object, prompt information may be superimposed on a video picture of the rotatable camera in some positions, where the prompt information is used to prompt an object existing in the video picture or a current position of the rotatable camera, and for example, the prompt information may be "west passenger station" or "current position is south".

In the related art, a plurality of preset orientations are preset in a rotatable camera, and a state parameter and prompt information of each preset orientation are stored in the rotatable camera, wherein the state parameter includes a spatial direction vector and a zoom multiple of the preset orientation. When the rotatable camera is rotating, the state parameters of the current position are detected in real time, and when the detected state parameters are consistent with the state parameters of any preset position in the plurality of preset positions, prompt information preset for the preset position is superposed in a current collected video picture. And sending the video picture after the prompt information is superimposed to a monitoring system for viewing the video picture so that the monitoring system displays the video picture after the prompt information is superimposed.

In the process of rotating the rotatable camera, only when the state parameter of the current position is completely consistent with the state parameter of the preset position, the rotatable camera can overlay the prompt information set for the preset position in the current video picture. That is, only when the spatial direction vector and the zoom factor of the current orientation are completely consistent with those of the preset orientation, the rotatable camera determines the prompt information set for the preset orientation as the prompt information to be superimposed in the current video picture. Therefore, when the rotatable camera has a target object in the video image acquired at the current position, but the state parameter of the current position is not completely consistent with the state parameter of the preset position, the rotatable camera does not superimpose the prompt information on the current video image, so that the monitoring system does not display the prompt information at this time.

Disclosure of Invention

In order to solve the problem that in the related art, when a target object exists in a video picture acquired by a rotatable camera in the current direction, but the state parameter of the current direction is not completely consistent with the state parameter of the preset direction, the rotatable camera cannot superimpose prompt information on the current video picture, so that a monitoring system cannot display the prompt information at the moment, the method and the device for determining the prompt information in video monitoring are provided. The technical scheme is as follows:

in a first aspect, a method for determining prompt information in video monitoring is provided, where the method includes:

acquiring a video picture;

determining azimuth deviation between a target azimuth and each predefined azimuth in a plurality of predefined azimuths based on the video picture and a plurality of pre-stored predefined azimuth images, wherein the target azimuth is the collection azimuth of the video picture, and the predefined azimuth images correspond to the predefined azimuths in a one-to-one manner;

selecting a predefined orientation from the plurality of predefined orientations having an orientation deviation from the target orientation that is less than a preset orientation deviation;

and determining the prompt message set for the selected predefined orientation as the prompt message to be superposed in the video picture.

In the embodiment of the invention, a video picture is acquired, and a predefined orientation with an orientation deviation smaller than a preset orientation deviation is selected from a plurality of predefined orientations in one-to-one correspondence with a plurality of predefined orientation images, because the target orientation is the acquisition orientation of the video picture, and the orientation deviation smaller than the preset orientation deviation can indicate that the target orientation is basically close to the predefined orientation, the rotatable camera can determine the prompt information set for the selected predefined orientation as the prompt information to be superposed in the video picture, and the prompt information set for the selected predefined orientation can be determined as the prompt information to be superposed in the video picture without completely conforming the state parameters of the target orientation and the predefined orientation.

Optionally, the determining an orientation deviation between the target orientation and each of a plurality of predefined orientations based on the video picture and a plurality of pre-stored predefined orientation images comprises:

for each preset azimuth image in the preset azimuth images, acquiring a plurality of pairs of local feature points matched in the preset azimuth image and the video picture, wherein each pair of local feature points is an edge point or a corner point matched in the video picture and the preset azimuth image;

and when the number of the plurality of pairs of local feature points is greater than a preset number, determining the azimuth deviation between the predefined azimuth corresponding to the predefined azimuth image and the target azimuth according to the plurality of pairs of local feature points.

In particular, the rotatable camera may determine an orientation deviation between the predefined orientation and the target orientation based on matching local feature points in the predefined orientation image and the video frame.

Optionally, the predefined azimuth image is an image comprising first-order edge features;

the acquiring the matched pairs of local feature points in the predefined azimuth image and the video picture comprises the following steps:

determining a background image of the video picture by a Gaussian mixture background modeling method;

processing the background image of the video picture to obtain a processed image, wherein the processed image is an image comprising first-order edge features;

and matching the feature points in the processed image with the feature points in the predefined azimuth image through a Scale Invariant Feature Transform (SIFT) algorithm to obtain a plurality of pairs of matched local feature points.

Further, in order to facilitate searching for the matched local feature points in the predefined orientation image and the video frame, the video frame may be processed to obtain a processed image, and then the matched pairs of local feature points are determined from the processed image and the predefined orientation image.

Optionally, the determining, according to the plurality of pairs of local feature points, an orientation deviation between a predefined orientation corresponding to the predefined orientation image and the target orientation includes:

selecting at least three pairs of local feature points from the plurality of pairs of local feature points;

constructing at least one pair of triangles through any three pairs of local feature points in the at least three pairs of local feature points, wherein each pair of triangles comprises a first triangle and a second triangle, the first triangle is a triangle constructed by the local feature points in the video picture, the second triangle is a triangle constructed by the local feature points in the predefined azimuth image, each side length of the first triangle and each side length of the second triangle are greater than a preset side length, and each inner angle of the first triangle and each inner angle of the second triangle are greater than a preset angle;

for each pair of triangles in the at least one pair of triangles, determining a sum of squares of differences between interior angles corresponding to matched local feature points in a first triangle and a second triangle comprised by the pair of triangles;

when the sum of squares is larger than a preset value, performing spatial transformation on the first triangle so that the sum of squares of differences between interior angles corresponding to matched local feature points in the spatially transformed triangle and the second triangle is smaller than the preset value, and the size of the spatially transformed triangle is the same as that of the second triangle;

determining the ratio of the view field deflection angle of the triangle after the space transformation to the view field deflection angle of the second triangle, wherein the view field deflection angle is an included angle formed by connecting lines between the gravity center of the triangle and the center of the image where the triangle is located and the focus of the rotatable camera respectively;

and determining the azimuth deviation between the predefined azimuth corresponding to the predefined azimuth image and the target azimuth according to the determined at least one ratio.

Specifically, the orientation deviation between the predefined orientation and the target orientation may be determined from the plurality of pairs of local feature points by constructing at least one pair of triangles and then from the constructed at least one pair of triangles.

Optionally, the spatially transforming the first triangle includes:

performing affine transformation on the first triangle so that the sum of squares of differences between interior angles corresponding to the matched local feature points in the affine-transformed triangle and the second triangle is smaller than the preset value;

determining an average value of the ratio of the corresponding side lengths in the triangle after affine transformation and the second triangle;

when the average value is larger than 1, reducing the triangle after affine transformation by taking the average value as a reduction ratio to obtain the triangle after space transformation;

and when the average value is not more than 1, amplifying the triangle after affine transformation by taking the reciprocal of the average value as an amplification scale to obtain the triangle after space transformation.

And performing spatial transformation on the first triangle to make the first triangle and the second triangle after spatial transformation similar as much as possible.

Optionally, the determining a ratio between the field of view declination of the spatially transformed triangle and the field of view declination of the second triangle comprises:

determining the distance between the center of the triangle after the spatial transformation and the center of the video picture where the triangle after the spatial transformation is located to obtain a first distance;

determining the distance between the center of the second triangle and the center of the image where the second triangle is located to obtain a second distance;

determining a ratio between the first distance and the second distance as a ratio between a field of view declination of the spatially transformed triangle and a field of view declination of the second triangle.

Since the view declination is not easily measured directly, the ratio between the view declination of the spatially transformed triangle and the view declination of the second triangle can be determined by the indirect method described above

Optionally, the determining, according to the determined at least one ratio, an orientation deviation between a predefined orientation corresponding to the predefined orientation image and the target orientation includes:

performing linear clustering calculation on the at least one ratio to obtain a discrete value of each ratio;

selecting a ratio value of which the discrete value is smaller than a preset discrete value from the at least one ratio value;

determining an absolute value of a difference between the average of the selected ratios and 1 as an orientation deviation between a predefined orientation corresponding to the predefined orientation image and the target orientation.

To avoid more discrete ratios of the at least one ratio, more discrete ratios of the at least one ratio may be discarded.

Optionally, before determining an orientation deviation between the target orientation and each of a plurality of predefined orientations based on the video frame and a plurality of pre-defined orientation images stored in advance, the method further includes:

when a predefined orientation setting instruction is detected, setting the current orientation as a predefined orientation, and setting a video picture of the current orientation as a video picture of the predefined orientation;

determining a background image of the video picture of the predefined orientation by a Gaussian mixture background modeling method;

processing the background image of the video picture in the predefined direction to obtain a processed image, wherein the processed image is an image comprising first-order edge features;

and determining the processed image as a predefined azimuth image corresponding to the predefined azimuth, and storing the predefined azimuth image.

In the embodiment of the present invention, the predetermined azimuth image may be set in advance, and the predetermined azimuth image set in advance may be stored.

In a second aspect, an information display apparatus in video surveillance is provided, where the information display apparatus in video surveillance has a function of implementing the behavior of the information display method in video surveillance in the first aspect. The information display device in video monitoring comprises at least one module, and the at least one module is used for realizing the information display method in video monitoring provided by the first aspect.

In a third aspect, an information display apparatus for video surveillance is provided, where the information display apparatus for video surveillance structurally includes a processor and a memory, and the memory is used to store a program for supporting the information display apparatus for video surveillance to execute the information display method for video surveillance provided in the first aspect, and store data for implementing the information display method for video surveillance provided in the first aspect. The processor is configured to execute programs stored in the memory. The operating means of the memory device may further comprise a communication bus for establishing a connection between the processor and the memory.

In a fourth aspect, a computer-readable storage medium is provided, which stores instructions that, when executed on a computer, cause the computer to execute the information display method in video surveillance according to the first aspect.

In a fifth aspect, there is provided a computer program product containing instructions which, when run on a computer, cause the computer to perform the method for displaying information in video surveillance as described in the first aspect above.

The technical effects obtained by the above second, third, fourth and fifth aspects are similar to the technical effects obtained by the corresponding technical means in the first aspect, and are not described herein again.

The beneficial effect that technical scheme that this application provided brought is: the method comprises the steps of obtaining a video picture, selecting a predefined orientation with the orientation deviation smaller than a preset orientation deviation from a plurality of predefined orientations in one-to-one correspondence with a plurality of predefined orientation images, wherein the target orientation is the collection orientation of the video picture, and the orientation deviation smaller than the preset orientation deviation can indicate that the target orientation is basically close to the predefined orientation, and at the moment, prompt information set for the selected predefined orientation can be determined as prompt information needing to be superposed in the video picture, so that the monitoring system can display the prompt information. The prompt information set for the selected predefined orientation can be determined as the prompt information to be superimposed in the video picture without the condition parameters of the target orientation and the predefined orientation being completely consistent, so that the flexibility of determining the prompt information in video monitoring is improved.

Drawings

Fig. 1 is a schematic view of a video monitoring system according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a rotatable camera according to an embodiment of the present invention;

fig. 3A is a flowchart of a method for determining prompt information in video monitoring according to an embodiment of the present invention;

fig. 3B is a schematic diagram of a pixel point and a distribution of pixel points around the pixel point according to an embodiment of the present invention;

fig. 4A is a flowchart of another method for determining prompt information in video monitoring according to an embodiment of the present invention;

FIG. 4B is a schematic diagram of a first triangle and a second triangle provided by an embodiment of the present invention;

FIG. 4C is a schematic diagram of a triangle and a second triangle after affine transformation according to the embodiment of the present invention;

FIG. 4D is a schematic view of a field of view angling provided in accordance with an embodiment of the present invention;

fig. 5 is a flowchart of another method for determining prompt information in video monitoring according to an embodiment of the present invention;

fig. 6A is a block diagram of a device for determining prompt information in video monitoring according to an embodiment of the present invention;

fig. 6B is a block diagram of another apparatus for determining prompt information in video monitoring according to an embodiment of the present invention.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

Before explaining the embodiments of the present invention in detail, an application scenario of the embodiments of the present invention is described. In the related art, in order to facilitate a user to quickly find the position of a target object through a rotatable camera, in the process of rotating the rotatable camera, prompt information can be displayed in a video image acquired in the rotating process of the rotatable camera, so that the user can know the approximate position of a camera of the rotatable camera. For example, a prompt message "the current orientation is the true south direction" is displayed in a video picture captured by the rotatable camera at the current orientation, and when the user views the prompt message, the current orientation of the rotatable camera is known to be the true south direction. The method for determining the prompt information in the video monitoring provided by the embodiment of the invention is applied to the scene that the rotatable camera displays the prompt information in the rotating process.

Fig. 1 is a schematic diagram of a video surveillance system according to an embodiment of the present invention, as shown in fig. 1, the video surveillance system includes a rotatable camera 101 and a surveillance system 102, and the surveillance system 102 includes a surveillance server 1021 and a surveillance client 1022. The rotatable camera 101 and the monitoring server 1021 can communicate with each other through a wireless network or a wired network. The rotatable camera 101 and the monitoring client 1022 may communicate with each other through a wireless network or a wired network. The monitoring server 1021 and the monitoring client 1022 may also communicate with each other through a wireless network or a wired network.

The rotatable camera 101 is configured to capture video frames in different directions, and send the captured video frames to the monitoring server 1021 and/or the monitoring client 1022. The monitoring client 1022 is configured to receive the video frames transmitted by the rotatable camera 101 or the monitoring server 1021, and display the video frames, so that the user can view the video frames captured by the rotatable camera 101 through the monitoring client 1022. The monitoring server 1021 is configured to receive a video frame captured by the rotatable camera 101, process the video frame, and send the processed video frame to the monitoring client 1022.

In practical applications, the monitoring client 1022 may be an administrator client. At this time, the administrator may control the rotatable camera 101 to perform certain operations such as rotation through the monitoring client 1022, or set certain information for the rotatable camera through the monitoring client, such as setting a predefined orientation or setting a prompt message.

Fig. 2 is a schematic structural diagram of a rotatable camera provided in an embodiment of the present invention. The rotatable camera may be the rotatable camera 101 shown in fig. 1. In practice, the rotatable camera may be a cloud-top (PTZ) camera. Referring to fig. 2, the rotatable camera includes: video collector 201, transmitter 202, receiver 203, memory 204, processor 205.

The video collector 201 is configured to collect video pictures, and the transmitter 202 is configured to transmit data and/or signaling. Receiver 203 is for receiving data and/or signaling, etc.

Memory 204 may be used to store one or more software programs and/or modules. The Memory may be, but is not limited to, a Read-Only Memory (ROM), a Random Access Memory (RAM), an Electrically Erasable Programmable Read-Only Memory (EEPROM), a Compact Disc Read-Only Memory (CD-ROM) or other optical disk storage, optical disk storage (including Compact Disc, laser disk, optical disk, digital versatile disk, blu-ray disk, etc.), magnetic disk storage media, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by an integrated circuit.

The processor 205 may be a general-purpose Central Processing Unit (CPU), a microprocessor, an Application-Specific Integrated Circuit (ASIC), or one or more ics for controlling the execution of programs in accordance with the present invention.

In a particular implementation, the rotatable camera may further include an output device and an input device, as one embodiment. An output device, which is in communication with the processor 205, may display information in a variety of ways. For example, the output device may be a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display device, a Cathode Ray Tube (CRT) display device, a projector (projector), or the like. The input device is in communication with the processor 205 and may receive user input in a variety of ways. For example, the input device may be a mouse, a keyboard, a touch screen device, or a sensing device, among others.

The memory 204 is used for storing program codes for executing the scheme of the application, and is controlled by the processor 205 to execute. The processor 205 is used to execute program code stored in the memory 204. One or more software modules may be included in the program code. The video surveillance system shown in fig. 1 may determine data for developing an application by the processor 205 and one or more software modules in the program code in the memory 204.

It should be noted that the method for determining the prompt information in video monitoring provided by the embodiment of the present invention mainly includes two aspects, that is, setting a plurality of predefined orientation images for the rotatable camera, and determining whether to determine the prompt information set for the predefined orientation as the prompt information to be superimposed in the video image according to the video image and the predefined orientation images. The following embodiments of the present invention will discuss these two aspects separately.

First, a process of setting a plurality of predetermined azimuth images for the rotatable camera will be described in detail, and it should be noted that the process of setting a plurality of predetermined azimuth images may be performed by the rotatable camera in fig. 1 or the monitoring server in fig. 1. The following embodiments are explained taking as an example a process in which a rotatable camera performs a process of setting a plurality of predetermined azimuth images in advance for the rotatable camera.

Fig. 3A is a method for determining prompt information in video monitoring, which is provided by an embodiment of the present invention and is applied to the video monitoring system shown in fig. 1, and the embodiment of the present invention will be described in detail with respect to a process for presetting a plurality of predefined orientation images for a rotatable camera. Referring to fig. 3A, the method includes the following steps.

Step 301: when the rotatable camera detects the predefined orientation setting instruction, the current orientation is set as the predefined orientation, and the video picture of the current orientation is set as the video picture of the predefined orientation.

When the administrator wants to set a plurality of predefined orientation images for the rotatable camera through the monitoring client, the administrator can control the rotatable camera to rotate through the monitoring client. That is, the monitoring client sends a rotation request to the rotatable camera, when the rotatable camera receives the rotation request, the rotatable camera rotates, and collects a video picture in real time in the rotating process, sends the video picture collected in real time to the monitoring client, and the monitoring client displays the video picture received in real time. When the monitoring client detects a predefined orientation setting instruction in the process of displaying the video picture, the predefined orientation setting instruction is forwarded to the rotatable camera.

The predefined orientation instruction is triggered by an administrator through a preset operation, and the preset operation can be a click operation, a sliding operation or a voice operation.

Step 302: the rotatable camera determines a background image of the video frame at the predefined orientation by a gaussian mixture background modeling method.

Since the video frames of the predefined orientation captured by the rotatable camera are video frames comprising a foreground image and a background image, and the foreground image is typically an image of a non-stationary object, such as a moving vehicle, in order to make the predefined orientation image and the video frames captured by the rotatable camera at different times in the predefined orientation as consistent as possible, the rotatable camera needs to acquire the background image of the video frames of the predefined orientation.

For the sake of convenience in the following description, a gaussian mixture background modeling method is described herein.

Because the brightness difference between the foreground image and the background image in the same picture is large, and the brightness of the foreground image and the brightness of the background image both obey Gaussian distribution, a brightness distribution histogram of the same image has a double peak-valley shape, wherein one peak corresponds to the brightness distribution condition of the foreground image, and the other peak corresponds to the brightness distribution condition of the background image. When the rotatable camera collects the video pictures in the predefined orientation, the collected video pictures in the predefined orientation are multi-frame video pictures, so that the rotatable camera can obtain the brightness distribution conditions of a plurality of background images according to the multi-frame video pictures and establish a background image brightness distribution model, namely a Gaussian mixture model, according to the brightness distribution conditions of the plurality of background images. And for any frame of video image in the multi-frame video image, matching each pixel point in the video image with the Gaussian mixture model, determining the pixel point as a pixel point in a background image when the brightness of the pixel point is matched with the Gaussian mixture model, and determining the pixel point as a pixel point in a foreground image when the brightness of the pixel point is not matched with the Gaussian mixture model. When the rotatable camera performs the above operation on all the pixel points in the video picture, the pixel points in the background image are obtained, that is, the background image of the video picture in the predefined orientation is determined.

It should be noted that, in the embodiment of the present invention, the background image of the video picture in the predefined orientation may also be determined by other methods, such as a single gaussian model method or an initialization method based on an image sequence, which is not specifically limited herein.

Step 303: the rotatable camera processes the background image of the video picture in the predefined orientation to obtain a processed image, wherein the processed image is an image including a first-order edge feature.

Specifically, graying the background image of the video image in the predefined orientation, that is, representing the characteristics of each pixel point by using the brightness value of each pixel point to obtain a grayscale image of the background image, performing gaussian blurring on the grayscale image of the background image, removing local fine characteristics in the grayscale image, and performing Sobel (Sobel) transformation on the grayscale image subjected to the gaussian blurring to obtain an image only including first-order edge characteristics.

The gray level image of the background image is subjected to Gaussian blur, that is, for each pixel point in the gray level image, the gray level values of a first preset number of pixel points around the pixel point are counted to obtain the average value of the gray level values of the pixel points in the gray level image, and the gray level value of the pixel point is replaced by the average value of the gray level values.

In addition, Sobel transformation is performed on the gray-scale image subjected to the gaussian blurring, that is, for each pixel point in the gray-scale image subjected to the gaussian blurring, a second preset number of pixel points around the pixel point are obtained, different weights are respectively given to the second preset number of pixel points, and the gray gradient of the pixel point is determined according to the gray values of the second preset number of pixel points. When the gray scale gradient of the pixel point is larger than the gradient threshold, the fact that the gray scale value at the pixel point is changed violently is indicated. And selecting pixel points with the gray gradient larger than the gradient threshold value from all the pixel points of the gray image subjected to the Gaussian blur, and determining an image formed by the selected pixel points as the processed image. Because the processed image only includes the pixel points with the gray gradient greater than the gradient threshold, the processed image can approximately represent the outline of the background image of the video picture in the predefined orientation, that is, the processed image is an image including a first-order edge feature.

Wherein the gray gradient of a pixel is used to indicate the gray value of the pixel and the gray value of a second predetermined number of pixels around the pixel, and in one possible implementation, as shown in fig. 3B, point P is the pixel and point P is the point P₁-P₈For 8 pixel points around the P point, the gray gradient of the P point in the longitudinal direction can be expressed as: (P)₆`+2P₇`+P₈`)-(P₁`+2P₂`+P₃") wherein P₁`、P₂`、P₃`、P₆`、P₇' and P₈Each is a pixel point P₁、P₂、P₃、P₆、P₇And P₈The gray value of (a).

Step 304: the rotatable camera determines the processed image as a predefined orientation image corresponding to the predefined orientation and stores the predefined orientation image.

Wherein after storing the predefined orientation image the rotatable camera may further establish a correspondence between the predefined orientation and the predefined orientation image.

In addition, when the rotatable camera receives a predefined orientation setting instruction sent by the monitoring client, the rotatable camera also receives prompt information which is sent by the monitoring client and is set for the predefined orientation, and at this time, the rotatable camera also needs to store the prompt information in the corresponding relation between the predefined orientation and the predefined orientation image, so that the corresponding relation among the predefined orientation, the predefined orientation image and the prompt information is obtained.

For example, the administrator sets a plurality of predefined orientations through the monitoring client, and the predefined orientations are respectively marked as predefined orientation 1, predefined orientation 2 and predefined orientation 3 …, the rotatable camera respectively obtains predefined orientation images of each predefined orientation in the plurality of predefined orientations, namely image a, image B and image C …, and receives prompt information sent by the monitoring client and set for the plurality of predefined orientations, namely information i, information ii and information iii …. The rotatable camera now establishes a correspondence as shown in table 1, where each predefined orientation corresponds to a predefined orientation image and a prompt message.

TABLE 1

Predefined orientation	Predetermined azimuth image	Prompt information
			Predefined orientation 1	Image A	Information I
Predefined orientation 2	Image B	Information II
			Predefined orientation 3	Image C	Information III
…	…	…

Optionally, when the monitoring server executes a process of setting a plurality of predefined orientation images, at this time, during the process of receiving a video picture acquired by the rotatable camera in real time, if a predefined orientation setting instruction triggered by a preset operation by an administrator is detected, the current orientation is determined as the predefined orientation, and the predefined orientation setting instruction is forwarded to the monitoring server, where the predefined orientation setting instruction carries the video picture of the current orientation. When receiving the predefined orientation setting instruction, the monitoring server determines the predefined orientation image according to the steps 301 to 303 to obtain a plurality of predefined orientation images and the prompt information set for each predefined orientation in the plurality of predefined orientations, establishes the corresponding relationship shown in table 1, and sends the plurality of predefined orientation images, the prompt information set for the plurality of predefined orientations and the corresponding relationship shown in table 1 to the rotatable camera and the monitoring client, so that the rotatable camera and the monitoring client store the plurality of predefined orientation images, the prompt information set for the plurality of predefined orientations and the corresponding relationship shown in table 1.

In the embodiment of the invention, a plurality of predefined orientations are set for the rotatable camera in advance, and predefined orientation images and prompt information which correspond to the predefined orientations one by one are determined, so that whether the prompt information set for the predefined orientations is determined to be the prompt information which needs to be superposed in the video picture or not is judged according to the predefined orientation images.

After setting a plurality of predefined orientations and predefined orientation images and prompt information corresponding to the predefined orientations one by one, according to the video picture and the predefined orientation images, whether the prompt information set for the predefined orientations is determined to be the prompt information to be superimposed in the video picture is judged. The following examples are provided to illustrate the process in detail.

It should be noted that, according to the video frame and the plurality of predefined orientation images, the process of determining whether to determine the prompt information set for the predefined orientation as the prompt information to be superimposed in the video frame may be performed by the rotatable camera in fig. 1, or may be performed by the monitoring server in fig. 1. The following two examples will be used to illustrate these two cases in detail.

Fig. 4A is another method for determining prompt information in video surveillance, which is provided by an embodiment of the present invention and is applied to the video surveillance system shown in fig. 1, where the embodiment of the present invention will describe in detail a process of determining, for a video picture and a plurality of predefined orientation images, which are acquired by a rotatable camera according to a target orientation, whether the prompt information set for the predefined orientation is the prompt information that needs to be superimposed in the video picture. Referring to fig. 4A, the method includes the following steps.

Step 401: the rotatable camera acquires video pictures.

Specifically, when a user wants to control the rotatable camera to rotate to the position of the target object through the monitoring client, the monitoring client sends a rotation request to the rotatable camera, and when the rotatable camera receives the rotation request, the rotatable camera rotates and collects a video picture in real time in the rotating process. At this time, the rotatable camera may determine a video picture captured in real time as the video picture.

Of course, in order to reduce the data processing pressure of the rotatable camera, the rotatable camera may select one video picture from the video pictures acquired in the preset time period closest to the current time before the current time every preset time period, and determine the selected video picture as the video picture.

For the convenience of the following description, the orientation of the rotatable camera for capturing the video image is referred to as the target orientation, i.e., the target orientation is the capturing orientation of the video image.

To determine whether to superimpose the preset hint information on the video frame, the rotatable camera needs to determine whether the target orientation is close to one of the predefined orientations in the embodiment shown in FIG. 3A. Specifically, the rotatable camera may determine an orientation deviation between the target orientation and each of a plurality of predefined orientations based on the video frame and a plurality of pre-defined orientation images stored in advance, wherein the plurality of pre-defined orientation images and the plurality of predefined orientations correspond one-to-one. The following steps 402-408 will be described in detail with respect to the rotatable camera determining an orientation deviation between the target orientation and each of a plurality of predefined orientations based on the video frame and a plurality of predefined orientation images stored in advance.

Step 402: for each predefined azimuth image in the plurality of predefined azimuth images, the rotatable camera acquires a plurality of pairs of local feature points matched in the predefined azimuth image and the video picture, wherein each pair of local feature points is an edge point or a corner point matched in the video picture and the predefined azimuth image.

As can be seen from the embodiment shown in fig. 3A, the predefined azimuth image is an image including a first-order edge feature, and therefore, the implementation process of obtaining the pairs of local feature points matching the predefined azimuth image with the video frame may specifically be: determining a background image of the video picture by a Gaussian mixture background modeling method; processing the background image of the video picture to obtain a processed image, wherein the processed image is an image comprising first-order edge features; and matching the feature points in the processed image with the feature points in the predetermined azimuth image through a Scale Invariant Feature Transform (SIFT) algorithm to obtain a plurality of pairs of matched local feature points.

The implementation process of processing the video frame to obtain a processed image is substantially the same as the implementation process of step 302 and step 303 in fig. 3A, and the embodiment of the present invention is not described in detail herein.

In addition, matching the feature points in the processed image with the feature points in the predefined azimuth image through a SIFT algorithm to obtain a plurality of pairs of matched local feature points, which may specifically be: for each feature point in the processed image and the predefined orientation image, according to the method for determining the longitudinal gray scale gradient of the pixel point in step 303 of fig. 3A, the transverse gray scale gradient and the longitudinal gray scale gradient of the feature point are determined, and the orientation angle of the feature point is represented by a vector formed by the transverse gray scale gradient and the longitudinal gray scale gradient of the feature point. And determining the Euclidean distance between the direction angle of the feature point and the azimuth angle of each feature point included in the predefined azimuth image for any feature point in the processed image to obtain a plurality of Euclidean distances, and if the Euclidean distance smaller than a Euclidean distance threshold value is searched from the plurality of Euclidean distances, determining the feature point corresponding to the searched Euclidean distance as the feature point matched with any feature point in the processed image. Wherein the Euclidean distance is a real distance between two points in the multi-dimensional space.

The rotatable camera determines the number of the matched pairs of local feature points after obtaining the matched pairs of local feature points, and indicates that the processed image is similar to the predefined orientation image when the number of the matched pairs of local feature points is greater than a preset number. At this time, the rotatable camera may determine an azimuth deviation between the predefined azimuth corresponding to the predefined azimuth image and the target azimuth from the plurality of pairs of local feature points, and specifically, the rotatable camera may determine an azimuth deviation between the predefined azimuth and the target azimuth through steps 403 to 408.

Step 403: the rotatable camera selects at least three pairs of local feature points from the plurality of pairs of local feature points.

Wherein, each of the at least three pairs of local feature points should satisfy the condition in step 404, and the specific condition is detailed in step 404 and is not described in detail herein.

Step 404: the rotatable camera constructs at least one pair of triangles from any three pairs of the at least three pairs of local feature points, each pair of triangles comprising a first triangle and a second triangle, the first triangle being a triangle constructed from the local feature points in the video picture, the second triangle being a triangle constructed from the local feature points in the predefined orientation image.

In order to facilitate the processing of the first triangle and the second triangle by the rotatable camera, the first triangle and the second triangle should satisfy the following condition: each side length of the first triangle and the second triangle is larger than a preset side length, and each inner angle of the first triangle and the second triangle is larger than a preset angle.

In addition, since the length in the image is generally represented by a pixel value, the preset side length may be a preset pixel value.

Step 405: for each of the at least one pair of triangles, the rotatable camera determines a sum of squares of differences between corresponding interior angles of matched local feature points in a first triangle and a second triangle comprised by the pair of triangles.

FIG. 4B is a first triangle and a second triangle, wherein the first triangle is △ ABC, and the second triangle is △ A 'B' C ', wherein the vertex A in the first triangle and the vertex A' in the second triangle are a pair of matched local feature points, the vertex B in the first triangle and the vertex B 'in the second triangle are a pair of matched local feature points, the vertex C in the first triangle and the vertex C' in the second triangle are a pair of matched local feature points, and the interior angles corresponding to the vertices A, B and C in △ ABC are α₁、α₂And α₃The interior angles corresponding to the vertexes A ', B' and C 'of △ A' B 'C' are β₁、β₂And β₃。

As shown in fig. 4B, the sum of squares of differences between the internal angles corresponding to the matched local feature points in the first triangle and the second triangle can be expressed by the following formula:

(α₁-β₁)²+(α₂-β₂)²+(α₃-β₃)²

wherein the sum of squares may represent a degree of deviation between an orientation at which the rotatable camera collects the video image of the first triangle and an orientation at which the rotatable camera collects the video image of the second triangle, that is, when the sum of squares is greater than a preset value, it indicates that the rotatable camera has a large deviation between the orientation at which the rotatable camera collects the video image of the first triangle and the orientation at which the rotatable camera collects the video image of the second triangle, and then the rotatable camera performs step 406; when the sum of squares is not greater than the preset value, it indicates that the deviation between the orientation of the rotatable camera for capturing the video frame of the first triangle and the orientation of the rotatable camera for capturing the video frame of the second triangle is small, and the rotatable camera may directly perform step 407.

Step 406: when the square sum is larger than a preset value, the rotatable camera performs space transformation on the first triangle, so that the square sum of the difference values between the interior angles corresponding to the matched local feature points in the triangle and the second triangle after space transformation is smaller than the preset value, and the triangle after space transformation is the same as the second triangle in size.

Specifically, affine transformation is carried out on the first triangle, so that the square sum of the difference values between the interior angles corresponding to the matched local feature points in the affine transformed triangle and the second triangle is smaller than the preset value; determining the average value of the ratio of the corresponding side lengths in the triangle after affine transformation and the second triangle; when the average value is larger than 1, taking the average value as a reduction scale, and reducing the triangle after affine transformation to obtain a triangle after space transformation; and when the average value is not more than 1, amplifying the triangle after affine transformation by taking the reciprocal of the average value as an amplification scale to obtain the triangle after space transformation.

The affine transformation of the first triangle may be performed by: and determining an affine transformation matrix according to the three matched pairs of local feature points in the first triangle and the second triangle. And carrying out affine transformation on the first triangle according to the affine transformation matrix to obtain the triangle after the first affine transformation. And judging whether the sum of squares of differences between the internal angles corresponding to the matched local feature points in the triangle and the second triangle after the first affine transformation is smaller than the preset value or not, when the sum of squares of differences between the internal angles corresponding to the matched local feature points in the triangle and the second triangle after the first affine transformation is not smaller than the preset value, performing affine transformation again on the triangle after the first affine transformation according to the affine transformation matrix to obtain a triangle after the second affine transformation, repeating the above operations until the sum of squares of differences between the internal angles corresponding to the matched local feature points in the triangle after the nth affine transformation and the second triangle is smaller than the preset value, and determining the triangle after the nth affine transformation as the triangle after the affine transformation. Wherein, the affine transformation correlation technique can be referred to for determining the affine transformation matrix according to the three pairs of matched local feature points in the first triangle and the second triangle, which will not be elaborated herein.

Fig. 4C shows the affine-transformed triangle and the second triangle, as shown in fig. 4C, the affine-transformed triangle is △ a ″ B ″ C ", and the second triangle is △ a 'B' C ', wherein the vertex a ″ of the affine-transformed triangle and the vertex a' of the second triangle are a pair of matched local feature points, the vertex B ″ of the affine-transformed triangle and the vertex B 'of the second triangle are a pair of matched local feature points, the vertex C ″ of the affine-transformed triangle and the vertex C865' of the second triangle are a pair of matched local feature points, the three side lengths of the vertex B ″ 865C ″ of the affine-transformed triangle are L1, L4 and L3, the three side lengths of the vertex B829C ″ 829 of the affine-transformed triangle are L636, L2 and L3 are respectively, and the side lengths of the triangle 3 and the side length of the affine-transformed triangle are respectively L6323, L3, and the side lengths of the triangle 829C B C35A are respectively.

At this time, the average value of the ratio of the side lengths in the affine-transformed triangle and the second triangle may be represented by the following formula:

Ratio＝AVG(L1/R1，L2/R2，L3/R3)，

the AVG represents an average value, and the Ratio represents an average value of ratios of corresponding side lengths in the triangle after affine transformation and the second triangle.

For example, when the Ratio is 2, the triangle after affine transformation is reduced by two times to obtain a triangle after space transformation; when the Ratio is 0.5, the triangle after affine transformation is enlarged by two times to obtain a triangle after space transformation, so that the size of the triangle after space transformation is the same as that of the second triangle.

Step 407: the rotatable camera determines the ratio of the view field deflection angle of the triangle after the space transformation to the view field deflection angle of the second triangle, wherein the view field deflection angle is an included angle formed by a connecting line between the gravity center of the triangle and the center of the image where the triangle is located and the focus of the rotatable camera respectively.

Since the field of view declination of the rotatable camera when capturing a target object may represent to some extent the orientation of the rotatable camera when capturing the target object, the degree of deviation between the target orientation and the predefined orientation may be determined by determining the ratio between the field of view declination of the spatially transformed triangle and the field of view declination of the second triangle.

In addition, since it is difficult to directly measure the size of the view field declination angle, the ratio between the view field declination angle of the triangle after the spatial transformation and the view field declination angle of the second triangle may be determined by an indirect method, which specifically may be: determining the distance between the center of the triangle after the space transformation and the center of the video picture where the triangle after the space transformation is located to obtain a first distance; determining the distance between the center of the second triangle and the center of the image where the second triangle is located to obtain a second distance; determining a ratio between the first distance and the second distance as a ratio between the field of view declination of the spatially transformed triangle and the field of view declination of the second triangle.

Fig. 4D is a schematic diagram of a field of view deviation provided by the embodiment of the invention, as shown in diagram (a) of fig. 4D, the triangle △ a "B" C "is the spatially transformed triangle, the center of △ a" B "C" is the G point, the plane α is the video frame where the spatially transformed triangle is located, the center of the plane α is the O point, the J point is the focal point of the rotatable camera, so the angle θ in diagram (a) is the field of view deviation angle of the spatially transformed triangle, the first distance is M1, as shown in diagram (B) of fig. 4D, the triangle △ a ' B C is the second triangle, the center of the △ a B C is the G ' point, the plane β is the image where the second triangle is located, the center of the plane 829 4a is the O829 point, the J point is the rotatable triangle B C, the center of the △ a C is the G ' point, the plane β is the image where the second triangle is located, the plane 483 point of the field of view is the second triangle, the distance between the spatial transformation triangle, the triangle # 12A distance between the triangle and the second angle of the field of view of the triangle, the triangle after the triangle is determined by the ratio between the triangle # M + M.

When the rotatable camera performs the operations of step 405, step 406 and step 407 on each of the at least one pair of triangles, at least one ratio is obtained and step 408 is performed according to the at least one ratio.

Step 408: the rotatable camera determines an orientation deviation between a predefined orientation corresponding to the predefined orientation image and the target orientation based on the determined at least one ratio.

Since there may be a more discrete ratio in the at least one ratio, the specific implementation of step 408 may be: performing linear clustering calculation on the at least one ratio to obtain a discrete value of each ratio; selecting a ratio value of which the discrete value is smaller than a preset discrete value from the at least one ratio value; the absolute value of the difference between the average of the selected ratios and 1 is determined as the azimuth deviation between the predefined azimuth corresponding to the predefined azimuth image and the target azimuth. The at least one ratio is subjected to linear clustering calculation, that is, a central point of the at least one ratio is determined by a K-means (hard clustering algorithm) clustering method, and the euclidean distance from each ratio in the at least one ratio to the central point is determined as a discrete value of the ratio.

For example, the rotatable camera selects from the at least one ratio a ratio in which the at least one discrete value is smaller than a preset discrete value, and determines an average value of the selected ratios, denoted avg (λ). The difference between avg (λ) and 1 can be used to indicate the degree of deviation between the predefined orientation and the target orientation, i.e., the closer avg (λ) is to 1, the closer the predefined orientation and the target orientation are; the more avg (λ) deviates from 1, the more the predefined orientation and the target orientation deviate. The orientation deviation between the predefined orientation and the target orientation may thus be expressed by the following formula:

|avg(λ)-1|

it should be noted that in the embodiment of the present invention, the rotatable camera may also directly determine the absolute value of the difference between the average of the at least one ratio and 1 as the azimuth deviation between the predefined azimuth corresponding to the predefined azimuth image and the target azimuth.

When the rotatable camera performs the operations of steps 402 to 408 described above for each of the plurality of predefined orientation images stored, the rotatable camera may determine, for each of a plurality of predefined orientations in one-to-one correspondence with the plurality of predefined orientation images, an orientation offset between the target orientation and the predefined orientation, resulting in a plurality of orientation offsets in one-to-one correspondence with the plurality of predefined orientations.

Step 409: a predefined orientation is selected from a plurality of predefined orientations having an orientation deviation from the target orientation that is less than a preset orientation deviation.

Wherein the preset azimuth deviation is a preset azimuth deviation. When the azimuth deviation between the target azimuth and the predefined azimuth is not less than the preset azimuth deviation, the target azimuth and the predefined azimuth are far away from each other, and the rotatable camera does not perform any operation; when the azimuth deviation between the target azimuth and the predefined azimuth is smaller than the preset azimuth deviation, it indicates that the target azimuth and the predefined azimuth are relatively close, and at this time, the rotatable camera may determine the prompt information set for the predefined azimuth according to the corresponding relationship shown in table 1 in the embodiment shown in fig. 3A, and determine the prompt information set for the selected predefined azimuth as the prompt information to be superimposed in the video picture. In particular, the rotatable camera may implement this process by steps 410 and 411 described below.

Step 410: the rotatable camera determines the reminder information set for the selected predefined orientation as reminder information to be superimposed in the video picture.

After the rotatable camera determines the cue information that needs to be superimposed in the video frame, the cue information can be superimposed in the video frame in the following two possible implementations.

(1) The rotatable camera overlays a prompt for the selected predefined orientation setting in the video frame.

Specifically, the rotatable camera may superimpose the cue information set for the selected predefined orientation in the video picture of the current position by an on-screen display (OSD) manner, that is, after the rotatable camera determines the cue information set for the predefined orientation, determine the coding information of the cue information, and create an OSD environment, that is, initialize an OSD. The rotatable camera performs information synchronization and superposition on the coding information of the prompt information and frame data of the video picture, and writes the superposed data into the created OSD environment through Digital Signal Processing (DSP) technology to obtain the video picture superposed with the prompt information.

And then, the rotatable camera sends the video picture after the prompt information is superposed to the monitoring client, and the monitoring client displays the video picture after the prompt information is superposed. When the monitoring client receives the video picture which is sent by the rotatable camera and is superposed with the prompt information, the video picture superposed with the prompt information is directly displayed, and at the moment, if the user controls the rotatable camera to rotate through the monitoring client, the user can view the prompt information in the currently displayed video picture so as to know the approximate direction of the current rotatable camera.

(2) And the monitoring client superposes prompt information set for the selected predefined orientation in the video picture.

After the rotatable camera determines the prompt message set for the selected predefined orientation as the prompt message that needs to be superimposed in the video frame, the rotatable camera sends the prompt message and an identification of the video frame to the monitoring client. When the monitoring client receives the identification of the video picture, the video picture is determined from the received video pictures according to the identification of the video picture, and then the prompt information is superposed in the video picture. The identification of the video pictures may be a frame number of a video frame to which the rotatable camera transmits the video picture.

The implementation process of the monitoring client side for superimposing the prompt information in the video picture may refer to the implementation process of the rotatable camera for superimposing the prompt information in the video picture.

Optionally, when the corresponding relationship shown in table 1 above is stored in the monitoring client, after the rotatable camera determines that the prompt information set for the selected predefined orientation is the prompt information that needs to be superimposed in the video picture, the rotatable camera sends the identifier of the selected predefined orientation and the identifier of the video picture to the monitoring client. When the monitoring client receives the identifier of the predefined orientation sent by the rotatable camera, according to the identifier of the predefined orientation, the prompt information set for the predefined orientation is determined from the corresponding relation shown in table 1, and according to the identifier of the video picture, the video picture is determined from the received video pictures, and then the prompt information is superimposed on the video picture.

After the monitoring client superimposes the prompt information on the video picture, the monitoring client can directly display the video picture after the prompt information is superimposed.

In the embodiment of the invention, the rotatable camera acquires the video picture, and selects the predefined orientation with the orientation deviation smaller than the preset orientation deviation from a plurality of predefined orientations which are in one-to-one correspondence with a plurality of predefined orientation images, because the target orientation is the acquisition orientation of the video picture, the orientation deviation smaller than the preset orientation deviation can indicate that the target orientation is basically close to the predefined orientation, and at the moment, the prompt information set for the selected predefined orientation can be superposed in the video picture, and the prompt information set for the predefined orientation can be superposed in the video picture without the condition parameters of the target orientation and the predefined orientation being completely consistent, so that the flexibility of determining the prompt information in video monitoring is improved.

When the process of determining whether to determine the prompt information set to the predefined orientation as the prompt information to be superimposed in the video picture is executed by the monitoring server in fig. 1 according to the video picture and the plurality of predefined orientation images, another method for determining the prompt information in video monitoring is provided in the embodiment of the present invention, as shown in fig. 5, the method includes the following steps:

step 501: the monitoring server acquires a video picture.

Specifically, when the rotatable camera is currently rotating, a video picture is acquired in real time, and the video picture acquired in real time is sent to a monitoring server in the monitoring system, and when the monitoring server receives the video picture acquired in real time, the video picture acquired in real time can be directly determined as the video picture. The monitoring server may also select one video picture from the video pictures received in the preset time period closest to the current time before the current time every preset time period, and determine the selected video picture as the video picture.

Step 502: the monitoring server determines an orientation deviation between the target orientation and each of a plurality of predefined orientations based on the video picture and a plurality of predefined orientation images stored in advance, wherein the plurality of predefined orientation images and the plurality of predefined orientations are in one-to-one correspondence.

The implementation process of step 502 may refer to the implementation processes of step 402 to step 407 in fig. 4, and embodiments of the present invention are not described in detail herein.

Step 503: the monitoring server selects a predefined orientation from the plurality of predefined orientations, the orientation deviation of which from the target orientation is smaller than a preset orientation deviation.

The implementation of step 503 is substantially the same as the implementation of step 408 in fig. 4, and is not described in detail here.

When the monitoring server selects a predefined azimuth having an azimuth deviation smaller than a preset azimuth deviation from the plurality of predefined azimuths, it indicates that the target azimuth is closer to the predefined azimuth, and at this time, the monitoring server may determine the prompt information set for the predefined azimuth according to the corresponding relationship shown in table 1 in the embodiment shown in fig. 3A, and determine the prompt information set for the selected predefined azimuth as the prompt information to be superimposed in the video picture. Specifically, the monitoring server may implement the process by steps 504 and 505 described below.

Step 504: and the monitoring server determines the prompt information set for the selected predefined orientation as the prompt information needing to be superposed in the video picture.

After the monitoring server determines the prompt information to be superimposed in the video frame, the prompt information may be superimposed in the video frame in the following two possible implementations.

(1) And the monitoring server superposes prompt information set for the selected predefined direction in the video picture.

The implementation process of the monitoring server for superimposing the prompt information in the video frame may refer to the implementation process of the rotatable camera in step 410 in fig. 4 for superimposing the prompt information in the video frame.

And then, the monitoring server sends the video picture after the prompt information is superposed to the monitoring client, and the monitoring client displays the video picture after the prompt information is superposed.

Specifically, after the monitoring server determines the prompt information set for the selected predefined orientation as the prompt information to be superimposed in the video picture, the monitoring server sends the prompt information and the identifier of the video picture to the monitoring client, the monitoring client superimposes the prompt information on the video picture, and the video picture on which the prompt information is superimposed is displayed.

Optionally, when the corresponding relationship shown in table 1 is stored in the monitoring client, after the monitoring server determines that the prompt information set for the selected predefined orientation is the prompt information that needs to be superimposed in the video image, the monitoring server sends the identifier of the selected predefined orientation and the identifier of the video image to the monitoring client, and the monitoring client still superimposes the prompt information on the video image and displays the video image on which the prompt information is superimposed.

In the embodiment of the invention, the monitoring server acquires the video picture, and selects the predefined orientation with the orientation deviation smaller than the preset orientation deviation from the plurality of predefined orientations which are in one-to-one correspondence with the plurality of predefined orientation images, because the target orientation is the acquisition orientation of the video picture, and the orientation deviation smaller than the preset orientation deviation can indicate that the target orientation is basically close to the predefined orientation, the prompt information set for the selected predefined orientation can be superposed in the video picture, and the prompt information set for the predefined orientation can be superposed in the video picture without the condition parameters of the target orientation and the predefined orientation being completely consistent, so that the flexibility of determining the prompt information in video monitoring is improved.

In this application, in addition to the method for determining the prompt information in video monitoring described in the above embodiment, a device for determining the prompt information in video monitoring is also provided, and the following embodiment is used to describe in detail the device for determining the prompt information in video monitoring provided by the embodiment of the present invention.

Referring to fig. 6A, an embodiment of the present invention provides an apparatus 600 for determining a prompt message in video monitoring, where the apparatus 600 includes an obtaining module 601, a first determining module 602, a selecting module 603, and a second determining module 604.

An obtaining module 601, configured to execute step 401 in the embodiment of fig. 4A or step 501 in the embodiment of fig. 5.

A first determining module 602, configured to determine an orientation deviation between the target orientation and each of a plurality of predefined orientations based on the video frame and a plurality of predefined orientation images stored in advance, where the plurality of predefined orientation images and the plurality of predefined orientations correspond to each other one by one.

A selecting module 603, configured to perform step 409 in the embodiment of fig. 4A or step 503 in the embodiment of fig. 5.

A second determining module 604, configured to perform step 410 in the embodiment of fig. 4A or step 504 in the embodiment of fig. 5.

Optionally, the first determining module 602 includes:

an obtaining unit, configured to perform step 402 in the embodiment of fig. 4A;

a determining unit, configured to determine, according to the plurality of pairs of local feature points, an orientation deviation between a predefined orientation corresponding to the predefined orientation image and a target orientation when the number of the plurality of pairs of local feature points is greater than a preset number.

Optionally, the predefined orientation image is an image comprising first order edge features;

the obtaining unit is specifically configured to:

Optionally, the determining unit includes:

a selection subunit for performing step 403 in the embodiment of FIG. 4A;

a constructing subunit, configured to execute step 404 in the embodiment of fig. 4A, where each side length of the first triangle and the second triangle is greater than the preset side length, and each internal angle of the first triangle and the second triangle is greater than the preset angle.

A first determining subunit, configured to perform step 405 in the embodiment of fig. 4A;

a spatial transform subunit for performing step 406 in the embodiment of fig. 4A;

a second determining subunit, configured to perform step 407 in the embodiment of fig. 4A;

a third determining subunit, configured to perform step 408 in the embodiment of fig. 4A.

Optionally, the spatial transform subunit is specifically configured to:

performing affine transformation on the first triangle to enable the square sum of the difference values between the interior angles corresponding to the matched local feature points in the affine transformed triangle and the second triangle to be smaller than a preset value;

determining the average value of the ratio of the corresponding side lengths in the triangle and the second triangle after affine transformation;

and when the average value is not more than 1, magnifying the triangle after affine transformation by taking the reciprocal of the average value as a magnification ratio to obtain the triangle after spatial transformation.

Optionally, the second determining subunit is specifically configured to:

determining the distance between the center of the triangle after the space transformation and the center of the video picture where the triangle after the space transformation is located to obtain a first distance;

the ratio between the first distance and the second distance is determined as the ratio between the field of view declination of the spatially transformed triangle and the field of view declination of the second triangle.

Optionally, the third determining subunit is specifically configured to:

the absolute value of the difference between the average of the selected ratios and 1 is determined as the azimuth deviation between the predefined azimuth corresponding to the predefined azimuth image and the target azimuth.

Optionally, referring to fig. 6B, the apparatus 600 further includes a setting module 605, a third determining module 606, a processing module 607, and a storage module 608.

A setting module 605 for executing step 301 in the embodiment of fig. 3A;

a third determining module 606 for performing step 302 in the embodiment of fig. 3A;

a processing module 607 for performing step 303 in the embodiment of fig. 3A;

a storage module 608 for executing step 304 in the embodiment of fig. 3A.

In the embodiment of the invention, a video picture is obtained, and a predefined azimuth with azimuth deviation smaller than the preset azimuth deviation from a plurality of predefined azimuths in one-to-one correspondence with a plurality of predefined azimuth images is selected. The prompt information set for the selected predefined orientation can be determined as the prompt information to be superimposed in the video picture without the condition parameters of the target orientation and the predefined orientation being completely consistent, so that the flexibility of determining the prompt information in video monitoring is improved.

It should be noted that: in the information display apparatus for video monitoring provided in the above embodiment, when the information display apparatus for video monitoring is used for displaying information in video monitoring, only the division of the above functional modules is used for illustration, and in practical applications, the above functions may be distributed to different functional modules according to needs, so as to complete all or part of the above described functions. In addition, the information display apparatus in video monitoring and the information display method in video monitoring provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.

In the above embodiments, the implementation may be wholly or partly realized by software, hardware, firmware, or any combination thereof. When implemented in software, may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. When loaded and executed on a computer, cause the processes or functions described in accordance with embodiments of the invention, to occur, in whole or in part. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored on a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website, computer, server, or data center to another website, computer, server, or data center via wire (e.g., coaxial cable, fiber optic, Digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that incorporates one or more of the available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., Digital Versatile Disk (DVD)), or a semiconductor medium (e.g., Solid State Disk (SSD)), among others.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above-mentioned embodiments are provided not to limit the present application, and any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims

1. A method for determining prompt information in video monitoring is characterized by comprising the following steps:

in the process of searching the direction of the target object, acquiring a video picture;

2. The method of claim 1, wherein determining an orientation deviation between a target orientation and each of a plurality of predefined orientations based on the video frame and a plurality of pre-stored predefined orientation images comprises:

3. The method of claim 2, wherein the predefined azimuthal image is an image that includes first order edge features;

4. The method of claim 2 or 3, wherein determining an orientation deviation between a predefined orientation corresponding to the predefined orientation image and the target orientation based on the plurality of pairs of local feature points comprises:

5. The method of claim 4, wherein the spatially transforming the first triangle comprises:

6. The method of claim 5, wherein determining a ratio between the field of view bias angle of the spatially transformed triangle and the field of view bias angle of the second triangle comprises:

7. The method of any one of claims 5-6, wherein determining an orientation deviation between a predefined orientation corresponding to the predefined orientation image and the target orientation based on the determined at least one ratio comprises:

8. The method of any of claims 1-3 or 5-6, wherein prior to determining an orientation deviation between the target orientation and each of a plurality of predefined orientations based on the video frame and a plurality of pre-stored predefined orientation images, further comprising:

9. An apparatus for determining a reminder in video surveillance, the apparatus comprising:

the acquisition module is used for acquiring a video picture in the process of searching the direction of the target object;

the first determination module is used for determining the azimuth deviation between a target azimuth and each predefined azimuth in a plurality of predefined azimuths based on the video picture and a plurality of pre-stored predefined azimuth images, wherein the target azimuth is the collection azimuth of the video picture, and the predefined azimuth images correspond to the predefined azimuths in a one-to-one mode;

a selection module for selecting a predefined orientation from the plurality of predefined orientations having an orientation deviation from the target orientation that is less than a preset orientation deviation;

and the second determination module is used for determining the prompt information set for the selected predefined orientation as the prompt information needing to be superposed in the video picture.

10. The apparatus of claim 9, wherein the first determining module comprises:

an obtaining unit, configured to obtain, for each of the plurality of predefined orientation images, a plurality of pairs of local feature points that are matched in the predefined orientation image and the video frame, where each pair of local feature points is an edge point or a corner point that is matched in the video frame and the predefined orientation image;

a determining unit, configured to determine, according to the plurality of pairs of local feature points, an orientation deviation between a predefined orientation corresponding to the predefined orientation image and the target orientation when the number of the plurality of pairs of local feature points is greater than a preset number.

11. The apparatus of claim 10, wherein the predefined azimuthal image is an image that includes first order edge features;

the obtaining unit is specifically configured to:

12. The apparatus of claim 10 or 11, wherein the determining unit comprises:

a selecting subunit, configured to select at least three pairs of local feature points from the plurality of pairs of local feature points;

a constructing subunit, configured to construct at least one pair of triangles through any three pairs of local feature points in the at least three pairs of local feature points, where each pair of triangles includes a first triangle and a second triangle, the first triangle is a triangle constructed by local feature points in the video picture, the second triangle is a triangle constructed by local feature points in the predefined azimuth image, each side length of the first triangle and each side length of the second triangle is greater than a preset side length, and each internal angle of the first triangle and each internal angle of the second triangle is greater than a preset angle;

a first determining subunit, configured to determine, for each pair of triangles in the at least one pair of triangles, a sum of squares of differences between internal angles corresponding to matched local feature points in a first triangle and a second triangle included in the pair of triangles;

the spatial transformation subunit is used for performing spatial transformation on the first triangle when the sum of squares is greater than a preset value, so that the sum of squares of differences between internal angles corresponding to matched local feature points in the spatially transformed triangle and the second triangle is smaller than the preset value, and the spatially transformed triangle and the second triangle have the same size;

the second determining subunit is used for determining the ratio of the view field deflection angle of the triangle after the space transformation to the view field deflection angle of the second triangle, wherein the view field deflection angle is an included angle formed by a connecting line between the gravity center of the triangle and the center of the image where the triangle is located and the focus of the rotatable camera respectively;

and the third determining subunit is used for determining the azimuth deviation between the predefined azimuth corresponding to the predefined azimuth image and the target azimuth according to the determined at least one ratio.

13. The apparatus of claim 12, wherein the spatial transform subunit is specifically configured to:

14. The apparatus of claim 13, wherein the second determining subunit is specifically configured to:

15. The apparatus according to any of claims 13 to 14, wherein the third determining subunit is specifically configured to:

16. The apparatus of any of claims 9-11 or 13-14, wherein the apparatus further comprises:

the setting module is used for setting the current position as the predefined position and setting the video picture of the current position as the video picture of the predefined position when a predefined position setting instruction is detected;

a third determining module, configured to determine a background image of the video frame at the predefined orientation by a gaussian mixture background modeling method;

the processing module is used for processing the background image of the video picture in the predefined direction to obtain a processed image, and the processed image is an image comprising first-order edge features;

and the storage module is used for determining the processed image as a predefined azimuth image corresponding to the predefined azimuth and storing the predefined azimuth image.