CN109344285B - Monitoring-oriented video map construction and mining method and equipment - Google Patents

Monitoring-oriented video map construction and mining method and equipment Download PDF

Info

Publication number
CN109344285B
CN109344285B CN201811058356.5A CN201811058356A CN109344285B CN 109344285 B CN109344285 B CN 109344285B CN 201811058356 A CN201811058356 A CN 201811058356A CN 109344285 B CN109344285 B CN 109344285B
Authority
CN
China
Prior art keywords
video
monitoring
monitored
monitored object
acquiring
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811058356.5A
Other languages
Chinese (zh)
Other versions
CN109344285A (en
Inventor
邹复好
李开
周檬
刘鹏坤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Meitong Technology Co ltd
Original Assignee
Wuhan Meitong Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan Meitong Technology Co ltd filed Critical Wuhan Meitong Technology Co ltd
Priority to CN201811058356.5A priority Critical patent/CN109344285B/en
Publication of CN109344285A publication Critical patent/CN109344285A/en
Application granted granted Critical
Publication of CN109344285B publication Critical patent/CN109344285B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the invention provides a monitoring-oriented video map construction and mining method and equipment, which take a map database and a key value type database as data persistence storage of a knowledge map, automatically perform face recognition by utilizing a deep convolutional neural network, greatly reduce the manual review strength, convert unstructured video data into structured map data for storage by using a shared convolutional neural network and a deep deconvolution network in combination with the automatic marking of attributes of pedestrian objects, can well meet the security protection requirements of figure retrieval, track mining, group center mining and the like based on reviewing monitoring videos, can efficiently inquire attribute information of nodes, can efficiently visually display the association relationship among the nodes, and can exert respective advantages by using a main information database and a detailed information database storage mode, the data storage and retrieval efficiency is high.

Description

Monitoring-oriented video map construction and mining method and equipment
Technical Field
The embodiment of the invention relates to the technical field of computer vision, in particular to a video map construction and mining method and device for monitoring.
Background
With the development of technology, intelligent security appears in the aspects of people's life, and residential districts, road monitoring, motor vehicles, bank schools and the like are all provided with monitoring cameras. With the digitization and large scale of intelligent security, the data volume of the monitoring video is continuously increased, and the characteristics of large data volume, unstructured data and difficulty in obtaining valuable information are presented. Therefore, developing a framework for automatically structuring surveillance videos is of great practical significance in the aspect of large data analysis application of surveillance videos.
The knowledge graph is a key technical branch of artificial intelligence and is widely applied to the fields of intelligent search, intelligent question answering, personalized recommendation, content distribution and the like. At present, two knowledge graph construction modes exist, one is a top-down method, and the other is a bottom-up method. In either method, a knowledge base containing a large amount of knowledge is constructed as the structural basis of the knowledge graph. Most of the existing knowledge bases are constructed on the basis of characters, few knowledge bases are constructed on the basis of pictures, the knowledge base constructed on the basis of the monitoring video is almost blank, key information in the original video, such as information of people, vehicles, objects and the like, is stored in a more simplified video map, the original information acquired from the video is converted into a mode of operation on the video map, and a stable and feasible basis is provided for subsequent data analysis and data mining. Therefore, it is valuable to evolve the knowledge-graph into a video graph oriented towards the surveillance video.
The existing relational database is difficult to process the massive unstructured monitoring video data, and the dependence on the table in the relational database to maintain the information and realize the subsequent data analysis is not ideal.
Disclosure of Invention
Embodiments of the present invention provide a method and apparatus for building and mining a surveillance-oriented video map, which overcome the above problems or at least partially solve the above problems.
According to a first aspect of the embodiments of the present invention, there is provided a monitoring-oriented video map building and mining method, including:
acquiring the same monitored object in different monitored videos based on re-identification, and acquiring attribute information of the monitored object;
storing the object number of the monitored object based on a graphic database, storing the object number of the monitored object based on a row key of a key value type database, storing the attribute information of the monitored object and the corresponding monitored video number based on a sub key of the key value type database, and establishing a monitored object video map;
and acquiring the track information of the monitored object based on the video map of the monitored object.
Preferably, the monitored object is any object in the monitored video, including a pedestrian and a vehicle.
Preferably, in step S1, the acquiring the same monitored object in different monitored videos based on re-identification specifically includes:
preprocessing all monitoring videos based on a computer cluster, dividing a monitoring video stream into a picture stream according to frames, and acquiring a unique feature vector of a monitoring object in the picture stream containing complete monitoring object information;
and establishing a segment hash index based on the unique characteristic vector, re-identifying the monitored object based on the segment hash index, acquiring the monitored video information containing the monitored object, and obtaining the entering time and the leaving time of the monitored object in the monitored video.
Preferably, the establishing of the segment hash index based on the unique feature vector and the re-identification of the monitored object based on the segment hash index specifically include:
converting the unique feature vector into a hash code, and equally dividing the hash code into m segments, wherein the length of each segment is
Figure BDA0001796398260000032
Bits, where b is the length of the hash code;
and inquiring candidate hash codes with the Hamming distance r from the hash codes in the m segments, and judging whether the objects corresponding to the candidate hash codes are the same as the monitored objects or not based on a trained combined Bayesian classifier.
Preferably, in step S1, the obtaining of the attribute information of the monitored object specifically includes:
and performing semantic segmentation and attribute labeling on the monitored object based on the trained neural network model to acquire attribute information of the monitored object.
Preferably, before storing the unique number information of the monitored object based on the graphic database, the method further includes:
and searching based on the unique characteristic vector and the attribute information to find whether records of the same object exist or not, if the records of the same object are selected, taking the object number of the same object as the number of the monitored object, and if the records of the same object do not exist, reallocating a new object number.
Preferably, the step S3 specifically includes:
based on the object number of the monitored object, acquiring the number of the monitored video from the key value type database, and acquiring the entering time and the leaving time of the monitored object in each monitored video;
and taking the mean value of the entering time and the leaving time as the occurrence time of the monitored object in the corresponding monitored video, acquiring the occurrence time of the monitored object in each monitored video, sequencing the occurrence time, and acquiring the track information of the monitored object based on the address of the camera corresponding to each monitored video.
Preferably, the method further comprises the step of S4, performing suspicious group center mining based on the monitoring object video map:
s401, all monitoring object nodes p connected with each monitoring video node in the graph database0、p1…pnAnd the nodes are sorted from small to large according to the average value of the entering time and the leaving time, and the sorted nodes are
Figure BDA0001796398260000031
S402, the node sequence obtained in the step S401 is shifted left by one bit and then combined with the original sequence to form a node pair sequence:
Figure BDA0001796398260000041
inputting the node pair sequence into a memory for calculation, and counting that the difference value of the average value of the entering time and the leaving time of each pair of nodes is less than a threshold value K1Node number of
Figure BDA0001796398260000042
Recorded as a key-value pair
Figure BDA0001796398260000043
S403, repeating the process of the step S402 for each monitoring video, merging all key value pairs, if the key values are the same, adding the corresponding values, and outputting the key value pair with the median value larger than the threshold value K2A bond of (a);
s404, establishing an adjacent matrix for the monitoring object group with the relationship in the step S403, and counting the degree of each node, wherein the node with the maximum degree is the central node of the monitoring object group.
Preferably, the method further comprises the step of S5, displaying the video map of the monitoring object, the track information of the monitoring object and the suspicious group center mining information on a webpage.
According to a second aspect of the embodiments of the present invention, there is provided a surveillance-oriented video atlas construction and mining apparatus, including a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the surveillance-oriented video atlas construction and mining method according to the first aspect of the embodiments of the present invention when executing the program.
The embodiment of the invention provides a monitoring-oriented video atlas construction and mining method and equipment, wherein a graph database and a key value type database are constructed to be used as data persistence storage of a knowledge atlas, a deep convolutional neural network is utilized to automatically recognize human faces, the manual review strength is greatly reduced, a shared convolutional neural network and a deep deconvolution network are used to combine with the automatic attribute of a landmark pedestrian object, the unstructured video data is converted into structured graph data to be stored, the figure retrieval based on reviewing and monitoring videos can be well met, the method has the advantages that due to the requirements of track mining, group partner center mining and other security aspects, the attribute information of the nodes can be efficiently inquired, the association relation between the nodes can be efficiently and visually displayed, the respective advantages are brought into play by using a main information database and a detailed information database storage mode, and the data storage and retrieval efficiency is high.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of a monitoring-oriented video graph construction and mining method according to an embodiment of the present invention;
fig. 2 is an effect display diagram of a video map of a monitored object according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating pedestrian re-identification based on segment hash indexing according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating attribute labeling learning according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of video map construction according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of track mining according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
With the rapid development of monitoring technology, video monitoring becomes an important component of a security system, and the demand range of the video monitoring implementation in various industries of society is continuously expanded, so that the video monitoring is developing from traditional public security and banks to the fields of traffic, venues, communities, campuses, houses and the like, and is gradually and closely integrated with the industries of telecommunication, IT and the like, a new technical means is brought to the video monitoring, and a template is provided for constructing a video monitoring integration solution. For example, video monitoring is performed in various public places such as streets and shopping malls, so that emergency can be found in time, real-time handling of the emergency is facilitated, and the stored video monitoring videos can be backtracked and checked, so that corresponding monitoring details can be found out.
The Knowledge map (also called scientific Knowledge map) is a Knowledge domain visualization or Knowledge domain mapping map in the book intelligence world, and is a series of different graphs for displaying the relationship between the Knowledge development process and the structure, describing Knowledge resources and carriers thereof by using a visualization technology, and mining, analyzing, constructing, drawing and displaying Knowledge and the mutual relation among the Knowledge resources and the carriers. Specifically, the knowledge graph is a modern theory which achieves the purpose of multi-discipline fusion by combining theories and methods of applying subjects such as mathematics, graphics, information visualization technology, information science and the like with methods such as metrology introduction analysis, co-occurrence analysis and the like and utilizing a visualized graph to vividly display the core structure, development history, frontier field and overall knowledge framework of the subjects. The method displays the complex knowledge field through data mining, information processing, knowledge measurement and graph drawing, reveals the dynamic development rule of the knowledge field, and provides practical and valuable reference for subject research, so that it is valuable to develop the knowledge map to the video map facing the monitoring video.
The essence of the video map is that data modeling and data analysis are carried out on massive unstructured monitoring video data, and the videos contain various entities such as pedestrians, vehicles, articles and the like; the number of the entities is huge, taking pedestrians as an example, in a busy place with high pedestrian traffic, the number of the pedestrians shot by one monitoring camera in one day can reach ten thousands; the relationships between entities are various, including the relationship between pedestrians and surveillance cameras, and the relationship between pedestrians and surveillance cameras. The existing relational database is difficult to process the massive unstructured monitoring video data, and the dependence on the table in the relational database to maintain the information and realize the subsequent data analysis is not ideal.
Aiming at the defects in the prior art, the embodiment of the invention uses the construction graph database and the key value type database as data persistence storage of the knowledge graph through the monitoring-oriented video graph construction and mining method and equipment, and can quickly and automatically construct the video graph by combining the storage modes of the graph database and the key value type database, and can perform intelligent data mining and data analysis based on the video graph. The following description and description will proceed with reference being made to various embodiments.
The embodiment provides a video map construction and mining method facing monitoring, as shown in fig. 1, including:
s1, acquiring the same monitored object in different monitored videos based on re-identification, and acquiring attribute information of the monitored object;
s2, storing the object number of the monitored object based on the graphic database, storing the object number of the monitored object based on the row key of the key-value type database, storing the attribute information of the monitored object and the corresponding monitored video number based on the sub-key of the key-value type database, and establishing a monitored object video map;
and S3, acquiring the track information of the monitored object based on the monitored object video map.
In this embodiment, a definition of a video map for a surveillance video is proposed, where the video map includes three basic elements: the video graph comprises nodes, edges and attributes, wherein the nodes of the video graph represent entity objects, the entity objects can be pedestrians, vehicles or other moving objects, the edges represent the relationship between the two entity objects, and the nodes and the edges contain the respective attributes. The method provided by the embodiment of the invention can efficiently inquire the attribute information of the nodes by using the storage mode of the graph database, can efficiently and visually display the association relation between the nodes, and can exert respective advantages by using the storage modes of the main information database and the detailed information database, so that the data storage and retrieval efficiency is high.
The video map comprises three basic elements: the video graph comprises nodes, edges and attributes, wherein the nodes of the video graph represent entity objects, the entity objects can be pedestrians or vehicles or other movable objects, the edges represent the relation between the two entity objects, and the nodes and the edges contain the respective attributes. In this embodiment, taking a pedestrian object as an example, the technical solution adopted is as follows: the person appearing under the camera is taken as a node to be used as a part of the graph, the human face characteristics extracted by the deep learning method, the attribute of the person, the appearing time of the person and the leaving time of the person are taken as the attributes of the person node to be stored in a main information database and a detailed information database, a large number of person nodes and the nodes of the existing camera form the nodes of the graph, and the appearance of the person corresponds to the edges of one person node and one camera node on the graph. The graph-based mining algorithm may be converted to a graph computation on a graph. The video map effect display diagram of the monitored object is shown in fig. 2.
On the basis of the above embodiment, acquiring the same monitored object in different monitored videos based on re-identification specifically includes:
preprocessing all monitoring videos based on a computer cluster, dividing a monitoring video stream into a picture stream according to frames, and acquiring a unique feature vector of a monitoring object in the picture stream containing complete monitoring object information;
and establishing a segment hash index based on the unique characteristic vector, re-identifying the monitored object based on the segment hash index, acquiring the monitored video information containing the monitored object, and obtaining the entering time and the leaving time of the monitored object in the monitored video.
In this embodiment, the object to be recognized is, for example, a pedestrian or a vehicle, the unique feature vector of the pedestrian is a face feature (such as iris recognition, face feature recognition, etc.), and the unique feature vector of the vehicle may be a license plate; specifically, taking a pedestrian as an example (the pedestrian in the following embodiments may also be replaced by a movable object such as a vehicle), the specific re-identification includes:
s11, inputting each path of monitoring video into a cluster consisting of a plurality of computers, performing video preprocessing, detecting the faces of pedestrians in the monitoring video and extracting face features;
s12, establishing a segmented index by using the face features obtained in the step S11, and carrying out pedestrian re-identification based on the segmented hash index;
specifically, in this embodiment, MTCNN (Multi-task convolutional neural network) is adopted as a face detection method, and key frames in a video are input to obtain face positions in video frames, and the face is aligned and corrected. The adopted face feature extraction network inputs aligned and corrected faces, the result of the last full connection layer of the network model is taken as the real-value feature of a face picture, if the face picture is a vehicle, the face picture can be corrected by the sum of the unique identification features corresponding to the vehicle, such as a license plate, and the face processing is replaced by the license plate processing.
Specifically, in this embodiment, the step S11 specifically includes:
s111, distributing all monitored video streams to be processed to a computer cluster consisting of a plurality of computers for processing, wherein the distribution mode is that the video streams in each computer are evenly distributed according to the number of monitoring cameras, the video streams are divided into picture streams according to frames, each key frame is divided into every 30 frames, and the rest frames are non-key frames;
s112, inputting the key frame obtained in the step S111 into an MTCNN network, judging whether a final face window is generated in output or not, and if not, finishing the identification; if so, extracting a final face window image, correcting and aligning the face to the center, and storing the corrected face window image as a specified resolution;
and S113, extracting the face picture and the real-value feature vector of the mirror image of the picture by using a face feature extraction network.
On the basis of the foregoing embodiments, establishing a segment hash index based on the unique feature vector, and performing re-identification on a monitored object based on the segment hash index specifically includes:
converting the unique feature vector into a hash code, and equally dividing the hash code into m segments, wherein the length of each segment is b/m bits, and b is the length of the hash code;
and inquiring candidate hash codes with the Hamming distance r from the hash codes in the m segments, and judging whether the objects corresponding to the candidate hash codes are the same as the monitored objects or not based on a trained combined Bayesian classifier.
Because the embodiment of the present invention includes a plurality of cameras, and the number of pedestrians included in the surveillance video of each camera is huge as time increases, a pedestrian re-identification technology based on segment indexes is used, and the segment indexes greatly improve the speed of retrieving whether the database includes the target face by establishing the correspondence between the face features and the pedestrians based on the segments in an off-line manner, in which step S12 mainly includes the following steps:
s121, converting the unique characteristic vector obtained in the step into a hash code;
s122, assuming that the length of the hash code y obtained in the step S121 is b bits, dividing the hash code y into m segments, and then the length of each segment is
Figure BDA0001796398260000092
Or
Figure BDA0001796398260000093
A bit;
s123, a rough matching stage is carried out, and all hash codes with the Hamming distance r from the hash code y in the step S122 are inquired in the m sections;
s124, combining the m sections of query results to form a candidate set;
s125, performing an accurate matching stage on the candidate set, assuming x1And x2And respectively representing real-value characteristics of two to-be-verified pedestrian objects, and judging whether the two images are the same pedestrian by using a combined Bayesian classifier.
Specifically, in the present embodiment, when two objects belong to the same pedestrian, it is assumed that H issIf true; otherwise, when two objects do not belong to the same pedestrian, assume HuThis is true. Given this assumption, two features x are given1And x2In practice, the probability that two hypotheses are most likely to occur is compared, which formally can be expressed as observing x1And x2One log-likelihood ratio of:
Figure BDA0001796398260000091
Figure BDA0001796398260000101
in the above formulas (1) and (2), when r (x)1,x2) When t is greater than t, x is determined1And x2Corresponding to the same pedestrian object, otherwise, not the same pedestrian object; r (x1, x2) is a log-likelihood ratio of observed x1 and x 2; a and G denote model parameters obtained by parameter learning; sμAnd SA covariance matrix is represented.
In this embodiment, the purpose of using the segment hash index is to accelerate the search speed, the segment hash index may be regarded as a first-layer screening, the combined bayesian classifier is used to judge that the search is a second-layer screening, and a candidate set obtained after the first-layer screening is an input data set of the second-layer screening. And in the process of training the combined Bayesian classifier, continuously iterating and solving according to sample data, obtaining model parameters A and G through parameter learning, and storing the model parameters A and G as model parameters. And a schematic diagram of an algorithm for selecting a suitable threshold t according to the experimental results is shown in fig. 3.
On the basis of the foregoing embodiments, acquiring the attribute information of the monitored object specifically includes:
s13, performing semantic segmentation and attribute labeling on the monitored object based on the trained neural network model, and acquiring attribute information of the monitored object.
In this embodiment, a human body structure is semantically segmented by combining an RPN (region suggestion network) network and a Deep deconvolution network, which specifically includes the following steps:
s131, sending the video key frames obtained in the steps into a public deep convolution neural network, and extracting universal local features and unique abstract features of pedestrians in a forward direction to obtain a feature activation matrix of a scene;
s132, regressing the pedestrian region by using a depth RPN, expressing the interest region of each dynamic object by using a quadruple, and performing space pooling on the feature activation matrix according to the interest region to obtain human body features which are consistent in size and independent from each other; regressing the pedestrian region by using a depth RPN network, expressing the interest region of each dynamic object by using a quadruple [ x, y, z, h ], and performing spatial pooling on the feature activation matrix according to the interest region to obtain human body features which are consistent in size and independent from each other;
s133, reconstructing the shape distribution of the human body by using a depth deconvolution network which is in mirror symmetry with the depth convolution network;
s134, through a large amount of sample training, in the restoration process, pixel points of the same part of a human body or a vehicle are gradually gathered, pixel points of different parts are gradually far away, and finally contour segmentation at a pixel level is presented;
and S135, in the attribute classification section, performing attribute classification on each segmented object by using a fully-connected and Softmax combined classifier, completing attribute labeling of the pedestrian, and repeating the labeling process to obtain the body characteristics of the pedestrian. In the attribute classification section, the attribute classification is carried out on each segmented object by using a fully-connected and Softmax combined classifier, the attribute marking of the pedestrian is completed, the marking process is repeated, and the body characteristics of the pedestrian, such as the hair length, the hair color, the coat color, the trousers color and the like, can be obtained. If the vehicle is a vehicle, the vehicle type, the vehicle brand, the vehicle color, the tire type and the like can be adopted.
The RPN is a deep convolution regressor, which is used for generating propulses, all sliding windows share the RPN, and by adopting an anchor mechanism, a propulses is generated by marking positive samples in advance and calculating L oss value, a deep DNN is a deconvolution network which is stacked with a deep convolution network in a mirror image mode, the morphological distribution of a human body can be reconstructed, so that the human body is subjected to semantic segmentation, the result of the semantic segmentation is used as a mask for training a Softmax classifier corresponding to each part in attribute classification, the classification result corresponding to each part can be obtained, and the whole process of the semantic segmentation and the attribute labeling is shown in FIG. 4.
On the basis of the above embodiments, before storing the unique number information of the monitoring object based on the graphic database, the method further includes:
and searching based on the unique characteristic vector and the attribute information to find whether records of the same object exist or not, if the records of the same object are selected, taking the object number of the same object as the number of the monitored object, and if the records of the same object do not exist, reallocating a new object number.
In this embodiment, taking a pedestrian as an example, step S2 specifically includes:
s201, combining the obtained face features (unique feature vectors), hash features and attribute features (information) to form structured data;
s202, performing rough segmented hash index and accurate matching, and searching whether the same person record exists in a face library;
s203, allocating a figure number as a unique number representing the figure in the database according to the index result, if the figure number exists, not allocating the number, and if not, re-allocating a new number;
s204, storing the figure number obtained in the step S203 in a graph database as the number of the node;
and S205, taking the figure number obtained in the step S203 as a row key in a key value type database, and respectively taking the hash feature and the attribute name as sub-key names to store corresponding sub-key values.
In the embodiment, the monitoring-oriented video atlas construction and mining method can automatically perform face recognition by utilizing a deep convolutional neural network, greatly reduce the manual review strength, convert unstructured video data into structured graph data for storage by using a shared convolutional neural network and a deep deconvolution network in combination with the automatic annotation of attributes of pedestrian objects, and can well meet the security requirements of people retrieval, track mining, group partner center mining and the like based on review monitoring videos. By using the storage mode of the graph database, the attribute information of the nodes can be efficiently inquired, the association relationship between the nodes can be efficiently and visually displayed, the respective advantages are exerted by using the storage modes of the main information database and the detailed information database, and the data storage and retrieval efficiency is high.
Before a pedestrian video atlas is constructed, relevant information of pedestrians in a video is extracted in a structured mode, whether an existing database contains detected pedestrians or not can be obtained through segmented hash index and accurate matching, numbers of the pedestrians distributed at the backstage are calculated, main information is stored in an image database, detailed information is stored in a key value type database, the advantages of the two databases are brought into full play, the image database is good at storing an image structure but not beneficial to storing a large amount of information, and the key value type database does not have the image structure but can store a large amount of information. The three tables in the key-value database are a character table, a camera table and a relation table (the relation between characters and cameras, that is, which cameras the characters appear in), and the schematic diagram of the map construction is shown in fig. 5.
The specific structure of person _ table is shown in table 1 below:
TABLE 1 person _ Table Structure
RowKey Column
Person_Id Person_hash_feature
Person_Id Person_attribute_1
Person_Id Person_attribute_2
Person_Id Person_cameras
The specific structure of the camera _ table is as follows:
TABLE 2 Camera _ Table Structure
RowKey Column
Camera_Id Camera_producor
Camera_Id Camera_location
Camera_Id Camera_picture
Camera_Id Camera_latitudeAndLongitude
The specific structure of relationship _ table is as follows:
table 3 relational _ table structure
RowKey Column
Person_Id+Camera_Id Person_in_time-count
Person_Id+Camera_Id Person_out_time-count
Person_Id+Camera_Id Person_snapshot-count
Person_Id+Camera_Id count
On the basis of the foregoing embodiments, acquiring trajectory information of a monitored object based on the monitored object video map specifically includes:
based on the object number of the monitored object, acquiring the number of the monitored video from the key value type database, and acquiring the entering time and the leaving time of the monitored object in each monitored video;
and taking the mean value of the entering time and the leaving time as the occurrence time of the monitored object in the corresponding monitored video, acquiring the occurrence time of the monitored object in each monitored video, sequencing the occurrence time, and acquiring the track information of the monitored object based on the address of the camera corresponding to each monitored video.
Specifically, in this embodiment, after the object-based video map is constructed, the specified pedestrian or vehicle trajectory may be mined on the monitoring object video map, and the step S3 specifically includes:
s301, according to the serial number p of the person, the serial number C of the camera is obtained in the person table of the key-value type database1、C2、C3
S302, the person number p and the camera number C obtained in the step S3011、C2、C3Combining the new row keys, continuously inquiring two columns of time of entering the camera and time of leaving in a relation table, and averaging the two times to obtain the time t of the character appearing in front of each camera1、t2、t3
S303, obtaining the time t according to the step S3021、t2、t3And (4) outputting the addresses of the corresponding cameras by inquiring the camera table in a descending order.
Specifically, in this embodiment, by automatically associating the pedestrian captured in front of each camera with the camera and having corresponding records in both the graph database and the key-value database, the pedestrian can obtain each associated camera and the corresponding time appearing in front of the camera according to the target pedestrian, thereby obtaining the track path that is based on the camera location and is strung together along the time line.
As shown in fig. 6, there is information related to each camera in the camera table, the information is static information and basically does not change, and the number of the camera where the pedestrian appears is obtained according to the time sequence, so that the tracks of the pedestrian in a period of time can be connected.
On the basis of the above embodiments, the method further includes:
s4, performing suspicious group center mining based on the monitoring object video map:
s401, comparing all character nodes p connected with each camera node in graph database0、p1…pnThe nodes are sorted from small to large according to the average value of the time of entering the camera and the time of leaving, and the sorted nodes are
Figure BDA0001796398260000141
S402, the node sequence obtained in the step S401 is shifted left by one bit and then combined with the original sequence to form a node pair sequence:
Figure BDA0001796398260000151
inputting the node pair sequence into a memory for calculation, and counting that the difference value of the average value of the time of entering a camera and the time of leaving the camera of each pair of nodes is less than a threshold value K1Node number of
Figure BDA0001796398260000152
Recorded as a key-value pair
Figure BDA0001796398260000153
S403, repeating the process of the step S402 for each camera, merging all key value pairs, if the key values are the same, adding the corresponding values, and outputting the key value pair with the median value larger than the threshold value K2A bond of (a);
s404, establishing an adjacency matrix for the person group with the relationship in the step S403, and counting the degree of each node, wherein the node with the maximum degree is the central node of the person group.
The method comprises the steps that the target of the algorithm is divided into two stages, the first stage is to search two nodes with a co-occurrence relation in a video map, data division is carried out on the two nodes according to the serial number of a camera, pedestrians in each group belong to pedestrians appearing in front of the same camera, sequencing is carried out in the groups according to time, after sequencing is completed, the front and rear pedestrian targets meeting the time difference in each group are counted, and then the same two pedestrian targets in all the groups are combined; the second stage is to mine the central persons in the group partners with the co-occurrence relationship, and the section calculates the degree of the node by the algorithm processed on the graph, thereby obtaining the node with the maximum degree. The first part of the algorithm can be calculated in the memory in an iterative manner, so that the calculation efficiency is improved, and the second part can be replaced by other algorithms to achieve the aim of improving the calculation speed.
On the basis of the above embodiments, the method further includes:
s5, displaying the map established in the steps S2-S4 and the mining result on a webpage.
S501, querying the numbers of all nodes including the character node p on the graph database by using corresponding query language0、p1…pnAnd a camera node C0、C1…Cn
S502, finding the basic information of all people and the basic information of the camera nodes in a key value type database to form a data file in a json format;
and S503, the front end of the webpage analyzes the data file in the json format in the step S502 by using the D3.js, and displays the content of the video map on the webpage.
The embodiment discloses monitoring-oriented video atlas constructing and mining equipment, which comprises a memory, a processor and a computer program, wherein the computer program is stored on the memory and can run on the processor, and the processor executes the steps of executing the monitoring-oriented video atlas constructing and mining method described in the method embodiments. Examples include:
acquiring the same monitored object in different monitored videos based on re-identification, and acquiring attribute information of the monitored object;
storing the object number of the monitored object based on a graphic database, storing the object number of the monitored object based on a row key of a key value type database, storing the attribute information of the monitored object and the corresponding monitored video number based on a sub key of the key value type database, and establishing a monitored object video map;
and acquiring the track information of the monitored object based on the video map of the monitored object.
The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions, which when executed by a computer, the computer is capable of executing the steps of the monitoring-oriented video atlas construction and mining method according to the above method embodiments, for example, the steps of:
acquiring the same monitored object in different monitored videos based on re-identification, and acquiring attribute information of the monitored object;
storing the object number of the monitored object based on a graphic database, storing the object number of the monitored object based on a row key of a key value type database, storing the attribute information of the monitored object and the corresponding monitored video number based on a sub key of the key value type database, and establishing a monitored object video map;
and acquiring the track information of the monitored object based on the video map of the monitored object.
The present embodiment further provides a non-transitory computer-readable storage medium, which stores computer instructions, where the computer instructions cause the computer to perform the steps of the monitoring-oriented video map building and mining method described in the foregoing method embodiments, for example, the steps include:
acquiring the same monitored object in different monitored videos based on re-identification, and acquiring attribute information of the monitored object;
storing the object number of the monitored object based on a graphic database, storing the object number of the monitored object based on a row key of a key value type database, storing the attribute information of the monitored object and the corresponding monitored video number based on a sub key of the key value type database, and establishing a monitored object video map;
and acquiring the track information of the monitored object based on the video map of the monitored object.
To sum up, the embodiment of the present invention provides a method and a device for constructing and mining a video map facing monitoring, wherein a map database and a key-value database are used as data persistence storage of a knowledge map, a deep convolutional neural network is used for automatically recognizing human faces, the manual review strength is greatly reduced, a shared convolutional neural network and a deep deconvolution network are used in combination with the automatic annotation of attributes of pedestrian objects, the unstructured video data is converted into structured map data for storage, the security requirements of people retrieval, track mining, group center mining and the like based on the review monitoring video can be well met, the attribute information of nodes can be efficiently inquired, the association relationship between the nodes can be efficiently visualized, the storage modes of a main information database and a detailed information database are used, and respective advantages are exerted, the data storage and retrieval efficiency is high.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (10)

1. A monitoring-oriented video map construction and mining method is characterized by comprising the following steps:
s1, acquiring the same monitored object in different monitored videos based on re-identification, and acquiring attribute information of the monitored object;
s2, storing the object number of the monitored object based on the graphic database, storing the object number of the monitored object based on the row key of the key-value type database, storing the attribute information of the monitored object and the corresponding monitored video number based on the sub-key of the key-value type database, and establishing a monitored object video map;
s3, acquiring track information of the monitored object based on the monitored object video map;
the step S3 includes:
s301, according to the serial number p of the person, the serial number C of the camera is obtained in the person table of the key-value type database1、C2、C3
S302, the person number p and the camera number C obtained in the step S3011、C2、C3Combining the new row keys, continuously inquiring two columns of time of entering the camera and time of leaving in a relation table, and averaging the two times to obtain the time t of the character appearing in front of each camera1、t2、t3
S303, obtaining the time t according to the step S3021、t2、t3And (4) outputting the addresses of the corresponding cameras by inquiring the camera table in a descending order.
2. The surveillance-oriented video atlas building and mining method of claim 1, wherein the surveillance object is any object in surveillance video, including pedestrians, vehicles.
3. The monitoring-oriented video graph building and mining method according to claim 1, wherein in step S1, obtaining the same monitoring object in different monitoring videos based on re-recognition specifically comprises:
preprocessing all monitoring videos based on a computer cluster, dividing a monitoring video stream into a picture stream according to frames, and acquiring a unique feature vector of a monitoring object in the picture stream containing complete monitoring object information;
and establishing a segment hash index based on the unique characteristic vector, re-identifying the monitored object based on the segment hash index, acquiring the monitored video information containing the monitored object, and obtaining the entering time and the leaving time of the monitored object in the monitored video.
4. The monitoring-oriented video graph building and mining method according to claim 3, wherein a segment hash index is built based on the unique feature vector, and monitoring object re-identification is performed based on the segment hash index, specifically comprising:
converting the unique feature vector into a hash code, and equally dividing the hash code into m segments, wherein the length of each segment is
Figure FDA0002529847890000021
Bits, where b is the length of the hash code;
and inquiring candidate hash codes with the Hamming distance r from the hash codes in the m segments, and judging whether the objects corresponding to the candidate hash codes are the same as the monitored objects or not based on a trained combined Bayesian classifier.
5. The monitoring-oriented video graph constructing and mining method according to claim 3, wherein in step S1, acquiring attribute information of the monitoring object specifically includes:
and performing semantic segmentation and attribute labeling on the monitored object based on the trained neural network model to acquire attribute information of the monitored object.
6. The surveillance-oriented video atlas building and mining method of claim 3, before storing the unique number information of the surveillance object based on a graph database, further comprising:
and searching based on the unique characteristic vector and the attribute information to find whether records of the same object exist or not, if the records of the same object are selected, taking the object number of the same object as the number of the monitored object, and if the records of the same object do not exist, reallocating a new object number.
7. The monitoring-oriented video atlas constructing and mining method of claim 3, wherein the step S3 specifically comprises:
based on the object number of the monitored object, acquiring the number of the monitored video from the key value type database, and acquiring the entering time and the leaving time of the monitored object in each monitored video;
and taking the mean value of the entering time and the leaving time as the occurrence time of the monitored object in the corresponding monitored video, acquiring the occurrence time of the monitored object in each monitored video, sequencing the occurrence time, and acquiring the track information of the monitored object based on the address of the camera corresponding to each monitored video.
8. The monitoring-oriented video graph building and mining method according to claim 7, further comprising S4, performing suspicious group center mining based on the monitoring-object video graph:
s401, all monitoring object nodes p connected with each monitoring video node in the graph database0、p1…pnAnd the nodes are sorted from small to large according to the appearance time, and the sorted nodes are
Figure FDA0002529847890000031
S402, the node sequence obtained in the step S401 is shifted left by one bit and then combined with the original sequence to form a node pair sequence:
Figure FDA0002529847890000032
inputting the node pair sequence into a memory for calculation, and counting that the difference value of the average value of the entering time and the leaving time of each pair of nodes is less than a threshold value K1Node number of
Figure FDA0002529847890000033
Recorded as a key-value pair
Figure FDA0002529847890000034
S403, repeating the process of the step S402 for each monitoring video, merging all key value pairs, if the key values are the same, adding the corresponding values, and outputting the key value pair with the median value larger than the threshold value K2A bond of (a);
s404, establishing an adjacent matrix for the monitoring object group with the relationship in the step S403, and counting the degree of each node, wherein the node with the maximum degree is the central node of the monitoring object group.
9. The monitoring-oriented video graph building and mining method according to claim 8, further comprising S5, displaying the monitoring object video graph, the track information of the monitoring object, and the suspicious group center mining information on a webpage.
10. A surveillance-oriented video atlas construction and mining apparatus comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the steps of the surveillance-oriented video atlas construction and mining method according to any of claims 1 to 9.
CN201811058356.5A 2018-09-11 2018-09-11 Monitoring-oriented video map construction and mining method and equipment Active CN109344285B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811058356.5A CN109344285B (en) 2018-09-11 2018-09-11 Monitoring-oriented video map construction and mining method and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811058356.5A CN109344285B (en) 2018-09-11 2018-09-11 Monitoring-oriented video map construction and mining method and equipment

Publications (2)

Publication Number Publication Date
CN109344285A CN109344285A (en) 2019-02-15
CN109344285B true CN109344285B (en) 2020-08-07

Family

ID=65304963

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811058356.5A Active CN109344285B (en) 2018-09-11 2018-09-11 Monitoring-oriented video map construction and mining method and equipment

Country Status (1)

Country Link
CN (1) CN109344285B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109948447B (en) * 2019-02-21 2023-08-25 山东科技大学 Character network relation discovery and evolution presentation method based on video image recognition
CN109918184B (en) * 2019-03-01 2023-09-26 腾讯科技(深圳)有限公司 Picture processing system, method and related device and equipment
CN110187678B (en) * 2019-04-19 2021-11-05 广东省智能制造研究所 Information storage and digital application system of processing equipment in manufacturing industry
CN110245259B (en) * 2019-05-21 2021-09-21 北京百度网讯科技有限公司 Video labeling method and device based on knowledge graph and computer readable medium
CN110188224A (en) * 2019-05-27 2019-08-30 南京工程学院 A kind of general cognitive method based on the identification of emphasis human target
CN110210387B (en) * 2019-05-31 2021-08-31 华北电力大学(保定) Method, system and device for detecting insulator target based on knowledge graph
CN110532923A (en) * 2019-08-21 2019-12-03 深圳供电局有限公司 A kind of personage's trajectory retrieval method and its system
CN111314605A (en) * 2020-02-19 2020-06-19 杭州涂鸦信息技术有限公司 Merging method and system for face recognition among multiple equipment terminals
CN112148938B (en) * 2020-10-16 2023-05-26 成都中科大旗软件股份有限公司 Cross-domain heterogeneous data retrieval system and retrieval method
CN111968264A (en) * 2020-10-21 2020-11-20 东华理工大学南昌校区 Sports event time registration device
CN112861670B (en) * 2021-01-27 2022-11-08 华北电力大学(保定) Transmission line hardware detection method and system
CN113761221B (en) * 2021-06-30 2022-02-15 中国人民解放军32801部队 Knowledge graph entity alignment method based on graph neural network
CN113688251B (en) * 2021-07-27 2024-02-13 广东师大维智信息科技有限公司 Knowledge graph construction method and system in indoor sports event security field
CN113704487A (en) * 2021-07-29 2021-11-26 湖南五凌电力科技有限公司 Knowledge graph generation method and device, computer equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102207966A (en) * 2011-06-01 2011-10-05 华南理工大学 Video content quick retrieving method based on object tag
CN103955494A (en) * 2014-04-18 2014-07-30 大唐联智信息技术有限公司 Searching method and device of target object and terminal
CN104778224A (en) * 2015-03-26 2015-07-15 南京邮电大学 Target object social relation identification method based on video semantics
CN107609497A (en) * 2017-08-31 2018-01-19 武汉世纪金桥安全技术有限公司 The real-time video face identification method and system of view-based access control model tracking technique

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105373626A (en) * 2015-12-09 2016-03-02 深圳融合永道科技有限公司 Distributed face recognition track search system and method
CN107370983B (en) * 2016-05-13 2019-12-17 腾讯科技(深圳)有限公司 method and device for acquiring track of video monitoring system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102207966A (en) * 2011-06-01 2011-10-05 华南理工大学 Video content quick retrieving method based on object tag
CN103955494A (en) * 2014-04-18 2014-07-30 大唐联智信息技术有限公司 Searching method and device of target object and terminal
CN104778224A (en) * 2015-03-26 2015-07-15 南京邮电大学 Target object social relation identification method based on video semantics
CN107609497A (en) * 2017-08-31 2018-01-19 武汉世纪金桥安全技术有限公司 The real-time video face identification method and system of view-based access control model tracking technique

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于多哈希算法的大规模图像快速检索方法;唐小蔓,王云飞,邹复好,周可;《计算机工程与科学》;20160731;全文 *
微博网络传播行为中的关键问题研究;熊小兵;《中国优秀博士论文全文数据库》;20140115;全文 *

Also Published As

Publication number Publication date
CN109344285A (en) 2019-02-15

Similar Documents

Publication Publication Date Title
CN109344285B (en) Monitoring-oriented video map construction and mining method and equipment
Sindagi et al. Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting
CN109858390B (en) Human skeleton behavior identification method based on end-to-end space-time diagram learning neural network
CN110298404B (en) Target tracking method based on triple twin Hash network learning
CN111259786B (en) Pedestrian re-identification method based on synchronous enhancement of appearance and motion information of video
CN108960184B (en) Pedestrian re-identification method based on heterogeneous component deep neural network
CN109993100B (en) Method for realizing facial expression recognition based on deep feature clustering
US11640714B2 (en) Video panoptic segmentation
Rabiee et al. Crowd behavior representation: an attribute-based approach
CN115860152B (en) Cross-modal joint learning method for character military knowledge discovery
CN110751027A (en) Pedestrian re-identification method based on deep multi-instance learning
CN114005085A (en) Dense crowd distribution detection and counting method in video
CN114332473A (en) Object detection method, object detection device, computer equipment, storage medium and program product
CN111159475B (en) Pedestrian re-identification path generation method based on multi-camera video image
CN112084895A (en) Pedestrian re-identification method based on deep learning
CN108052680A (en) Image data target identification Enhancement Method based on data collection of illustrative plates, Information Atlas and knowledge mapping
CN113553975A (en) Pedestrian re-identification method, system, equipment and medium based on sample pair relation distillation
CN111496784B (en) Space environment identification method and system for robot intelligent service
CN112906557A (en) Multi-granularity characteristic aggregation target re-identification method and system under multiple visual angles
CN104200222B (en) Object identifying method in a kind of picture based on factor graph model
CN112306985A (en) Digital retina multi-modal feature combined accurate retrieval method
CN111160077A (en) Large-scale dynamic face clustering method
CN114299342B (en) Unknown mark classification method in multi-mark picture classification based on deep learning
CN110717068A (en) Video retrieval method based on deep learning
Fofana et al. Optimal Flame Detection of Fires in Videos Based on Deep Learning and the Use of Various Optimizers

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant