CN113495975A

CN113495975A - Video processing, display and completion method, device, system and storage medium

Info

Publication number: CN113495975A
Application number: CN202010202455.7A
Authority: CN
Inventors: 郑卫东; 求大位
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-03-20
Filing date: 2020-03-20
Publication date: 2021-10-12

Abstract

The embodiment of the application provides a video processing method, a video display method, a video complement method, a video display device, a video complement system and a storage medium. In the embodiment of the application, for video data shot by a plurality of cameras in a target space, taking a movable object entering the target space as a unit, merging the video data to obtain the track of each movable object, and associating the track of each movable object with a data object related to the behavior of each movable object in the target space, so that conditions are provided for video processing by taking the movable object as a unit, and the video processing operation is simplified. Furthermore, in the information display process, the track of the movable object can be displayed, the data object related to the behavior of the movable object in the target space can be displayed in a correlated mode, the related information of the movable object in the target space can be displayed in a multi-azimuth stereoscopic mode, the video processing efficiency can be improved, and the video processing cost can be reduced.

Description

Video processing, display and completion method, device, system and storage medium

Technical Field

The present application relates to the field of video processing technologies, and in particular, to a method, device, system, and storage medium for video processing, display, and completion.

Background

In actual life, cameras are deployed in a plurality of scenes, and security monitoring is carried out through the cameras. For example, in an online unattended retail environment, multiple cameras are typically deployed, and are used to monitor shopping behavior of users in the field. For example, in a home scene, cameras are installed in different areas such as a living room, a bedroom, and a kitchen, and the cameras are used for monitoring the actions of old people and children at home so as to ensure the safety of the old people and the children.

In a camera monitoring scene, if a specific condition, such as missed payment or fall injury of old people and children, occurs, related investigation is generally performed through videos shot by all cameras in a backtracking field. The video monitoring backtracking mode is high in cost and low in efficiency.

Disclosure of Invention

Aspects of the present application provide a video processing, display and completion method, device, system and storage medium, so as to simplify a video-based troubleshooting operation, improve troubleshooting efficiency and reduce troubleshooting cost.

An embodiment of the present application provides a video processing method, including: acquiring video data shot by a plurality of cameras in a target space, wherein the target space comprises at least one movable object; merging the video data shot by the cameras by taking the movable object as a unit to obtain the track of the at least one movable object; and respectively associating the track of the at least one movable object with at least one data object, wherein the at least one data object is related to the behavior of the movable object in the target space.

The embodiment of the present application further provides a track display method, including: responding to the first query operation, and sending a first data request to the server-side equipment to request a target track; the target trajectory is a trajectory of a target movable object within a target space; receiving a target track returned by the server-side equipment and at least one data object associated with the target track; and in the target track display process, the at least one data object is displayed in an associated mode, and the at least one data object is related to the behavior of the target movable object in the target space.

The embodiment of the present application further provides a track completion method, including: displaying a target track, wherein the target track is a track of a target movable object in a target space, and a missing part exists in the target track; determining candidate cameras which are possible to shoot the missing part in the target space according to the trend of the target movable object in front of the missing part; performing completion processing on the target track according to the video content corresponding to the missing time period and shot by the candidate camera; the deletion period is a period corresponding to the deletion portion.

An embodiment of the present application provides a video processing system, including: the system comprises a plurality of cameras, server equipment and a display terminal, wherein the cameras, the server equipment and the display terminal are deployed in a target space; the target space allows movable objects to come in and go out; the cameras are used for shooting video data in respective view field ranges and uploading the shot video data to the server side equipment; the server-side equipment is used for merging the video data of the cameras by taking the movable objects as units to obtain the track of each movable object entering the target space, and associating the track of each movable object with at least one data object respectively, wherein the at least one data object is related to the behavior of the movable object in the target space; the display terminal is used for acquiring a target track and at least one data object associated with the target track from the server-side equipment according to the query operation, and displaying the at least one data object associated with the target track in an associated manner in the process of displaying the target track; the target trajectory is a trajectory of a target movable object within a target space.

The embodiment of the application provides a server device, including: a memory and a processor; the memory for storing a computer program; the processor, coupled with the memory, to execute the computer program to: acquiring video data shot by a plurality of cameras in a target space, wherein the target space comprises at least one movable object; merging the video data shot by the cameras by taking the movable object as a unit to obtain the track of the at least one movable object; and respectively associating the track of the at least one movable object with at least one data object, wherein the at least one data object is related to the behavior of the movable object in the target space.

An embodiment of the present application provides a display terminal, including: a memory, a processor, a communication component, and a display; the memory for storing a computer program; the processor, coupled with the memory, to execute the computer program to: responding to the first query operation, and sending a first data request to the server-side equipment to request a target track; the target trajectory is a trajectory of a target movable object within a target space; receiving a target track returned by the server-side equipment and at least one data object associated with the target track; and in the target track display process, the at least one data object is displayed in an associated mode, and the at least one data object is related to the behavior of the target movable object in the target space.

An embodiment of the present application further provides a display terminal, including: a memory, a processor, a communication component, and a display; the memory for storing a computer program; the processor, coupled with the memory, to execute the computer program to: displaying a target track, wherein the target track is a track of a target movable object in a target space, and a missing part exists in the target track; determining candidate cameras which are possible to shoot the missing part in the target space according to the trend of the target movable object in front of the missing part; performing completion processing on the target track according to the video content corresponding to the missing time period and shot by the candidate camera; the deletion period is a period corresponding to the deletion portion.

Embodiments of the present application provide a computer-readable storage medium storing a computer program, which, when executed by one or more processors, causes the one or more processors to implement the steps in the video processing method, the track display method, or the track completion method provided by the embodiments of the present application.

In the embodiment of the application, for video data shot by a plurality of cameras in a target space, taking a movable object entering the target space as a unit, merging the video data to obtain the track of each movable object, and associating the track of each movable object with a data object related to the behavior of each movable object in the target space, so that conditions are provided for video processing by taking the movable object as a unit, and the video processing operation is simplified. Furthermore, in the information display process, the track of the movable object can be displayed, the data object related to the behavior of the movable object in the target space can be displayed in a correlated mode, the related information of the movable object in the target space can be displayed in a multi-azimuth stereoscopic mode, the video processing efficiency can be improved, and the video processing cost can be reduced.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

fig. 1 is a schematic structural diagram of a video processing system according to an exemplary embodiment of the present application;

fig. 2 is a schematic structural diagram of a video processing system applied to an offline store according to an exemplary embodiment of the present application;

FIG. 3a is a schematic diagram of a first query interface provided by an exemplary embodiment of the present application;

FIG. 3b is another schematic diagram of a first query interface provided by an exemplary embodiment of the present application;

4 a-4 d are schematic diagrams of states of a display interface provided by an exemplary embodiment of the present application;

FIG. 5 is a schematic diagram of a camera orientation displayed on a store map as provided by an exemplary embodiment of the present application;

FIG. 6a is a schematic diagram of adding mark information related to a second behavior to a track according to an exemplary embodiment of the present application;

FIG. 6b is a schematic diagram of a second query interface provided by an exemplary embodiment of the present application;

fig. 7a is a schematic flowchart of a video processing method according to an exemplary embodiment of the present application;

fig. 7b is a schematic flowchart of another video processing method according to an exemplary embodiment of the present application;

FIG. 8 is a flowchart illustrating a track display method according to an exemplary embodiment of the present disclosure;

FIG. 9 is a flowchart illustrating a track completion method according to an exemplary embodiment of the present disclosure;

fig. 10a is a schematic structural diagram of a server device according to an exemplary embodiment of the present application;

fig. 10b is a schematic structural diagram of a display terminal according to an exemplary embodiment of the present application;

fig. 10c is a schematic structural diagram of another display terminal according to an exemplary embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail and completely with reference to the following specific embodiments of the present application and the accompanying drawings. It should be apparent that the described embodiments are only some of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Aiming at the technical problems of high cost, low efficiency and the like of the existing video investigation scenes, in the embodiment of the application, the video data shot by a plurality of cameras in the target space are merged by taking movable objects entering the target space as a unit to obtain the track of each movable object, and the track of each movable object is associated with data objects related to the behavior of the movable object in the target space, so that conditions are provided for track processing (such as investigation, clipping or beautification) by taking the movable object as a unit, and the track processing (such as investigation, clipping or beautification) operation is facilitated to be simplified. Furthermore, in the information display process, the track of the movable object can be displayed, the data object related to the behavior of the movable object in the target space can be displayed in a correlated mode, the related information of the movable object in the target space can be displayed in a multi-direction three-dimensional mode, the track processing efficiency can be improved, and the track processing cost can be reduced.

The technical solutions provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings.

Fig. 1 is a schematic structural diagram of a video processing system according to an exemplary embodiment of the present application. As shown in fig. 1, the system includes: a plurality of cameras 11, server devices 12, and display devices 13 disposed in the target space 10. The server device 12 is in communication connection with the plurality of cameras 11 and the display terminal 13.

In the present embodiment, the target space 10 refers to a physical space having an accommodation property that allows the movable object 14 to come in and go out. The movable object 14 broadly refers to any object that can move, including an object that can move autonomously, such as a user, a robot, or an unmanned vehicle, etc.; but also objects that are not autonomously moving, such as vehicles driven by humans, etc. The specific implementation of the target space 10 will vary from application to application, and accordingly, the movable objects entering and exiting the target space 10 will vary. The following examples illustrate:

in some application scenarios, the target space 10 is an offline store such as a mall, a supermarket, etc., and accordingly, the movable object is a user who enters or exits the offline store, where the user includes a consumer and/or a supermarket staff, or may also include an autonomous shopping cart capable of walking freely within the online offline store, or a robot, etc.

In other application scenarios, the target space 10 is a home space, and accordingly, the movable object is a user accessible to the home space, where the user includes a family member and/or a visitor from outside, or may also include a home service robot that is free-wheeling in a home environment, such as an autonomously movable scrubber, a sweeping robot, etc.

In still other application scenarios, the target space 10 is a bus stop, a train station, an airport, or other public place, and accordingly, the movable objects include passengers, place workers, and takeaway personnel who may enter and exit these places, people who pick up passengers, taxi drivers, and other personnel, and may also include private cars, taxis, and the like.

In any application scenario, some movable objects are always in and out of the target space 10, and in order to meet the monitoring requirement, a plurality of cameras 11 may be installed in the target space 10. The plurality of cameras 11 are installed at different positions in the target space 10, have different field of view ranges, and can capture video data within the respective field of view ranges. Optionally, the coverage of the target space 10 by the plurality of cameras 11 may be different according to the monitoring requirement and the monitoring strength. Preferably, the fields of view of the plurality of cameras 11 may overlap or seamlessly join each other, so as to provide complete and seamless coverage of the target space 10, but is not limited thereto.

In the present embodiment, the type of the camera 11 is not limited. For example, from the viewpoint of transmission signals, the camera 11 may be an analog camera or a digital camera. From the viewpoint of picture quality, the camera 11 may be a standard definition camera or a high definition camera. From the appearance, the camera 11 may be a spherical camera, a hemispherical camera, or a gun camera. From the view angle range, the camera 11 may be a wide-angle camera or a standard camera. From the number of lenses, the camera 11 may be a monocular camera or a binocular camera.

In this embodiment, the plurality of cameras 11 may upload captured video data to the server device 12 through a communication connection between the cameras and the server device 12. Alternatively, the multiple cameras 11 may upload the shot video data to the server device 12 in real time, or may upload the shot video data to the server device 12 according to a certain uploading policy, for example, periodically or regularly. The video data captured by each camera 11 includes a movable object entering the field of view, the relative behavior of the movable object in the target space 10, and the time, track, position, etc. of the movable object in the target space 10.

In this embodiment, the implementation form of the server device 12 is not limited, and the server device may be a regular server, a cloud server, or a server array. The server device 12 is mainly responsible for receiving the video data uploaded by the plurality of cameras 11 and storing the video data for further processing on the video data in the following. Further, the server device 12 is further configured to: the video data captured by the plurality of cameras 11 is merged in units of movable objects, and the trajectory of each movable object entering the target space 10 is obtained. The track of the movable object mainly describes the moving track of the movable object in the target space; in an implementation form, the track of the movable object may be a video (simply referred to as a track video), or a moving picture (simply referred to as a track moving picture), or an image picture (simply referred to as a track picture). Wherein, one track is used for one movable object, which is beneficial to the processing related to the track of the movable object by taking the movable object as a unit. Alternatively, if the trajectory of the movable object is a trajectory video, the processing associated with the trajectory of the movable object may be video processing, including but not limited to: the video processing method has the advantages that video investigation, video clipping or video beautification and the like are performed, the video of a certain movable object is obtained without screening the video contents shot by all the cameras, the video processing efficiency can be improved, and the video processing cost is reduced. Preferably, in the embodiments of the present application, the track of the movable object is a track video. In fig. 1, 2, and 4a to 6a, the track video is illustrated as an example, but not limited thereto.

Further, the server device 12 may associate the track of each movable object with at least one data object respectively. At least one of the data objects is related to the behavior of the movable object in the target space, in other words, the data objects reflect the behavior of the movable object in the target space to a certain extent. In this embodiment, the data objects used are not limited, and the data objects that reflect the behavior of the movable object in the target space may be different according to the application scene, the target space, and the movable object. The following examples illustrate:

for example, in some application scenarios, the target space is an offline store and the movable object is a user entering and exiting the offline store. In an off-line retail scenario, an off-line store typically includes some items that can be purchased by a user, such as fresh goods, daily necessities, clothing, and the like, and may also be cosmetics, home appliances, electronic products, and the like. The behavior of the user in the online store mainly comprises the following steps: an entrance store, a walk in store, browse items, select items, shop for items, pay for payment, and the like. Data objects that can reflect the user's behavior in the store that is being taken off-line include, but are not limited to: the time when the user goes in and out of the store, the details of the order in the online store and/or the map of the online store. The time of the user for getting out of the online store and the time of waiting for getting out of the online store can be obtained through the time of the user for getting out of the online store. Through the details of the order in the online store of the user, the information of which articles are selected and purchased by the online store of the user, the payment operation of which articles, the payment time, the POS machine used for payment and the like can be obtained. Through the map of the off-line store, the information such as the internal pattern and the passage position of the off-line store can be obtained, and further the general positions of the operations of walking, browsing, selecting, purchasing and paying for payment in the off-line store of the user can be obtained.

For another example, in other application scenarios, the target space is a family space and the movable object is a family member. In a family scene, household appliances for family members, such as televisions, sweeping robots, microwave ovens, air conditioners, electric lamps, and the like, are generally included in a family space, and also food, beverages, and the like are included. The behaviors of family members in the family space mainly include: when entering or exiting the home space, the home appliances are used for doing housework or entertaining, such as watching television, turning on an air conditioner, turning on an electric lamp, cooking food on a cooking bench, heating food with a microwave oven, and the like. Data objects that reflect the behavior of family members within the home environment include, but are not limited to: the time of the family members entering and exiting the family space, the time of use of the household appliances, the remaining amount of food and beverages, and/or a map of the family environment. The time of the family member entering and exiting the family space and the time length of the family member staying in the family space can be obtained through the time of the family member entering and exiting the online store. Through the detail information in the family environment such as the using time of the household appliance, the residual quantity of food and beverage, the family members can know what things are done in the family space and the information such as the time of doing the things. The positions of various household appliances can be obtained through the map of the family space, and further the general positions of family members when the family members do various things in the family environment can be obtained.

For another example, in still other application scenarios, the target space is a public location such as a bus stop, a train station, an airport, etc., and the movable object includes passengers that can enter and exit the public location. In public settings, these public places usually include various service windows, such as ticket/ticket vending windows, baggage check-in windows, as well as some self-service equipment, such as ticket dispensers, gate gates, etc. The behavior of passengers in these public places mainly includes: ticket buying/refunding, ticket picking, consignment of luggage, gate passing, etc. Data objects that reflect passenger behavior in public places include, but are not limited to: the time when the passenger goes in and out of the public place, the service information of the passenger at the service window, the operation information of the passenger on the self-service equipment, the map of the public place and the like. The time of the passengers getting in and out of the public place and the time length of the passengers waiting in the public place can be obtained through the time of the passengers getting in and out of the public place. Through the service information of the passenger at the service window, the operation information of the passenger on the self-service equipment and the like, the information of what the passenger does in the public place, the time of doing the things and the like can be obtained. The positions of service windows, self-service equipment and the like in the public place can be obtained through the map of the public place, and further the general positions of passengers doing various things in the public place can be obtained.

In an alternative embodiment, the data objects capable of reflecting the behavior of the movable object in the target space can be roughly summarized as the following types: the first type: temporal data of the movable object entering and exiting the target space; the second type: behavioral detail data of the movable object within the target space; in the third category: map data of the target space.

In this alternative embodiment, at least one of the three types of data objects may be used, that is, the server device 12 may associate the trajectory of the movable object with the time data of the movable object entering or exiting the target space, the behavior detail data of the movable object in the target space, and/or the map data of the target space, but is not limited thereto. In addition, which data objects are used can be flexibly set according to monitoring requirements.

After the track of the movable object and at least one data object related to the track are obtained, the track of any movable object can be processed according to video processing requirements by taking the track as a data base, such as investigation, clipping or beautification. For convenience of description and distinction, in the embodiment of the present application, a movable object that needs to be subjected to processing related to its trajectory is referred to as a target movable object, and the target movable object may be any one movable object that enters into a target space. In the present embodiment, the trajectory of the target movable object may be processed by the display terminal 13.

Specifically, the display terminal 13 may obtain a target trajectory and at least one data object associated therewith from the server device 12 according to the query operation, where the target trajectory refers to a trajectory of a target movable object in a target space. Then, the display terminal 13 displays the target track, and in the process of displaying the target track, at least one data object associated with the target track is displayed in an associated manner. Optionally, if the target track is a track video, the display terminal 13 may play the track video, and in the process of playing the track video, associate and display at least one data object associated with the track video. In this way, the relevant processing personnel can know the target track and the relevant content thereof, and further can process the track of the target movable object. For example, in a video investigation scene, on the basis of knowing the track of the target movable object and at least one data object associated with the track, whether abnormal behavior or specified behavior exists in the target movable object or not can be investigated.

The track of at least one movable object is associated with at least one data object capable of reflecting the behavior of the movable object in the target space, and in the track-related processing process, the track can be displayed, and meanwhile, other information related to the track can be displayed in a multi-dimensional mode, so that multi-dimensional consideration is facilitated, track-related processing efficiency (video troubleshooting efficiency) is improved, and cost is saved.

In this embodiment of the present application, the server device 12 at least needs to perform: merge processing and association operations. The merging processing is a abbreviation of "merging processing is performed on video data shot by the plurality of cameras 11 by taking a movable object as a unit to obtain a track of each movable object entering a target space"; the association operation is an abbreviation for "associating the trajectories of the movable objects with at least one data object, respectively. In the embodiments of the present application, detailed implementation manners of the "merging process" and the "association operation" are not limited, and all the implementation manners that can achieve the above-described objects are applicable to the embodiments of the present application. Detailed embodiments of the "merge process" and the "association operation" are respectively exemplified below.

Exemplary description of the "merge processing" embodiment:

in some exemplary embodiments of the present application, the server device 12 merges the video data captured by the multiple cameras 11 by using a Computer Vision (CV) technology and a pedestrian re-identification (REID) technology, so as to obtain a trajectory of each movable object entering the target space. In the embodiment of the present application, the CV technique is mainly used to track the movable object captured by each camera 11; REID technology is mainly used to identify whether a movable object shot by another camera exists in video data shot by one camera.

Specifically, the server device 12 may perform target tracking on video data captured by each of the multiple cameras 11 to obtain a movable object captured by each of the multiple cameras 11; REID processing is performed on the movable objects photographed by the plurality of cameras 11, respectively, to obtain a trajectory of at least one movable object. For the cameras 11 that shoot the same movable object, the track segments that contain the movable object and are shot by the cameras 11 can be spliced according to the time sequence of shooting the movable object by the cameras, so as to obtain the track of the movable object.

The above-mentioned target tracking of the video data shot by each of the plurality of cameras 11 may be implemented in various ways, which is not limited to this. The following examples illustrate:

in the alternative a1, for each camera, a target tracking operation is performed on each frame of video data, that is, for each frame of video data captured by the camera, it is determined whether the frame of video data contains a movable object tracked in the previous frame of video data. The tracking granularity is finer in the mode, the probability of missing tracking can be reduced, and the movable object appearing in the target space can be tracked more comprehensively.

In an optional mode a2, for each camera, a target tracking operation is performed every N frames of video data, that is, in each target tracking operation, it is determined whether a movable object tracked last time is tracked in combination with N frames of video data; wherein N is an integer of 2 or more. In the optional embodiment, instead of tracking and classifying the front and back frames of video data, a delayed confirmation technology is used to improve the accuracy of the tracking result in consideration of the problems of body occlusion, clothes similarity, light reflection and the like. The delayed confirmation technology is that judgment is not made in real time for each frame of video data, but is made for 2 frames or more than 2 frames of video data, and more complete track information (such as information of catching more faces, full body photographs and the like) is obtained after the delayed judgment, so that the accuracy of tracking and classifying is improved.

The REID processing on the movable object shot by each of the plurality of cameras 11 may be implemented in various ways, which is not limited to the above. The following examples illustrate:

in the alternative b1, REID processing is performed on the movable object captured by each of the plurality of cameras 11 in combination with the overlapping fields of view between the plurality of cameras 11 to obtain the trajectory of at least one movable object.

In the alternative b2, REID processing is performed on the movable objects photographed by the plurality of cameras 11, respectively, in combination with the spatiotemporal information of the movable objects, to obtain the trajectory of at least one movable object.

In an alternative b3, REID processing is performed on the movable objects photographed by the plurality of cameras 11, respectively, in combination with feature information of the movable objects acquired by other means, to obtain the trajectory of at least one movable object.

In the alternative b4, REID processing is performed on the movable object photographed by each of the plurality of cameras 11 in combination with the overlapping fields of view between the plurality of cameras 11 and the spatiotemporal information of the movable object to obtain the trajectory of at least one movable object.

In an optional mode b5, REID processing is performed on the movable objects captured by the plurality of cameras 11, respectively, in combination with the overlapping fields of view among the plurality of cameras 11 and feature information of the movable objects acquired by other means, to obtain the trajectory of at least one movable object.

In an optional mode b6, REID processing is performed on the movable objects photographed by the respective cameras 11 in combination with the spatiotemporal information of the movable objects and the feature information of the movable objects acquired by other means to obtain the trajectory of at least one movable object.

In an optional mode b7, REID processing is performed on the movable objects captured by the plurality of cameras 11, respectively, in combination with the overlapping fields of view among the plurality of cameras 11, the spatiotemporal information of the movable objects, and the feature information of the movable objects acquired by other means, to obtain the trajectory of at least one movable object.

In the above optional embodiment, the overlapping fields of view of the cameras are used, and based on the positions of the movable objects photographed by different cameras, some movable objects which are not clear can be associated, so that the tracking precision and accuracy can be improved, and the track of the same movable object is more complete. The spatiotemporal information of the movable objects is utilized to make a filtering decision on the association between the movable objects, so that the association between the movable objects which obviously do not meet the spatiotemporal requirement can be deleted, the error merging is reduced, and the accuracy of the track is improved. The spatiotemporal information is information obtained by establishing a coordinate system for a map of a target space and determining the reasonableness of a movable object at which time and position. The method has the advantages that the characteristic information of the movable object acquired by other modes is combined with the characteristics of the movable object shot by the camera to carry out comprehensive merging, the merging accuracy is improved, and the track segments of the same movable object are spliced in one track as much as possible. Optionally, if the track is a video track, the track segment is a video segment shot by the camera.

According to different application scenarios, other embodiments for acquiring the feature information of the movable object and the acquired feature information are different. Taking an offline store as an example, the feature information of the movable object is obtained by other ways, including at least one of the following: 1. acquiring a face image of a movable object (mainly a user who arrives at a store for consumption) shot by a POS machine in an offline store; 2. and acquiring a face image in the electronic account corresponding to the movable object. The electronic account number refers to an account number in which the movable object is registered on line, and the facial image in the account number refers to a facial image bound with the account number during the registration process of the movable object.

Exemplary description of the "Association operations" embodiments:

depending on the application scenario, the target space and the movable object, the behavior of the movable object in the target space may differ, and accordingly, the at least one data object that is capable of reflecting the behavior of the movable object in the target space may also differ. The embodiment of associating the trajectory of the movable object with at least one data object may also vary from data object to data object. In the following embodiments, at least one data object comprises: the embodiment of the association operation is exemplified by time data of the movable object entering and exiting the target space, behavior detail data of the movable object in the target space, and/or map data of the target space.

In the following embodiments, the first movable object is taken as an example and the first movable object may be any one of movable objects entering or exiting the target space.

In alternative c1, the trajectory of the first movable object may be associated with temporal data of the first movable object entering or exiting the target space.

In alternative c2, the trajectory of the first movable object may be associated with behavioral detail data of the first movable object within the target space.

In alternative c3, the trajectory of the first movable object may be associated with map data of the target space.

In the alternative c4, the trajectory of the first movable object may be associated with time data of the first movable object going in and out of the target space and behavior detail data of the first movable object within the target space.

In the alternative c5, the trajectory of the first movable object may be associated with time data of the first movable object entering or exiting the target space and map data of the target space.

In alternative c6, the trajectory of the first movable object may be associated with the behavioral detail data of the first movable object within the target space and the map data of the target space.

In the alternative c7, the trajectory of the first movable object may be associated with time data of the first movable object going in and out of the target space, behavior detail data of the first movable object within the target space, and map data of the target space.

In either alternative, one alternative embodiment of associating the trajectory of the first movable object with temporal data of the first movable object in and out of the target space includes: time axis information of a trajectory of the first movable object is generated based on time data of the first movable object entering and exiting the target space. That is, time data of the first movable object entering and exiting the target space is represented on the time axis of the trajectory of the first movable object. For example, the start time of the time axis may be set to the time when the first movable object enters the target space and the end time of the time axis may be set to the time when the first movable object leaves the target space.

In either alternative, one optional embodiment of associating the trajectory of the first movable object with map data of the target space includes: and adding a dynamic icon in the map data of the target space according to the track of the first movable object, wherein the dynamic icon is linked with the first movable object. In other words, as the first movable object moves in the trajectory, the dynamic image moves on the map, and the position of the first movable object in the target space is displayed on the map in real time. The dynamic icon may be any image that is movable and capable of identifying the first movable object. For example, the dynamic icon may be an avatar of the first movable object registered online, or an icon with the name or ID of the first movable object.

In either alternative, one alternative embodiment of associating the trajectory of the first movable object with the behavioral detail data of the first movable object within the target space includes: determining a behavior object, a behavior position and/or a behavior time of a first movable object when the first movable object performs a first behavior according to behavior detail data of the first movable object in a target space; and establishing a corresponding relation between the behavior object, the behavior position and/or the behavior time of the first behavior and the track segment of the first behavior in the track of the first movable object. In the embodiments of the present application, "and/or" means at least one of the objects connected by "and/or".

In the embodiment of the present application, the behavior detail data of the first movable object in the target space refers to detail data of a first behavior of the first movable object in the target space, and may reflect or embody information such as a behavior object, a behavior position, and/or a behavior time involved in the first behavior of the first movable object in the target space. The first behavior may be defined differently depending on the application scenario, the target space, and the first movable object. The following examples illustrate:

for example, taking the target space as an offline store and the first movable object as a user who enters the store for shopping as an example, the first behavior mainly refers to a behavior that the user purchases an item in the store; accordingly, the behavior detail data of the first movable object in the target space is order information formed by the user purchasing an item from an online store. The order information generally includes: the information such as the name, price, quantity, payment settlement time, POS machine information of payment settlement, address and name of the offline store and the like of the goods selected by the user. The name of the article can directly represent the article (namely, the action object) selected by the user, and can indirectly represent the position (namely, the action position) selected by the user, and the placing position of the article in the store is known. The payment checkout time may reflect to some extent the time (i.e., the action time) when the user purchases the item.

For example, if the target space is a public place such as a train station or an airport and the first movable object is a traveler entering the public place, the first behavior may be a behavior in which the traveler passes through a gate in preparation for boarding or boarding, a behavior in which the traveler checks in baggage, a behavior in which the traveler checks tickets, or the like, and may be flexibly defined according to application requirements. Accordingly, the behavior detail data of the first movable object in the target space is ticket purchasing information of the passenger, the ticket purchasing information generally embodies information such as train number or flight number, departure or takeoff time, waiting room or boarding gate, and the like, and the information can reflect information of the passenger passing through a gate (gate position, gate passing time and the like) to a certain extent, information of the passenger handling baggage consignment (such as window position and approximate time range) or information of the passenger checking time, position and the like.

Optionally, for the action object and/or the action position, highlighting may be performed in the track segment where the first action is located; for action time, it can be marked on the time axis of the track. The manner of highlighting the behavior object and/or behavior location is not limited, for example, the behavior object and/or behavior location may be framed in a track; or adding an indication icon (such as an arrow and a little hand) in the track, pointing to the behavior object and/or the behavior position; some image or video processing techniques may also be employed to highlight behavioral objects and/or behavioral locations; and so on. The manner of marking travel as time on the time axis is not limited, and for example, the action time of the first action can be selected in a frame on the time axis; the time axis position corresponding to the behavior time can be thickened, highlighted or changed in color; indication icons (such as arrows and small hands) can also be added to point to the time axis position corresponding to the action time; and so on.

In the embodiment of the present application, the display terminal 13 at least needs to execute: information acquisition operations and associated display operations. The information obtaining operation is a short for "responding to the query operation, obtaining the target track and at least one data object associated with the target track from the server-side device 12"; the association display operation is an abbreviation for "at least one data object is displayed in association during the target track display process". In the embodiment of the present application, detailed embodiments of the "information acquisition operation" and the "associated display operation" are not limited, and any embodiment that can achieve the above object is applicable to the embodiment of the present application. Detailed embodiments of the "information acquisition operation" and the "associated display operation" are exemplarily described below, respectively.

Exemplary description of the embodiments regarding "information acquisition operations":

in the alternative d1, the display terminal 13 has an electronic screen, and the electronic screen can present a first query interface to the video processing person, where the first query interface is an interactive interface between the video processing person and the display terminal 13, and through the first query interface, the video processing person can initiate a first query operation to the display terminal 13.

Optionally, the first query interface includes a number of information items that may lock the target movable object or target track. For example, such information items include, but are not limited to: an information item describing a trajectory, an information item describing a movable object, an information item describing behavior detail data, or an information item describing a time of getting in and out of a target space, or the like. Taking video investigation in an offline retail scene as an example, as shown in fig. 3a, a schematic diagram of a first query interface is shown. In fig. 3a, the first query interface includes: and furthermore, as shown in fig. 3b, the right sides of the information items comprise an "inquiry" control, and the inquiry control can send out a first inquiry operation. It should be noted that, in the interface shown in fig. 3a, these information items are obtained by performing order query through the order query interface. Further, fig. 3b shows another implementation form of the first query interface. In fig. 3b, the first query interface includes information items such as access time, order number, user image, and the like, and after the user inputs information required by one or more information items, the user clicks the "query" control on the interface to issue a first query operation.

The display terminal 13 may send a first data obtaining request to the server device 12 in response to the first query operation, so as to request the target track. The first data request carries first identification information pointing to the target track, so that the server-side device 12 determines which track needs to be returned; then, the target track returned by the server device 12 and at least one data object associated with the target track may be received. The first identification information may be information in an information item carried in the first query interface, and may be, for example, an identification of the target movable object, an identification of behavior detail data of the target movable object, or time data of entering and exiting the target space.

In alternative d2, display terminal 13 has an audio component that supports voice recognition functionality. Based on this, the video processing person may call an audio component (e.g., a microphone) of the display terminal 13 to initiate a first query operation to the display terminal 13 in a voice manner. The display terminal 13 may send a first data obtaining request to the server device 12 in response to the first query operation, so as to request the target track. The first data request carries first identification information pointing to the target track, so that the server-side device 12 determines which track needs to be returned; then, the target track returned by the server device 12 and at least one data object associated with the target track may be received. The first identification information may be provided by the video processing person through voice, and may be, for example, an identification of the target movable object, an identification of behavior detail data of the target movable object, or time data of entering and exiting the target space.

Exemplary description of embodiments related to "associated display operations":

depending on the application scenario, the target space and the movable object, the behavior of the movable object in the target space may differ, and accordingly, the at least one data object that is capable of reflecting the behavior of the movable object in the target space may also differ. Different data objects are different, and the implementation of the associated display of at least one data object in the target track display process is also different. In the following embodiments, at least one data object comprises: the embodiment of the related display operation is exemplified by time data of the entrance and exit of the target movable object into and out of the target space, behavior detail data of the target movable object in the target space, and/or map data of the target space.

In the alternative e1, during the display of the target trajectory, time data showing the moving of the target movable object into and out of the target space is associated.

In the alternative e2, during the display of the target trajectory, behavior detail data of the target movable object in the target space is displayed in association.

In the alternative e3, during the display of the target trajectory, the map data of the target space is displayed in association.

In the alternative e4, during the display of the target trajectory, time data of the movement of the target movable object into and out of the target space and behavior detail data of the target movable object within the target space are displayed in association.

In the alternative e5, in the display of the target trajectory, time data of entering or exiting the target space of the target movable object and map data of the target space are displayed in association.

In the alternative e6, during the display of the target trajectory, the behavior detail data of the target movable object within the target space and the map data of the target space are displayed in association.

In the alternative e7, during the display of the target trajectory, time data of the entrance and exit of the target movable object into and out of the target space, behavior detail data of the target movable object within the target space, and map data of the target space are displayed in association.

In the embodiment of the present application, the style of the display interface of the display terminal 13 is not limited, and the interface style capable of displaying at least one data object in association is applicable to the embodiment of the present application in the process of displaying the target track. As shown in fig. 4 a-4 d, the display interface is a style of display interface, and the display interface includes: a track display area and an information display area; the track display area is used for displaying a target track; the information display area is used for displaying part or all of the data objects associated with the target track. In fig. 4 a-4 d, a style of displaying an interface is given by taking a video review of an offline retail scene as an example, and the interface style is not limited thereto.

In either alternative, one alternative embodiment of the method for displaying time data of the target movable object entering or exiting the target space in association with the target trajectory display process includes: on a time axis of the target track, time data of the target movable object coming in and out of the target space is displayed. As shown in fig. 4a to 4d, the start time of the time axis is the store-in time, and the end time of the time axis is the store-out time.

In either alternative, one alternative embodiment of displaying the map data of the target space in association with the target trajectory during the target trajectory display process includes: displaying map data of a target space in a target track display process; and displaying dynamic icons in the map data, the dynamic icons being in linkage with the target movable objects in the target trajectory. As shown in fig. 4 a-4 d, the dynamic icon is the user's avatar in the target trajectory, and the position of the avatar on the map represents the position of the user in the online store.

Further, the map data of the target space may be displayed in an information display area other than the trajectory display area, as shown in fig. 4a to 4 c. Alternatively, the map data of the target space may be displayed in a floating layer form above the trajectory display area, as shown in fig. 4 d.

Further, the behavior detail data of the target movable object in the target space refers to detail data of a first behavior of the target movable object in the target space, and may reflect or embody information such as a behavior object, a behavior position, and/or behavior time involved in the first behavior of the target movable object in the target space. The first behavior may also be defined differently according to the application scenario, the target space, and the target movable object, which is not limited herein.

The behavior detail data of the target movable object in the target space comprises: in the case that the target movable object has a behavior object, a behavior position and/or a behavior time of a first behavior in the target space, in the target track display process, associating and displaying the behavior detail data of the target movable object in the target space, including at least one of the following operations:

operation A1: on a time axis of the target track, a behavior time at which the target movable object has a first behavior within the target space is displayed.

Operation A2: displaying a behavior object in which a first behavior of the target movable object occurs in the target space in an information display area outside the target trajectory display area;

operation A3: a behavior position where a first behavior of the target movable object occurs within the target space is marked in the map data.

With respect to operation a 1: and marking the action time of the first action on the time axis of the target track, namely marking the position corresponding to the action time on the time axis. As shown in fig. 4 a-4 d, the position of the icon "x" on the time axis is the time when the first action occurs. Further optionally, the marked positions on the time axis have interactivity, and the relevant processing personnel initiates a trigger operation on the marked positions on the time axis, so that the track fragments when the target movable object has the first behavior can be quickly located. Triggering operations herein include, but are not limited to: click, double click, touch, mouse hover, long press, etc.

With respect to operation a 2: and displaying a behavior object in which the target movable object performs a first behavior in the target space in an information display area outside the trajectory display area. As shown in fig. 4 a-4 d, the item in the right-hand order details is the action object involved when the first action is a shopping action. Further optionally, the behavior object related to the first behavior may be presented in the form of an object list. Furthermore, the behavior object has interactivity, and relevant processing personnel initiate triggering operation on the behavior object displayed in the information display area, so that the track segment where the behavior object is located can be quickly located. Triggering operations herein include, but are not limited to: click, double click, touch, mouse hover, long press, etc.

With respect to operation a 3: with the display of the target trajectory, when a trajectory section containing behavior objects is displayed, a behavior position where a first behavior occurs in the trajectory section is marked in the map data. As shown in fig. 4a to 4d, the position of the user avatar in the map is the position of the user when the user purchases an article, that is, the position of the user avatar when a purchasing behavior occurs, and the position of the user avatar in the map dynamically changes along with the playing of the target track.

Furthermore, during the process of displaying at least one data object in an associated manner, at least one of the following operations may be included:

operation B1: and when the track segment containing the behavior object is displayed, highlighting the behavior object in the information display area. As shown in fig. 4b, when the video clip (i.e. track clip) of the apple selected by the user is played, the apple information in the right information display area is highlighted. In operation B1, the manner of highlighting is not limited, and the behavior object displayed may be highlighted in a frame selection manner, a highlight manner, an animation manner, or the like.

Operation B2: when the track segment containing the behavior object is displayed, the behavior time of the first behavior occurring aiming at the behavior object is highlighted on the time axis. As shown in fig. 4b, when the video clip (i.e., track clip) of the apple selected by the user is played, the corresponding time position on the time axis is indicated by the little-hand icon. In operation B2, the manner of highlighting is not limited, and the time position displayed on the time axis may be highlighted in a frame selection manner, a highlight manner, or a manner indicated by an icon (e.g., an arrow or a small hand icon).

Operation B3: and jumping to the track segment containing the behavior object from the current display position in response to the triggering operation of the behavior object in the information display area. As shown in fig. 4c, in response to the click operation on the tomato information displayed in the right information display area, the user can jump from the currently playing video segment (i.e., track segment) of the apple bought by the user to the video segment (i.e., track segment) of the tomato bought by the user, and fig. 4c shows the video segment (i.e., track segment) of the tomato bought by the user.

Operation B4: and responding to the triggering operation of any behavior time on the time axis, and jumping to the track segment corresponding to the triggered behavior time from the current display position.

In an alternative embodiment, the map data further includes icons for a plurality of cameras in the target space, as shown in fig. 4 a-4 d. Based on the above, in the display process of the target track, the camera shooting the target movable object can be dynamically marked in the map data along with the movement of the target movable object. The camera as encircled by the dashed circle icon in fig. 4c and fig. 4d is the camera that captures the target movable object in the current video segment.

In order to facilitate a clearer understanding of the technical solutions provided in the embodiments of the present application, in the following embodiments, a video investigation of an offline retail scene is taken as an example, and the technical solutions of the present application are fully described in detail.

As shown in fig. 2, in the off-line retail scene, the target space is an off-line store 20, a plurality of cameras 21 are disposed in the off-line store 20, the off-line store 20 contains articles 24 to be sold, the articles 24 to be sold are placed on shelves 23, a passageway is formed between the shelves 23 and the shelves 23, and a user can move in the passageway to browse, select and purchase the articles 24 on the shelves 23. Further, as shown in fig. 2, a POS 27 is provided near the exit of the offline store 20 to provide self-payment services to the user.

The cameras 21 capture video data in the offline store, and the video data is uploaded to the server device 25 corresponding to the offline store 20, as shown in fig. 2. The server device 25 may be a conventional server, a server array, or a cloud server. In fig. 2, the server device 25 is illustrated by taking a cloud server as an example, but the invention is not limited thereto. The server device 25 stores video data captured by the plurality of cameras 21.

In the retail scene under the online condition, the problems of lost payment, unpaid payment and the like caused by self-service payment are common. To the resource loss problem, can carry out the dish loss through the video that a plurality of cameras 21 were shot of backtracking, however, the data volume of the video data that a plurality of cameras 21 were shot is great, and the video time is longer, backtracks all video data, and is with high costs, inefficiency.

In this embodiment, on one hand, the server device 25 merges the video data captured by the plurality of cameras 21 by using the user entering the offline store 20 as a unit to obtain a track video of each user; on the other hand, the track video of each user is also associated with the time when the user enters or exits the online store 20, the order data generated in the online store 20 and the map of the online store 20, so as to provide a data base for the subsequent video investigation. With the data base, when video investigation is needed, the investigation personnel can screen out suspicious orders, suspicious users or suspicious time periods and the like, and can screen out track videos to be investigated according to the information in a suspicious mode, so that the video investigation range is reduced, the efficiency is improved, and the cost is saved.

Optionally, the troubleshooting personnel may set the conditions that the track video to be investigated needs to meet through the first query interface shown in fig. 3a or fig. 3b, and report the conditions to the server device 25. Taking the first query interface shown in fig. 3a as an example, firstly, in the order query interface, a suspicious user or a suspicious order can be quickly searched and located through the examination conditions such as a face/body photo, an order number, an account number, and the like, so as to obtain the first query interface shown in fig. 3a, related information items of the track video to be examined, such as a face/whole body image, store-in time, store-out time, order information, and the like, are displayed on the first query interface, and these information items can uniquely identify one track video, and a query request can be initiated to the server device 25 by clicking a "query" control next to the corresponding track video. The server device 25 will screen out the track video of the suspicious user or the suspicious order according to the above information items, and return the track video of the suspicious user or the suspicious order, the time when the suspicious user enters or exits the online store 20, the order data of the suspicious user, and the map of the online store 20 to the terminal device 26 of the inspector.

The terminal device 26 receives the track video of the suspicious user or the suspicious order returned by the server device 25, the time when the suspicious user enters and exits the online store 20, the order data of the suspicious user, and the map of the online store 20; then, as shown in fig. 4a to 4d, a track video of the suspected user or the suspected order is played in the track display area, that is, a video picture of the suspected user or the suspected order corresponding to the user in the online store 20 is displayed, so that an inspector inspects whether the suspected user in the track video has behaviors of missing payment, escaping an order, and the like.

In the present embodiment, it is assumed that the suspicious user takes at least one shopping behavior in the store 20 online, and the behavior of purchasing items each time is simply referred to as one pick. Further, as shown in fig. 4a to 4d, the time when the suspicious user enters or exits the offline store 20 is marked on the time axis of the track video, and the time point when the suspicious user takes goods (referred to as a pick-up point for short) from the online offline store 20 is marked, where the position corresponding to the icon "+" on the time axis in fig. 4a to 4d is the pick-up point. For the troubleshooting personnel, the video pictures can be quickly positioned to the video clips corresponding to the corresponding goods taking points by clicking, touching or double clicking the goods taking points on the time axis, the goods taking items are marked in the video clips, the troubleshooting personnel can conveniently and quickly check certain goods taking behaviors in a targeted manner, and the troubleshooting efficiency can be improved. Accordingly, when a video clip of a certain pick is played, the pick point can be highlighted concomitantly.

Further, as shown in fig. 4a to 4d, the order information of the suspicious user in the online store 20 may also be displayed in the information display area on the right, which mainly refers to all the item information purchased by the suspicious user in the online store 20. For the troubleshooting personnel, the video frame can be quickly positioned to the video segment when the suspicious user takes the goods through clicking, touching or long-pressing the corresponding goods information, the goods taking can be marked in the video segment, the troubleshooting personnel can conveniently and quickly check certain goods taking behaviors in a targeted mode, and the troubleshooting efficiency can be improved. Accordingly, when the video clip of a certain goods is played, the goods information on the right side can be highlighted in a concomitant manner.

Further, as shown in fig. 4a to 4c, a map of the offline store 20 may be displayed in the right information display area, and when a video clip of a certain pickup is played, the pickup position corresponding to the suspicious user may be concomitantly displayed on the map, and the pickup point corresponding to the time axis may be highlighted. For the troubleshooting personnel, a certain position or area on the map can be triggered to quickly position the video picture to the video clip in the corresponding position or area, and similarly, a certain camera on the map can be triggered to quickly position the video picture to the video clip shot by the camera, so that the troubleshooting personnel can quickly and pertinently check the video clip shot by the certain position, area or camera, and the troubleshooting efficiency can be improved.

Further, as shown in fig. 4d, the map of the offline store 20 can be expanded and displayed in a semi-transparent floating layer manner above the video screen. The map is displayed in an unfolding mode, so that the information on the map is clear, and the map is convenient to view and operate. Of course, when the map does not need to be expanded and displayed, the map can be reduced to the information display area on the right side through certain triggering operation, for example, clicking a closing control or a contraction control on the upper right corner.

In this embodiment, in the process of playing the track video in the suspicious user online leaving store 20, the picking point is displayed in a time axis in a companion manner, the picking position corresponding to the picking point is highlighted in a companion manner on a map of the online leaving store 20, and the item information corresponding to the picking point is highlighted in a companion manner in the picking list, so that three-dimensional information of picking, namely, the time, the place and the item of picking can be displayed comprehensively and stereoscopically, and the loss of the plate can be rapidly and accurately checked.

And (3) video completion:

in the above embodiments of the present application, the video data captured by the plurality of cameras is merged and processed into the track using the movable object as a unit, so that the video processing operation can be greatly simplified, the video processing efficiency is improved, and the cost is reduced. However, the lack of the trajectory may occur due to technical aspects such as the field environment in the target space and the video merging process. In order to ensure the completeness of the track and the backtracking or troubleshooting effect based on the track, in some embodiments of the application, before the target track is displayed, whether a missing part exists in the target track can be judged; and if the missing part exists, performing completion processing on the target track. The completion here refers to a technique of completing a missing portion in the target trajectory.

In the embodiment of the application, a video completion scheme is provided, which can perform completion processing on a target track. Specifically, for the display terminal 13, on the one hand, a target trajectory in which a missing portion exists may be displayed; on the other hand, according to the trend of the target movable object in front of the missing part, candidate cameras which are possible to shoot the missing part in the target space are determined; performing completion processing on the target track according to the video content corresponding to the missing time period and shot by the candidate camera; wherein the missing time period is a time period corresponding to the missing portion.

Further, on the basis of the video completion scheme, the embodiment of the application also provides a human-computer interaction video completion scheme, and the completion processing is performed on the target track. Specifically, for the display terminal 13, on the one hand, a target trajectory in which a missing portion exists may be displayed; on the other hand, a completion operation initiated by a user for the missing part can be responded, and candidate cameras which are possible to shoot the missing part in the target space are determined according to the trend of the target movable object in front of the missing part; playing video content corresponding to the missing time period and shot by the candidate camera so that a user can check whether the target movable object is included; and responding to a completion confirmation operation initiated by a user, and completing the target track by using the track segment containing the target movable object. The user initiating the completion operation on the target track may be a video processing person, or may be another person specially responsible for the video completion operation, which is not limited herein.

In the embodiment of the present application, a manner in which the user initiates the completion operation is not limited. For example, the user may send a completion instruction to the display terminal in a voice manner to initiate a completion operation. For another example, a completion control may be set on an interface displaying the target trajectory, and the user may trigger the completion control, thereby initiating a completion operation on the target trajectory. For another example, before the target trajectory is displayed, the missing part in the target trajectory and the corresponding missing time period may be determined according to the time when the target movable object enters or exits the target space and the time when the target movable object appears in the target trajectory; in displaying the target track, the missing time period is marked on the time axis of the target track. The missing time periods on the time axis have an interactive function, and a user can trigger the missing time periods on the time axis so as to initiate completion operation. The display terminal 13 may determine that the user initiates a completion operation for the missing portion in the target trajectory in response to a user trigger operation for the missing time period on the time axis. The trigger operations in this paragraph include, but are not limited to: click, double click, touch, mouse hover, or long press.

In some optional embodiments, during the display of the target trajectory, a map of the target space may be displayed in association; a camera in the target space and a dynamic icon are displayed on the map, and the dynamic icon is linked with a target movable object in the target track. In these embodiments, candidate cameras may be determined from cameras displayed on a map in conjunction with the map. An embodiment of determining candidate cameras includes: responding to a completion operation initiated by a user aiming at the missing part, and calculating the final trend of the target movable object before the missing part; adjusting the orientation of a camera which finally shoots the target movable object before the missing part on the map to be consistent with the final trend; and responding to the selection operation initiated by the user for the cameras in the orientation coverage range, and determining the camera selected by the user as a candidate camera. The orientation of the camera that last captured the target movable object before the missing part is shown in fig. 5.

Further, the candidate camera is a camera that does not shoot the target movable object with a high probability. Based on the above, before responding to the selection operation initiated by the user for the camera in the orientation coverage range, the display terminal can mark the camera which does not shoot the target movable object on the map, so that the user can select the operation for the camera which is in the orientation coverage range and does not shoot the target movable object. Further alternatively, the display terminal may also mark the camera that captured the target movable object, or mark the camera that captured the target movable object and the camera that did not capture the target movable object at the same time, respectively. No matter which marking mode is adopted, the method can assist the user to identify the camera which does not shoot the target movable object, and is convenient for the user to select the candidate camera.

In the embodiment of the present application, an implementation manner of performing completion processing on a target track according to video content corresponding to a missing time period captured by a candidate camera is not limited. For example, the display terminal may directly use the features of the target movable object to adapt in the video data captured by the candidate camera; if a track segment containing a target movable object is found and the shooting time of the track segment corresponds to the missing time period, the track segment can be utilized to perform completion processing on the target track; on the contrary, if the track segment containing the target movable object is not found or the shooting time of the track segment containing the target movable object does not correspond to the missing time period, the completion operation can be ended, or prompt information is output for the user to reselect the candidate camera so as to perform the next completion operation.

In addition to the above embodiments, in another optional embodiment, the display terminal may display the video content corresponding to the to-be-played time period captured by the candidate camera, so that the user can check whether the video content includes the target movable object. Wherein the time period to be displayed at least comprises a missing time period. For the user, in case that the target movable object is found to be included, a completion confirmation operation may be initiated; otherwise, the candidate camera can be reselected without sending the completion confirmation operation. If the user initiates a completion confirming operation, the display terminal can respond to the completion confirming operation initiated by the user and complete the target track by using the track segment containing the target movable object.

Further, when playing the video content corresponding to the time period to be displayed and shot by the candidate camera, the specified video content shot by the candidate camera can be played first, and the missing time period is marked on the time axis of the specified video content; responding to the operation of selecting time points forwards and backwards from the missing time period by a user, and determining a starting time point and an ending time point of the time period to be displayed; then, starting from the starting time point, playing the video content shot by the candidate camera until the ending time point. Alternatively, the specified video content may be all video content shot by the candidate cameras, or video content shot in the day of the missing time period.

In the embodiment of the application, the track is supplemented, so that the track of the target movable object in the target space can be comprehensively and completely displayed, a complete data base can be provided for subsequent track-based troubleshooting or other data analysis, and the accuracy of the track-based troubleshooting or other data analysis is improved.

It should be noted that the above-mentioned video complementing process can also be implemented independently, and does not necessarily depend on the content of the foregoing embodiment. In other words, in any application scenario where a track with a movable object as a unit needs to be supplemented, the video supplementation scheme or the variant thereof provided in the embodiment of the present application may be adopted, and no limitation is made on what manner the track with the movable object as the unit is obtained and whether at least one data object needs to be further associated and displayed after the video supplementation.

Video processing for the second behavior:

in some embodiments of the present application, after obtaining the trajectory of each movable object, the server device 12 may further analyze the trajectory of each movable object, and analyze whether each movable object exists or may exist a second behavior; and in the case that the movable object is found to exist or possibly has the second behavior, marking information related to the second behavior can be added in the track of the movable object, and conditions are provided for subsequent video processing based on the second behavior. The operation of associating the track of the movable object with at least one data object and the operation of adding the marking information related to the second behavior to the track of the movable object may be associated with each other or may be implemented independently of each other.

The second behavior is different from the first behavior, and the second behavior is different according to the application scene, the target space and the movable object. For example, taking an offline retail scenario as an example, the first behavior is a behavior of a user purchasing goods online in an offline store, and accordingly, the second behavior includes but is not limited to: at least one of a missed payment behavior, an unpaid payment behavior, a multi-payment behavior, a micropayment order behavior, and a purchase selection behavior of the specified item. The missed payment behavior refers to a behavior that a user takes a plurality of items in an online store and pays for only part of the items (not all the items). Unpaid behavior refers to the behavior of a user taking an item off the store online but not paying. The small order behavior refers to the behavior that the payment amount is smaller than a set threshold value after the user purchases the item in the online store, for example, the payment amount is smaller than 0.4 yuan, or the behavior that the payment is initiated only for the specific small item (for example, shopping bag). The behavior of purchasing the specified articles refers to the behavior of purchasing the specified articles in an online store, and the specified articles can be flexibly set according to application requirements, for example, the specified articles can be fresh fish, fruits and the like, and can also be grain and oil such as rice and flour and the like. The multi-payment behavior refers to the behavior that the payment amount is larger than the actual amount of the purchased goods after the user goes off the store online to purchase the goods.

Wherein, the process of the server device 12 analyzing whether the second behavior exists or possibly exists according to the track of the different movable objects is the same or similar. In the following embodiments, the first movable object is taken as an example for explanation, and the first movable object may be any one of movable objects entering and exiting the target space. That is, after obtaining the trajectory of the first movable object, the server device 12 may further analyze whether the first movable object exists or may exist a second behavior according to the trajectory of the first movable object; if so, adding mark information related to the second behavior in the track of the first movable object.

Alternatively, the server device 12 may use a pre-trained recognition model to input the trajectory of the first movable object into the recognition model, so as to obtain a recognition result of whether the first movable object exists or may exist the second behavior. Alternatively, some behavior characteristics related to the second behavior may be provided to the server device 12 in advance; the server device 12 may compare and analyze the trajectory of the first movable object with the features, so as to obtain an identification result of whether the first movable object exists or may exist the second behavior. Alternatively, some characteristics or identifications (such as names and the like) of the movable objects where the second behavior may occur may be provided to the server device 12 in advance; the server device 12 may compare and analyze the first movable object in the trajectory with the features or the identifiers, so as to obtain a recognition result of whether the first movable object exists or may exist the second behavior.

Further optionally, the second behavior also relates to information such as behavior object, behavior time, and behavior location. Based on this, adding mark information related to the second behavior in the track of the first movable object comprises at least one of the following operations:

operation C1: adding a highlight mark for the behavior object of the second behavior in the track of the first movable object;

operation C2: adding at least one of detail information, on-line sales information, off-line inventory and replenishment suggestion information of the behavior object of the second behavior in the track of the first movable object;

operation C3: adding a highlight mark for the action time of the second action in the track of the first movable object;

operation C4: adding a highlight mark for the behavior position of the second behavior in the track of the first movable object;

operation C5: adding a highlight mark to the first movable object in the track of the first movable object;

operation C6: in the trajectory of the first movable object, detail information of the first movable object is added.

Taking the offline retail scenario shown in fig. 2 as an example, as shown in fig. 6a, the user in the track is a first movable object, and details of the user, such as the store-in time, the store-out time, the online account number of the user, the order number of the user, the avatar of the user, and the like, may be added to the track. Further, as shown in fig. 6a, detailed information of a stock that is an action object relating to the second action and other information may also be added in the track of the user. The detailed information of the picked-up goods includes but is not limited to: item name, item number, item status, etc., and other information for the item includes, but is not limited to: on-line sales information (e.g., 1000 pieces), off-line sales information (e.g., 100 pieces), off-line inventory (e.g., 100 pieces), and replenishment advice information (e.g., 100 pieces of replenishment), etc. Further, as shown in fig. 6a, on the time axis of the trace, a high suspicious pick point, which is the action time when the second action (e.g., picking the specified item or picking the missed item) occurs, is marked with an icon "solid circle".

Based on the trajectory to which the tag information relating to the second behavior is added, the video processor can process the trajectory section of the movable object in which the second behavior exists. For the display terminal 13, according to the second query operation, a second data request may be sent to the server device 12 to request a target track segment marked with a target behavior; the second data request includes second identification information directed to a target behavior, the target behavior belonging to the second behavior. For the server device 12, the second data request sent by the display terminal 13 may be received; acquiring a target track segment containing a target behavior from the track of at least one movable object according to the second identification information; the target track segment and the mark information related to the target behavior in the target track segment are provided to the display terminal 13. The display terminal 13 receives the target track segment returned by the server device 12 and the mark information related to the target behavior in the target track segment; and in the display process of the target track segment, the mark information related to the target behavior in the target track segment is displayed in an associated manner.

It should be noted that the target track segments containing the target behavior may be one or more, and in the case of multiple target track segments, the target track segments may be from tracks of different movable objects.

It should be noted that the video processing person may perform processing on the track segment of the movable object with the second behavior alone, or may perform processing on the track of the movable object at the same time. In an alternative embodiment, the video processor may process the target track segment containing the target behavior, and further determine whether it is necessary to view and process the complete track according to the processing result of the target track segment. Based on this, an embodiment of sending the first data request to the server device in response to the first query operation includes: in the display process of the target track segment, displaying a control for viewing the complete track; responding to the triggering operation of the control, and sending a first data request to the server-side equipment; wherein, the complete track to which the target track segment belongs is the target track in the foregoing embodiment.

In the embodiment of the present application, the implementation that the display terminal 13 sends the second data request to the server device 12 is not limited, and the following examples illustrate that:

in the option f1, the display terminal has an electronic screen, and the electronic screen can present a second query interface to the video processing person, where the second query interface is an interactive interface between the video processing person and the display terminal, and through the second query interface, the video processing person can initiate a second query operation to the display terminal.

Optionally, the query interface includes a plurality of query condition information items, and the information items can lock the target track segment. For example, it may be an identification information item of a trajectory, an identification information item of a movable object, an identification information item of behavior detail data, or a time information item of getting in and out of a target space, or the like. Fig. 6b is a schematic diagram of a second query interface for video review in an offline retail scenario. On the second query interface shown in fig. 6b, there are included: face/whole body images, commodity numbers, shelf numbers, cameras and other information items.

In fig. 6b, the inspector may manually input corresponding information in the information items on the second query interface, then click the query control on the second query interface, and send out a second query operation to the display terminal 13.

Further, as shown in fig. 6b, in the online retail scenario, the second query interface includes map data of the online store in addition to the query condition information items. Based on the above, the inspectors can directly initiate triggering operations on the shelf information, the positions in stores and/or the cameras in the map data, and the display terminal can respond to the triggering operations on the shelf information, the positions in stores and/or the cameras in the map data and fill the triggered information on the shelf information, the positions in stores and/or the cameras as query conditions; and further, sending a second data request to the server-side equipment according to the query condition so that the server-side equipment returns the target track segment marked with the target behavior according to the query condition.

In the alternative f2, the display terminal 13 has an audio component supporting a voice recognition function. Based on this, the video processing person may call an audio component (e.g., a microphone) of the display terminal 13 to initiate a second query operation to the display terminal in a voice manner. The display terminal 13 may send a second data obtaining request to the server device 12 in response to the second query operation, so as to request the target track segment marked with the target behavior.

In the embodiment of the present application, the implementation manner of displaying the mark information related to the target behavior in the target track segment in association with the display terminal 13 is not limited. For example, the target movable object, the behavior object associated with the second behavior, the behavior location, and the behavior time may be highlighted during the target track segment display. In the target track segment display process, the interface state of highlighting the target movable object, the behavior object associated with the second behavior, the behavior position, and the behavior time is shown in fig. 6 a.

In addition to the video processing system, the embodiment of the application also provides a video processing method, a video display method, a video completion method and the like for various application scenes with multiple cameras. These methods are described in detail below by means of different examples.

Fig. 7a is a schematic flowchart of a video processing method according to an exemplary embodiment of the present application. As shown in fig. 7a, the method comprises:

71a, acquiring video data shot by a plurality of cameras in a target space, wherein the target space comprises at least one movable object.

And 72a, taking the movable object as a unit, merging the video data shot by the plurality of cameras to obtain the track of at least one movable object.

73a, associating the trajectory of the at least one movable object with at least one data object, respectively, the at least one data object being related to the behavior of the movable object within the target space.

In the present embodiment, the target space refers to a physical space having an accommodation property that allows the movable object to come in and go out. A movable object broadly refers to any object that can move, including an object that can move autonomously, such as a user, a robot, or an unmanned vehicle, etc.; but also objects that are not autonomously moving, such as vehicles driven by humans, etc. The specific implementation of the target space will vary according to the application scenario, and accordingly, the movable objects entering and exiting the target space will also vary.

In some application scenarios, the target space is an offline store such as a mall, a supermarket, etc., and accordingly, the movable object is a user who enters or exits the offline store, where the user includes a consumer and/or a supermarket staff, or may also include an autonomous shopping cart capable of walking freely within the online offline store, or a robot, etc.

A plurality of cameras are mounted in the target space. The cameras are arranged at different positions in the target space, have different view field ranges and can shoot video data in the respective view field ranges. The plurality of cameras can upload the shot video data to an execution main body of the method, such as a server device. The server-side equipment merges the video data shot by the cameras by taking the movable object as a unit to obtain the track of at least one movable object, and further can associate the track of at least one movable object with at least one data object respectively, so that conditions are provided for video processing (such as investigation, clipping or beautification) by taking the movable object as a unit, and the operation of the video processing (such as investigation, clipping or beautification) is facilitated to be simplified.

In an optional embodiment, associating the trajectory of the at least one movable object with the at least one data object, respectively, comprises: for a first movable object, associating the track of the first movable object with time data of the first movable object entering and exiting the target space, behavior detail data of the first movable object in the target space and/or map data of the target space; the first movable object is any one of the at least one movable object.

Optionally, the associating the trajectory of the first movable object with the time data of the first movable object entering or exiting the target space includes: time axis information of a trajectory of the first movable object is generated based on time data of the first movable object entering and exiting the target space.

Optionally, the associating the trajectory of the first movable object with the map data of the target space includes: and adding a dynamic icon in the map data of the target space according to the track of the first movable object, wherein the dynamic icon is linked with the first movable object. For example, the dynamic icon may be an avatar of the first movable object registered online.

Optionally, the associating the trajectory of the first movable object with the behavior detail data of the first movable object in the target space includes: determining a behavior object, a behavior position and/or a behavior time of the first movable object when the first movable object performs the first behavior according to the behavior detail data of the first movable object in the target space; and establishing a corresponding relation between the behavior object, the behavior position and/or the behavior time of the first behavior and the track segment of the first behavior in the track of the first movable object.

In an optional embodiment, the method of this embodiment further includes: analyzing whether the first movable object has or is possible to have the second behavior according to the track of the first movable object; if so, adding mark information related to the second behavior in the track of the first movable object.

Optionally, the adding, in the trajectory of the first movable object, the mark information related to the second behavior includes at least one of:

adding a highlight mark for the behavior object of the second behavior in the track of the first movable object;

adding at least one of detail information, on-line sales information, off-line inventory and replenishment suggestion information of the behavior object of the second behavior in the track of the first movable object;

adding a highlight mark for the action time of the second action in the track of the first movable object;

adding a highlight mark for the behavior position of the second behavior in the track of the first movable object;

adding a highlight mark to the first movable object in the track of the first movable object;

in the trajectory of the first movable object, detail information of the first movable object is added.

In an alternative embodiment, the target space is an off-line store, which contains items for sale; the behavior detail data is order information formed by the first movable object purchasing articles in an online off-store; the first behavior is the behavior of purchasing the article; accordingly, the second behavior comprises: at least one of a missed payment behavior, an unpaid payment behavior, a multi-payment behavior, a micropayment order behavior, and a purchase selection behavior of the specified item.

In an optional embodiment, the method of this embodiment further includes: receiving a second data request sent by the display terminal, wherein the second data request comprises second identification information pointing to a target behavior, and the target behavior belongs to the second behavior; acquiring a target track segment containing a target behavior from the track of at least one movable object according to the second identification information; and providing the target track segment and the mark information related to the target behavior in the target track segment to a display terminal, so that the display terminal can perform associated display on the target track segment and the mark information related to the target behavior contained in the target track segment.

In an optional embodiment, the method of this embodiment further includes: receiving a first data request sent by a display terminal, wherein the first data request comprises first identification information pointing to a target track; acquiring a target track from the track of at least one movable object according to the first identification information; and providing the target track and the at least one data object associated with the target track to a display terminal, so that the display terminal can perform associated display on the target track and the at least one data object associated with the target track.

Optionally, the first identification information is an identification of the target movable object, or an identification of behavior detail data of the target movable object; the target movable object is a movable object corresponding to the target trajectory.

In an optional embodiment, the merging, with the movable object as a unit, the video data captured by the plurality of cameras to obtain the track of the at least one movable object includes: performing target tracking on video data shot by each of the plurality of cameras to obtain movable objects shot by each of the plurality of cameras; and carrying out REID processing on the movable objects shot by the cameras respectively to obtain the track of at least one movable object.

Optionally, the performing target tracking on the video data captured by each of the plurality of cameras to obtain the movable object captured by each of the plurality of cameras includes: performing target tracking operation once every N frames of video data for each camera, and judging whether a movable object tracked by the last target tracking operation is tracked or not by combining the N frames of video data in each target tracking operation; wherein N is an integer of 2 or more.

Optionally, the performing REID processing on the movable object captured by each of the plurality of cameras to obtain a track of at least one movable object includes:

and performing REID processing on the movable objects shot by the cameras respectively by combining the overlapped visual fields among the cameras, the space-time information of the movable objects and/or the characteristic information of the movable objects acquired by other modes to obtain the track of at least one movable object.

Further optionally, the feature information of the movable object is obtained by other means, including at least one of: acquiring a face image of a movable object shot by a POS machine in a target space; and acquiring a face image in the electronic account corresponding to the movable object.

Alternatively, the movable object in the present embodiment may be a user, a robot, or an autonomous shopping cart within the target space, but is not limited thereto.

In this embodiment, for video data captured by a plurality of cameras in a target space, taking a movable object entering the target space as a unit, merging the video data to obtain a track of each movable object, and associating the track of each movable object with a data object related to the behavior of each movable object in the target space, so as to provide conditions for video processing in units of movable objects, which is beneficial to simplifying video processing operations and improving video processing efficiency.

Fig. 7b is a schematic flowchart of another video processing method according to an exemplary embodiment of the present application. As shown in fig. 7b, the method comprises:

and 71b, acquiring video data shot by a plurality of cameras in a target space, wherein the target space comprises at least one movable object.

And 72b, merging the video data shot by the cameras by taking the movable object as a unit to obtain the track of at least one movable object.

73b, analyzing whether the movable objects exist or possibly exist the second behavior according to the tracks of the movable objects.

74b, adding marker information associated with the second behavior in the trajectory of the movable object in which the second behavior is or may be present.

For the description of steps 71b and 72b, reference may be made to the foregoing embodiments, which are not repeated herein.

In this embodiment, the manner of analyzing whether the second behavior exists or is likely to exist for different movable objects is the same or similar. For convenience of description, a process of analyzing whether the movable object exists or may exist the second behavior will be described in detail by taking the first movable object as an example. Wherein the first movable object is any one of the at least one movable object.

In particular, it is possible to analyze, from the trajectory of the first movable object, whether the first movable object is present or likely to have a second behavior; if so, adding mark information related to the second behavior in the track of the first movable object.

In an alternative embodiment, adding marker information associated with the second behavior to the trajectory of the first movable object includes at least one of:

In an alternative embodiment, the target space is an off-line store, which contains items for sale; the second behavior comprises: at least one of a missed payment behavior, an unpaid payment behavior, a multi-payment behavior, a micropayment order behavior, and a purchase selection behavior of the specified item. Accordingly, the movable object may be, but is not limited to, a user, a robot, or an autonomous shopping cart within the target space.

In an optional embodiment, the method of this embodiment further includes: receiving a data request sent by a display terminal, wherein the data request comprises identification information pointing to a target behavior, and the target behavior belongs to a behavior; acquiring a target track segment containing a target behavior from the track of at least one movable object according to the identification information; and providing the target track segment and the mark information related to the target behavior in the target track segment to a display terminal, so that the display terminal can perform associated display on the target track segment and the mark information related to the target behavior contained in the target track segment.

In this embodiment, for video data captured by a plurality of cameras in the target space, taking a movable object entering the target space as a unit, merging the video data to obtain a track of each movable object, identifying the movable object having or possibly having the second behavior, and adding mark information related to the second behavior to the track, thereby providing a condition for video processing initiated for the second behavior by taking the movable object as a unit, facilitating simplification of video processing operations, and improving video processing efficiency.

Fig. 8 is a flowchart illustrating a video display method according to an exemplary embodiment of the present application. As shown in fig. 8, the method includes:

81. responding to the first query operation, and sending a first data request to the server-side equipment to request a target track; the target trajectory is the trajectory of the target movable object within the target space.

82. And receiving a target track returned by the server-side equipment and at least one data object associated with the target track.

83. And in the target track display process, at least one data object is displayed in a correlated mode, and the at least one data object is related to the behavior of the target movable object in the target space.

In an optional embodiment, the first data request carries first identification information pointing to the target track, where the first identification information is an identification of a target movable object associated with the target track, an identification of behavior detail data of the target movable object, or time data of entering or exiting the target space.

In an alternative embodiment, the at least one data object comprises: time data of the target movable object going in and out of the target space, behavior detail data of the target movable object in the target space, and/or map data of the target space.

Optionally, in the target trajectory display process, the associating and displaying time data of the target movable object entering or exiting the target space includes: on a time axis of the target track, time data of the target movable object coming in and out of the target space is displayed.

Optionally, the behavior detail data includes: the target movable object is a behavior object, a behavior location, and/or a behavior time at which the first behavior occurs within the target space. Based on the above, in the target track display process, the association display of the behavior detail data of the target movable object in the target space includes at least one of the following operations:

displaying behavior time of a first behavior of a target movable object in a target space on a time axis of a target track;

displaying a behavior object in which a first behavior of the target movable object occurs in the target space in an information display area outside the target trajectory display area;

a behavior position where a first behavior of the target movable object occurs within the target space is marked in the map data.

Further optionally, marking a behavior location in the map data where the first behavior of the target movable object occurs in the target space includes: when the track segment containing the behavior object is displayed, the behavior position where the first behavior occurs in the track segment is marked in the map data.

Optionally, in the target trajectory display process, the displaying of the map data of the target space in association includes: displaying map data of a target space in a target track display process; and displaying dynamic icons in the map data, the dynamic icons being in linkage with the target movable objects in the target trajectory.

Further optionally, in the target trajectory display process, displaying map data of the target space includes: displaying map data of the target space in an information display area outside the target trajectory display area; alternatively, the map data of the target space is displayed in a floating layer form above the target trajectory display area.

In an optional embodiment, the method of this embodiment further includes at least one of the following operations:

highlighting the behavior object in the information display area when the track segment containing the behavior object is displayed;

when the track clip containing the behavior object is displayed, highlighting the behavior time of the first behavior aiming at the behavior object on a time axis;

responding to the triggering operation of the behavior object in the information display area, and jumping to a track segment containing the behavior object from the current display position;

and responding to the triggering operation of any behavior time on the time axis, and jumping to the track segment corresponding to the triggered behavior time from the current display position.

In an optional embodiment, the map data includes icons of a plurality of cameras in the target space. Based on this, the method of this embodiment further includes: in the process of displaying the target track, a camera shooting the target movable object is dynamically marked in the map data along with the movement of the target movable object.

In an optional embodiment, the method of this embodiment further includes: responding to the second query operation, and sending a second data request to the server-side equipment to request the target track segment marked with the target behavior; receiving a target track fragment returned by the server equipment and mark information related to a target behavior in the target track fragment; and in the display process of the target track segment, the mark information related to the target behavior in the target track segment is displayed in an associated manner.

On the basis of the optional embodiment, an implementation manner of sending the first data request to the server device in response to the first query operation includes: in the display process of the target track segment, displaying a control for viewing the complete track; responding to the triggering operation of the control, and sending a first data request to the server-side equipment; wherein, the complete track to which the target track segment belongs is regarded as the target track.

Optionally, in the online retail scene, the target space is an online store, the online store includes shelves, and the shelves include articles to be sold; the behavior detail data of the target movable object is order information formed by the target movable object to shop and shop for shopping on line; the first behavior is the behavior of purchasing the article; accordingly, the target behavior includes: at least one of a missed payment behavior, an unpaid payment behavior, a multi-payment behavior, a micropayment order behavior, and a purchase selection behavior of the specified item. Accordingly, the target movable object is a user entering and exiting the off-line store, where the user includes a consumer and/or a supermarket staff, or may also include an autonomous shopping cart capable of walking freely within the off-line store, or a robot, etc.

In an optional embodiment, the sending a second data request to the server device in response to the second query operation to request a target track segment marked with a target behavior includes:

displaying a second query interface, wherein the second query interface comprises query condition information items and map data of offline stores;

responding to triggering operation of shelf information, positions in stores and/or cameras in the map data, and filling the triggered shelf information, positions in stores and/or cameras as query conditions;

and sending a second data request to the server-side equipment according to the query condition so that the server-side equipment returns the target track segment marked with the target behavior according to the query condition.

Further, before displaying the target trajectory, the method of this embodiment further includes: judging whether a target track has a missing part or not; and if the missing part exists, completing the target track.

Optionally, an embodiment of completing the target trajectory includes:

responding to a completion operation initiated by a user aiming at the missing part, and determining candidate cameras which are possible to shoot the missing part in the target space;

playing video contents shot by the candidate cameras so that a user can confirm whether the video contents contain a target movable object or not; and

and responding to a completion confirmation operation initiated by a user, and completing the target track by utilizing the track segment containing the target movable object in the video content.

In the embodiment, the track of the movable object is associated with the data object capable of reflecting the behavior of the movable object in the target space, and the video processing can be performed on the basis of the movable object by taking the track as a data base, so that not only can the track of the movable object be displayed, but also the data object related to the behavior of the movable object in the target space can be associated and displayed, the related information of the movable object in the target space can be stereoscopically displayed in multiple directions, the video processing efficiency can be improved, and the video processing cost can be reduced.

Fig. 9 is a flowchart illustrating a video completion method according to an exemplary embodiment of the present application. As shown in fig. 9, the method includes:

91. and displaying a target track, wherein the target track is the track of the target movable object in the target space, and the target track has a missing part.

92. And determining candidate cameras which are possible to shoot the missing part in the target space according to the trend of the target movable object in front of the missing part.

93. Performing completion processing on the target track according to the video content corresponding to the missing time period and shot by the candidate camera; the deletion period is a period corresponding to the deletion portion.

In the present embodiment, the target movable object is any one of movable objects entering and exiting the target space, and the trajectory of the target movable object within the target space is obtained in advance. In the present embodiment, the manner of acquiring the trajectory of the target movable object is not concerned, and for example, but not limited to, the manner in the embodiment shown in fig. 7a or fig. 7b may be adopted.

In an optional embodiment, the method of this embodiment further includes: in the process of displaying the target track, a map of a target space is displayed in a correlated manner; a camera in the target space and a dynamic icon are displayed on the map, and the dynamic icon is linked with a target movable object in the target track.

Based on the foregoing, one embodiment of step 92 includes: responding to a completion operation initiated by a user aiming at the missing part, and calculating the final trend of the target movable object before the missing part; adjusting the orientation of a camera which finally shoots the target movable object before the missing part on the map to be consistent with the final trend; and responding to a selection operation initiated by a user for the cameras facing the coverage range, and determining the camera selected by the user as a candidate camera.

Further optionally, before responding to a selection operation initiated by a user for a camera facing a coverage area, the method further includes: and marking the camera which does not shoot the target movable object on the map so as to enable the user to perform selection operation on the camera which faces the coverage range and does not shoot the target movable object.

In an optional embodiment, before displaying the target trajectory, the method further includes: determining a missing part in the target track and a missing time period corresponding to the missing part according to the time of the target movable object entering and exiting the target space and the time of the target movable object appearing in the target track; and marking the missing time period on a time axis of the target track in the process of displaying the target track. Accordingly, responding to a completion operation initiated by a user for the missing part includes: and responding to the trigger operation of the user on the missing time period on the time axis, and determining that the user initiates a completion operation aiming at the missing part.

In an alternative embodiment, one implementation of step 93 includes: playing video content which is shot by the candidate camera and corresponds to a time period to be displayed so that a user can check whether a target movable object is included; responding to a completion confirmation operation initiated by a user, and completing the target track by using a track segment containing the target movable object; wherein the time period to be displayed at least comprises a missing time period.

Further optionally, playing the video content corresponding to the time period to be displayed and captured by the candidate camera includes: playing appointed video content shot by the candidate camera, and marking a missing time period on a time axis of the appointed video content; responding to the operation of selecting time points forwards and backwards from the missing time period by a user, and determining a starting time point and an ending time point of the time period to be displayed; and playing the video contents shot by the candidate camera from the starting time point until the ending time point.

In the embodiment, for a target track with a missing part, a candidate camera which is possible to shoot the missing part can be determined by combining the trend of the movable object in front of the missing part; and then, the target track is subjected to completion processing based on the video content shot by the candidate camera, so that the video content needing to be screened can be reduced, and the completion efficiency can be improved.

It should be noted that the execution subjects of the steps of the methods provided in the above embodiments may be the same device, or different devices may be used as the execution subjects of the methods. For example, the execution subjects of steps 91 to 93 may be device a; for another example, the execution subjects of

steps

91 and 92 may be device a, and the execution subject of step 93 may be device B; and so on.

In addition, in some of the flows described in the above embodiments and the drawings, a plurality of operations are included in a specific order, but it should be clearly understood that the operations may be executed out of the order presented herein or in parallel, and the sequence numbers of the operations, such as 91, 92, etc., are merely used to distinguish various operations, and the sequence numbers themselves do not represent any execution order. Additionally, the flows may include more or fewer operations, and the operations may be performed sequentially or in parallel. It should be noted that, the descriptions of "first", "second", etc. in this document are used for distinguishing different messages, devices, modules, etc., and do not represent a sequential order, nor limit the types of "first" and "second" to be different.

Fig. 10a is a schematic structural diagram of a server device according to an exemplary embodiment of the present application. As shown in fig. 10a, the server device includes: memory 103a, processor 101a, and communication component 102 a.

The memory 103a is used for storing computer programs and may be configured to store other various data to support operations on the server device. Examples of such data include instructions for any application or method operating on the server device, video data, pictures, messages, and so forth.

A processor 101a, coupled to the memory 103a, for executing the computer program in the memory 103a to: acquiring video data shot by a plurality of cameras in a target space, wherein the target space comprises at least one movable object; merging the video data shot by the cameras by taking the movable object as a unit to obtain the track of at least one movable object; the trajectory of the at least one movable object is associated with at least one data object, respectively, the at least one data object being related to the behavior of the movable object within the target space.

In an optional embodiment, when associating the trajectory of the at least one movable object with the at least one data object, the processor 101a is specifically configured to: for a first movable object, associating the track of the first movable object with time data of the first movable object entering and exiting the target space, behavior detail data of the first movable object in the target space and/or map data of the target space; the first movable object is any one of the at least one movable object.

Optionally, when the processor 101a associates the trajectory of the first movable object with the time data of the first movable object entering or exiting the target space, it is specifically configured to: time axis information of a trajectory of the first movable object is generated based on time data of the first movable object entering and exiting the target space.

Optionally, when associating the trajectory of the first movable object with the map data of the target space, the processor 101a is specifically configured to: and adding a dynamic icon in the map data of the target space according to the track of the first movable object, wherein the dynamic icon is linked with the first movable object. For example, the dynamic icon may be an avatar of the first movable object registered online.

Optionally, when the processor 101a associates the trajectory of the first movable object with the behavior detail data of the first movable object in the target space, it is specifically configured to: determining a behavior object, a behavior position and/or a behavior time of the first movable object when the first movable object performs the first behavior according to the behavior detail data of the first movable object in the target space; and establishing a corresponding relation between the behavior object, the behavior position and/or the behavior time of the first behavior and the track segment of the first behavior in the track of the first movable object.

In an alternative embodiment, the processor 101a is further configured to: analyzing whether the first movable object has or is possible to have the second behavior according to the track of the first movable object; if so, adding mark information related to the second behavior in the track of the first movable object.

Optionally, the processor 101a is specifically configured to perform at least one of the following operations when adding the marking information related to the second behavior to the track of the first movable object:

In an alternative embodiment, the processor 101a is further configured to: receiving a second data request sent by the display terminal through the communication component 102a, wherein the second data request comprises second identification information pointing to a target behavior, and the target behavior belongs to the second behavior; acquiring a target track segment containing a target behavior from the track of at least one movable object according to the second identification information; and providing the target track segment and the mark information related to the target behavior in the target track segment to a display terminal, so that the display terminal can perform associated display on the target track segment and the mark information related to the target behavior contained in the target track segment.

In an alternative embodiment, the processor 101a is further configured to: receiving a first data request sent by a display terminal through a communication component 102a, wherein the first data request comprises first identification information pointing to a target track; acquiring a target track from the track of at least one movable object according to the first identification information; and providing the target track and the at least one data object associated with the target track to a display terminal, so that the display terminal can perform associated display on the target track and the at least one data object associated with the target track.

In an optional embodiment, when obtaining the trajectory of the at least one movable object, the processor 101a is specifically configured to: performing target tracking on video data shot by each of the plurality of cameras to obtain movable objects shot by each of the plurality of cameras; and carrying out pedestrian re-identification REID processing on the movable objects shot by the cameras respectively to obtain the track of at least one movable object.

Optionally, when obtaining the movable object captured by each of the plurality of cameras, the processor 101a is specifically configured to: performing target tracking operation once every N frames of video data for each camera, and judging whether a movable object tracked by the last target tracking operation is tracked or not by combining the N frames of video data in each target tracking operation; wherein N is an integer of 2 or more.

Optionally, when the processor 101a performs REID processing on the movable object captured by each of the plurality of cameras to obtain a track of at least one movable object, the processing is specifically configured to: and performing REID processing on the movable objects shot by the cameras respectively by combining the overlapped visual fields among the cameras, the space-time information of the movable objects and/or the characteristic information of the movable objects acquired by other modes to obtain the track of at least one movable object.

Further optionally, the processor 101a is specifically configured to perform at least one of the following operations when obtaining the feature information of the movable object by other means: acquiring a face image of a movable object shot by a POS machine in a target space; and acquiring a face image in the electronic account corresponding to the movable object.

Further, as shown in fig. 10a, the server device further includes: power supply component 104a, and the like. Only some of the components are schematically shown in fig. 10a, and the server device is not meant to include only the components shown in fig. 10 a.

Accordingly, the present application further provides a computer-readable storage medium storing a computer program, where the computer program can implement the steps in the method embodiment of fig. 7a or fig. 7b when executed.

Fig. 10b is a schematic structural diagram of a display terminal according to an exemplary embodiment of the present application. As shown in fig. 10b, the display terminal includes: memory 103b, processor 101b, communication component 102b, and display 104 b.

The memory 103b is used to store a computer program and may be configured to store other various data to support operations on the display terminal. Examples of such data include instructions for any application or method operating on the display terminal, video data, pictures, messages, contact information, and the like.

A processor 101b, coupled to the memory 103b, for executing the computer program in the memory 103b to: sending a first data request to the server device through the communication component 102b in response to the first query operation to request a target track; the target trajectory is a trajectory of a target movable object within a target space; receiving a target track returned by the server equipment and at least one data object associated with the target track; during the target trajectory display process, at least one data object is displayed in association with the display 104b, the at least one data object being related to the behavior of the target movable object in the target space.

Optionally, when the processor 101b displays the time data of the moving object entering or exiting the target space in association with the display 104b, the processor is specifically configured to: on a time axis of the target track, time data of the target movable object coming in and out of the target space is displayed.

Optionally, the behavior detail data includes: the target movable object is a behavior object, a behavior location, and/or a behavior time at which the first behavior occurs within the target space. Based on this, the processor 101b is specifically configured to perform at least one of the following operations when the behavior detail data of the target movable object in the target space is displayed in an associated manner:

Further optionally, when the action position of the target movable object in the target space where the first action occurs is marked in the map data, the processor 101b is specifically configured to: when the track segment containing the behavior object is displayed, the behavior position where the first behavior occurs in the track segment is marked in the map data.

Optionally, when the processor 101b displays the map data of the target space in association with the display 104b, the processor is specifically configured to: displaying map data of a target space in a target track display process; and displaying dynamic icons in the map data, the dynamic icons being in linkage with the target movable objects in the target trajectory.

Further optionally, when the processor 101b displays the map data of the target space, it is specifically configured to: displaying map data of the target space in an information display area outside the target trajectory display area; alternatively, the map data of the target space is displayed in a floating layer form above the target trajectory display area.

In an alternative embodiment, the processor 101b is further configured to perform at least one of the following operations:

highlighting the behavioral objects within the information display area when the track segment containing the behavioral objects is displayed through the display 104 b;

highlighting, on a time axis, a behavior time at which a first behavior occurs with respect to a behavior object when a track segment containing the behavior object is displayed through the display 104 b;

In an optional embodiment, the map data includes icons of a plurality of cameras in the target space. Based on this, the processor 101b is further configured to: in the process of displaying the target track, a camera shooting the target movable object is dynamically marked in the map data along with the movement of the target movable object.

In an alternative embodiment, the processor 101b is further configured to: in response to the second query operation, sending a second data request to the server device through the communication component 102b to request a target track segment marked with a target behavior; receiving a target track fragment returned by the server equipment and mark information related to a target behavior in the target track fragment; during the display process of the target track segment, the mark information related to the target behavior in the target track segment is displayed in association with the display 104 b.

On the basis of the above optional embodiment, when the processor 101b sends the first data request to the server device through the communication component 102b, the processor is specifically configured to: in the display process of the target track segment, a control for viewing the complete track is displayed through the display 104 b; responding to the triggering operation of the control, and sending a first data request to the server-side equipment through the communication component 102 b; wherein, the complete track to which the target track segment belongs is regarded as the target track.

Optionally, in the online retail scene, the target space is an online store, the online store includes shelves, and the shelves include articles to be sold; the behavior detail data is order information formed by the first movable object purchasing articles in an online off-store; the first behavior is the behavior of purchasing the article; accordingly, the target behavior includes: at least one of a missed payment behavior, an unpaid payment behavior, a multi-payment behavior, a micropayment order behavior, and a purchase selection behavior of the specified item.

In an optional embodiment, when the processor 101b sends the second data request to the server device through the communication component 102b, the processor is specifically configured to: displaying a second query interface through the display 104b, the second query interface including query condition information items and map data of offline stores; responding to triggering operation of shelf information, positions in stores and/or cameras in the map data, and filling the triggered shelf information, positions in stores and/or cameras as query conditions; and sending a second data request to the server-side equipment according to the query condition so that the server-side equipment returns the target track segment marked with the target behavior according to the query condition.

Further, the processor 101b is further configured to: before displaying the target track, judging whether the target track has a missing part; and if the missing part exists, completing the target track.

Optionally, when the processor 101b completes the target trajectory, it is specifically configured to: responding to a completion operation initiated by a user aiming at the missing part, and determining candidate cameras which are possible to shoot the missing part in the target space; playing video contents shot by the candidate cameras so that a user can confirm whether the video contents contain a target movable object or not; and responding to a completion confirmation operation initiated by a user, and completing the target track by using the track segment containing the target movable object in the video content.

Further, as shown in fig. 10b, the server device further includes: power component 105b, audio component 106b, and the like. Only some of the components are schematically shown in fig. 10b, and it is not meant that the display terminal includes only the components shown in fig. 10 b.

Accordingly, the present application further provides a computer-readable storage medium storing a computer program, where the computer program can implement the steps in the method embodiment of fig. 8 described above when executed.

Fig. 10c is a schematic structural diagram of another display terminal according to an exemplary embodiment of the present application. As shown in fig. 10c, the display terminal includes: memory 103c, processor 101c, communication component 102c, and display 104 c.

The memory 103c is used to store a computer program and may be configured to store other various data to support operations on the display terminal. Examples of such data include instructions for any application or method operating on the display terminal, video data, pictures, messages, contact information, and the like.

A processor 101c, coupled to the memory 103c, for executing the computer program in the memory 103c for: displaying a target trajectory through the display 104c, the target trajectory being a trajectory of a target movable object within a target space, the target trajectory having a missing portion; determining candidate cameras which are possible to shoot the missing part in the target space according to the trend of the target movable object in front of the missing part; performing completion processing on the target track according to the video content corresponding to the missing time period and shot by the candidate camera; the deletion period is a period corresponding to the deletion portion.

In an alternative embodiment, the processor 101c is further configured to: in displaying the target trajectory through the display 104c, a map of the target space is displayed in association; a camera in the target space and a dynamic icon are displayed on the map, and the dynamic icon is linked with a target movable object in the target track.

Optionally, when determining that there is a possibility that a candidate camera of the missing part is captured in the target space, the processor 101c is specifically configured to: responding to a completion operation initiated by a user aiming at the missing part, and calculating the final trend of the target movable object before the missing part; adjusting the orientation of a camera which finally shoots the target movable object before the missing part on the map to be consistent with the final trend; and responding to a selection operation initiated by a user for the cameras facing the coverage range, and determining the camera selected by the user as a candidate camera.

Further optionally, the processor 101c is further configured to: and marking the camera which does not shoot the target movable object on the map before responding to the selection operation initiated by the user for the camera facing the coverage range, so that the user can select the camera facing the coverage range and not shooting the target movable object.

In an alternative embodiment, the processor 101c is further configured to: determining a missing portion in the target trajectory and a corresponding missing time period thereof according to the time when the target movable object enters or exits the target space and the time when the target movable object appears in the target trajectory before the target trajectory is displayed through the display 104 c; and marking the missing time period on a time axis of the target track in the process of displaying the target track. Accordingly, when responding to the completion operation initiated by the user for the missing part, the processor 101c is specifically configured to: and responding to the trigger operation of the user on the missing time period on the time axis, and determining that the user initiates a completion operation aiming at the missing part.

In an optional embodiment, when performing the completion processing on the target trajectory, the processor 101c is specifically configured to: playing video content corresponding to a time period to be displayed, which is shot by the candidate camera, through the display 104c so that a user can check whether a target movable object is included; responding to a completion confirmation operation initiated by a user, and completing the target track by using a track segment containing the target movable object; wherein the time period to be displayed at least comprises a missing time period.

Further optionally, when the video content corresponding to the time period to be displayed and captured by the candidate camera is played through the display 104c, the processor 101c is specifically configured to: playing the designated video content shot by the candidate camera through the display 104c, and marking a missing time period on a time axis of the designated video content; responding to the operation of selecting time points forwards and backwards from the missing time period by a user, and determining a starting time point and an ending time point of the time period to be displayed; and playing the video contents shot by the candidate camera from the starting time point until the ending time point.

Further, as shown in fig. 10c, the server device further includes: power component 105c, audio component 106c, and the like. Only some of the components are schematically shown in fig. 10c, and it is not meant that the display terminal includes only the components shown in fig. 10 c.

Accordingly, the present application further provides a computer-readable storage medium storing a computer program, where the computer program can implement the steps in the method embodiment of fig. 9 when executed.

The memory in the above embodiments may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The communication component in the above embodiments is configured to facilitate communication between the device in which the communication component is located and other devices in a wired or wireless manner. The device in which the communication component is located may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component receives a broadcast signal or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

The display in the above embodiments includes a screen, which may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation.

The power supply components in the embodiments described above provide power to the various components of the device in which the power supply components are located. The power components may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device in which the power component is located.

The audio component in the above embodiments may be configured to output and/or input an audio signal. For example, the audio component includes a Microphone (MIC) configured to receive an external audio signal when the device in which the audio component is located is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in a memory or transmitted via a communication component. In some embodiments, the audio assembly further comprises a speaker for outputting audio signals.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A video processing system, comprising: the system comprises a plurality of cameras, server equipment and a display terminal, wherein the cameras, the server equipment and the display terminal are deployed in a target space; the target space allows movable objects to come in and go out;

the cameras are used for shooting video data in respective view field ranges and uploading the shot video data to the server side equipment;

the server-side equipment is used for merging the video data of the cameras by taking the movable objects as units to obtain the track of each movable object entering the target space, and associating the track of each movable object with at least one data object respectively, wherein the at least one data object is related to the behavior of the movable object in the target space;

the display terminal is used for acquiring a target track and at least one data object associated with the target track from the server-side equipment according to the query operation, and displaying the at least one data object associated with the target track in an associated manner in the process of displaying the target track; the target trajectory is a trajectory of a target movable object within a target space.

2. The system of claim 1, wherein the target space is an off-line store and the movable object is a user entering and exiting the off-line store.

3. The system of claim 2, wherein the at least one data object comprises: the time when the user visits the online store, the details of the user's order within the online store, and/or a map of the online store.

4. A video processing method, comprising:

acquiring video data shot by a plurality of cameras in a target space, wherein the target space comprises at least one movable object;

merging the video data shot by the cameras by taking the movable object as a unit to obtain the track of the at least one movable object;

and respectively associating the track of the at least one movable object with at least one data object, wherein the at least one data object is related to the behavior of the movable object in the target space.

5. The method of claim 4, wherein associating the trajectory of the at least one movable object with at least one data object, respectively, comprises:

for a first movable object, correlating a trajectory of the first movable object with time data of the first movable object entering and exiting the target space, behavior detail data of the first movable object within the target space, and/or map data of the target space;

wherein the first movable object is any one of the at least one movable object.

6. The method of claim 5, wherein correlating the trajectory of the first movable object with temporal data of the first movable object in and out of the target space comprises:

and generating time axis information of the track of the first movable object according to the time data of the first movable object entering and exiting the target space.

7. The method of claim 5, wherein associating the trajectory of the first movable object with map data of the target space comprises:

and adding a dynamic icon in the map data of the target space according to the track of the first movable object, wherein the dynamic icon is linked with the first movable object.

8. The method of claim 7, wherein the dynamic icon is an avatar of the first movable object registered online.

9. The method of claim 5, wherein associating the trajectory of the first movable object with behavioral detail data of the first movable object within the target space comprises:

determining a behavior object, a behavior position and/or a behavior time of the first movable object when the first movable object performs a first behavior according to the behavior detail data of the first movable object in the target space;

and establishing a corresponding relation between the behavior object, the behavior position and/or the behavior time of the first behavior and the track segment of the first behavior in the track of the first movable object.

10. The method of claim 5, further comprising:

analyzing whether a second behavior exists or possibly exists in the first movable object according to the track of the first movable object;

and if so, adding mark information related to the second behavior in the track of the first movable object.

11. The method of claim 10, wherein adding marker information associated with the second behavior in the trajectory of the first movable object comprises at least one of:

adding at least one of detail information, on-line sales information, off-line inventory amount, and replenishment suggestion information of the behavior object of the second behavior in the trajectory of the first movable object;

adding a highlight mark for the behavior time of the second behavior in the trajectory of the first movable object;

adding detail information of the first movable object in a trajectory of the first movable object.

12. The method of claim 10, wherein the target space is an off-line store containing items for sale;

the behavior detail data is order information formed by purchasing articles by the first movable object in the off-line store; the first behavior is a behavior of purchasing an item;

the second behavior comprises: at least one of a missed payment behavior, an unpaid payment behavior, a multi-payment behavior, a micropayment order behavior, and a purchase selection behavior of the specified item.

13. The method of claim 10, further comprising:

receiving a second data request sent by a display terminal, wherein the second data request comprises second identification information pointing to a target behavior, and the target behavior belongs to the second behavior;

acquiring a target track segment containing a target behavior from the track of the at least one movable object according to the second identification information;

and providing the target track segment and the mark information related to the target behavior in the target track segment to a display terminal, so that the display terminal can perform associated display on the target track segment and the mark information related to the target behavior contained in the target track segment.

14. The method according to any one of claims 4-14, further comprising:

receiving a first data request sent by a display terminal, wherein the first data request comprises first identification information pointing to a target track;

acquiring the target track from the track of the at least one movable object according to the first identification information;

and providing the target track and at least one data object associated with the target track to the display terminal so that the display terminal can perform associated display on the target track and the at least one data object associated with the target track.

15. The method according to claim 14, wherein the first identification information is an identification of the target movable object or an identification of behavior detail data of the target movable object; the target movable object is a movable object corresponding to the target trajectory.

16. The method according to any one of claims 4 to 14, wherein merging the video data captured by the plurality of cameras into a track of the at least one movable object in units of movable objects comprises:

performing target tracking on video data shot by each of the plurality of cameras to obtain movable objects shot by each of the plurality of cameras;

and carrying out pedestrian re-identification REID processing on the movable objects shot by the cameras respectively to obtain the track of the at least one movable object.

17. The method of claim 16, wherein performing target tracking on the video data captured by each of the plurality of cameras to obtain the movable object captured by each of the plurality of cameras comprises:

performing target tracking operation once every N frames of video data for each camera, and judging whether a movable object tracked by the last target tracking operation is tracked or not by combining the N frames of video data in each target tracking operation; wherein N is an integer of 2 or more.

18. The method of claim 17, wherein performing pedestrian re-identification REID processing on the movable objects captured by each of the plurality of cameras to obtain the trajectory of the at least one movable object comprises:

and performing REID processing on the movable objects shot by the cameras respectively by combining the overlapped view among the cameras, the space-time information of the movable objects and/or the feature information of the movable objects acquired by other modes to obtain the track of the at least one movable object.

19. The method of claim 18, wherein obtaining feature information of the movable object by other means comprises at least one of:

acquiring a face image of a movable object shot by a POS machine in the target space;

and acquiring a face image in the electronic account corresponding to the movable object.

20. The method of any of claims 4-14, wherein the movable object is a user, a robot, or an autonomous shopping cart within the target space.

21. A server-side device, comprising: a memory and a processor;

the memory for storing a computer program;

the processor, coupled with the memory, to execute the computer program to:

22. A trajectory display method, comprising:

responding to the first query operation, and sending a first data request to the server-side equipment to request a target track; the target trajectory is a trajectory of a target movable object within a target space;

receiving a target track returned by the server-side equipment and at least one data object associated with the target track;

and in the target track display process, the at least one data object is displayed in an associated mode, and the at least one data object is related to the behavior of the target movable object in the target space.

23. The method of claim 22, wherein the first data request carries first identification information pointing to the target track, and the first identification information is an identification of a target movable object associated with the target track, an identification of behavior detail data of the target movable object, or time data of entering or exiting the target space.

24. The method of claim 22, wherein the at least one data object comprises: time data of the target movable object entering or exiting the target space, behavior detail data of the target movable object in the target space, and/or map data of the target space.

25. The method of claim 24, wherein associating the time data showing the movement of the target movable object into and out of the target space during the target trajectory display process comprises:

displaying time data of the target movable object entering and exiting the target space on a time axis of the target track.

26. The method of claim 24, wherein the behavioral detail data comprises: a behavior object, a behavior location, and/or a behavior time at which the target movable object performs a first behavior within the target space;

in the target track display process, the behavior detail data of the target movable object in the target space is displayed in an associated mode, and the method comprises at least one of the following operations:

displaying, on a time axis of the target track, a behavior time at which the target movable object undergoes a first behavior within the target space;

a behavior location within the target space where a first behavior of the target movable object occurs is marked in the map data.

27. The method of claim 26, wherein marking a behavior location in the map data where the first behavior of the target movable object occurs within the target space comprises:

when the track segment containing the behavior object is displayed, the behavior position of the first behavior occurring in the track segment is marked in the map data.

28. The method of claim 24, wherein displaying the map data of the target space in association with the target trajectory display process comprises:

displaying map data of the target space in the target track display process; and displaying a dynamic icon in the map data, the dynamic icon being in linkage with the target movable object in the target trajectory.

29. The method of claim 28, wherein displaying the map data of the target space during the target trajectory display process comprises:

displaying map data of the target space in an information display area outside the target track display area; or displaying the map data of the target space in a floating layer mode above the target track display area.

30. The method of claim 26, further comprising at least one of:

highlighting the behavior object within the information display area when a track segment containing the behavior object is displayed;

highlighting, on the timeline, a behavior time at which a first behavior occurs for the behavior object when displayed to a track segment containing the behavior object;

31. The method of claim 28, wherein the map data includes icons for a plurality of cameras within the target space, the method further comprising:

and in the process of displaying the target track, dynamically marking a camera shooting the target movable object in the map data along with the movement of the target movable object.

32. The method of claim 22, further comprising:

responding to the second query operation, and sending a second data request to the server-side equipment to request the target track segment marked with the target behavior;

receiving a target track fragment returned by the server-side equipment and mark information related to the target behavior in the target track fragment;

and in the display process of the target track segment, displaying the mark information related to the target behavior in the target track segment in a correlated manner.

33. The method of claim 32, wherein sending a first data request to the server device in response to the first query operation comprises:

in the display process of the target track segment, displaying a control for viewing the complete track;

responding to the trigger operation of the control, and sending a first data request to the server-side equipment; and the complete track to which the target track segment belongs is the target track.

34. The method of claim 32, wherein the target space is an off-line store comprising shelves containing items for sale;

the behavior detail data is order information formed by purchasing articles by the target movable object in the off-line store; the first behavior is a behavior of purchasing an item;

the target behaviors include: at least one of a missed payment behavior, an unpaid payment behavior, a multi-payment behavior, a micropayment order behavior, and a purchase selection behavior of the specified item.

35. The method of claim 34, wherein sending a second data request to the server device in response to the second query operation to request a target track segment tagged with a target behavior comprises:

displaying a second query interface, wherein the second query interface comprises query condition information items and map data of the offline store;

responding to triggering operation of the shelf information, the positions in the stores and/or the cameras in the map data, and filling the triggered shelf information, the positions in the stores and/or the information of the cameras as query conditions;

and sending a second data request to the server side equipment according to the query condition so that the server side equipment returns the target track segment marked with the target behavior according to the query condition.

36. The method of any one of claims 22-35, further comprising, prior to displaying the target trajectory:

judging whether the target track has a missing part or not;

and if the missing part exists, completing the target track.

37. The method of claim 36, wherein completing the target trajectory comprises:

responding to a completion operation initiated by a user for the missing part, and determining candidate cameras which are possible to shoot the missing part in the target space;

playing video content shot by the candidate camera so that a user can confirm whether the video content contains the target movable object or not; and

38. A display terminal, comprising: a memory, a processor, a communication component, and a display;

the memory for storing a computer program;

the processor, coupled with the memory, to execute the computer program to:

39. A trajectory completion method, comprising:

displaying a target track, wherein the target track is a track of a target movable object in a target space, and a missing part exists in the target track;

determining candidate cameras which are possible to shoot the missing part in the target space according to the trend of the target movable object in front of the missing part;

performing completion processing on the target track according to the video content corresponding to the missing time period and shot by the candidate camera; the deletion period is a period corresponding to the deletion portion.

40. The method of claim 39, further comprising:

in the process of displaying the target track, a map of the target space is displayed in a correlated manner; and a camera and a dynamic icon in the target space are displayed on the map, and the dynamic icon is linked with the target movable object in the target track.

41. The method of claim 40, wherein determining candidate cameras in the target space that are likely to capture the missing portion based on the orientation of the target movable object before the missing portion comprises:

responding to a completion operation initiated by a user aiming at the missing part, and calculating the last trend of the target movable object before the missing part;

adjusting an orientation of a camera on the map that last captured the target movable object before the missing portion to coincide with the last heading;

and responding to a selection operation initiated by the user for the cameras in the orientation coverage range, and determining the camera selected by the user as the candidate camera.

42. The method of claim 41, further comprising, prior to responding to a user-initiated selection operation for a camera within the orientation coverage area:

and marking cameras which do not shoot the target movable object on the map so as to enable a user to select and operate the cameras which are in the orientation coverage range and do not shoot the target movable object.

43. The method of claim 41, further comprising, prior to displaying the target trajectory:

determining a missing part in the target track and a missing time period corresponding to the missing part according to the time of the target movable object entering and exiting the target space and the time of the target movable object appearing in the target track; and

marking the missing time period on a time axis of the target track in the process of displaying the target track.

44. The method of claim 43, wherein responding to a completion operation initiated by a user for the missing portion comprises:

and responding to the triggering operation of the user on the missing time period on the time axis, and determining that the user initiates a completion operation aiming at the missing part.

45. The method according to any one of claims 30 to 44, wherein the completing the target trajectory according to the video content corresponding to the missing time period captured by the candidate camera comprises:

playing video content which is shot by the candidate camera and corresponds to a time period to be displayed so that a user can check whether the target movable object is included; and

completing the target track by utilizing a track segment containing the target movable object in response to a completion confirming operation initiated by a user; wherein the time period to be displayed at least comprises the missing time period.

46. The method of claim 45, wherein playing the video content captured by the candidate camera corresponding to the time period to be displayed comprises:

playing the appointed video content shot by the candidate camera, and marking the missing time period on the time axis of the appointed video content;

responding to the operation of selecting time points forwards and backwards from the missing time period by a user, and determining a starting time point and an ending time point of the time period to be displayed;

and playing the video content shot by the candidate camera from the starting time point until the ending time point.

47. A display terminal, comprising: a memory, a processor, a communication component, and a display;

the memory for storing a computer program;

the processor, coupled with the memory, to execute the computer program to:

48. A computer readable storage medium having a computer program stored thereon, which, when executed by one or more processors, causes the one or more processors to perform the steps of the method of any one of claims 4-20, 22-37, and 39-46.