WO2020220968A1 - 一种视频数据处理方法和相关装置 - Google Patents
一种视频数据处理方法和相关装置 Download PDFInfo
- Publication number
- WO2020220968A1 WO2020220968A1 PCT/CN2020/084112 CN2020084112W WO2020220968A1 WO 2020220968 A1 WO2020220968 A1 WO 2020220968A1 CN 2020084112 W CN2020084112 W CN 2020084112W WO 2020220968 A1 WO2020220968 A1 WO 2020220968A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- video frame
- target
- pixel
- matrix
- video
- Prior art date
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 21
- 238000000034 method Methods 0.000 claims abstract description 75
- 230000004044 response Effects 0.000 claims abstract description 21
- 239000011159 matrix material Substances 0.000 claims description 420
- 238000006073 displacement reaction Methods 0.000 claims description 355
- 230000003287 optical effect Effects 0.000 claims description 53
- 238000013507 mapping Methods 0.000 claims description 33
- 238000012545 processing Methods 0.000 claims description 27
- 238000004891 communication Methods 0.000 claims description 17
- 238000012216 screening Methods 0.000 claims description 17
- 238000004590 computer program Methods 0.000 claims description 15
- 238000001914 filtration Methods 0.000 claims description 10
- 230000006870 function Effects 0.000 claims description 7
- 238000010586 diagram Methods 0.000 description 28
- 238000004364 calculation method Methods 0.000 description 23
- 230000010354 integration Effects 0.000 description 20
- 230000008569 process Effects 0.000 description 16
- 230000000694 effects Effects 0.000 description 10
- 230000014509 gene expression Effects 0.000 description 7
- 230000009286 beneficial effect Effects 0.000 description 6
- 238000000926 separation method Methods 0.000 description 6
- 230000009466 transformation Effects 0.000 description 5
- 238000007781 pre-processing Methods 0.000 description 4
- 230000000007 visual effect Effects 0.000 description 4
- 238000012937 correction Methods 0.000 description 2
- 238000007667 floating Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000000717 retained effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/44008—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T11/00—2D [Two Dimensional] image generation
- G06T11/60—Editing figures and text; Combining figures or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/80—2D [Two Dimensional] animation, e.g. using sprites
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/50—Image enhancement or restoration using two or more images, e.g. averaging or subtraction
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/238—Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
- H04N21/2387—Stream processing in response to a playback request from an end-user, e.g. for trick-play
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/431—Generation of visual interfaces for content selection or interaction; Content or additional data rendering
- H04N21/4312—Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/472—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
- H04N21/47217—End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content for controlling playback functions for recorded or on-demand content, e.g. using progress bars, mode or play-point indicators or bookmarks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4884—Data services, e.g. news ticker for displaying subtitles
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/83—Generation or processing of protective or descriptive data associated with content; Content structuring
- H04N21/845—Structuring of content, e.g. decomposing content into time segments
- H04N21/8455—Structuring of content, e.g. decomposing content into time segments involving pointers to the content, e.g. pointers to the I-frames of the video stream
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/80—Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
- H04N21/85—Assembly of content; Generation of multimedia applications
- H04N21/854—Content authoring
- H04N21/8547—Content authoring involving timestamps for synchronizing content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20092—Interactive image processing based on input by user
- G06T2207/20101—Interactive definition of point of interest, landmark or seed
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Definitions
- This application relates to the field of Internet technology, and in particular to a video data processing method and related devices.
- the user can see the user text or user comments posted by the user or other users on the video playback interface.
- the user text output to the video playback interface is usually output and displayed through a fixed text display track in the video playback interface.
- the embodiments of the present application provide a video data processing method and related devices.
- the embodiments of the present application provide a method for processing video data.
- the method is applied to a computer device and includes:
- a target pixel is determined from a key video frame of the target video, and multimedia information associated with the target pixel is acquired, wherein the key video frame is the trigger operation In the video frame where the target pixel is located, the target pixel is the pixel corresponding to the trigger operation in the key video frame;
- the target trajectory information includes the target pixel in the key video frame
- Position information in the next video frame of the key video frame is obtained by tracking the target pixel
- the multimedia information is displayed based on the position information of the target pixel in the target track information in the next video frame of the key video frame.
- the embodiments of the present application provide a video data processing method.
- the method is applied to a service server and includes:
- the track information is determined by the position information of the pixels in each video frame of the target video;
- the target trajectory information associated with the position information of the target pixel in the key video frame, and return the target trajectory information, wherein the target trajectory
- the information includes target location information, and the target location information is used to trigger the display of multimedia information associated with the target pixel in the next video frame of the key video frame.
- One aspect of the embodiments of the present application provides a video data processing method, and the method includes:
- trajectory information associated with the target video is generated, wherein the The trajectory information includes target trajectory information used to track and display the multimedia information associated with the target pixel in the target video.
- One aspect of the embodiments of the present application provides a video data processing device.
- the device is applied to computer equipment and includes:
- the object determination module is configured to determine a target pixel from a key video frame of the target video in response to a trigger operation on the target video, and obtain multimedia information associated with the target pixel, wherein the key video A frame is a video frame where the trigger operation is located, and the target pixel is a pixel in the key video frame corresponding to the trigger operation;
- a request determination module configured to determine a trajectory acquisition request corresponding to the target pixel based on the position information of the target pixel in the key video frame;
- the trajectory acquisition module is configured to acquire target trajectory information associated with the position information of the target pixel in the key video frame based on the trajectory acquisition request, wherein the target trajectory information includes the target pixel Position information in a video frame next to the key video frame, and position information of the target pixel in a video frame next to the key video frame is obtained by tracking the target pixel;
- the text display module is used for displaying the position information of the target pixel in the target track information in the next video frame of the key video frame when the next video frame of the key video frame is played. Description of multimedia information.
- One aspect of the embodiments of the present application provides a video data processing device, which is applied to a service server and includes:
- the request response module is used to obtain track information associated with the target video in response to the track acquisition request for the target pixel in the key video frame, where the key video frame is a video frame in the target video, so The target pixel is a pixel in the key video frame, and the track information is determined by the position information of the pixel in each video frame of the target video;
- the trajectory screening module is used to filter the target trajectory information associated with the position information of the target pixel in the key video frame from the trajectory information associated with the target video, and return the target trajectory information, wherein, the target trajectory information includes target position information, and the target position information is used to trigger the display of multimedia information associated with the target pixel in the next video frame of the key video frame.
- One aspect of the embodiments of the present application provides a video data processing device, and the device includes:
- the first acquisition module is configured to acquire adjacent first video frames and second video frames from the target video
- the matrix acquisition module is configured to determine the average value corresponding to the first video frame based on the optical flow tracking rules corresponding to the target video, the pixels in the first video frame, and the pixels in the second video frame Displacement matrix
- a position tracking module configured to track the position information of the pixels in the first video frame based on the average displacement matrix, and determine the position information of the pixels obtained by tracking in the second video frame;
- the trajectory generation module is configured to generate a trajectory associated with the target video based on the position information of the pixel in the first video frame and the position information of the pixel in the second video frame obtained by the tracking Information, wherein the trajectory information includes target trajectory information used to track and display multimedia information associated with the target pixel in the target video.
- One aspect of the embodiments of the present application provides a computer device, including a processor, a memory, and a network interface;
- the processor is connected to a memory and a network interface, where the network interface is used to provide data communication functions, the memory is used to store a computer program, and the processor is used to call the computer program to execute as in the embodiment of the present application Method in one aspect.
- One aspect of the embodiments of the present application provides a computer storage medium that stores a computer program.
- the computer program includes program instructions. When executed by a processor, the program instructions execute as in the embodiments of the present application. Method in one aspect.
- FIG. 1 is a schematic structural diagram of a network architecture provided by an embodiment of the present application.
- FIG. 2 is a schematic diagram of multiple video frames in a target video provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of a scene for acquiring a target video provided by an embodiment of the present application
- FIG. 4 is a schematic flowchart of a video data processing method provided by an embodiment of the present application.
- FIG. 5 is a schematic diagram of acquiring multimedia information according to an embodiment of the present application.
- FIG. 6 is a schematic diagram of a full-image pixel tracking provided by an embodiment of the present application.
- FIG. 7 is a schematic diagram of tracking barrage data in consecutive multiple video frames according to an embodiment of the present application.
- FIG. 8 is a schematic diagram of another video data processing method provided by an embodiment of the present application.
- FIG. 9 is a method for determining effective pixels provided by an embodiment of the present application.
- FIG. 10 is a schematic diagram of displaying barrage data based on trajectory information according to an embodiment of the present application.
- FIG. 11 is a schematic structural diagram of a video data processing device provided by an embodiment of the present application.
- FIG. 12 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
- FIG. 13 is a schematic structural diagram of another video data processing device provided by an embodiment of the present application.
- Figure 14 is a schematic structural diagram of another computer device provided by an embodiment of the present application.
- FIG. 15 is a schematic structural diagram of another video data processing device provided by an embodiment of the present application.
- FIG. 16 is a schematic structural diagram of another computer device provided by an embodiment of the present application.
- the user text displayed in the video playback interface is independent of the video content played on the video playback interface, so that the displayed user text lacks a certain correlation with the video content.
- the user terminal outputs the acquired user text through a predetermined text display track. Therefore, the user text sent by each user is output through the same text display track, so that it is impossible to comment on the video content in a targeted manner.
- the network architecture may include a service server 2000 (or application server 2000) and a user terminal cluster.
- the service server 2000 may be a server cluster composed of a large number of servers, for example, a cloud server, or simply called the cloud.
- the user terminal cluster may include a plurality of user terminals, as shown in FIG. 1, and specifically may include a user terminal 3000a, a user terminal 3000b, a user terminal 3000c, ..., a user terminal 3000n.
- the user terminal 3000a, the user terminal 3000b, the user terminal 3000c, ..., the user terminal 3000n can be connected to the service server 2000 respectively, so that each user terminal can connect to the service server 2000 through the network. Data exchange between.
- each user terminal in the user terminal cluster can be integrated and installed with a target application.
- the target application runs in each user terminal, it can communicate with the business server 2000 shown in Figure 1 above. Data interaction.
- the target application may include multimedia applications, social applications, entertainment applications, and other applications with video playback functions.
- the embodiment of the present application takes one user terminal of the plurality of user terminals as the target user terminal as an example to illustrate that the target user terminal integrated with the target application passes through the service data display platform and the service server 2000. The specific process of realizing data exchange between.
- the target user terminal in the embodiment of the present application may include a personal computer, a tablet computer, a notebook computer, a smart phone, and other mobile terminals integrated with the above-mentioned target application.
- the business server 2000 may be a background server of the target application, and the business database corresponding to the background server may be used to store each business data information displayed on the business data display platform, and the business data information may include video data, etc. Internet information.
- multiple videos can be displayed on the business data display platform, and when a target user triggers one of the multiple videos through the business data display platform in the target user terminal, the video data corresponding to the video can be obtained, and then The video data can be played in the target user terminal, and the video data currently being played in the target user terminal can be further referred to as a target video.
- the target video is the video data returned by the service server 2000 based on the data loading instruction sent by the target user terminal.
- the target video may include multiple video frames, and each video frame may be referred to as an image data, and each video frame corresponds to a playback timestamp (that is, a playback time stamp) in the playback duration of the target video. Time), so that when the target user terminal subsequently loads and plays the target video, it can display the corresponding video frame in the playback display interface based on the playback timestamp corresponding to each video frame in the target video.
- a playback timestamp that is, a playback time stamp
- the service server 2000 can sub-frame each video in the video set stored in the service database during the video preprocessing stage, so that multiple video frames contained in each video can be split into one picture.
- FIG. 2 is a schematic diagram of multiple video frames in a target video provided by an embodiment of the present application.
- the target video may be video A in the aforementioned service database.
- the video A may include n video frames, and n is a positive integer greater than zero.
- the service server 2000 may split the n video frames in the video A into n pictures in advance. Each two adjacent pictures in the n pictures can be called an image pair. For example, as shown in FIG.
- the embodiment of the present application may refer to the video frame corresponding to the first moment and the video frame corresponding to the second moment shown in FIG. 2 as an image pair, and may refer to the video corresponding to the second moment
- the video frame corresponding to the frame and the third moment is called an image pair, ...
- the video frame at the n-1th moment and the video frame corresponding to the nth moment may be called the first image pair.
- multiple image pairs can be determined from multiple video frames of the target video, and each image pair can contain two consecutively adjacent video frames, that is, each Each image pair can contain two adjacent video frames.
- the embodiment of the present application takes the first image pair of the multiple image pairs as an example.
- the embodiment of the present application may call a video frame in the first image pair (for example, the video frame corresponding to the first moment shown in FIG. 2) the first video frame, and may The other video frame in the image pair (that is, the video frame corresponding to the second moment) is called the second video frame, and then based on the optical flow tracking rules, the positions of all pixels in the first video frame in the image pair The information is tracked to obtain the position information of each pixel in the first video frame in the second video frame.
- each image pair contains two adjacent video frames
- the position information of the pixel in the first video frame of each image pair in the next video frame can be calculated, and finally the information on the position of the pixel in the next video frame can be obtained in the service server 2000.
- the service server 2000 can pre-calculate the trajectory information of all pixels in the video A, when the target user plays the video A in the target user terminal, the currently played video A can be called a target video .
- a trigger operation can be performed on the object that needs to be tracked (ie, the target object) in the target user terminal.
- the pixel corresponding to the trigger operation is called the target Pixels, that is, the target pixel is determined by the trigger operation performed by the target user for the target object in the currently played video frame, and the trigger operation can be used to select the current video frame to be tracked Target audience.
- the embodiment of the application may call the video frame corresponding to the trigger operation a key video frame.
- the embodiment of the application may call the video frame currently containing the target pixel as the key video frame in the target video .
- the key video frame may be the video frame corresponding to the first moment in the embodiment corresponding to FIG. 2 above, and optionally, the key video frame may also be the second moment in the embodiment corresponding to FIG. 2 above. The corresponding video frames will not be listed here.
- the embodiment of the present application may send the key video frame, the target pixel in the key video frame, and the position information of the target pixel to the service server 2000 in the embodiment corresponding to FIG. 1, so that the service The server 2000 may filter the trajectory information that matches the position information of the target pixel from the pre-calculated trajectory information of all the pixels in the target video based on the position information of the target pixel in the key video frame , As the target trajectory information.
- the target track information may include the position coordinates of the target pixel in the video frame after the key video frame.
- the service server 2000 may return the target trajectory information to the target user terminal, so that the target user terminal can further determine that the target pixel is in the key video frame according to the target trajectory information when playing the next video frame of the key video frame.
- the position information in the next video frame of the video frame can obtain the target position information of the target pixel, and then the multimedia information corresponding to the target object can be displayed based on the target position information.
- FIG. 3 is a schematic diagram of a scene for acquiring a target video provided by an embodiment of the present application.
- the target user terminal shown in FIG. 3 may be the user terminal 3000a in the embodiment corresponding to FIG. 1 above.
- the target user after entering the target application, the target user can display the business data display platform of the target application in the target user terminal (for example, a smart phone), and the business data display platform can display the business data display platform shown in Figure 3.
- video 10a video 20a, video 30a, and video 40a.
- the target user needs to play the video 30a shown in FIG. 3 in the target user terminal (the video 30a may be the video A in the embodiment corresponding to FIG.
- the application server of the network connection relationship may be the business server 2000 in the embodiment corresponding to FIG. 1 above. It is understandable that when the application server obtains the data loading instruction, it can search for the video data corresponding to the target identification information from the service database, and can collectively refer to the found video data as target data, so that the target can be referred to as target data.
- the data is given to the target user terminal shown in Fig.
- the target user terminal can play the video data in the video playback interface shown in Fig. 3.
- the target user terminal can select and play the target user
- the video 30a is called the target video, that is, at this time, the target user terminal can play each video frame in the video A according to the playback timestamp shown in FIG. 3 above.
- the above-mentioned service server 2000 obtains position information of the pixel in the second video frame.
- the specific process of screening the target trajectory information corresponding to the target pixel refer to the implementation manners provided in the embodiments corresponding to FIGS. 8 to 10 below.
- FIG. 4 is a schematic flowchart of a video data processing method provided by an embodiment of the present application. As shown in FIG. 4, the method can be applied to the target user terminal in the embodiment corresponding to FIG. 1, and the method may include:
- Step S101 in response to a trigger operation on the target video, determine a target pixel from a key video frame of the target video, and obtain multimedia information associated with the target pixel, wherein the key video frame is all In the video frame where the trigger operation is located, the target pixel is a pixel in the key video frame corresponding to the trigger operation.
- the target user terminal may display a business data display platform for carrying multiple business data information on the display interface of the target application, for example, each business data information on the business data display platform Can be a video.
- the business data information displayed on the business data display platform may be filtered by an application server that has a network connection relationship with the target user terminal based on the user portrait data of the target user (for example, the historical behavior data of the target user). definite.
- the video data corresponding to the video can be loaded from the business database corresponding to the application server, and then the video data can be displayed in the target
- the loaded video data is played on the video playing interface of the user terminal.
- the target user terminal may obtain the trigger operation performed by the target user on the target object (that is, the object to be tracked) in the video playback interface during the process of playing the video data.
- the trigger operation is, for example, clicking or touching a certain point in the target object in the video frame displayed on the display screen of the target user terminal with a mouse.
- the video frame corresponding to the trigger operation may be called a key video frame, and the pixel point corresponding to the trigger operation in the key video frame may be called a target pixel point.
- Pixels are points in an image (for example, a video frame). If the image is a 640 ⁇ 480 resolution picture, 640 ⁇ 480 pixels are distributed on it. Generally, pixels in an image have spatial location and color (or grayscale) attributes.
- the target user terminal can also create a text box in a sub-window independent of the video playback interface, so that the target user can input multimedia information associated with the target object in the text box. After the target user enters multimedia information in the text box, the target user terminal can obtain the multimedia information associated with the target object, that is, the multimedia information associated with the target object can be collectively referred to as the user text input by the target user , Or user comments.
- the target user terminal may be a terminal device with a video data playback function
- the target user terminal may be the user terminal 3000a in the embodiment corresponding to FIG. 1
- the target user terminal may be understood as a mobile terminal.
- the application server may be the business server 2000 in the embodiment corresponding to FIG. 1 above.
- FIG. 5 is a schematic diagram of acquiring multimedia information according to an embodiment of the present application.
- the target user terminal when the target user terminal is playing the video 30a in the embodiment corresponding to FIG. 3, the target user terminal may use the currently played video 30a as the target video. It is understandable that the target user can perform a trigger operation on a certain video frame of the multiple video frames included in the video 30a at any time when the video 30a is played.
- the target user terminal may use the video frame corresponding to the trigger operation as the key video frame.
- the target user may select object A as the target object in the video playback interface 100a shown in FIG.
- the target user terminal may call the video frame currently played by the video playback interface 100a as the key video frame.
- the target user terminal may use the video frame corresponding to the selection operation (ie, trigger operation) as the key video frame, and may use the pixel corresponding to the selection operation in the key video frame as the target pixel.
- the target pixel is a pixel in a key video frame in the target video acquired by the target user terminal.
- the text box shown in FIG. 5 may pop up in the video playback interface 200a shown in FIG.
- the text box can also be called a dialog box.
- the text box shown in FIG. 5 can be understood as a floating window independent of the video playback interface 200a, and the text box shown in FIG. 5 may have an association relationship with the object A shown in FIG. 5 (for example, it may be associated with There is a relative relationship between the target pixels in the object A in display positions, so as to construct the correlation between the target pixels of the target object in the video 30a and the multimedia information associated with the target object).
- the implementation of the floating window may be similar or the same as that of the video playing interface.
- the multimedia information entered in the dialog box of the embodiment of the present application may include user text, user pictures, user expressions and other data, and the user text entered by the target user in the dialog box (ie text information ), user pictures (i.e. picture information) and user expressions (i.e. expression information) are collectively referred to as barrage data.
- the display of the barrage data can be similar to subtitles.
- the input text information A can be displayed in the video playback interface 300a shown in FIG.
- the input text information A may be the text information shown in FIG. 5 with a certain distance between the target pixels in the object A.
- the text information A displayed in the video playing interface 300a can be referred to as barrage data associated with the target object.
- Step S102 Determine a trajectory acquisition request corresponding to the target pixel based on the position information of the target pixel in the key video frame.
- the target user terminal may determine the position information of the target pixel in the key video frame, and may be based on the frame number of the key video frame in the target video and the position information of the target pixel in the key video frame , Generate a trajectory acquisition request corresponding to the target pixel, so that step S103 can be further executed.
- the trajectory acquisition request may be used to instruct the application server to filter the trajectory information that matches the target pixel from the pre-calculated trajectory information corresponding to all pixels in the target video.
- Step S103 based on the trajectory acquisition request, acquire target trajectory information associated with the position information of the target pixel in the key video frame.
- the target track information includes the position information of the target pixel in the next video frame of the key video frame, and the position information of the target pixel in the next video frame of the key video frame is Obtained by tracking the target pixel.
- the target user terminal may be based on the motion trajectory of all pixels in the target video in all video frames calculated in advance by the application server (the motion trajectory of each pixel may be collectively referred to as a track information), from Among the motion trajectories corresponding to these pixels, the motion trajectories of the pixel points matching the target pixel are selected as target trajectory information associated with the position information of the target pixel in the key video frame.
- the target user terminal obtains the target track information, it can quickly determine the position of the target pixel contained in the target track information in the next video frame of the key video frame.
- the multimedia information appears in the Position information in the next video frame of the key video frame.
- the position separation distance can be understood as the relative position separation distance between the target pixel in the key video frame and the corresponding barrage data. That is, the position separation distance can include the relative position separation distance in the horizontal (ie, horizontal) direction, or the relative position separation distance in the vertical (ie, longitudinal) direction, so as to ensure that the target user terminal can obtain the target pixel in the key video
- the position information of the text information A in the next video frame of the key video frame can be quickly calculated based on the relative position separation distance. That is, at this time, the position information of the text information A displayed in the video playback interface 300a in the embodiment corresponding to FIG.
- the target user terminal can obtain a track that matches the position information of the target pixel in the key video frame from an application server that has a network connection relationship with the target user terminal Information, as target trajectory information, so that when the target user terminal obtains the target trajectory information of the target pixel pre-calculated by the application server, it is further based on that the target pixel in the target trajectory information appears in the key video
- the position information of the next video frame of the frame can quickly and accurately realize the fast tracking of the barrage data within the effective time, so that the calculation amount of the target user terminal can be effectively reduced to ensure the calculation of the target user terminal In the case of relatively general performance, you can also quickly track the barrage data.
- the effective duration may be the display duration corresponding to the barrage data, that is, the target user terminal can track the barrage data associated with the target object within the display duration.
- the motion track of each pixel in the target video (that is, the track information of each pixel) is determined by the position information of each pixel in each video frame of the target video.
- the embodiment of the present application may determine any two adjacent video frames among the multiple video frames as an image pair. It should be understood that one of the two video frames contained in each image pair determined from the plurality of video frames may be called the first video frame, and the other video frame may be called the second video frame. Video frame. For the image pair 1 formed by the video frame corresponding to the first moment and the video frame corresponding to the second moment in the embodiment corresponding to FIG.
- the image pair 1 corresponding to the first moment The video frame is referred to as the first video frame, and the video frame corresponding to the second moment can be referred to as the second video frame, which can then be based on the pre-calculated distance between the two video frames in the image pair 1.
- the video corresponding to the second moment may also be The frame is called the first video frame, and the video frame corresponding to the third moment is called the second video frame, which can be based on the pre-calculated average between the two video frames in the image pair 2.
- the displacement matrix tracks all the pixels in the first video frame to determine the position information where all the pixels in the first video frame appear in the second video frame.
- the embodiment of the present application can obtain the average displacement matrix corresponding to each image pair, and the average displacement matrix corresponding to each image pair can be called the average displacement matrix corresponding to the first video frame in each image pair.
- the average displacement matrix corresponding to the first video frame can be used to map all the pixels in the first video frame to the second video frame, so as to accurately obtain the position information of the mapped pixels in the second video frame.
- the average displacement matrix in the embodiment of the present application may include a longitudinal average displacement matrix and a horizontal average displacement matrix.
- the first longitudinal coordinate value (for example, the y value) of each pixel in the first video frame can be transformed into the longitudinal coordinate to obtain the corresponding pixel mapped to the second longitudinal in the second video frame Coordinates; in the same way, the first horizontal coordinate value (for example, the x value) of each pixel in the first video frame can be transformed by the horizontal average displacement matrix to get the corresponding pixel to be mapped to the second video frame
- the second horizontal coordinate in.
- the first horizontal coordinate and the first vertical coordinate value of each pixel in the first video frame may be referred to as the first position information of each pixel in the first video frame.
- the second horizontal coordinate and the second vertical coordinate value of each pixel point mapped in the first video frame can be referred to as the second position information of each pixel point obtained by mapping in the second video frame. Since each image pair corresponds to an average displacement matrix, the corresponding second position information can be calculated based on the first position information of the pixel in the first video frame, and each second video frame obtained by the calculation can be calculated The second position information of the pixel points mapped in the video frame is retained, and the position information of the same pixel point in each video frame can be integrated to obtain the motion trajectory of all pixels in the video frame, which can be realized Track all pixels in all video frames of the target video.
- the multiple videos in the target video shown in the embodiment corresponding to FIG. 2 may be multiple consecutive image frames. Therefore, by splitting the target video shown in FIG. Set the corresponding video frame numbers of the image frames (ie video frames) obtained by splitting according to the playback order.
- the video frame number of the video frame obtained at the first moment can be 1, and the video frame number 1 can be used Yu means that the video frame obtained at the first moment is the first frame in the target video; similarly, the video frame number of the video frame obtained at the second moment can be 2, and the video frame number 2 can be used for It means that the video frame obtained at the second moment is the second frame in the target video.
- the video frame number of the video frame obtained at time n-1 can be n-1, and the video frame number n-1 can be used to indicate that the video frame obtained at time n-1 is the target
- the n-1th frame in the video; the video frame number of the video frame obtained at the nth time can be n, and the video frame number n can be used to indicate that the video frame obtained at the nth time is the target video
- the nth frame is the last frame in the target video.
- the embodiment of the present application may refer to the image pair formed by the first frame and the second frame of the multiple video frames shown in FIG. 2 as the first image pair to illustrate that the average displacement matrix is used to divide the image in the first frame
- the pixel points are translated into the second frame to realize the specific process of pixel tracking.
- the first frame in the first image pair is the video frame corresponding to the first moment in the embodiment corresponding to FIG. 2
- the second frame in the first image pair is the embodiment corresponding to FIG. 2.
- FIG. 6, is a schematic diagram of a full-image pixel tracking provided by an embodiment of the present application.
- the image pair (1, 2) shown in FIG. 6 may be the first image pair described above.
- the first video frame in the first image pair may be the video frame corresponding to the aforementioned first moment (ie, the first frame), and the second video frame in the first image pair may be the aforementioned video frame corresponding to the second moment ( That is the second frame).
- the value 1 in the image pair (1, 2) is the video frame number of the first frame
- the value 2 is the video frame number of the second frame. Therefore, the video frame number of each video frame in the target video can be used to characterize any two consecutively adjacent video frames in the target video.
- the pixel point display area 600a shown in FIG. 6 may include all the pixels extracted from the first video frame of the image pair.
- each pixel point in the pixel point display area 600a may correspond to an area. logo.
- the pixel point display area 600a in FIG. 6 is only for example, and the pixel point display area 600a may also be referred to as a pixel point area or the like. It should be understood that the embodiment of the present application only takes 20 pixels of pixels obtained from the first video frame as an example. In actual situations, the number of pixels obtained from the first video frame There will be far more than the 20 listed in the examples of this application. It should be understood that since multiple video frames in the same video are obtained by the same terminal after image collection, the number of pixels in each video frame included in the same video is the same.
- the average displacement matrix shown in Figure 6 can be used for the pixel.
- All pixels in the dot display area 600a are tracked, and the position information of the mapped pixels can be determined in the pixel dot display area 700a corresponding to the second video frame.
- the position information of the pixel point A in the pixel point display area 600a shown in FIG. 6 may be the coordinate position information of the area identifier 5, and the average displacement matrix can be used to The pixel point A is mapped to the pixel point display area 700 a shown in FIG.
- the position information of the pixel point A in the pixel point display area 700 a shown in FIG. 6 may be the coordinate position information of the area identifier 10.
- the position information of the pixel point A in the second video frame after the position information of the pixel point A in the second video frame is obtained by calculation, it may be stored. Since each image pair in the target video can correspond to an average displacement matrix, the position information of each pixel in the first video frame mapped to the second video frame can be calculated. By integrating the position information of the same pixel in each image pair in consecutive video frames, the position information of the pixel A in each video frame in the target video can be obtained, which can then be based on the pixel The position information of A in each video frame in the target video obtains the motion track of the pixel A.
- the average displacement matrix corresponding to each image pair (ie, the average displacement matrix corresponding to the first video frame in each image pair) .
- the trajectory information corresponding to all pixels in the target video may be collectively referred to as trajectory information corresponding to the pixels in the embodiment of the present application.
- the target user terminal can be pre-calculated through an application server that has a network connection relationship with the target user terminal.
- the motion track of all pixels in the target video Therefore, when the target user terminal actually plays the target video, the application server receives the position information of the target pixel in the key video frame sent by the target user terminal, and filters the track information corresponding to the pixel from the pre-calculated track information.
- the trajectory information matched by the target pixel is used as the target trajectory information, and the target trajectory information is further returned to the target user terminal, so that the target user terminal can further execute step S104 based on the acquired target trajectory information.
- the target pixel is a pixel in the key video frame selected by the target user.
- the motion trajectories of all pixels in the target video can be pre-calculated in the target user terminal, so that the target video can be actually played on the target user terminal
- the trajectory information matching the target pixel is selected from the trajectory information corresponding to these pixel points as the target trajectory information, so that step S104 can be further executed.
- Step S104 when the next video frame of the key video frame is played, based on the position information of the target pixel in the target track information in the next video frame of the key video frame, display the multimedia information.
- FIG. 7 is a schematic diagram of tracking barrage data in consecutive multiple video frames provided by an embodiment of the present application.
- the multiple consecutive video frames used for barrage tracking in the embodiment of the present application may include the key video frame currently being played and the video frame located after the key video frame in the target video that has not yet been played.
- the video frame 10 as shown in FIG. 7 is used as a key video frame, in each video frame following the key video frame (for example, video frame 20, video frame 30, etc.), the pair appears in the video
- the barrage data in frame 10 is used for barrage tracking.
- the video frame 10 shown in FIG. 7 may be the video frame displayed in the video playback interface 300a in the embodiment corresponding to FIG.
- the video frame 10 shown in FIG. 7 may be currently being played in the target user terminal.
- the key video frame in the embodiment of the present application can be understood as the video frame corresponding to the trigger operation performed when the target user selects the target object.
- the target objects in the embodiments of the present application may include objects such as characters, animals, plants, etc., selected by the target user through a click operation in the video frame being played.
- the target user terminal may call the object selected by the target user the target object, and may use the pixel point corresponding to the trigger operation in the target object in the key video frame as the target pixel point, and then may pre-determine it from the application server.
- the target track information may include the position information of the target pixel in the key video frame, and may also include each video frame after the key video frame (for example, the next video of the key video frame). Frame). It should be understood that based on the position information of the target pixel in each video frame after the key video frame, the multimedia information associated with the target object (also associated with the target pixel) can be quickly calculated (that is, as shown in Figure 5 above).
- the position information in each video frame after the key video frame to achieve fast tracking of the barrage data associated with the target object, so that the target user terminal can play the In the next video frame of the key video frame, based on the calculated position information of the barrage data in the key video frame, the barrage data is displayed in the next video frame in real time.
- the display of the barrage data can be similar to the display of subtitles.
- the barrage data and the target object can be accompanied by a shadow, that is, the barrage input by the user can be effectively tracked.
- follow the target object to be tracked for relative movement For example, in the target video, if there are target objects in multiple consecutive video frames after the key video frame, the position information of the target pixels in the target object in these consecutive video frames can be displayed. Barrage data associated with the target object (that is, the aforementioned text information A).
- the target user terminal may also transmit the barrage data (multimedia information) input by the user and the calculated position information of the barrage data in each video frame of the target video to the server.
- the server may receive the frame number of the key video frame in the target video clicked by the user from the target user terminal, the coordinates of the target pixel, and the input barrage data (multimedia information), and calculate the position of the target pixel in each video frame of the target video.
- Target trajectory information according to the target trajectory information, calculate the position information of the barrage data in each video frame of the target video, and save the position information of the barrage data.
- the server may also receive information such as the identification of the target user terminal and/or the user identification of the user logging in to the target application on the target user terminal. Then when other user terminals play the target video, the server can send the barrage data, its position information in each video frame of the target video, and the user ID to other user terminals, and the other user terminals will use the barrage data according to the barrage data. Position information, display barrage data in each video frame of the target video.
- the target user can select the object that the target user thinks needs to be tracked from the currently played video frame when the current time is T1.
- the selected object can be referred to as a target object.
- the target user terminal can filter out the trajectory information associated with the target pixel in the target object based on the pre-calculated trajectory information corresponding to all the pixels in the video, and quickly obtain the target pixel in the target object.
- the target track information corresponding to the point It should be understood that the pre-calculated track information corresponding to any pixel of each pixel in the video can be used to describe the position information of the pixel in each video frame of the video.
- the target user terminal uses the video frame played at time T1 as the key video frame, the target pixel in the target object can be obtained in the key video frame, and then the target track information corresponding to the target pixel can be obtained , To quickly obtain the position information of the target pixel in each video frame after the key video frame, so that the multimedia information associated with the target object can be displayed based on the target track information. It can be understood that, if the trajectory information formed by the target pixel in each video frame is a circle, the multimedia information associated with the target object can synchronously follow the trajectory information in a circle.
- the trajectory information corresponding to each pixel can be obtained in advance, so that when the target video is played in the target user terminal, it can be based on the trigger executed by the target user Operation, the pixel in the target object corresponding to the trigger operation is used as the target pixel to obtain the trajectory information associated with the target pixel as the target trajectory information, which can then be quickly realized based on the acquired target trajectory information Accurate tracking of the multimedia information associated with the target object.
- the corresponding motion trajectories of the target pixels in the different objects can be obtained, so that the barrage data associated with different target objects can move around different trajectories , So that the correlation between the barrage data and the object it is aimed at is stronger, thereby enriching the visual display effect of the barrage data, and improving the flexibility of the display mode of the barrage data.
- the embodiment of the present application may use the video frame corresponding to the trigger operation in the target video as the key video frame when the trigger operation of the target user on the target video is acquired, so that the target pixel can be determined from the key video frame and obtained Multimedia information associated with the target pixel and the target object where the target pixel is located (for example, the multimedia information may be barrage data such as user text, pictures, expressions, etc. in the target video). Further, based on the position information of the target pixel in the key video frame, the trajectory acquisition request corresponding to the target pixel is determined, and then based on the trajectory acquisition request, the target pixel in the key video frame can be acquired.
- the target track information associated with the position information in the key video frame so that when the next video frame of the key video frame is played, based on the target track information, the target pixel and the target object associated with the target pixel can be displayed Barrage data.
- the embodiment of the present application can further filter out the trajectory information of the target pixel from the trajectory information of all pixels in the key video frame, and determine the value of the selected target pixel.
- the trajectory information is used as the target trajectory information, so that the display effect of the barrage data can be enriched based on the obtained target trajectory information. For example, for target pixels in different target objects, the obtained target trajectory information may be different, which in turn makes the display effect of the barrage data different.
- the position information of the barrage data in each video frame after the key video frame can be quickly determined.
- the barrage data will be displayed in the target video.
- the target object is always changed in the video, which can enrich the visual display effect of the user text in the video, and can make the barrage data more closely related to the target object or the object in the commented video.
- FIG. 8 is a schematic diagram of another video data processing method according to an embodiment of the present application. This method is mainly used to illustrate the data interaction process between the target user terminal and the application server. The method may include the following steps:
- Step S201 Acquire adjacent first video frames and second video frames from the target video.
- the application server may determine multiple image pairs from multiple video frames contained in the target video, and each image pair in the multiple image pairs is composed of two adjacent video frames in the target video. Constituted.
- the target video when the application server performs video preprocessing on the target video, the target video can be divided into frames first, so that multiple video frames in the target video are divided into pictures according to the playback time sequence. That is, multiple video frames arranged based on the playback time sequence shown in FIG. 2 can be obtained.
- a picture corresponding to each video frame can be obtained, that is, one image can be regarded as one image frame.
- the application server can use the forward-backward optical flow method to track the pixels in the two video frames in each image pair. For example, for a target video containing n video frames, the application server may determine two video frames with adjacent frame numbers as an image pair according to the video frame number of each video frame in the target video.
- the application server may determine a video frame with a video frame number of 1 and a video frame with a video frame number of 2 as an image pair. Similarly, the application server may determine the video frame with the video frame number of 2 and the video frame with the video frame number of 3 as an image pair. By analogy, the application server may determine a video frame with a video frame number of n-1 and a video frame with a video frame number of n as an image pair.
- n-1 image pairs can be obtained.
- n-1 image pairs can be expressed as: (1, 2), (2, 3), (3, 4),...,(n-1,n).
- the video frame with the video frame number of 1 in the image pair can be called the first frame of the target video
- the video frame with the video frame number of 2 can be called the second frame of the target video, and so on.
- the video frame with the video frame number of n-1 in the image pair can be called the n-1th frame of the target video, and the video frame with the video frame number of n can be called the nth frame of the target video.
- the application server can track the pixel points in each image pair of the target video through the cloud forward and backward optical flow method.
- the cloud forward and backward optical flow method can be collectively referred to as the optical flow method, and the optical flow method can be used to calculate the pixel point displacement between two video frames in each image pair.
- each image pair is composed of two adjacent video frames
- one video frame in each image pair can be called the first video frame
- each The other video frame in the image pair is called the second video frame
- the embodiment of the present application may collectively refer to the two video frames in each image pair acquired from the target video as the first video frame and the second video frame, that is, the application server may obtain information from the target video. Acquire adjacent first video frames and second video frames from the video.
- Step S202 Determine an average displacement matrix corresponding to the first video frame based on the optical flow tracking rule corresponding to the target video, the pixels in the first video frame, and the pixels in the second video frame.
- the application server may extract all pixels of the first video frame. All the extracted pixels can be collectively referred to as pixels.
- the optical flow tracking rule corresponding to the target video may include the aforementioned cloud forward and backward optical flow method, and may also include the cloud displacement integration method and the cloud displacement difference method. It should be understood that through this optical flow tracking rule, the pixels in the first video frame and the pixels in the second video frame in each image pair can be subjected to optical flow calculations to obtain the optical flow tracking corresponding to each image pair. As a result, the target state matrix and target displacement matrix corresponding to each image pair can be determined based on the optical flow tracking result.
- the application server may select a block around the pixel (including the pixel and the pixels around the pixel) with respect to each pixel in the first video frame, and calculate the The average displacement of all pixels in the block is used as the displacement of the pixel.
- the computational complexity of this processing method may be relatively large.
- the optical flow tracing rule can further perform displacement integration operations on the target state matrix and the target displacement matrix corresponding to each image pair to obtain the state integral matrix and displacement integral matrix corresponding to each image pair. Further, through the optical flow tracing rule, the displacement difference operation can be performed on the state integral matrix and the displacement integral matrix corresponding to each image pair to obtain the average displacement matrix corresponding to each image pair. In other words, the optical flow tracking rule can accurately obtain an average displacement matrix that can be used to accurately track the position information of the pixel points in the first video frame in each image pair.
- the application server can calculate the average displacement of the pixels in the first video frame and the pixels in the second video frame in batches, thereby increasing the speed of calculation and improving the resolution of pixels and video frames. Processing efficiency.
- the cloud forward and backward optical flow method can be used to synchronously perform forward and reverse optical flow calculations on the first video frame and the second video frame in each image pair to obtain the optical flow corresponding to each image pair Track the results.
- the optical flow tracking result obtained by the application server may include the forward displacement matrix corresponding to the first video frame in each image pair, and may also include the reverse displacement matrix corresponding to the second video frame in each image pair.
- each matrix element in the forward displacement matrix and the reverse displacement matrix may include a displacement in two dimensions (for example, ( ⁇ x, ⁇ y)).
- the displacement in these two dimensions can be understood as the displacement of the same pixel in the horizontal direction (ie, ⁇ x) and the displacement in the vertical direction (ie, ⁇ y). It should be understood that for each image pair in the target video, after calculation by the optical flow method, a positive horizontal displacement matrix, a positive vertical displacement matrix, and a reverse horizontal displacement can be obtained. Matrix, a reverse vertical displacement matrix, and the four matrices obtained can be called optical flow results. Further, the application server can set an initial state matrix for the first video frame in each image pair, and then can determine the first video in each image pair based on the forward displacement matrix and the reverse displacement matrix obtained above Whether the pixels in the frame meet the target filtering conditions.
- the application server can determine the pixels that meet the target screening conditions as effective pixels, and then can perform a calculation on the first video frame according to the determined effective pixels.
- the corresponding initial state matrix and the forward displacement matrix are modified to obtain the target state matrix and the target displacement matrix corresponding to the first video frame in each image pair.
- the application server can determine and obtain the average displacement matrix corresponding to the first video frame in each image pair through the cloud displacement integration method and the cloud displacement difference method, as well as the obtained target state matrix and target displacement matrix.
- the forward horizontal displacement matrix and the forward vertical displacement matrix may be collectively referred to as the forward displacement matrix
- the reverse horizontal displacement matrix and the reverse vertical displacement matrix may be collectively referred to as the reverse displacement matrix.
- the embodiment of the present application takes an image pair of multiple image pairs as an example to illustrate the process of obtaining the average displacement matrix corresponding to the image pair through the first video frame and the second video frame in the image pair.
- the first video frame in the image pair may be a video frame with a video frame number of 1
- the second video frame may be a video frame with a video frame number of 2. Therefore, an image pair composed of a video frame with a video frame number of 1 and a video frame with a video frame number of 2 is called image pair 1, and the image pair 1 can be expressed as (1,2).
- the positive displacement matrix corresponding to the image pair 1 obtained after calculation by the optical flow method may include a positive horizontal displacement matrix (for example, the positive horizontal displacement matrix may be a matrix Q 1,2,x ) and a positive vertical Displacement matrix (for example, the positive vertical displacement matrix may be matrix Q 1,2,y ).
- each matrix element in the matrix Q 1, 2, x can be understood as the horizontal displacement of the pixel in the first video frame in the second video frame. That is, each matrix element in the forward horizontal displacement matrix can be referred to as the first lateral displacement corresponding to the pixel in the first video frame.
- each matrix element in the matrix Q 1, 2, y can be understood as the vertical displacement of the pixel in the first video frame in the second video frame.
- each matrix element in the forward horizontal displacement matrix can be referred to as the first longitudinal displacement corresponding to the pixel in the first video frame.
- the matrix size of the two matrices (ie matrix Q 1,2,x and matrix Q 1,2,x ) obtained by the optical flow calculation method is the same as the size of the first video frame, that is, one matrix element can correspond to A pixel in the first video frame.
- the inverse displacement matrix corresponding to the image pair 1 obtained by the optical flow method can include the inverse horizontal displacement matrix (that is, the inverse horizontal displacement matrix can be the matrix Q 2,1,x ) and the inverse displacement matrix .
- a matrix of vertical displacement (that is, the matrix of reverse vertical displacement may be a matrix Q 2,1,y ).
- each matrix element in the matrix Q 2, 1, x can be understood as the horizontal displacement of the pixel in the second video frame in the first video frame. That is, each matrix element in the reverse horizontal displacement matrix can be referred to as the second lateral displacement corresponding to the pixel in the second video frame.
- each matrix element in the matrix Q 2, 1, y can be understood as the vertical displacement of the pixel in the second video frame in the first video frame. That is, each matrix element in the reverse vertical displacement matrix can be referred to as the second longitudinal displacement corresponding to the pixel in the second video frame.
- the matrix size of the two matrices (ie matrix Q 2,1,x and matrix Q 2,1,y ) obtained by the optical flow calculation method is the same as the size of the second video frame, that is, one matrix element can correspond to A pixel in the second video frame.
- the matrix size is the same. For example, if the number of pixels in each video frame is m ⁇ n, the matrix size of the obtained four matrices can all be m ⁇ n. It can be seen that each matrix element in the forward horizontal displacement matrix and the forward vertical displacement matrix can correspond to the corresponding pixel in the first video frame.
- each matrix element in the forward displacement matrix corresponding to the image pair 1 can represent the displacement of the pixel in the first video frame in two dimensions in the second video frame.
- the forward displacement matrix corresponding to the image pair 1 may be collectively referred to as the forward displacement matrix corresponding to the first video frame.
- each matrix element in the reverse displacement matrix corresponding to image pair 1 may represent the displacement of the pixel in the second video frame in two dimensions in the first video frame.
- the inverse displacement matrix corresponding to the image pair 1 may be collectively referred to as the inverse displacement matrix corresponding to the second video frame.
- the application server can forward the pixels in the first video frame to the second Video frame, and the second position information of the first mapping point obtained by mapping is determined in the second video frame, and may be further based on the first position information of the pixel point and the second position information of the first mapping point The position information determines the forward displacement matrix corresponding to the first video frame.
- the application server may reversely map the pixels in the second video frame to the first video based on the second position information of the pixels in the second video frame and the optical flow tracking rule Frame, and determine in the first video frame the third location information of the second mapping point obtained by mapping, and further based on the second location information of the first mapping point and the third location of the second mapping point Information to determine the reverse displacement matrix corresponding to the second video frame.
- the first mapping point and the second mapping point are both pixel points obtained by mapping a pixel point in one video frame of the image pair to another video frame by an optical flow method.
- the application server may, based on the first position information of the pixels in the first video frame, the forward displacement matrix, and the reverse displacement matrix, divide the pixels that meet the target filtering condition among the pixels Determined as effective pixels.
- the specific process of determining effective pixels by the application server can be described as:
- the application server may obtain the first pixel from the pixels in the first video frame, and determine the first position information of the first pixel in the first video frame, and shift it from the forward direction
- the first horizontal displacement and the first longitudinal displacement corresponding to the first pixel are determined in the matrix; further, the application server may be based on the first position information of the first pixel, and the first pixel corresponding to the first pixel.
- the application server may determine the second lateral displacement and the second longitudinal displacement corresponding to the second pixel from the reverse displacement matrix, and based on the second position information of the second pixel, the second The second horizontal displacement and the second longitudinal displacement corresponding to the two pixels are reversely mapped to the first video frame, and the mapped third pixel is determined in the first video frame Point’s third position information; further, the application server may determine the first pixel point and the third pixel point based on the first position information of the first pixel point and the third position information of the third pixel point.
- the error distance between the pixels and according to the first position information of the first pixel and the second position information of the second pixel, determine the image block containing the first pixel and the image block containing the second pixel Correlation coefficients between image blocks of points; further, the application server may determine the pixel points in the pixel points whose error distance is less than the error distance threshold and the correlation coefficient is greater than the correlation coefficient threshold as effective pixels.
- the embodiment of the present application may perform matrix transformation on the matrix elements in the four displacement matrices. Screening, that is, through the changes of the matrix elements at the positions of the corresponding pixels in the constructed initial state matrix, the matrix elements with large displacement errors at the corresponding positions of the corresponding pixels can be removed from these four matrices, That is, the effective pixels can be determined from the pixels of the first video frame.
- FIG. 9 is a method for determining effective pixels provided by an embodiment of the present application.
- the application server may first initialize a state matrix S 1 with the same size as the first video before filtering the matrix elements in the four matrices.
- the application server may call the state matrix S 1 an initial state matrix.
- the value of the matrix element corresponding to each pixel point may be referred to as the first value.
- the first value in the initial state matrix is all zero.
- the change of the value of the matrix element in the initial state matrix can be used to indicate whether the pixel in the first video frame meets the target filter condition, so that the pixel that meets the target filter condition can be used as the effective tracking pixel (that is, the effective pixel point).
- the first image frame shown in FIG. 9 may be a video frame whose video frame number is 1 in the aforementioned image pair 1.
- the pixels in the first video frame may include the first pixel p1 as shown in FIG. 9, that is, the first pixel p1 may be one of all the pixels in the first video frame, and may
- the position information of the first pixel p1 in the first video frame is called first position information.
- the application server can find the first lateral displacement corresponding to the first pixel p1 from the positive horizontal displacement matrix in the forward displacement matrix, and find the positive vertical displacement matrix in the forward displacement matrix.
- the first longitudinal displacement corresponding to the first pixel point p1 may further be based on the first position information of the first pixel point p1, the first lateral displacement and the first longitudinal displacement corresponding to the first pixel point p1.
- the pixel point p1 is forwardly mapped to the second video frame shown in FIG. 9, and the second position information of the second pixel point p2 obtained by mapping is determined in the second video frame. It can be understood that, at this time, the second pixel point p2 is a pixel point obtained by matrix transformation of the first pixel point p1.
- the application server may determine the second lateral displacement and the second longitudinal displacement corresponding to the second pixel point p2 from the above-mentioned reverse displacement matrix, and may determine the second position information of the second pixel point p2 and the second position
- the second horizontal displacement and the second longitudinal displacement corresponding to the two pixel points p2, the second pixel point p2 is reversely mapped back to the first video frame shown in FIG. 9, and the mapped result can be determined in the first video frame
- the third position information of the third pixel p1' It can be understood that, at this time, the third pixel point p1' is a pixel point obtained by matrix transformation of the second pixel point p2 obtained by mapping the first pixel point p1.
- the application server may determine, in the first video frame, between the first position information of the first pixel p1 and the third position information of the third pixel p1' obtained after matrix transformation. The position error of t 11' . Further, the application server may select an image block 10 with a size of k*k pixels (for example, 8*8 pixels) in the first video frame shown in FIG. 9 with the first position information of the first pixel p1 as the center. . In addition, as shown in FIG. 9, the application server can also select an image block 20 with a size of k*k pixels in the second video frame shown in FIG. 9 with the second position information of the second pixel p2 as the center, and then The correlation coefficient between the two image blocks can be calculated (the correlation coefficient can be N 1,2 ).
- Patch 1 (a, b) in the formula (1) can represent the pixel value of the pixel at the position of the ath row and the b column of the image block 10 shown in FIG. 9.
- the pixel value may be the gray value of the pixel, which is between 0 and 255.
- E(patch 1 ) represents the average pixel value of the tile 10 shown in FIG. 9.
- patch 2 (a, b) represents the pixel value of the pixel at the position of the ath row and b column of the image block 20 shown in FIG. 9.
- E(patch 2 ) represents the average pixel value of the image block 20 shown in FIG. 9.
- the application server may correspond to the first pixel p1 initial state matrix S matrix element at the corresponding position of a second value.
- the value of the initial state matrix S 1 and the point p1 corresponding to the first pixel element is switched from 0 to 1, to indicate that the first video frame in the first effective pixel as a pixel p1.
- the application server can determine that the first pixel p1 shown in FIG. 9 is an invalid tracking pixel. I.e.
- the application server may further determine the value of the matrix element at the position corresponding to the first pixel point p1 in the above-mentioned forward displacement matrix (ie, the above-mentioned matrix Q 1,2,x and matrix Q 1,2,y ) Set to 0, so that the forward displacement matrix containing the first value can be determined as the target displacement matrix (for example, the forward horizontal displacement matrix Q x1 and the forward horizontal displacement matrix Q y1 ). That is, the matrix elements at these positions in the target displacement matrix can be used to represent the matrix determined after filtering out the above-mentioned forward displacement matrix and filtering out the mistracking displacement with larger error.
- the pixels can be selected from the first video frame shown in FIG. 9 as the first pixel in order to repeat the above steps of determining effective pixels. Until all pixels in the first video frame are regarded as first pixels, all effective pixels in the first video frame can be determined. Thereby, the matrix element in the initial state matrix can be updated based on the position information of the effective pixel in the initial state matrix, and the initial state matrix containing the second value can be determined as the target corresponding to the first video frame State matrix S 1 . And the target displacement matrix corresponding to the first video frame (that is, the target horizontal displacement matrix Q x,1 and the target horizontal displacement matrix Q y,1 ) can be obtained.
- the composed image pairs can be expressed as (1, 2) , (2, 3), (3, 4), ..., (n-1, n).
- the target state matrix S 1 corresponding to the image pair (1, 2) and the image pair (1, 2) corresponding to the image pair (1, 2) can be finally obtained through the above-mentioned effective pixel point judgment method
- the target displacement matrix Q 1 (that is, the aforementioned target horizontal displacement matrix Q x,1 and the target vertical displacement matrix Q y,1 ).
- the target state matrix S n-1 corresponding to the image pair (n-1, n) and the target displacement matrix Q n-1 corresponding to the image pair (1, 2) can be obtained (that is, the aforementioned target horizontal displacement matrix Q x, n-1 and the target horizontal displacement matrix Q y, n-1 ).
- the application server can be the target of the image matrix corresponding to a state S 1 and the target displacement matrix Q by an integral operation Drive displacement integration, to obtain the corresponding pixels of the first video frame in a state of integration
- the displacement integral matrix Q in (x, y) may include a lateral displacement integral matrix Q x,in (x,y) and a longitudinal displacement integral matrix Q y,in (x,y).
- the state integral matrix S in (x, y), the lateral displacement integral matrix Q x, in (x, y) and the longitudinal displacement integral matrix Q y, in (x, y) can be obtained by the following matrix integral formula:
- the x and y in formula (2), formula (3), and formula (4) can be used to represent the coordinates of all matrix elements in the state integral matrix and displacement integral matrix corresponding to the first video frame, such as S in ( x, y) can represent the value of the matrix element in the xth row and yth column of the state integral matrix.
- x'and y'in formula (2), formula (3), and formula (4) can represent the coordinates of matrix elements in the target state matrix and the target displacement matrix, such as S(x', y') Represents the value of the matrix element in the xth'row and y'column of the target state matrix.
- the application server can select a target frame with a height of M and a width of N in the first video frame through the cloud displacement difference method, and then can further compare formula (2) and formula (3) in the target frame ,
- the three integral matrices obtained by formula (4) are subjected to displacement difference operation to obtain the state difference matrix S dif (x, y) and the displacement difference matrix Q dif (x, y) respectively.
- the target frame is to select all the pixels in a certain area around the pixels to calculate the average displacement. For example, the size is 80 ⁇ 80 pixels.
- the displacement difference matrix Q dif (x, y) may include a lateral displacement difference matrix Q x, dif (x, y) and a longitudinal displacement integral matrix Q y, dif (x, y).
- the state integral matrix S dif (x, y), the lateral displacement difference matrix Q x, dif (x, y) and the longitudinal displacement integral matrix Q y, dif (x, y) can be obtained by the following matrix difference formula (5).
- the area where the target frame is located in the first video frame may be referred to as a differential area, which can be based on the size information of the differential area, the state integral matrix, the lateral displacement integral matrix and The longitudinal displacement integral matrix determines the average displacement matrix corresponding to the first video frame.
- M and N in the displacement difference calculation formula are the length and width values of the difference area.
- x and y in the displacement difference calculation formula are respectively the position information of each pixel in the first video frame.
- the matrix in terms of the state of integration can be obtained integrating the state matrix S in (x, y) corresponding to a state difference matrix S dif (x, y).
- the lateral displacement integral matrix Q x,in (x,y) and the longitudinal displacement integral matrix Q y,in (x,y) can be obtained Difference matrix Q y, dif (x, y).
- the application server may determine the ratio between the lateral displacement difference matrix Q x, dif (x, y) and the state difference matrix S dif (x, y) as the horizontal average displacement matrix Q x, F (x , Y), and determine the ratio between the longitudinal displacement difference matrix Q y, in (x, y) and the state difference matrix S dif (x, y) as the longitudinal average displacement matrix Q y, F (x ,Y).
- the e in formula (6) and formula (7) is used to represent a relatively small number artificially set, such as 0.001. That is, the e in formula (6) and formula (7) is to avoid when the value of all matrix elements in the state difference matrix S dif (x, y) is 0, so as to avoid direct division by 0, and further Step S203 is executed to pre-calculate the position information of the pixels in the first video frame in the second video frame in the target user terminal.
- Step S203 based on the average displacement matrix, track the position information of the pixels in the first video frame, and determine the position information of the pixels obtained by tracking in the second video frame.
- the application server may be based on the average displacement matrix obtained in step S203 (the average displacement matrix may include the horizontal average displacement matrix Q x, F (x, y) and the longitudinal average displacement matrix Q y, F (x, y) ), and further quickly and accurately track the position information of the pixels in the first video frame that appear in the next video frame (that is, the second video frame in the above image pair 1), that is, by performing displacement transformation,
- the position information of the pixels obtained by tracking the pixels in the first video frame is determined in the second video frame.
- x in formula (8) is the horizontal position coordinate of the pixel in the first video frame
- Q x, F (x, y) is the horizontal average displacement matrix corresponding to the first video frame
- the formula ( 8) The horizontal position coordinates of the pixels in the first video frame can be coordinate transformed to obtain the horizontal position coordinates of the pixels in the first video frame in the next video frame.
- y in formula (9) is the longitudinal position coordinate of the pixel in the first video frame
- Q x, y (x, y) is the longitudinal average displacement matrix corresponding to the first video frame, through the formula (9)
- the longitudinal position coordinates of the pixels in the first video frame can be coordinate transformed to obtain the longitudinal position coordinates of the pixels in the first video frame in the next video frame.
- the pixels in the first video frame in the corresponding image pair can be quickly tracked.
- the position coordinates of the tracked pixel can be determined in the second video frame of the corresponding image pair, that is, the position information of the tracked pixel can be determined in the second video frame of each image pair.
- the application server may further store the position information of the pixel points tracked in each image pair, so that step S204 may be further executed.
- Step S204 Generate track information associated with the target video based on the position information of the pixels in the first video frame and the position information of the pixels obtained by the tracking in the second video frame.
- the track information includes target track information used to track and display multimedia information associated with the target object in the target video.
- Step S205 In response to a trigger operation on the target video, a target pixel is determined from a key video frame of the target video, and multimedia information associated with the target pixel is acquired.
- Step S206 Determine a track acquisition request corresponding to the target pixel based on the position information of the target pixel in the key video frame.
- step S205 and step S206 For the specific implementation of step S205 and step S206, reference may be made to the description of the target user terminal in the embodiment corresponding to FIG. 4, which will not be repeated here.
- Step S207 In response to the request for acquiring the track of the target pixel in the key video frame, acquiring track information associated with the target video.
- the application server may receive the trajectory acquisition request sent by the target user terminal based on the target pixel in the key video frame, and may further acquire the trajectory associated with all the pixels in the target video calculated in advance by the application server. Information in order to further execute step S208.
- Step S208 Filter the target trajectory information associated with the position information of the target pixel in the key video frame from the trajectory information associated with the target video, and return the target trajectory information.
- the application server can obtain the video frame number of the key video frame in the target video and the position information of the target pixel in the key video frame from the trajectory acquisition request, so that it can further obtain the data from the application server in advance.
- the trajectory information associated with the target video is filtered out, and the trajectory information obtained by filtering can be called target trajectory information, so that the target trajectory information can be further returned to the target user terminal , So that the target user terminal can quickly find out the position information of the target pixel appearing in the next video of the key video frame from the received target track information based on the frame number of the key video frame, until the The position information of the target pixel appearing in each video frame after the key video frame.
- the target user terminal can display the target pixel in each video frame after the key video frame.
- New trajectory information formed it should be understood that when the application server obtains the frame number of the key video frame, it can quickly find out the location information of the target pixel in the next video of the key video frame from the filtered track information , Until the position information of the target pixel appearing in each video frame after the key video frame is obtained. At this time, the application server can appear the target pixel in each video frame after the key video frame.
- the new track information formed by the position information is called target track information.
- the target user terminal may send the trajectory acquisition request to the application server when generating the trajectory acquisition request corresponding to the target pixel. So that the application server can obtain target trajectory information associated with the position information of the target pixel in the key video frame based on the trajectory acquisition request, and can return the obtained target trajectory information to the target user terminal ;
- the target user terminal may execute the above steps S201 to S204 in the target user terminal, so as to pre-set the target video in the target user terminal.
- Perform full-image pixel tracking of all pixels in the target video to obtain the position information of all pixels in the target video in each video frame in advance, and then each pixel in the target video can be in each video frame
- the position information is integrated to obtain the track information corresponding to each pixel in the target video.
- the target user terminal can directly obtain the position information of the target pixel in the target object in the key video frame in the target user terminal.
- Target trajectory information associated with the position information so that step S209 can be further executed.
- the target track information includes the position information of the target pixel in the next video frame of the key video frame; the position information in the next video frame of the key video frame is obtained by tracking the target object owned.
- the application server can pre-process each video frame in the target video in advance, that is, based on the above optical flow tracking rules, it can determine the composition of every two adjacent video frames in the target video.
- the average displacement matrix corresponding to the image pair, and then the average displacement matrix corresponding to each image pair (also called the average displacement matrix corresponding to the first video frame in each image pair) for the first video frame All pixels are tracked to obtain the position information of all pixels in the first video frame in the second video frame, and then all the pixels in the target video can be obtained in each video frame (that is, the target).
- the position information of all pixels of the video in the above-mentioned video frame a, video frame b, video frame c, video frame d, video frame e, and video frame f) can be based on the location information of all the pixels of the target video in each video From the position information in the frame, the track information corresponding to all pixels of the target video is obtained.
- the track information corresponding to all the pixels of the target video is called track information associated with the target video.
- the application server may pre-calculate the pixel point A in the target video (for example, the pixel point A may be among all the pixels in the target video). If the track information corresponding to the pixel point A includes the pixel point A in each video frame of the target video (that is, in the above-mentioned video frame a, video frame b, video frame c, Video frame d, video frame e, and video frame f), when the key video frame corresponding to the target pixel in the target user terminal is the video frame c of the target video, the video frame c can be further The pixel point A in the target object is used as the target pixel point.
- the trajectory information of the pixel point A can be filtered from the application server, which can then be based on the filtered pixel point
- the track information of A obtains the position information of the pixel A in each video frame (ie, video frame d, video frame e, and video frame f) after the key video frame.
- the target trajectory information obtained by the target user terminal may be pre-calculated trajectory information.
- the target trajectory information obtained by the target user terminal may include the target pixel in the aforementioned video frame a, video frame b.
- the target trajectory information obtained by the target user terminal may be composed of partial position information determined from the pre-calculated trajectory information
- the target trajectory information acquired by the target user terminal may include the target pixel in the video
- the position information in frame d, video frame e, and video frame f, and the position information of the target pixel in video frame d, video frame e, and video frame f can be called partial position information.
- the target user terminal can also find the trajectory information that contains the position information of the target pixel in the key video frame (that is, the trajectory information corresponding to the pixel A), which is found by the application server. ) Is collectively referred to as target trajectory information.
- the target trajectory information can be regarded as the trajectory information corresponding to the pixel point A that matches the target pixel point found from all the pixels of the target video. Since the trajectory information can contain the position information of the pixel A in each video frame of the target video, naturally, it is also possible to quickly obtain each video frame of the target pixel after the key video from the trajectory information Location information in.
- Step S209 When the next video frame of the key video frame is played, display the multimedia information based on the position information of the target pixel in the target track information in the next video frame of the key video frame .
- the embodiment of the present application can filter out the key video frame from the trajectory information corresponding to all the pre-calculated pixels when the target pixel in the target object selected by the target user is obtained.
- the trajectory information associated with the position information of the target pixel in the, and then the filtered trajectory information can be called target trajectory information. Because the embodiment of the present application can perform pixel tracking on the pixels in each video frame in the video in advance, when the average displacement matrix corresponding to the first video frame in each image pair is obtained, the average displacement matrix can be quickly obtained.
- the position information of each pixel in the video in the corresponding video frame can perform pixel tracking on the pixels in each video frame in the video in advance, when the average displacement matrix corresponding to the first video frame in each image pair is obtained, the average displacement matrix can be quickly obtained.
- the position information of each pixel in the video in the corresponding video frame can be quickly obtained.
- the pre-calculated position information of each pixel in the corresponding video frame can be used to characterize the position information of each pixel in the video played in the current video playback interface in the corresponding video frame. Therefore, when the target user terminal obtains the target pixel in the target object and the multimedia information associated with the target object, the trajectory information corresponding to the target pixel can be quickly filtered from the trajectory information corresponding to all pixels Called target trajectory information, and then the target trajectory information can be returned to the target user terminal, so that the target user terminal can be based on the target pixel point carried in the target trajectory information for each video frame after the key video frame
- the location information in the tracking display of the multimedia information for example, barrage data
- the barrage data can track this trajectory in the target user terminal. Display in a circle.
- FIG. 10 is a schematic diagram of displaying barrage data based on trajectory information according to an embodiment of the present application.
- the video frame 100 shown in FIG. 10 may contain multiple objects, for example, it may contain object 1, object 2, and object 3 shown in FIG. 10. If the target user uses the object 1 shown in FIG. 10 as the target object in the target user terminal, the video frame 100 can be called a key video frame, and the target object can be triggered by the target user. The corresponding pixel is called the target pixel. If the target user terminal has strong computing performance, the position information of each pixel in the target video in each video frame can be pre-calculated in the target user terminal, so that the target user terminal can be To get the track information associated with the target video.
- the track information 1 shown in FIG. 10 can be obtained by pre-calculation, that is, the position information in the track information 1 is determined by the position information of the pixels in the target video in each video frame of the target video. definite. Therefore, the target user terminal can quickly regard the trajectory information 1 shown in FIG. 10 as the target trajectory information based on the position information of the target pixel in the object 1, so that it can be based on the object 1 in the trajectory information 1.
- the position information in each video frame ie, the video frame 200 and the video frame 300 shown in Figure 10) after the key video frame quickly determines the multimedia information associated with the target object (ie, object 1) (ie, as shown in Figure 10).
- the barrage data 1 is BBBBB) for tracking and display. That is, the barrage data displayed in the video frame 200 and the video frame 300 shown in FIG. 10 are both determined by the position information in the track information system 1 shown in FIG. 10.
- the trajectory information associated with the target video shown in FIG. 10 may also be pre-calculated by the application server, so that when the application server receives the trajectory acquisition request for the target pixel in the object 1, It is also possible to quickly obtain the trajectory information associated with the position information of the target pixel in the key video frame from the trajectory information associated with the target video shown in FIG. 10, that is, by comparing the complete image of all pixels in the target video Pixel tracking is executed in the application server, which can effectively reduce the amount of calculation of the target user terminal, so as to ensure that when the target user terminal obtains the trajectory information 1 shown in FIG.
- the barrage data 1 shown in FIG. 10 is quickly tracked and displayed, so that the flexibility of the barrage data display can be improved.
- the video frame corresponding to the trigger operation in the target video may be called a key video frame, so that the target pixel can be determined from the key video frame.
- obtain the multimedia information associated with the target pixel and the target object where the target pixel is located for example, the multimedia information may be user text, pictures, expressions and other barrage data in the target video
- the position information of the point in the key video frame determines the trajectory acquisition request corresponding to the target pixel, and then the target trajectory associated with the position information of the target pixel in the key video frame can be acquired based on the trajectory acquisition request Information, so that when the next video frame of the key video frame is played, the barrage data associated with the target pixel and the target object where the target pixel is located can be displayed based on the target track information.
- the embodiment of the present application can further filter out the trajectory information of the target pixel from the trajectory information of all pixels in the key video frame, and determine the value of the selected target pixel.
- the trajectory information is called target trajectory information, so that the display effect of barrage data can be enriched based on the obtained target trajectory information.
- the obtained target trajectory information may be different, thereby making The display effect of barrage data will be different.
- the position information of the barrage data in each video frame after the key video frame can be quickly determined. In other words, the barrage data will be displayed in the target video.
- the target object is always changed in the video, which can enrich the visual display effect of the user text in the video, and can make the barrage data more closely related to the target object or the object in the commented video.
- FIG. 11 is a schematic structural diagram of a video data processing apparatus provided by an embodiment of the present application.
- the video data processing apparatus 1 can be applied to the target user terminal in the embodiment corresponding to FIG. 1 above.
- the video data processing device 1 may include: an object determination module 1101, a request determination module 1102, a trajectory acquisition module 1103, and a text display module 1104;
- the object determining module 1101 is configured to determine a target pixel from a key video frame of the target video in response to a trigger operation on the target video, and obtain multimedia information associated with the target pixel, wherein the key A video frame is a video frame where the trigger operation is located, and the target pixel is a pixel in the key video frame corresponding to the trigger operation;
- a request determination module 1102 configured to determine a trajectory acquisition request corresponding to the target pixel based on the position information of the target pixel in the key video frame;
- the trajectory acquisition module 1103 is configured to acquire target trajectory information associated with the position information of the target pixel in the key video frame based on the trajectory acquisition request, wherein the target trajectory information includes the target pixel Position information of a point in a video frame next to the key video frame, and the position information of the target pixel in a video frame next to the key video frame is obtained by tracking the target pixel;
- the text display module 1104 is configured to, when the next video frame of the key video frame is played, based on the position information of the target pixel in the target track information in the next video frame of the key video frame, Display the multimedia information.
- the specific execution method of the object determination module 1101, the request determination module 1102, the trajectory acquisition module 1103, and the text display module 1104 can refer to the description of step S101 to step S104 in the embodiment corresponding to FIG. 4, and will not continue here. Repeat.
- the embodiment of the present application may use the video frame corresponding to the trigger operation in the target video as the key video frame when the trigger operation of the target user on the target video is acquired, so that the target pixel can be determined from the key video frame and obtained Multimedia information associated with the target pixel and the target object where the target pixel is located (for example, the multimedia information may be barrage data such as user text, pictures, and expressions in the target video).
- the trajectory acquisition request corresponding to the target pixel is determined based on the position information of the target pixel in the key video frame, and then the position of the target pixel in the key video frame can be acquired based on the trajectory acquisition request Information associated target track information, so that when the next video frame of the key video frame is played, the barrage data associated with the target pixel and the target object where the target pixel is located can be displayed based on the target track information. It can be seen from this that when the key video frame is determined, the embodiment of the present application can further filter out the trajectory information of the target pixel from the trajectory information of all pixels in the key video frame, and determine the value of the selected target pixel.
- the trajectory information is used as the target trajectory information, so that the display effect of the barrage data can be enriched based on the obtained target trajectory information.
- the obtained target trajectory information may be different, which in turn makes the display effect of the barrage data different.
- the position information of the barrage data in each video frame after the key video frame can be quickly determined. In other words, the barrage data will be displayed in the target video.
- the target object is always changed in the video, which can enrich the visual display effect of the user text in the video, and can make the barrage data more closely related to the target object or the object in the commented video.
- FIG. 12 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
- the computer device 1000 may be the target user terminal in the embodiment corresponding to FIG. 1 above.
- the foregoing computer device 1000 may include: a processor 1001, a network interface 1004, and a memory 1005.
- the aforementioned computer device 1000 may further include: a user interface 1003 and at least one communication bus 1002.
- the communication bus 1002 is used to implement connection and communication between these components.
- the user interface 1003 may include a display (Display) and a keyboard (Keyboard).
- the optional user interface 1003 may also include a standard wired interface and a wireless interface.
- the network interface 1004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
- the memory 1004 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory.
- the memory 1005 may also be at least one storage device located far away from the foregoing processor 1001. As shown in FIG. 12, the memory 1005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a device control application program.
- the network interface 1004 can provide network communication functions; and the user interface 1003 is mainly used to provide an input interface for the user; and the processor 1001 can be used to call the device control application stored in the memory 1005 Procedure to achieve:
- a target pixel is determined from a key video frame of the target video, and multimedia information associated with the target pixel is acquired, wherein the key video frame is the trigger operation In the video frame where the target pixel is located, the target pixel is the pixel corresponding to the trigger operation in the key video frame;
- the target trajectory information Based on the trajectory acquisition request, acquire target trajectory information associated with the position information of the target pixel in the key video frame; the target trajectory information includes that the target pixel is located below the key video frame Position information in a video frame; the position information of the target pixel in the video frame next to the key video frame is obtained by tracking the target pixel;
- the multimedia information is displayed based on the position information of the target pixel in the target track information in the next video frame of the key video frame.
- the computer device 1000 described in the embodiment of the present application can perform the description of the foregoing video data processing method in the foregoing embodiment corresponding to FIG. 4, and may also perform the foregoing description of the foregoing video data processing apparatus in the foregoing embodiment corresponding to FIG. 11
- the description of 1 will not be repeated here.
- the description of the beneficial effects of using the same method will not be repeated.
- the embodiments of the present application also provide a computer storage medium, and the computer storage medium stores the aforementioned computer program executed by the video data processing apparatus 1, and the aforementioned computer program includes a program Instructions, when the above-mentioned processor executes the above-mentioned program instructions, it can execute the description of the above-mentioned video data processing method in the embodiment corresponding to FIG. 4, and therefore, it will not be repeated here. In addition, the description of the beneficial effects of using the same method will not be repeated.
- the embodiments of the computer storage media involved in this application please refer to the description of the method embodiments of this application.
- FIG. 13 is a schematic structural diagram of another video data processing apparatus provided by an embodiment of the present application.
- the video data processing apparatus 2 may be applied to the application server in the embodiment corresponding to FIG. 8, and the application server may be the business server 2000 in the embodiment corresponding to FIG. 1.
- the video data processing device 2 may include: a request response module 1301 and a track screening module 1302;
- the request response module 1301 is configured to obtain trajectory information associated with the target video in response to a trajectory acquisition request for a target pixel in a key video frame, where the key video frame is a video frame in the target video, The target pixel is a pixel in the key video frame, and the track information is determined by pixel position information in each video frame in the target video;
- the trajectory filtering module 1302 is configured to filter the target trajectory information associated with the position information of the target pixel in the key video frame from the trajectory information associated with the target video, and return the target trajectory information
- the target track information includes target location information; the target location information is used to trigger the display of multimedia information associated with the target pixel in the next video frame of the key video frame.
- step S207 and step S208 in the embodiment corresponding to FIG. 8, which will not be repeated here.
- FIG. 14 is a schematic structural diagram of another computer device provided by an embodiment of the present application.
- the computer device 2000 may be the target service server 2000 in the embodiment corresponding to FIG. 1 above.
- the aforementioned computer device 2000 may include: a processor 2001, a network interface 2004, and a memory 2005.
- the aforementioned computer device 2000 may further include: a user interface 2003 and at least one communication bus 2002.
- the communication bus 2002 is used to realize the connection and communication between these components.
- the user interface 2003 may include a display (Display) and a keyboard (Keyboard), and the optional user interface 2003 may also include a standard wired interface and a wireless interface.
- the network interface 2004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
- the memory 2004 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory.
- the memory 2005 may also be at least one storage device located far away from the aforementioned processor 2001.
- the memory 2005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a device control application program.
- the network interface 2004 can provide network communication functions;
- the user interface 2003 is mainly used to provide an input interface for the user; and
- the processor 2001 can be used to call the device control application stored in the memory 2005 Procedure to achieve:
- the trajectory information associated with the target video is acquired, where the key video frame is a video frame in the target video, and the target pixel is all
- the track information is determined by the pixel point position information in each video frame of the target video;
- the target track information includes Target location information; the target location information is used to trigger the display of multimedia information associated with the target pixel in the next video frame of the key video frame.
- the computer device 2000 described in the embodiment of the present application can perform the description of the foregoing video data processing method in the foregoing embodiment corresponding to FIG. 8 and may also perform the foregoing description of the foregoing video data processing device in the foregoing embodiment corresponding to FIG. 13
- the description of 2 will not be repeated here.
- the description of the beneficial effects of using the same method will not be repeated.
- the embodiments of the present application also provide a computer storage medium, and the computer storage medium stores the aforementioned computer program executed by the video data processing device 2, and the aforementioned computer program includes a program Instructions, when the above-mentioned processor executes the above-mentioned program instructions, it can execute the description of the above-mentioned video data processing method in the embodiment corresponding to FIG. 8 above, and therefore, it will not be repeated here. In addition, the description of the beneficial effects of using the same method will not be repeated.
- the description of the beneficial effects of using the same method will not be repeated.
- FIG. 15 is a schematic structural diagram of another video data processing apparatus provided by an embodiment of the present application.
- the video data processing device 3 can be applied to the service server 2000 in the embodiment corresponding to FIG. 1 and also can be applied to the target user terminal in the embodiment corresponding to FIG. 1.
- the video data processing device 3 may include: a first acquisition module 310, a matrix acquisition module 410, a position tracking module 510, and a trajectory generation module 610;
- the first obtaining module 310 is configured to obtain adjacent first video frames and second video frames from the target video;
- the matrix acquisition module 410 is configured to determine the corresponding optical flow tracking rules of the target video, the pixels in the first video frame, and the pixels in the second video frame Average displacement matrix;
- the matrix obtaining module 410 includes: a first determining unit 4001, a matrix determining unit 4002, a pixel point screening unit 4003, a matrix correcting unit 4004, and a second determining unit 4005;
- the first determining unit 4001 is configured to obtain the optical flow tracking rule corresponding to the target video, and determine the position information of the pixel in the first video frame as the first position information, and determine the second video frame The position information of the pixel in is determined as the second position information;
- the matrix determining unit 4002 is configured to obtain the optical flow tracking rule, the first position information of the pixels in the first video frame, and the second position information of the pixels in the second video frame based on the optical flow tracking rules.
- the matrix determining unit 4002 includes: a first tracking subunit 4021 and a second tracking subunit 4022;
- the first tracking subunit 4021 is configured to forwardly map the pixels in the first video frame to the optical flow tracking rule based on the first position information of the pixels in the first video frame Second video frame, and determine the second position information of the first mapping point obtained by mapping in the second video frame, and based on the first position information of the pixel point and the second position information of the first mapping point The position information determines the forward displacement matrix corresponding to the first video frame;
- the second tracking subunit 4022 based on the second position information of the pixels in the second video frame and the optical flow tracking rule, reversely maps the first mapping point in the second video frame to the The first video frame, and the third position information of the second mapping point obtained by mapping is determined in the first video frame, and is based on the second position information of the first mapping point and the second position information of the second mapping point The three position information determines the reverse displacement matrix corresponding to the second video frame.
- first tracking subunit 4021 and the second tracking subunit 4022 can refer to the description of the cloud forward and backward optical flow method in the embodiment corresponding to FIG. 8, and the details will not be repeated here.
- the pixel point screening unit 4003 is configured to, based on the first position information of the pixel points in the first video frame, the forward displacement matrix, and the reverse displacement matrix, select the pixels that meet the target filtering condition Pixels are determined as effective pixels;
- the pixel point screening unit 4003 includes: a first position determining subunit 4031, a second position determining subunit 4032, a third position determining subunit 4033, an error determining subunit 4034, and an effective screening subunit 4035;
- the first position determining subunit 4031 is configured to obtain a first pixel point from pixels in the first video frame, and determine first position information of the first pixel point in the first video frame, And determining the first lateral displacement and the first longitudinal displacement corresponding to the first pixel from the forward displacement matrix;
- the second position determining subunit 4032 is configured to forward the first pixel point based on the first position information of the first pixel point, the first lateral displacement and the first longitudinal displacement corresponding to the first pixel point Mapping to the second video frame, and determining second position information of the second pixel point obtained by mapping in the second video frame;
- the third position determining subunit 4033 is used to determine the second lateral displacement and the second longitudinal displacement corresponding to the second pixel from the reverse displacement matrix, and based on the second position information of the second pixel , The second horizontal displacement and the second vertical displacement corresponding to the second pixel point are reversely mapped to the first video frame, and the mapped result is determined in the first video frame Third position information of the third pixel of
- the error determination subunit 4034 is configured to determine the distance between the first pixel and the third pixel based on the first position information of the first pixel and the third position information of the third pixel. Error distance, and according to the first position information of the first pixel and the second position information of the second pixel, determine the difference between the image block containing the first pixel and the image block containing the second pixel Correlation coefficient between
- the effective screening sub-unit 4035 is configured to determine the pixels whose error distance is less than the error distance threshold and the correlation coefficient is greater than or equal to the correlation coefficient threshold among the pixels as effective pixels.
- the specific implementation of the first position determination sub-unit 4031, the second position determination sub-unit 4032, the third position determination sub-unit 4033, the error determination sub-unit 4034, and the effective screening sub-unit 4035 can refer to the embodiment corresponding to FIG. 8 above.
- the details will not be repeated here.
- the matrix correction unit 4004 is configured to correct the initial state matrix and the forward displacement matrix corresponding to the first video frame based on the effective pixel points to obtain the target state matrix and target displacement corresponding to the first video frame matrix;
- the matrix correction unit 4004 includes: an initial acquisition subunit 4041, a value switching subunit 4042, a displacement setting subunit 4043;
- the initial acquisition subunit 4041 acquires the initial state matrix corresponding to the first video frame; the state value of each matrix element in the initial state matrix is the first value, and one matrix element corresponds to one of the pixels ;
- the value switching subunit 4042 is used to switch the state value of the matrix element corresponding to the effective pixel point from the first value to the second value in the initial state matrix, and determine the initial state matrix containing the second value Is the target state matrix corresponding to the first video frame;
- the displacement setting subunit 4043 is configured to set the displacement of the matrix element corresponding to the remaining pixel points in the forward displacement matrix to the first value, and determine the forward displacement matrix containing the first value Is the target displacement matrix; the remaining pixel points are the pixel points excluding the effective pixel points among the pixel points.
- the displacement setting sub-unit 4043 is specifically configured to, if the forward displacement matrix includes an initial lateral displacement matrix and an initial longitudinal displacement matrix, in the initial lateral displacement matrix, the matrix elements corresponding to the remaining pixels
- the first lateral displacement of is set to the first value, and the initial lateral displacement including the first value is determined as the lateral displacement matrix corresponding to the first video frame;
- the displacement setting subunit 4043 is also specifically configured to set the first longitudinal displacement of the matrix element corresponding to the remaining pixel to the first value in the initial longitudinal displacement matrix, and include the first value
- the initial longitudinal displacement of is determined as the longitudinal displacement matrix corresponding to the first video frame;
- the displacement setting subunit 4043 is further specifically configured to determine the horizontal displacement matrix corresponding to the first video frame and the longitudinal displacement matrix corresponding to the first video frame as the target displacement matrix.
- the initial acquisition subunit 4041 the value switching subunit 4042, the displacement setting subunit 4043, please refer to the description of the modified initial state matrix and the forward displacement matrix in the embodiment corresponding to FIG. 8, which will not be repeated here. .
- the second determining unit 4005 is configured to determine an average displacement matrix corresponding to the first video frame based on the target state matrix and the target displacement matrix.
- the second determining unit 4005 includes: a first integration subunit 4051, a second integration subunit 4052, a third integration subunit 4053, and a difference operation subunit 4054;
- the first integration subunit 4051 is configured to perform a displacement integration operation on the target state matrix in the first video frame to obtain the state integration matrix corresponding to the pixel points in the first video frame;
- the second integration subunit 4052 is configured to perform a displacement integration operation on the lateral displacement matrix in the target state matrix in the first video frame to obtain the lateral displacement integral matrix corresponding to the pixel points in the first video frame ;
- the third integration subunit 4053 is configured to perform a displacement integration operation on the longitudinal displacement matrix in the target state matrix in the first video frame to obtain the longitudinal displacement integral matrix corresponding to the pixel points in the first video frame ;
- the difference operation subunit 4054 is used to determine the difference area corresponding to the displacement difference operation from the first video frame, and determine the difference area based on the size information of the difference area, the state integral matrix, the horizontal displacement integral matrix and the longitudinal displacement integral matrix. The average displacement matrix corresponding to the first video frame.
- the difference operation subunit 4054 includes: a first difference numerator unit 4055, a second difference numerator unit 4056, a third difference numerator unit 4057, and an average determination subunit 4058;
- the first difference molecule unit 4055 is configured to perform a displacement difference operation on the state integral matrix based on the length information and width information corresponding to the difference area to obtain the state difference matrix corresponding to the first image frame;
- the second differential molecule unit 4056 is configured to perform a displacement difference operation on the lateral displacement integral matrix and the longitudinal displacement integral matrix based on the length information and width information corresponding to the difference area, to obtain the lateral displacement corresponding to the first image frame. Displacement difference matrix and longitudinal displacement difference matrix;
- the third difference element unit 4057 is used to determine the ratio between the lateral displacement difference matrix and the state difference matrix as the horizontal average displacement matrix, and to determine the difference between the longitudinal displacement difference matrix and the state difference matrix The ratio is determined as the longitudinal average displacement matrix;
- the average determination subunit 4058 is configured to determine the longitudinal displacement difference matrix and the longitudinal average displacement matrix as the average displacement matrix corresponding to the first video frame.
- the specific implementation of the first integration sub-unit 4051, the second integration sub-unit 4052, the third integration sub-unit 4053, and the difference operation sub-unit 4054 can refer to the cloud displacement integration method and cloud displacement in the embodiment corresponding to FIG. The description of the difference method will not be repeated here.
- the specific implementation of the first determining unit 4001, the matrix determining unit 4002, the pixel point screening unit 4003, the matrix correcting unit 4004, and the second determining unit 4005 can refer to the description of step S202 in the embodiment corresponding to FIG. 8, here Will not continue to repeat.
- the position tracking module 510 is configured to track the position information of the pixels in the first video frame based on the average displacement matrix, and determine the position information of the pixels obtained by tracking in the second video frame;
- the trajectory generating module 610 is configured to generate a trajectory associated with the target video based on the position information of the pixels in the first video frame and the position information of the pixels obtained by the tracking in the second video frame Information;
- the trajectory information includes target trajectory information used to track and display the multimedia information associated with the target pixel in the target video.
- the specific implementation of the first acquisition module 310, the matrix acquisition module 410, the position tracking module 510, and the trajectory generation module 610 can refer to the description of steps S201 to S204 in the embodiment corresponding to FIG. 8, and will not continue here. Go ahead.
- FIG. 16 is a schematic structural diagram of another computer device provided by an embodiment of the present application.
- the foregoing computer device 3000 may be applied to the service server 2000 in the foregoing embodiment corresponding to FIG. 1.
- the above-mentioned computer equipment 3000 may include: a processor 3001, a network interface 3004 and a memory 3005.
- the above-mentioned video data processing device 3000 may also include: a user interface 3003 and at least one communication bus 3002. Among them, the communication bus 3002 is used to implement connection and communication between these components.
- the user interface 3003 may include a display screen (Display) and a keyboard (Keyboard), and the optional user interface 3003 may also include a standard wired interface and a wireless interface.
- the network interface 3004 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface).
- the memory 3004 may be a high-speed RAM memory, or a non-volatile memory (non-volatile memory), such as at least one disk memory.
- the memory 3005 may also be at least one storage device located far away from the foregoing processor 3001. As shown in FIG. 16, the memory 3005 as a computer storage medium may include an operating system, a network communication module, a user interface module, and a device control application program.
- the network interface 3004 can provide network communication functions; the user interface 3003 is mainly used to provide an input interface for the user; and the processor 3001 can be used to call the device control application stored in the memory 3005 Procedure to achieve:
- the trajectory information associated with the target video is generated; in the trajectory information Contains target trajectory information used to track and display the multimedia information associated with the target pixel in the target video.
- the computer device 3000 described in the embodiment of the present application can perform the description of the foregoing video data processing method in the foregoing embodiment corresponding to FIG. 8 and may also perform the foregoing description of the foregoing video data processing apparatus in the foregoing embodiment corresponding to FIG. 15
- the description of 3 will not be repeated here.
- the description of the beneficial effects of using the same method will not be repeated.
- the embodiments of the present application also provide a computer storage medium, and the computer storage medium stores the aforementioned computer program executed by the video data processing device 3, and the aforementioned computer program includes a program Instructions, when the above-mentioned processor executes the above-mentioned program instructions, it can execute the description of the above-mentioned video data processing method in the embodiment corresponding to FIG. 8 above, and therefore, it will not be repeated here. In addition, the description of the beneficial effects of using the same method will not be repeated.
- the embodiments of the computer storage media involved in this application please refer to the description of the method embodiments of this application.
- the above-mentioned program can be stored in a computer-readable storage medium. When executed, it may include the processes of the above-mentioned method embodiments.
- the aforementioned storage medium may be a magnetic disk, an optical disc, a read-only memory (Read-Only Memory, ROM), or a random access memory (Random Access Memory, RAM), etc.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Databases & Information Systems (AREA)
- Human Computer Interaction (AREA)
- Computer Security & Cryptography (AREA)
- Image Analysis (AREA)
- User Interface Of Digital Computer (AREA)
- Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)
Abstract
Description
Claims (15)
- 一种视频数据处理方法,所述方法应用于计算机设备,其特征在于,包括:响应于对目标视频的触发操作,从所述目标视频的关键视频帧中确定目标像素点,并获取与所述目标像素点相关联的多媒体信息,其中,所述关键视频帧是所述触发操作所在的视频帧,所述目标像素点是所述关键视频帧中与所述触发操作对应的像素点;基于所述目标像素点在所述关键视频帧中的位置信息,确定所述目标像素点对应的轨迹获取请求;基于所述轨迹获取请求,获取与所述目标像素点在所述关键视频帧中的位置信息相关联的目标轨迹信息,其中,所述目标轨迹信息包含所述目标像素点在所述关键视频帧的下一视频帧中的位置信息,所述目标像素点在所述关键视频帧的下一视频帧中的位置信息是通过跟踪所述目标像素点得到的;当播放所述关键视频帧的下一视频帧时,基于所述目标轨迹信息中的所述目标像素点在所述关键视频帧的下一视频帧中的位置信息,显示所述多媒体信息。
- 一种视频数据处理方法,所述方法应用于业务服务器,其特征在于,包括:响应于对关键视频帧中的目标像素点的轨迹获取请求,获取与目标视频相关联的轨迹信息,其中,所述关键视频帧是所述目标视频中的视频帧,所述目标像素点是所述关键视频帧中的像素点,所述轨迹信息是由所述目标视频的每个视频帧中的像素点的位置信息所确定的;从所述目标视频相关联的轨迹信息中,筛选与所述目标像素点在所述关键视频帧中的位置信息相关联的目标轨迹信息,并返回所述目标轨迹信息,其中,所述目标轨迹信息包含目标位置信息,所述目标位置信息用于触发在所述关键视频帧的下一视频帧中,显示与所述目标像素点相关联的多媒体信息。
- 一种视频数据处理方法,其特征在于,包括:从目标视频中获取相邻的第一视频帧和第二视频帧;基于所述目标视频对应的光流追踪规则、所述第一视频帧中的像素点、所述第二视频帧中的像素点,确定所述第一视频帧对应的平均位移矩阵;基于所述平均位移矩阵,对所述第一视频帧中的像素点的位置信息进行跟踪,并在所述第二视频帧中确定所跟踪得到的像素点的位置信息;基于所述第一视频帧中的像素点的位置信息、所述跟踪得到的像素点在所述第二视频帧中的位置信息,生成与所述目标视频相关联的轨迹信息,其中,所述轨迹信息中包含用于对目标视频中的目标像素点所关联的多媒体信息进行跟踪显示的目标轨迹信息。
- 根据权利要求3所述的方法,其特征在于,所述基于所述目标视频对应的光流追踪规则、所述第一视频帧中的像素点、所述第二视频帧中的像素点,确定所述第一视频帧对应的平均位移矩阵,包括:获取所述目标视频对应的光流追踪规则,并将所述第一视频帧中的像素点的位置信息确定为第一位置信息,并将所述第二视频帧中的像素点的位置信息确定为第二位置信息;基于所述光流追踪规则、所述第一视频帧中的像素点的第一位置信息、所述第二视频帧中的像素点的第二位置信息,获取所述第一视频帧对应的正向位移矩阵,并获取所述第二视频帧对应的反向位移矩阵;基于所述第一视频帧中的像素点的第一位置信息、所述正向位移矩阵、所述反向位移矩阵,将所述像素点中满足目标筛选条件的像素点确定为有效像素点;基于所述有效像素点对所述第一视频帧对应的初始状态矩阵和所述正向位移矩阵进行修正,得到所述第一视频帧对应的目标状态矩阵和目标位移矩阵;基于所述目标状态矩阵和所述目标位移矩阵,确定所述第一视频帧对应的平均位移矩阵。
- 根据权利要求4所述的方法,其特征在于,所述基于所述光流追踪规则、所述第一视频帧中的像素点的第一位置信息、所述第二视频帧中的像素点的第二位置信息,获取所述第一视频帧对应的正向位移矩阵,并获取所述第二视频帧对应的反向位移矩阵,包括:基于所述第一视频帧中的像素点的第一位置信息和所述光流追踪规则,将所述第一视频帧中的像素点正向映射到所述第二视频帧,并在所述第二视频帧中确定所映射得到的第一映射点的第二位置信息,并基于所述像素点的第一位置信息、所述第一映射点的第二位置信息确定所述第一视频帧对应的正向位移矩阵;基于所述第二视频帧中的像素点的第二位置信息和所述光流追踪规则,将所述第二视频帧中的像素点反向映射到所述第一视频帧,并在所述第一视频帧中确定所映射得到第二映射点的第三位置信息,并基于所述第一映射点的第二位置信息、所述第二映射点的第三位置信息确定所述第二视频帧对应的反向位移矩阵。
- 根据权利要求4所述的方法,其特征在于,所述基于所述第一视频帧中的像素点的第一位置信息、所述正向位移矩阵、所述反向位移矩阵,将所述像素点中满足目标筛选条件的像素点确定为有效像素点,包括:从所述第一视频帧中的像素点中获取第一像素点,并在所述第一视频帧中确定所述第一像素点的第一位置信息,并从所述正向位移矩阵中确定所述第一像素点对应的第一横向位移和第一纵向位移;基于所述第一像素点的第一位置信息、所述第一像素点对应的第一横向位移和第一纵向位移,将所述第一像素点正向映射到所述第二视频帧,并在所述第二视频帧中确定所映射得到的第二像素点的第二位置信息;从所述反向位移矩阵中确定所述第二像素点对应的第二横向位移和第二纵向位移,并基于所述第二像素点的第二位置信息、所述第二像素点对应的第二横向位移和第二纵向位移,将所述第二像素点反向映射到所述第一视频帧,并在所述第一视频帧中确定所映射得到的第三像素点的第三位置信息;基于所述第一像素点的第一位置信息、所述第三像素点的第三位置信息,确定所述第一像素点与所述第三像素点之间的误差距离,并根据所述第一像素点的第一位置信息、所述第二像素点的第二位置信息,确定包含第一像素点的图像块与包含所述第二像素点的图像块之间的相关系数;将所述像素点中误差距离小于误差距离阈值、且所述相关系数大于等于相关系数阈值的像素点确定为有效像素点。
- 根据权利要求4所述的方法,其特征在于,所述基于所述有效像素点对所述第一视频帧对应的初始状态矩阵和所述正向位移矩阵进行修正,得到所述第一视频帧对应的目标状态矩阵和目标位移矩阵,包括:获取第一视频帧对应的初始状态矩阵;所述初始状态矩阵中的每个矩阵元素的状态值均为第一数值,一个矩阵元素对应所述像素点中的一个像素点;在所述初始状态矩阵中将与所述有效像素点对应的矩阵元素的状态值由第一数值切换为第二数值,并将包含第二数值的初始状态矩阵确定为所述第一视频帧对应的目标状态矩阵;在所述正向位移矩阵中将所述剩余像素点对应的矩阵元素的位移设置为所述第一数值,并将包含所述第一数值的正向位移矩阵确定为目标位移矩阵;所述剩余像素点为所述像素点中除所述有效像素点之外的像素点。
- 根据权利要求7所述的方法,其特征在于,所述在所述正向位移矩阵中将所述剩余像素点对应的矩阵元素的位移设置为所述第一数值,并将包含所述第一数值的正向位移矩阵确定为目标位移矩阵,包括:若所述正向位移矩阵包含初始横向位移矩阵和初始纵向位移矩阵,则在所述初始横向位移矩阵中将所述剩余像素点对应的矩阵元素的第一横向位移设置为所述第一数值,并将包含所述第一数值的初始横向位移确定为所述第一视频帧对应的横向位移矩阵;在所述初始纵向位移矩阵将所述剩余像素点对应的矩阵元素的第一纵向位移设置为所述第一数值,并将包含所述第一数值的初始纵向位移确定为所述第一视频帧对应的纵向位移矩阵;将所述第一视频帧对应的横向位移矩阵和所述第一视频帧对应的纵向位移矩阵确定为目标位移矩阵。
- 根据权利要求4所述的方法,其特征在于,所述基于所述目标状态矩阵和所述目标位移矩阵,确定所述第一视频帧对应的平均位移矩阵,包括:在所述第一视频帧中对所述目标状态矩阵进行位移积分运算,得到所述第一视频帧中的像素点对应的状态积分矩阵;在所述第一视频帧中对所述目标状态矩阵中的横向位移矩阵进行位移积分运算,得到所述第一视频帧中的像素点对应的横向位移积分矩阵;在所述第一视频帧中对所述目标状态矩阵中的纵向位移矩阵进行位移积分运算,得到所述第一视频帧中的像素点对应的纵向位移积分矩阵;从所述第一视频帧中确定位移差分运算对应的差分区域,基于所述差分区域的尺寸信息、状态积分矩阵、横向位移积分矩阵和纵向位移积分矩阵,确定所述第一视频帧对应的平均位移矩阵。
- 根据权利要求9所述的方法,其特征在于,所述基于所述差分区域的尺寸信息、状态积分矩阵、横向位移积分矩阵和纵向位移积分矩阵,确定所述第一视频帧对应的平均位移矩阵,包括:基于所述差分区域对应的长度信息和宽度信息,对所述状态积分矩阵进行位移差分运算,得到所述第一图像帧对应的状态差分矩阵;基于所述差分区域对应的长度信息和宽度信息,分别对所述横向位移积分矩阵和纵向位移积分矩阵进行位移差分运算,得到所述第一图像帧对应的横向位移差分矩阵和纵向位移差分矩阵;将所述横向位移差分矩阵与所述状态差分矩阵之间的比值确定为横向平均位移矩阵,并将所述纵向位移差分矩阵与所述状态差分矩阵之间的比值确定为纵向平均位移矩阵;将所述纵向位移差分矩阵和所述纵向平均位移矩阵确定为所述第一视频帧对应的平均位移矩阵。
- 一种视频数据处理装置,所述装置应用于计算机设备,其特征在于,包括:对象确定模块,用于响应于对目标视频的触发操作,从所述目标视频的关键视频帧中确定目标像素点,并获取与所述目标像素点相关联的多媒体信息,其中,所述关键视频帧是所述触发操作所在的视频帧,所述目标像素点是所述关键视频帧中与所述触发操作对应的像素点;请求确定模块,用于基于所述目标像素点在所述关键视频帧中的位置信息,确定所述目标像素点对应的轨迹获取请求;轨迹获取模块,用于基于所述轨迹获取请求,获取与所述目标像素点在所述关键视频帧中的位置信息相关联的目标轨迹信息,其中,所述目标轨迹信息包含所述目标像素点在所述关键视频帧的下一视频帧中的位置信息,所述目标像素点在所述关键视频帧的下一视频帧中的位置信息是通过跟踪所述目标像素点得到的;文本显示模块,用于当播放所述关键视频帧的下一视频帧时,基于所述目标轨迹信息中的所述目标像素点在所述关键视频帧的下一视频帧中的位置信息显示所述多媒体信息。
- 一种视频数据处理装置,所述方法应用于业务服务器,其特征在于,包括:请求响应模块,用于响应于对关键视频帧中的目标像素点的轨迹获取请求,获取与目标视频相关联的轨迹信息,其中,所述关键视频帧是所述目标视频中的视频帧,所述目标像素点是所述关键视频帧中的像素点,所述轨迹信息是由所述目标视频的每个视频帧中的像素点的位置信息所确定的;轨迹筛选模块,用于从所述目标视频相关联的轨迹信息中,筛选与所述目标像素点在所述关键视频帧中的位置信息相关联的目标轨迹信息,并返回所述目标轨迹信息,其中,所述目标轨迹信息包含目标位置信息,所述目标位置信息用于触发在所述关键视频帧的下一视频帧中,显示与所述目标像素点相关联的多媒体信息。
- 一种视频数据处理装置,其特征在于,包括:第一获取模块,用于从目标视频中获取相邻的第一视频帧和第二视频帧;矩阵获取模块,用于基于所述目标视频对应的光流追踪规则、所述第一视频帧中的像素点、所述第二视频帧中的像素点,确定所述第一视频帧对应的平均位移矩阵;位置跟踪模块,用于基于所述平均位移矩阵对所述第一视频帧中的像素点的位置信息进行跟踪,并在所述第二视频帧中确定所跟踪得到的像素点的位置信息;轨迹生成模块,用于基于所述第一视频帧中的像素点的位置信息、所述跟 踪得到的像素点在所述第二视频帧中的位置信息,生成与所述目标视频相关联的轨迹信息,其中,所述轨迹信息中包含用于对目标视频中的目标像素点所关联的多媒体信息进行跟踪显示的目标轨迹信息。
- 一种计算机设备,其特征在于,包括:处理器、存储器、网络接口;所述处理器与存储器、网络接口相连,其中,网络接口用于提供数据通信功能,所述存储器用于存储计算机程序,所述处理器用于调用所述计算机程序,以执行如权利要求1、2、3-10任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机程序,所述计算机程序包括程序指令,所述程序指令当被处理器执行时,执行如权利要求1、2、3-10任一项所述的方法。
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020217022717A KR102562208B1 (ko) | 2019-04-30 | 2020-04-10 | 비디오 데이터 프로세싱 방법 및 관련 디바이스 |
SG11202105410RA SG11202105410RA (en) | 2019-04-30 | 2020-04-10 | Video data processing method and related device |
EP20799151.4A EP3965431A4 (en) | 2019-04-30 | 2020-04-10 | VIDEO DATA PROCESSING METHOD AND RELATED DEVICE |
JP2021531593A JP7258400B6 (ja) | 2019-04-30 | 2020-04-10 | ビデオデータ処理方法、ビデオデータ処理装置、コンピュータ機器、及びコンピュータプログラム |
US17/334,678 US11900614B2 (en) | 2019-04-30 | 2021-05-28 | Video data processing method and related apparatus |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910358569.8 | 2019-04-30 | ||
CN201910358569.8A CN110062272B (zh) | 2019-04-30 | 2019-04-30 | 一种视频数据处理方法和相关装置 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US17/334,678 Continuation US11900614B2 (en) | 2019-04-30 | 2021-05-28 | Video data processing method and related apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020220968A1 true WO2020220968A1 (zh) | 2020-11-05 |
Family
ID=67321748
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/084112 WO2020220968A1 (zh) | 2019-04-30 | 2020-04-10 | 一种视频数据处理方法和相关装置 |
Country Status (7)
Country | Link |
---|---|
US (1) | US11900614B2 (zh) |
EP (1) | EP3965431A4 (zh) |
JP (1) | JP7258400B6 (zh) |
KR (1) | KR102562208B1 (zh) |
CN (1) | CN110062272B (zh) |
SG (1) | SG11202105410RA (zh) |
WO (1) | WO2020220968A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117812392A (zh) * | 2024-01-09 | 2024-04-02 | 广州巨隆科技有限公司 | 可视化屏幕的分辨率自适应调节方法、***、介质及设备 |
Families Citing this family (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110062272B (zh) | 2019-04-30 | 2021-09-28 | 腾讯科技(深圳)有限公司 | 一种视频数据处理方法和相关装置 |
CN111161309B (zh) * | 2019-11-19 | 2023-09-12 | 北航航空航天产业研究院丹阳有限公司 | 一种车载视频动态目标的搜索与定位方法 |
CN111193938B (zh) * | 2020-01-14 | 2021-07-13 | 腾讯科技(深圳)有限公司 | 视频数据处理方法、装置和计算机可读存储介质 |
CN112258551B (zh) * | 2020-03-18 | 2023-09-05 | 北京京东振世信息技术有限公司 | 一种物品掉落检测方法、装置、设备及存储介质 |
CN111753679B (zh) * | 2020-06-10 | 2023-11-24 | 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) | 微运动监测方法、装置、设备及计算机可读存储介质 |
CN111901662A (zh) * | 2020-08-05 | 2020-11-06 | 腾讯科技(深圳)有限公司 | 视频的扩展信息处理方法、设备和存储介质 |
CN114449326A (zh) * | 2020-11-06 | 2022-05-06 | 上海哔哩哔哩科技有限公司 | 视频标注方法、客户端、服务器及*** |
CN114584824A (zh) * | 2020-12-01 | 2022-06-03 | 阿里巴巴集团控股有限公司 | 数据处理方法、***、电子设备、服务端及客户端设备 |
CN112884830B (zh) * | 2021-01-21 | 2024-03-29 | 浙江大华技术股份有限公司 | 一种目标边框确定方法及装置 |
CN113034458B (zh) * | 2021-03-18 | 2023-06-23 | 广州市索图智能电子有限公司 | 室内人员轨迹分析方法、装置及存储介质 |
US12020279B2 (en) * | 2021-05-03 | 2024-06-25 | Refercloud Llc | System and methods to predict winning TV ads, online videos, and other audiovisual content before production |
CN114281447B (zh) * | 2021-12-02 | 2024-03-19 | 武汉华工激光工程有限责任公司 | 一种载板激光加工软件界面处理方法、***及存储介质 |
CN114827754B (zh) * | 2022-02-23 | 2023-09-12 | 阿里巴巴(中国)有限公司 | 视频首帧时间检测方法及装置 |
CN117270982A (zh) * | 2022-06-13 | 2023-12-22 | 中兴通讯股份有限公司 | 数据处理方法、控制装置、电子设备、计算机可读介质 |
CN115297355B (zh) * | 2022-08-02 | 2024-01-23 | 北京奇艺世纪科技有限公司 | 弹幕显示方法、生成方法、装置、电子设备及存储介质 |
CN116152301B (zh) * | 2023-04-24 | 2023-07-14 | 知行汽车科技(苏州)股份有限公司 | 一种目标的速度估计方法、装置、设备及介质 |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101930779A (zh) * | 2010-07-29 | 2010-12-29 | 华为终端有限公司 | 一种视频批注方法及视频播放器 |
US20140241573A1 (en) * | 2013-02-27 | 2014-08-28 | Blendagram, Inc. | System for and method of tracking target area in a video clip |
CN104881640A (zh) * | 2015-05-15 | 2015-09-02 | 华为技术有限公司 | 一种获取向量的方法及装置 |
CN105872442A (zh) * | 2016-03-30 | 2016-08-17 | 宁波三博电子科技有限公司 | 一种基于人脸识别的即时弹幕礼物赠送方法及*** |
CN108242062A (zh) * | 2017-12-27 | 2018-07-03 | 北京纵目安驰智能科技有限公司 | 基于深度特征流的目标跟踪方法、***、终端及介质 |
CN109087335A (zh) * | 2018-07-16 | 2018-12-25 | 腾讯科技(深圳)有限公司 | 一种人脸跟踪方法、装置和存储介质 |
CN109558505A (zh) * | 2018-11-21 | 2019-04-02 | 百度在线网络技术(北京)有限公司 | 视觉搜索方法、装置、计算机设备及存储介质 |
CN110062272A (zh) * | 2019-04-30 | 2019-07-26 | 腾讯科技(深圳)有限公司 | 一种视频数据处理方法和相关装置 |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8363109B2 (en) * | 2009-12-10 | 2013-01-29 | Harris Corporation | Video processing system providing enhanced tracking features for moving objects outside of a viewable window and related methods |
JP5659307B2 (ja) * | 2012-07-17 | 2015-01-28 | パナソニックIpマネジメント株式会社 | コメント情報生成装置およびコメント情報生成方法 |
US20190096439A1 (en) * | 2016-05-23 | 2019-03-28 | Robert Brouwer | Video tagging and annotation |
US20190253747A1 (en) | 2016-07-22 | 2019-08-15 | Vid Scale, Inc. | Systems and methods for integrating and delivering objects of interest in video |
US20180082428A1 (en) * | 2016-09-16 | 2018-03-22 | Qualcomm Incorporated | Use of motion information in video data to track fast moving objects |
WO2018105290A1 (ja) * | 2016-12-07 | 2018-06-14 | ソニーセミコンダクタソリューションズ株式会社 | 画像センサ |
US10592786B2 (en) * | 2017-08-14 | 2020-03-17 | Huawei Technologies Co., Ltd. | Generating labeled data for deep object tracking |
CN109559330B (zh) * | 2017-09-25 | 2021-09-10 | 北京金山云网络技术有限公司 | 运动目标的视觉跟踪方法、装置、电子设备及存储介质 |
CN108389217A (zh) * | 2018-01-31 | 2018-08-10 | 华东理工大学 | 一种基于梯度域混合的视频合成方法 |
US20190392591A1 (en) * | 2018-06-25 | 2019-12-26 | Electronics And Telecommunications Research Institute | Apparatus and method for detecting moving object using optical flow prediction |
US10956747B2 (en) * | 2018-12-31 | 2021-03-23 | International Business Machines Corporation | Creating sparsely labeled video annotations |
-
2019
- 2019-04-30 CN CN201910358569.8A patent/CN110062272B/zh active Active
-
2020
- 2020-04-10 SG SG11202105410RA patent/SG11202105410RA/en unknown
- 2020-04-10 EP EP20799151.4A patent/EP3965431A4/en active Pending
- 2020-04-10 KR KR1020217022717A patent/KR102562208B1/ko active IP Right Grant
- 2020-04-10 JP JP2021531593A patent/JP7258400B6/ja active Active
- 2020-04-10 WO PCT/CN2020/084112 patent/WO2020220968A1/zh unknown
-
2021
- 2021-05-28 US US17/334,678 patent/US11900614B2/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101930779A (zh) * | 2010-07-29 | 2010-12-29 | 华为终端有限公司 | 一种视频批注方法及视频播放器 |
US20140241573A1 (en) * | 2013-02-27 | 2014-08-28 | Blendagram, Inc. | System for and method of tracking target area in a video clip |
CN104881640A (zh) * | 2015-05-15 | 2015-09-02 | 华为技术有限公司 | 一种获取向量的方法及装置 |
CN105872442A (zh) * | 2016-03-30 | 2016-08-17 | 宁波三博电子科技有限公司 | 一种基于人脸识别的即时弹幕礼物赠送方法及*** |
CN108242062A (zh) * | 2017-12-27 | 2018-07-03 | 北京纵目安驰智能科技有限公司 | 基于深度特征流的目标跟踪方法、***、终端及介质 |
CN109087335A (zh) * | 2018-07-16 | 2018-12-25 | 腾讯科技(深圳)有限公司 | 一种人脸跟踪方法、装置和存储介质 |
CN109558505A (zh) * | 2018-11-21 | 2019-04-02 | 百度在线网络技术(北京)有限公司 | 视觉搜索方法、装置、计算机设备及存储介质 |
CN110062272A (zh) * | 2019-04-30 | 2019-07-26 | 腾讯科技(深圳)有限公司 | 一种视频数据处理方法和相关装置 |
Non-Patent Citations (1)
Title |
---|
See also references of EP3965431A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117812392A (zh) * | 2024-01-09 | 2024-04-02 | 广州巨隆科技有限公司 | 可视化屏幕的分辨率自适应调节方法、***、介质及设备 |
CN117812392B (zh) * | 2024-01-09 | 2024-05-31 | 广州巨隆科技有限公司 | 可视化屏幕的分辨率自适应调节方法、***、介质及设备 |
Also Published As
Publication number | Publication date |
---|---|
JP7258400B6 (ja) | 2024-02-19 |
CN110062272A (zh) | 2019-07-26 |
KR102562208B1 (ko) | 2023-07-31 |
US11900614B2 (en) | 2024-02-13 |
JP7258400B2 (ja) | 2023-04-17 |
CN110062272B (zh) | 2021-09-28 |
JP2022511828A (ja) | 2022-02-01 |
SG11202105410RA (en) | 2021-06-29 |
KR20210095953A (ko) | 2021-08-03 |
EP3965431A1 (en) | 2022-03-09 |
US20210287379A1 (en) | 2021-09-16 |
EP3965431A4 (en) | 2022-10-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020220968A1 (zh) | 一种视频数据处理方法和相关装置 | |
US10586350B2 (en) | Optimizations for dynamic object instance detection, segmentation, and structure mapping | |
JP6179889B2 (ja) | コメント情報生成装置およびコメント表示装置 | |
CN108604379A (zh) | 用于确定图像中的区域的***及方法 | |
EP3493105A1 (en) | Optimizations for dynamic object instance detection, segmentation, and structure mapping | |
US20190287306A1 (en) | Multi-endpoint mixed-reality meetings | |
CN111464834B (zh) | 一种视频帧处理方法、装置、计算设备及存储介质 | |
JP2023509572A (ja) | 車両を検出するための方法、装置、電子機器、記憶媒体およびコンピュータプログラム | |
BRPI1011189B1 (pt) | Sistema baseado em computador para selecionar pontos de visualização ótimos e meio de armazenamento de sinal legível por máquina não transitória | |
JP7273129B2 (ja) | 車線検出方法、装置、電子機器、記憶媒体及び車両 | |
US11561675B2 (en) | Method and apparatus for visualization of public welfare activities | |
EP3493104A1 (en) | Optimizations for dynamic object instance detection, segmentation, and structure mapping | |
US11921983B2 (en) | Method and apparatus for visualization of public welfare activities | |
CN112752158A (zh) | 一种视频展示的方法、装置、电子设备及存储介质 | |
CN112702643B (zh) | 弹幕信息显示方法、装置、移动终端 | |
CN117152660A (zh) | 图像显示方法及其装置 | |
JP2021089711A (ja) | 動画ブレの検出方法及び装置 | |
DE102023105068A1 (de) | Bewegungsvektoroptimierung für mehrfach refraktive und reflektierende Schnittstellen | |
CN114565777A (zh) | 数据处理方法和装置 | |
CN114140488A (zh) | 视频目标分割方法及装置、视频目标分割模型的训练方法 | |
JP6892557B2 (ja) | 学習装置、画像生成装置、学習方法、画像生成方法及びプログラム | |
CN116506680B (zh) | 一种虚拟空间的评论数据处理方法、装置及电子设备 | |
CN113949926B (zh) | 一种视频插帧方法、存储介质及终端设备 | |
CN115993892A (zh) | 信息输入方法、装置及电子设备 | |
TW202405754A (zh) | 深度識別模型訓練方法、圖像深度識別方法及相關設備 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20799151 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2021531593 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20217022717 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2020799151 Country of ref document: EP Effective date: 20211130 |