CN113542774B

CN113542774B - Video synchronization method, device, electronic equipment and storage medium

Info

Publication number: CN113542774B
Application number: CN202110625325.9A
Authority: CN
Inventors: 赵勇; 夏鹏飞
Original assignee: Beijing Gelingshentong Information Technology Co ltd
Current assignee: Beijing Gelingshentong Information Technology Co ltd
Priority date: 2021-06-04
Filing date: 2021-06-04
Publication date: 2023-10-20
Anticipated expiration: 2041-06-04
Also published as: CN113542774A

Abstract

The embodiment of the application provides a video synchronization method, a video synchronization device, electronic equipment and a storage medium, wherein the method comprises the following steps: acquiring at least two competition videos of the same competition acquired from different positions; identifying a target segment corresponding to a target action in each competition video, wherein the target segment consists of images with continuous shooting time, and the target action is a marked action for starting competition; determining images corresponding to target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video; and synchronizing the at least two match videos according to the synchronization frame. And identifying target fragments from at least two match videos, determining a synchronization frame in each match video from the target fragments, and realizing video synchronization based on the synchronization frames without depending on any auxiliary synchronization equipment, so that the method is simple and convenient to operate and easy to apply on a large scale.

Description

Video synchronization method, device, electronic equipment and storage medium

Technical Field

The present application relates to the field of computer vision, and in particular, to a video synchronization method, apparatus, electronic device, and storage medium.

Background

With the rapid development of technology, there is an increasing demand for three-dimensional models and related three-dimensional models. In the process of three-dimensional modeling of a sports game, a plurality of image acquisition devices are used for acquiring game videos at different angles, the image acquisition devices are distributed at different positions of a competition field, and the angles of the acquired game videos are different. In three-dimensional modeling using these game videos, it is difficult to determine the same time in at least two game videos, and synchronization is achieved.

At present, when videos of a plurality of image acquisition devices are synchronized, the synchronization equipment sends synchronization signals to all the image acquisition devices depending on specific synchronization equipment, and the shutter of the image acquisition devices is controlled to shoot through the synchronization signals, so that the synchronous shooting of the plurality of image acquisition devices is realized, and further, the video synchronization is realized. However, the synchronization method needs to rely on synchronization equipment, is complex and cumbersome to operate, and is difficult to realize large-scale application.

Disclosure of Invention

The embodiment of the application provides a video synchronization method, a video synchronization device, electronic equipment and a storage medium, which can effectively solve the problems that video synchronization operation is complex and tedious, and large-scale application is difficult to realize.

According to a first aspect of an embodiment of the present application, there is provided a video synchronization method, including: acquiring at least two competition videos of the same competition acquired from different positions; identifying a target segment corresponding to a target action in each competition video, wherein the target segment consists of images with continuous shooting time, and the target action is a marked action for starting competition; determining images corresponding to target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video; and synchronizing the at least two match videos according to the corresponding synchronization frame of each match video.

According to a second aspect of an embodiment of the present application, there is provided a video synchronization apparatus, the apparatus including: the acquisition module is used for acquiring at least two match videos of the same match acquired from different positions; the identification module is used for identifying target fragments corresponding to target actions in each competition video, the target fragments are formed by images with continuous shooting time, and the target actions are indicative actions of starting competition; the synchronous frame determining module is used for determining images corresponding to the target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video; and the synchronization module is used for synchronizing the at least two match videos according to the corresponding synchronization frame of each match video.

According to a third aspect of embodiments of the present application, there is provided an electronic device comprising one or more processors; a memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform a method as described above as applied to an electronic device.

According to a fourth aspect of embodiments of the present application, an embodiment of the present application provides a computer readable storage medium having program code stored therein, wherein the above-described method is performed when the program code is run.

By adopting the video synchronization method provided by the embodiment of the application, at least two match videos of the same match collected from different positions are obtained; identifying a target segment corresponding to a target action in each competition video, wherein the target segment consists of images with continuous shooting time, and the target action is a marked action for starting competition; determining images corresponding to target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video; and synchronizing the at least two match videos according to the synchronization frame. The method comprises the steps of identifying target fragments corresponding to the marked actions of the game from at least two game videos, determining the synchronous frames in each game video from the target fragments, wherein the actual time of occurrence is the same no matter how the acquisition mode of the game videos is based on the same target posture corresponding to the marked actions of the game, and the corresponding synchronous frames are identical in actual acquisition time point.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:

fig. 1 is a schematic view of an application environment of a video synchronization method according to an embodiment of the present application;

FIG. 2 is a flow chart of a video synchronization method according to an embodiment of the present application;

FIG. 3 is a flowchart of a video synchronization method according to another embodiment of the present application;

FIG. 4 is a flowchart of a video synchronization method according to still another embodiment of the present application;

FIG. 5 is a functional block diagram of a video synchronization apparatus according to an embodiment of the present application;

fig. 6 is a block diagram of an electronic device for performing a video synchronization method according to an embodiment of the present application.

Detailed Description

At present, when videos of a plurality of image acquisition devices are synchronized, a synchronization device is used for sending synchronization signals to all the image acquisition devices, and the shutter of the image acquisition devices is controlled to shoot through the synchronization signals, so that the synchronous shooting of the plurality of image acquisition devices is realized, and further, the video synchronization is realized. However, the synchronization method needs to rely on synchronization equipment, is complex and cumbersome to operate, and is difficult to realize large-scale application.

The inventor finds that with the continuous upgrading of the image acquisition device in the mobile intelligent device and the rapid development of the 5G and cloud computing technology, the realization of video synchronization has already had mature hardware conditions. When shooting the same sports match at different angles by using a plurality of image acquisition devices, it is not practical to synchronize the mobile intelligent terminal through signals sent by external synchronization equipment, and no external synchronization equipment is actually available. In the context of a sports game, there will be a marked action at the beginning of the game, for example, the marked action at the beginning of a basketball game is a midfield service. Therefore, the marking actions at the beginning of the competition can be identified by analyzing the competition videos acquired by the plurality of image acquisition devices, and the frames where the marking actions are positioned are synchronous frames, so that videos of all the image acquisition devices for shooting the same sports competition are synchronized.

Therefore, the embodiment of the application provides a video synchronization method, which is used for acquiring at least two competition videos of the same competition acquired from different positions; identifying a target segment corresponding to a target action in each competition video, wherein the target segment consists of images with continuous shooting time, and the target action is a marked action for starting competition; determining images corresponding to target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video; and synchronizing the at least two match videos according to the synchronization frame. The method comprises the steps of identifying target fragments corresponding to the marked actions of the game from at least two game videos, determining the synchronous frames in each game video from the target fragments, wherein the actual time of occurrence is the same no matter how the acquisition mode of the game videos is based on the same target posture corresponding to the marked actions of the game, and the corresponding synchronous frames are identical in actual acquisition time point.

The scheme in the embodiment of the application can be realized by adopting various computer languages, such as Java which is an object-oriented programming language, javaScript which is an transliterated script language, python and the like.

In order to make the technical solutions and advantages of the embodiments of the present application more apparent, the following detailed description of exemplary embodiments of the present application is provided in conjunction with the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application and not exhaustive of all embodiments. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.

Referring to fig. 1, an application environment 10 of a video synchronization method provided by the present application is shown, where the application environment 10 includes an electronic device 20, at least two image capturing apparatuses 30, and a playing field 40. The image capturing device 30 is in communication connection with the electronic device 20, the game video captured by the image capturing device 30 is sent to the electronic device 20, and the electronic device 20 can process the game video.

The image capturing device 30 may be a mobile device with an image capturing function, such as a smart phone, a tablet computer, etc. The image capturing devices 30 are disposed in the playing field 40 for capturing environmental information in the scene 40, wherein the shooting view of each image capturing device 30 covers the entire playing field 40.

In some embodiments, an application program for synchronization may be installed in the image capturing apparatus 30, and the application program may send the game video captured by the image capturing apparatus 30 to the electronic device 20 through a network. The network may be a 5G network, a 4G network, a Wi-Fi network, or the like. The electronic device 20 may be a server, an intelligent terminal, a computer, or the like.

Thus, the electronic device 20 may acquire at least two game videos of the same game; identifying a target segment corresponding to a target action in each competition video, wherein the target segment consists of images with continuous shooting time, and the target action is a marked action for starting competition; determining images corresponding to target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video; and synchronizing the at least two match videos according to the synchronization frame.

Referring to fig. 2, an embodiment of the present application provides a video synchronization method, which may be applied to the electronic device in the application environment 10, where the electronic device may be a smart phone, a computer, a server, or the like.

Step 110, acquiring at least two game videos of the same game acquired from different locations.

The electronic device may obtain at least two game videos of the same game from different image capturing devices. At least two image capturing devices may be provided in the venue, each image capturing device may be provided at a different location of the venue, wherein a field of view of each image capturing device may cover the entire venue.

After each image acquisition device shoots the competition video, the shot competition video can be sent to the electronic equipment, so that the electronic equipment can acquire at least two competition videos of the same competition acquired from different positions.

And 120, identifying a target segment corresponding to a target action in each competition video, wherein the target segment is composed of images with continuous shooting time, and the target action is a marked action for starting the competition.

After the at least two competition videos are acquired, as the positions and angles of shooting the competition videos by each image acquisition device are different, for each competition video, a target segment corresponding to a target action in the competition video can be identified. The target segment is one video segment in the competition video and is composed of images with continuous shooting time. Specifically, the target action is a marked action for starting the game, for example, in basketball game and football game, a midcourt serving action, and in volleyball game, badminton game, table tennis game and tennis game, a first serving action.

The identifying of the target segment corresponding to the target action in the game video may be performed using a neural network model. Specifically, the neural network may be trained in advance to obtain the recognition model, so that the recognition model has the capability of recognizing the target action from the game video.

In the step of identifying the competition video, when a target segment corresponding to a target action is identified, the competition video may be cut into a plurality of video segments; inputting the video clips into a pre-trained prediction model to obtain candidate video clips; and determining the target fragments from the candidate video fragments according to the number of the candidate video fragments.

Before identifying the target segment corresponding to the target action in each competition video, determining the competition category of the competition video as a target category; and inquiring an information table according to the target category, and determining target actions and target gestures corresponding to the target category, wherein the information table comprises the corresponding relation between the competition category and the actions and gestures. Therefore, the target action corresponding to the competition video can be determined, and then the video segment corresponding to the target action in the competition video is identified.

And 130, determining an image corresponding to the target gesture in the target segment, and taking the image corresponding to the target gesture as a synchronous frame corresponding to each competition video.

Since the target segment is formed by time-continuous images, a specific moment needs to be determined during video synchronization, and this moment can be referred to by a certain image in the target segment. That is, an image corresponding to a target gesture may be determined from the target segment as a synchronization frame corresponding to the game video.

Specifically, the key points of the human skeleton and the key points of the object corresponding to each image in the target segment are obtained; and determining an image corresponding to the target gesture according to the human skeleton key points and the object key points.

And step 140, synchronizing the at least two match videos according to the corresponding synchronization frame of each match video.

According to the above steps, the synchronization frame can be determined from each target segment, and then each match video corresponds to one synchronization frame, and the time corresponding to the synchronization frame is the same time in the match. For example, the 100 th frame of the video a is a synchronization frame, and the 200 th frame of the video B is a synchronization frame, which are all moments of starting the game. According to the synchronization frame, synchronization of the competition video can be achieved, and data support is further provided for three-dimensional modeling according to the competition video.

The video synchronization method provided by the embodiment of the application acquires at least two match videos of the same match acquired from different positions; identifying a target segment corresponding to a target action in each competition video, wherein the target segment consists of images with continuous shooting time, and the target action is a marked action for starting competition; determining images corresponding to target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video; and synchronizing the at least two match videos according to the synchronization frame. And identifying a target segment from each match video, determining a synchronization frame in each match video from the target segment, and realizing video synchronization based on the synchronization frame without depending on any auxiliary synchronization equipment, so that the method is simple and convenient to operate and easy to apply on a large scale.

Referring to fig. 3, another embodiment of the present application provides a video synchronization method, which is mainly described in the foregoing embodiments, and the process of identifying a target segment and determining a synchronization frame may include the following steps.

Step 210, acquiring at least two game videos of the same game acquired from different locations.

Step 210 may refer to the corresponding parts of the foregoing embodiments, and will not be described herein.

Step 220, cutting the game video into a plurality of video clips.

After the game video is obtained, for each game video, the game video may be cut into a plurality of video clips. Specifically, the length of the cut video clip may be preset, and the cutting mode may be selected according to actual needs, which is not specifically limited herein. For example, the game video may be cut into a plurality of fixed length video clips using a fixed length sliding window that slides at a 1 frame step rate.

And step 230, inputting the video clips into a pre-trained prediction model to obtain candidate video clips.

After each match video is cut, a corresponding plurality of video clips can be obtained. And inputting the video clips corresponding to one match video into the prediction model, and outputting candidate video clips. The candidate video clip includes the target action therein.

The prediction model is obtained by training according to sample fragments and labeling information corresponding to the sample fragments in advance. Before training the neural network model, labeling may be performed on each sample segment to obtain labeling information, for example, a sample segment including the target action is labeled as 1, and a sample segment not including the target action is labeled as 0.

And inputting the sample fragments and the corresponding labeling information into the neural network model, and outputting the sample fragments comprising the target actions. And if the labeling information corresponding to the sample segment is 0, indicating that the sample segment does not comprise the target action, and adjusting the parameters of the neural network model until the sample segment output by the neural network model comprises the target action.

It will be appreciated that a plurality of video clips comprising the target action may appear in the game video, and the video clips are used as candidate video clips, and the target clips are further determined according to the number of the candidate video clips.

Step 240, determining the target segment from the candidate video segments according to the number of the candidate video segments.

After the candidate video clips are obtained, the number of candidate video clips may be determined. If the number of the candidate video clips is 1, it indicates that only one video clip includes the target action in the game video, so that the candidate video clip can be considered as the target clip.

If the number of candidate video clips is greater than 1, it is indicated that at least two video clips exist in the match video, including the target action. Since the target motion is a marked motion at the beginning of the match, when at least two candidate video clips exist, the time sequence of the candidate video clips in the match video can be acquired, and the candidate video clip with the earliest time sequence is determined as the target clip. For example, the candidate video segments are two segments, namely a segment a and a segment B, and the time of a in the match video is respectively 2 minutes, and the time of B is 10 minutes, so that the target segment of the segment a can be determined.

Step 250, obtaining key points of human bones and object key points corresponding to each image in the target segment.

After the target segment is obtained, the human skeleton key points and the object key points corresponding to each image in the target segment can be identified through a human skeleton key point identification algorithm and an object identification algorithm.

And 260, determining an image corresponding to the target gesture as the synchronous frame according to the human skeleton key points and the object key points.

And after the human skeleton key points and the object key points are identified, determining whether the image is in the target posture or not based on the human skeleton key points and the object key points, and if so, determining that the image is a synchronous frame.

In some embodiments, the method may include acquiring a target human skeleton key point and a target object key point corresponding to a target gesture, and when the human skeleton key point is matched with the target human skeleton key point and the object key point is matched with the target object key point, considering the image as the target gesture, so that the image may be determined to be a synchronous frame.

In some embodiments, the synchronization frame may be determined according to the change trend of the human skeleton key point and the object key point in each frame of image. For example, in a table tennis field, during the first serve, a player throws a ball upward with one hand, and at this time, the image of the highest point of the upward motion of the throwing hand is a synchronization frame.

It should be noted that, the above two modes may be selected by a specific target gesture to determine that an image corresponding to the target gesture in the target segment is a synchronization frame.

Step 270, synchronizing the at least two match videos according to the synchronization frame corresponding to each match video.

Step 270 may refer to the corresponding parts of the foregoing embodiments, and will not be described herein.

According to the video synchronization method provided by the embodiment of the application, the target segment corresponding to the target action in the competition video is identified through the prediction model; acquiring human skeleton key points and object key points corresponding to each image in the target segment; determining an image corresponding to the target gesture as the synchronous frame according to the human skeleton key points and the object key points; and synchronizing the at least two match videos according to the corresponding synchronization frame of each match video. The method comprises the steps of determining target fragments in each match video from at least two match videos, determining a synchronization frame from the target fragments, realizing video synchronization based on the synchronization frame, and having no need of relying on any auxiliary synchronization equipment, and has simple and convenient operation and easy large-scale application.

Referring to fig. 4, a video synchronization method is provided in accordance with a further embodiment of the present application, and the process of determining the target motion and the target gesture is described with emphasis on the foregoing embodiments.

Step 310, obtaining at least two game videos of the same game acquired from different locations.

Step 310 may refer to the corresponding parts of the foregoing embodiments, and will not be described herein.

Step 320, determining a match category corresponding to the match video.

After the match video is obtained, the match category corresponding to the match video is determined to be the target category.

As an embodiment, the staff may manually confirm the game category corresponding to the game video.

As an embodiment, an image may be arbitrarily extracted from the game video, a field in the image may be identified by a neural network model, and a target class may be determined according to the identified field. For example, the identified venue is a basketball court and the target category is determined to be a basketball game.

The game type may be one of a basketball game, a football game, a volleyball game, a badminton game, a table tennis game, a tennis game, an ice hockey game, and a curling game.

And 330, determining a target action and a target gesture corresponding to the target category according to the target category query information table, wherein the information table comprises the corresponding relation between the competition category and the action and gesture.

After determining the target category, a target action and a target gesture corresponding to the target category may be queried through an information table. The information table may be preset and stored in the electronic device, and includes correspondence between game types, actions and gestures. The information table may refer to table 1.

TABLE 1

Race category	Action	Posture of human
			Class 1	Action 1	Posture 1
Class 2	Action 2	Posture 2

If the target category is determined to be category 1, the target action can be determined to be action 1 and the target gesture can be determined to be gesture 1 by the information table.

Actions and gestures corresponding to the game categories will be described in detail below.

When the target class is basketball game, the target action is a midfield serving, and the target gesture is that the hands of the player contact the basketball for the first time.

When the target category is football match, the target action is a midcourt service, and the target posture is that the feet of the player contact football for the first time.

When the target class is volleyball match, the target action is first service, and the target posture is that the hands of the player first contact the volleyball.

When the target category is a badminton match, the target action is a first service, and in the first service action, a player swings backwards first and then swings forwards to hit the badminton, and at the moment, when the player swings backwards, the posture of the arm rearmost is the target posture.

When the target class is a table tennis match, the target action is a first serving action, and in the first serving action, a player throws a ball upward with one hand, and at this time, the highest position of the upward movement of the throwing hand is the target position.

When the target class is tennis match, the target action is first service, in the first service action, the player swings backwards first and then swings forwards to hit the tennis, and at this time, when the player swings backwards, the posture of the arm rearmost is the target posture.

When the target class is an ice hockey game, the target action is to be the first time to play a ball at a central ball point, and the target posture is the first swing posture of a player.

When the target class is a curling match, the target action is that both players start to throw the curling, and the target posture is that both players stop sliding.

Step 340, identifying a target segment corresponding to a target action in each competition video, wherein the target segment is composed of images with continuous shooting time, and the target action is a marked action for starting the competition.

And 350, determining images corresponding to the target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video.

Step 360, synchronizing the at least two match videos according to the synchronization frame corresponding to each match video.

Steps 340 to 360 may refer to the corresponding parts of the foregoing embodiments, and are not described herein.

After the video synchronization method provided by the embodiment of the application acquires the competition video, determining the competition category corresponding to the competition video as the target category; determining a target action and a target gesture corresponding to the target category according to the target category and a query information table; and identifying target actions and synchronization frames in the match video, and carrying out video synchronization based on the synchronization frames. The method comprises the steps of identifying target fragments corresponding to the marked actions of the game from at least two game videos, determining the synchronous frames in each game video from the target fragments, wherein the actual time of occurrence is the same no matter how the acquisition mode of the game videos is based on the same target posture corresponding to the marked actions of the game, and the corresponding synchronous frames are identical in actual acquisition time point.

Referring to fig. 5, an embodiment of the present application provides a video synchronization apparatus 400, where the video synchronization apparatus 400 includes an acquisition module 410, an identification module 420, a synchronization frame determination module 430, and a synchronization module 440. The acquiring module 410 is configured to acquire at least two game videos of the same game acquired from different positions; the identifying module 420 is configured to identify, in each of the game videos, a target segment corresponding to a target action, where the target segment is formed by images with continuous capturing time, and the target action is a logo action of a game start; the synchronization frame determining module 430 is configured to determine an image corresponding to a target gesture in the target segment, and take the image corresponding to the target gesture as a synchronization frame corresponding to each of the game videos; the synchronization module 440 is configured to synchronize the at least two match videos according to a synchronization frame corresponding to each match video.

Further, the video synchronization device 400 further includes a target determining module, where the target determining module is configured to determine a match category corresponding to the match video as a target category; and inquiring an information table according to the target category, and determining target actions and target gestures corresponding to the target category, wherein the information table comprises the corresponding relation between the competition category and the actions and gestures.

Further, the identifying module 420 is further configured to cut the game video into a plurality of video clips; inputting the video clips into a pre-trained prediction model to obtain candidate video clips; and determining the target fragments from the candidate video fragments according to the number of the candidate video fragments.

Further, the identifying module 420 is further configured to determine that the candidate video segment is the target segment if the number of candidate video segments is 1; and if the number of the candidate video clips is greater than 1, determining the candidate video clip with the earliest time sequence in the competition video as the target clip.

Further, the synchronization frame determining module 430 is further configured to obtain a human skeleton key point and an object key point corresponding to each image in the target segment; and determining an image corresponding to the target gesture as the synchronous frame according to the human skeleton key points and the object key points.

Further, when the game category is basketball game, the target action is a midrange serving, and the target gesture is that the player's hand first contacts basketball.

Further, when the match type is volleyball match, the target segment is a first serving, and the target gesture is that the hands of the player first contact the volleyball.

The video synchronization device provided by the embodiment of the application acquires at least two match videos of the same match acquired from different positions; identifying a target segment corresponding to a target action in each competition video, wherein the target segment consists of images with continuous shooting time, and the target action is a marked action for starting competition; determining images corresponding to target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video; and synchronizing the at least two match videos according to the synchronization frame. The method comprises the steps of identifying target fragments corresponding to the marked actions of the game from at least two game videos, determining the synchronous frames in each game video from the target fragments, wherein the actual time of occurrence is the same no matter how the acquisition mode of the game videos is based on the same target posture corresponding to the marked actions of the game, and the corresponding synchronous frames are identical in actual acquisition time point.

It should be noted that, for convenience and brevity of description, specific working processes of the apparatus described above may refer to corresponding processes in the foregoing method embodiments, which are not repeated herein.

Referring to fig. 6, an embodiment of the present application provides a block diagram of an electronic device 500, where the electronic device 500 includes a processor 510, a memory 520, and one or more application programs, where the one or more application programs are stored in the memory 520 and configured to be executed by the one or more processors 510, and the one or more program is configured to perform the method of video synchronization described above.

The electronic device 500 may be a terminal device such as a smart phone, a tablet computer, etc. capable of running an application program, or may be a server. The electronic device 500 of the present application may include one or more of the following components: a processor 510, a memory 520, and one or more application programs, wherein the one or more application programs may be stored in the memory 520 and configured to be executed by the one or more processors 510, the one or more program(s) configured to perform the method as described in the foregoing method embodiments.

Processor 510 may include one or more processing cores. The processor 510 utilizes various interfaces and lines to connect various portions of the overall electronic device 500, perform various functions of the electronic device 500, and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 520, and invoking data stored in the memory 520. Alternatively, the processor 510 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA). The processor 510 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for being responsible for rendering and drawing of display content; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 510 and may be implemented solely by a single communication chip.

The Memory 520 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Memory 520 may be used to store instructions, programs, code sets, or instruction sets. The memory 520 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the various method embodiments described below, etc. The storage data area may also store data created by the electronic device 500 in use (e.g., phonebook, audio game video, chat log data), etc.

The electronic equipment provided by the embodiment of the application acquires at least two competition videos of the same competition acquired from different positions; identifying a target segment corresponding to a target action in each competition video, wherein the target segment consists of images with continuous shooting time, and the target action is a marked action for starting competition; determining images corresponding to target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video; and synchronizing the at least two match videos according to the synchronization frame. The method comprises the steps of identifying target fragments corresponding to the marked actions of the game from at least two game videos, determining the synchronous frames in each game video from the target fragments, wherein the actual time of occurrence is the same no matter how the acquisition mode of the game videos is based on the same target posture corresponding to the marked actions of the game, and the corresponding synchronous frames are identical in actual acquisition time point.

It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the application.

It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the spirit or scope of the application. Thus, it is intended that the present application also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims

1. A method of video synchronization, the method comprising:

acquiring at least two competition videos of the same competition acquired from different positions;

identifying a target segment corresponding to a target action in each competition video, wherein the target segment consists of images with continuous shooting time, and the target action is a marked action for starting competition;

determining images corresponding to target gestures in the target fragments, and taking the images corresponding to the target gestures as synchronous frames corresponding to each competition video;

synchronizing the at least two match videos according to the corresponding synchronization frame of each match video;

before identifying the target segments in each of the game videos, the method further comprises:

determining the match category corresponding to the match video as a target category;

and inquiring an information table according to the target category, and determining target actions and target gestures corresponding to the target category, wherein the information table comprises the corresponding relation between the competition category and the actions and gestures.

2. The method of claim 1, wherein identifying a target segment in each of the game videos that corresponds to a target action comprises:

cutting the game video into a plurality of video clips;

inputting the video clips into a pre-trained prediction model to obtain candidate video clips;

and determining the target fragments from the candidate video fragments according to the number of the candidate video fragments.

3. The method of claim 2, wherein the determining the target segment from the candidate video segments based on the number of candidate video segments comprises:

if the number of the candidate video clips is 1, determining that the candidate video clips are the target clips;

and if the number of the candidate video clips is greater than 1, determining the candidate video clip with the earliest time sequence in the competition video as the target clip.

4. The method of claim 1, wherein the determining the image of the target segment corresponding to the target gesture, and the image of the target gesture as the synchronization frame corresponding to each of the game videos, comprises:

acquiring human skeleton key points and object key points corresponding to each image in the target segment;

and determining an image corresponding to the target gesture as the synchronous frame according to the human skeleton key points and the object key points.

5. The method of any one of claims 1-4, wherein the target action is a midfield serving and the target posture is the first contact of a player's hand with a basketball when the game category is a basketball game.

6. The method of any one of claims 1-4, wherein the target segment is a first serving and the target gesture is a first contact of a player's hand with a volleyball when the game category is volleyball play.

7. An electronic device, the electronic device comprising:

one or more processors;

a memory electrically connected to the one or more processors;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the method of any of claims 1-6.

8. A computer readable storage medium having stored therein program code which is callable by a processor to perform the method of any one of claims 1 to 6.