CN113014944A

CN113014944A - Video processing method and system and video live broadcast system

Info

Publication number: CN113014944A
Application number: CN202110233097.0A
Authority: CN
Inventors: 何亮; 鲍国敏
Original assignee: Shanghai Qiniu Information Technology Co ltd
Current assignee: Shanghai Qiniu Information Technology Co ltd
Priority date: 2021-03-03
Filing date: 2021-03-03
Publication date: 2021-06-22

Abstract

The invention discloses a video processing method, a video processing system and a video live broadcast system, wherein the video processing method comprises the following steps: s1: acquiring a video frame, and identifying a visual angle object in the video frame; s2: extracting the visual angle information of the visual angle object from the video frame, and packaging the visual angle information into visual angle parameters; s3: adding the view parameter to the video frame. The method and the system further analyze and extract the transmission picture while distributing the video stream, so that the system can further process and render the video picture while decoding the video frame in the player decoding stage, realize the subsequent processing and function expansion of the push-pull stream end, improve the live broadcast viewing experience and improve the expansibility of the system, thereby having obvious technical advantages and beneficial effects.

Description

Video processing method and system and video live broadcast system

Technical Field

The invention relates to the field of live broadcast services, in particular to a video processing method and system and a video live broadcast system.

Background

At present, various video player manufacturers at home and abroad attach more importance to the customization requirements of users, and for the same video, under the condition of ultrahigh resolution, the presentation effect of details is not ideal on one hand, and on the other hand, the image range of each user which is expected to be obtained in a key mode is different. In order to meet the selection of more extensive viewing angles of audiences, it can be seen that in some sports events live broadcasting, a scheme that multiple camera viewing angles are transmitted in different video streams or camera viewing angles are manually switched for a user to select one or more viewing angles is also provided, but the prior art scheme has at least the following defects:

(1) the selected visual angle is limited, and the requirement of a user for randomly selecting the visual angle to watch cannot be met;

(2) the multi-channel video stream is transmitted simultaneously, so that the transmission pressure and the additional video transmission cost are increased;

(3) the requirement change of a player end and a stream pushing end cannot be adapted, and the expandability is poor.

Disclosure of Invention

The invention provides a video processing solution for solving the defects in the prior art, and aims to provide rich visual angle information based on the conventional video stream through the solution, so that the personalized visual angle watching requirements of users can be met through the visual angle information, meanwhile, the scheme can meet lower transmission pressure and transmission cost, has good expansibility, and can adapt to the requirement transformation of a player end and a stream pushing end.

In order to achieve the above object, the present invention provides a video processing method, including:

s1: acquiring a video frame, and identifying a visual angle object in the video frame;

s2: extracting the visual angle information of the visual angle object from the video frame, and packaging the visual angle information into visual angle parameters;

s3: adding the view parameter to the video frame.

Further, the perspective objects include characters, activity items, and regions.

Further, the view angle information includes a view angle object name, and coordinates of a view angle object.

Further, the coordinates of the view object include coordinates of a rectangular area enclosed by an upper left corner, a lower left corner, an upper right corner and a lower right corner.

Further, the coordinates of the view object are relative coordinates.

Further, in step S3, the specific implementation steps are: adding the view parameter into a header SEI field of the video frame.

Further, in said step S3, the view angle parameter is padded into the user data field of the SEI.

Further, the data format of the view parameter is json format.

The invention also discloses a video processing system, comprising: visual angle object identification module, visual angle information extraction module, visual angle information filling module, wherein:

a perspective object identification module: the system comprises a video acquisition module, a video processing module and a display module, wherein the video acquisition module is used for acquiring video frames and identifying view angle objects in the video frames;

the visual angle information extraction module: the visual angle information is used for extracting the visual angle object from the video frame and is packaged into a visual angle parameter;

visual angle information filling module: for adding the view parameter to the video frame.

The invention also discloses a video live broadcasting system which comprises a live broadcasting cdn distribution system, a video acquisition terminal, a video processing system, a video transcoding system and a video playing terminal, wherein the live broadcasting cdn distribution system is used for receiving and distributing live broadcasting video streams, the video acquisition terminal is used for acquiring data of live broadcasting video sources, the video transcoding system is used for transcoding video frames, the video playing terminal is used for playing and displaying videos, and the video processing system is the video processing system.

The invention also discloses an electronic device, which is characterized in that the system comprises a processor and a memory, wherein the memory is used for storing the executable program; the processor is configured to execute the executable program to implement any of the video processing methods described above.

In practical applications, the modules in the method and system disclosed by the invention can be deployed on one target server, or each module can be deployed on different target servers independently, and particularly, the modules can be deployed on cluster target servers according to needs in order to provide stronger computing processing capacity.

By utilizing the method, the system and the equipment disclosed by the invention, at least the following obvious advantages are achieved in the aspect of video personalized display:

1) the video stream enriches the existing video stream information by expanding the additional visual angle information, and the personalized visual angle watching requirement of a user can be met by combining a player capable of processing the visual angle information;

2) the data volume of the additional visual angle information is small, so that the bandwidth pressure and the server load pressure caused by adding new functions are reduced to the maximum extent, and the cost is reduced;

3) the SEI fields rarely used in the current mainstream encoder are reasonably utilized, and seamless adaptation can be realized for the requirement change of a player end and a stream pushing end;

in summary, the method, system and device disclosed by the present invention further analyze and describe the transmission picture when the live video is distributed, so that the system can decode the video frame and further process and render the video picture at the same time in the player decoding stage, thereby realizing the subsequent processing and function expansion of the push-pull streaming end, improving the live viewing experience, and improving the system extensibility, and therefore having obvious technical advantages and beneficial effects.

In order that the invention may be more clearly and fully understood, specific embodiments thereof are described in detail below with reference to the accompanying drawings.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings used in the description of the embodiments will be briefly introduced below. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.

Fig. 1 shows a flow diagram of a video processing method in an embodiment.

Fig. 2 shows a schematic diagram of a video processing system in an embodiment.

Fig. 3 shows a schematic structural diagram of a live video broadcast system implementing live broadcast according to an embodiment.

Detailed Description

Referring to fig. 1, fig. 1 is a flowchart illustrating a video processing method according to an embodiment, which specifically includes steps S11 to S13:

step S11: acquiring a video frame, and identifying a visual angle object in the video frame;

in one embodiment, after the cdn system receives the live push stream, it first identifies the view angle objects in the video frames in the live stream, and in different live systems, the view angle objects to be identified are different, and the view angle objects to be identified may be obtained by presetting or by analyzing and extracting from the video frames through an AI algorithm. In particular, typically the perspective objects include characters, activity items, areas, such as in a live ball game, and the perspective objects in the video frames include hot players, hot areas, balls, etc.

Step S12: extracting the visual angle information of the visual angle object from the video frame, and packaging the visual angle information into visual angle parameters;

based on the obtained view object, further extracting view information of the view object from the video frame, in an embodiment, the view information includes a view object name and coordinates of the view object, and after the view information is obtained, the view information is packaged into view parameters, and the view parameters are a custom structured data.

In one embodiment, the storage format of the view parameter is mainly json, as an example, key is view object name information, and value is relative coordinates of the view.

{“key”：“value”}

For example, data for one view parameter is as follows:

in one embodiment, the coordinates include coordinates of a rectangular region surrounded by an upper left corner, a lower left corner, an upper right corner, and a lower right corner, for example, the location area of the person under the messi tag is the upper left corner: 0.1, upper right corner: 0.1, lower left corner: 0.23, lower right corner: 0.23, such a rectangular area. For example, the data structure of view information of one video frame is as follows:

step S13: adding the view parameter to the video frame.

After the two steps, the visual angle parameter of the video frame is obtained, and the visual angle parameter is added to the video frame before the video frame is distributed.

In one embodiment, the view parameter is specifically added to the header SEI field of the video frame. If the SEI field is already activated, a scheme for retaining the SEI field needs to be considered in specific implementation, and the specific scheme is as follows: and analyzing activation of an SEI field in a header field, if the activation is already performed, further analyzing the SEI field, analyzing information of the SEI field, analyzing whether SEI information which needs to be added and is not well finished by a stream pushing end exists, if so, adding the information, then distributing the information, and keeping the SEI information under the default condition.

In one embodiment, the view angle parameter is padded into the user data field of the SEI.

After the three steps, the video frames with the added view information expanded are sent to cdn live broadcast network again for distribution.

Referring to fig. 2, fig. 2 is a schematic structural diagram of a video processing system according to an embodiment, and as shown in the drawing, the embodiment includes a view object identification module 101, a view information extraction module 102, and a view information filling module 103, where:

view object identification module 101: the system comprises a video acquisition module, a video processing module and a display module, wherein the video acquisition module is used for acquiring video frames and identifying view angle objects in the video frames;

the view information extraction module 102: the visual angle information is used for extracting the visual angle object from the video frame and is packaged into a visual angle parameter;

view information padding module 103: and the video frame is used for adding the view angle parameters into the video frame and finally sending the video frame to a player end for playing.

Based on the foregoing embodiment, the present application further discloses a live video broadcast system, please refer to fig. 3, and fig. 3 is a schematic structural diagram illustrating that the live video broadcast system of this embodiment implements live broadcast, in this embodiment, the live video broadcast system includes a live broadcast cdn distribution system, a video acquisition terminal, a video processing system, a video transcoding system, and a video playing terminal, where the live broadcast cdn distribution system is configured to receive and distribute live video streams, the video acquisition terminal is configured to acquire data of live video sources, the video transcoding system is configured to transcode video frames, the video playing terminal is configured to play and display videos, and the video processing system is the video processing system in the foregoing embodiment.

As shown in the figure, the video capture terminal completes the capture of the live video content, at the cdn distribution stage of the live service, the video processing system receives the live stream or the video stream, extracts the view angle information of the view angle object and the view angle object in the page by processing the frame image data of the video stream, packages the view angle information into the structure data to be filled or supplemented in the page data of the video frame, assembles the new video stream data to be redistributed to the live cdn distribution system for distribution, transcodes the video by using the video transcoding system before cdn network distribution, then distributes the video stream or the video data to the player terminal by using the cdn network, after the player terminal receives the video stream, firstly uses the video transcoding system to process the video, then extracts and renders the transcoded view angle information according to the playing setting, and realizing the playing and displaying of the video.

In addition, by the system, when live video is distributed, the view information of the picture is transmitted at the same time, so that the system can decode video frames and further describe the video picture at the decoding stage of a player, and the follow-up processing and function expansion of a push-pull stream end, such as multi-view selective playing, can be effectively supported.

An embodiment of the present application further provides an electronic device, where the electronic device includes a processor and a memory, where the memory stores an executable program, and when the executable program runs on a computer, the computer executes the method and system described in any of the above embodiments.

It should be noted that, all or part of the steps in the methods of the above embodiments may be implemented by hardware related to instructions of a computer program, which may be stored in a computer-readable storage medium, which may include, but is not limited to: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A video processing method is characterized by comprising the following steps:

s3: adding the view parameter to the video frame.

2. The video processing method of claim 1, wherein the perspective objects include characters, moving objects, and regions.

3. The video processing method of claim 1, wherein the view information includes a view object name, and coordinates of a view object.

4. The video processing method of claim 3, wherein the coordinates of the view object include coordinates of a rectangular area surrounded by upper left corner, lower left corner, upper right corner, and lower right corner.

5. The video processing method of claim 3, wherein the coordinates of the view object are relative coordinates.

6. The video processing method according to claim 1, wherein in step S3, the steps are implemented as follows: adding the view parameter into a header SEI field of the video frame.

7. The video processing method according to claim 1, wherein in said step S3, the view parameter is padded into a user data field of the SEI.

8. The video processing method of claim 1, wherein the data format of the view parameter is json format.

9. A video processing system, comprising: visual angle object identification module, visual angle information extraction module, visual angle information filling module, wherein:

10. The utility model provides a live video system, includes live cdn distribution system, video acquisition terminal, video processing system, video transcoding system and video playback terminal, wherein, live cdn distribution system is used for receiving live video stream and distributes, video acquisition terminal is used for gathering the data of live video source, video transcoding system is used for carrying out transcoding processing to the video frame, video playback terminal is used for broadcast and show video, characterized by: the video processing system is the video processing system of claim 9.

11. An electronic device, wherein the system comprises a processor and a memory,

the memory is used for storing an executable program;

the processor is configured to execute the executable program to implement the video processing method of any one of claims 1 to 8.