CN108965711B

CN108965711B - Video processing method and device

Info

Publication number: CN108965711B
Application number: CN201810841801.9A
Authority: CN
Inventors: 吕现广
Original assignee: Guangzhou Kugou Computer Technology Co Ltd
Current assignee: Guangzhou Kugou Computer Technology Co Ltd
Priority date: 2018-07-27
Filing date: 2018-07-27
Publication date: 2020-12-11
Anticipated expiration: 2038-07-27
Also published as: CN108965711A

Abstract

The invention discloses a video processing method and device, and belongs to the technical field of terminals. The method comprises the following steps: in the shooting process, collecting video frames and detecting the shooting direction of a shooting end; for a first collected video frame, generating a first direction frame according to the shooting direction of the first video frame, and inserting the first direction frame in front of the first video frame, wherein the first direction frame carries the shooting direction information of the first video frame; for each video frame after the collected first video frame, if the shooting direction of the video frame is different from the shooting direction of the last video frame of the video frame, generating a second direction frame, inserting the second direction frame in front of the video frame, wherein the second direction frame carries the shooting direction information of the video frame; video data into which directional frames are inserted is generated. According to the inserted direction frame, the invention can realize automatic adjustment of the video frame playing picture, avoid the operation that a user needs to manually rotate the playing end and improve the video playing effect.

Description

Video processing method and device

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a video processing method and apparatus.

Background

Video has the function of bearing dynamic images, and is an important form for people to entertain and communicate. The processing of the video is usually performed by a shooting end and a playing end, wherein the shooting end is used for collecting video frames in the shooting process so as to generate video data, and the playing end is used for playing the video data.

In practical application, for a shooting end, a user can select a transverse screen, a vertical screen or a transverse and vertical screen mixed mode to shoot according to a scene where the user is located, and therefore the shooting end can obtain video data with the transverse screen, the vertical screen or the transverse and vertical screen mixed mode. For the playing end, the user sometimes watches the video in the horizontal screen mode, sometimes watches the video in the vertical screen mode, and sometimes watches the video in the horizontal screen mode and then switches to the vertical screen mode. In the process of watching a video by a user, if the shooting direction of a video picture is different from the screen direction of a playing end, the video picture watched by the user will have a 90-degree rotation phenomenon. In order to view a forward video frame, a user needs to rotate the playing end according to the shooting direction of the video frame. For example, as shown in fig. 1, if the video frame is in the horizontal screen direction and the playing end is in the vertical screen direction, the user needs to rotate the playing end 90 degrees to the left to see the forward video frame.

However, in the process of watching the video, the manner of adjusting the video picture by manually rotating the playing end is tedious for the user, and affects the watching experience of the user.

Disclosure of Invention

The embodiment of the invention provides a video processing method and device, which can be used for solving the problem that in the related art, a user needs to manually rotate a playing end in the process of watching a video, and the operation is complicated. The technical scheme is as follows:

in a first aspect, a video processing method is provided, which is applied to a shooting end, and the method includes:

in the shooting process, collecting video frames and detecting the shooting direction of the shooting end, wherein the shooting direction comprises a transverse screen direction or a vertical screen direction;

for a first collected video frame, generating a first direction frame according to the shooting direction of the first video frame, and inserting the first direction frame in front of the first video frame, wherein the first direction frame carries the shooting direction information of the first video frame;

for each video frame after the first collected video frame, if the shooting direction of the video frame is different from the shooting direction of the last video frame of the video frame, generating a second direction frame, and inserting the second direction frame in front of the video frame, wherein the second direction frame carries the shooting direction information of the video frame;

generating video data inserted into directional frames, the directional frames including the first directional frame, or including the first directional frame and the second directional frame.

Optionally, the detecting the shooting direction of the shooting end includes:

calling a screen direction monitoring interface of the shooting end, and acquiring the screen direction of the shooting end through the screen direction monitoring interface;

and determining the acquired screen direction as the shooting direction of the shooting end.

Optionally, the generating a first direction frame according to the shooting direction of the first video frame includes:

generating a video pseudo frame carrying shooting direction information of the first video frame by using an extension type of a Network Abstraction Layer (NAL) according to the shooting direction of the first video frame, and taking the video pseudo frame as the first direction frame;

alternatively, the first and second electrodes may be,

and generating an enhanced frame carrying the shooting direction Information of the first video frame by utilizing an expansion mode provided by SEI (Supplemental Enhancement Information) according to the shooting direction of the first video frame, and taking the enhanced frame as the first direction frame.

Optionally, the inserting the first directional frame before the first video frame further includes:

sequentially placing the first direction frame and a first video coding frame in a video frame queue to be sent, wherein the first video coding frame is obtained by coding the first video frame;

the generating of the video data inserted with the directional frame includes:

sequentially taking out video queue frames from the video frame queue to be sent, wherein the video queue frames comprise video coding frames and direction frames;

and packaging the taken out video queue frame to obtain a video stream.

Optionally, before the encapsulating the extracted video queue frame to obtain the video stream, the method further includes:

collecting audio frames in the shooting process;

for each collected audio frame, coding the audio frame to obtain an audio coding frame, and placing the audio coding frame in an audio frame queue to be sent;

sequentially taking out audio queue frames from the audio frame queue;

the encapsulating the extracted video queue frame to obtain a video stream includes:

and packaging the taken out video queue frame and audio queue frame to obtain a video stream.

Optionally, the method further comprises:

and if the shooting direction of the video frame is the same as that of the last video frame of the video frame, inserting the video frame behind the last video frame.

In a second aspect, a video processing method is provided, which is applied to a playing end, and the method includes:

acquiring video data, wherein the video data comprises a plurality of video frames and at least one direction frame, and each direction frame carries shooting direction information of a next video frame of the direction frames;

detecting the screen direction of the playing end in the process of playing the video data, wherein the screen direction of the playing end comprises a transverse screen direction or a vertical screen direction;

if any direction frame is obtained from the video data, and the shooting direction indicated by the shooting direction information carried by the direction frame is different from the screen direction of the playing end, for each target video frame in at least one target video frame, reducing the size of the target video frame according to the screen direction of the playing end, and playing the reduced target video frame, wherein the at least one target video frame is a video frame between the direction frame and the next video frame of the direction frame.

Optionally, the reducing the size of the target video frame according to the screen direction of the play end includes:

determining a reduction ratio of the target video frame, wherein the reduction ratio is a ratio of a first side length to a second side length, the first side length is the minimum value of the height and the width of a video displayable area of the playing end, and the second side length is the maximum value of the height and the width of the target video frame;

and carrying out reduction processing on the target video frame according to the reduction proportion.

Optionally, the determining the reduction ratio of the target video frame includes:

when the shooting direction indicated by the shooting direction information carried by the direction frame is a horizontal screen direction and the screen direction of the playing end is a vertical screen direction, determining the ratio between the width of the video displayable area of the playing end and the width of the target video frame as the reduction ratio;

and when the shooting direction indicated by the shooting direction information carried by the direction frame is a vertical screen direction and the screen direction of the playing end is a horizontal screen direction, determining the ratio between the height of the video displayable area of the playing end and the height of the target video frame as the reduction ratio.

Optionally, the playing the reduced target video frame includes:

according to the size of the video displayable area of the playing end, performing background filling on the reduced target video frame to obtain the filled target video frame, wherein the size of the filled target video frame is the same as that of the video displayable area of the playing end;

and displaying the filled target video frame in a video displayable area of the playing end.

Optionally, the method further comprises:

and if any direction frame is acquired from the video data and the shooting direction indicated by the shooting direction information carried by the direction frame is the same as the screen direction of the playing end, playing each target video frame in the at least one target video frame.

In a third aspect, a video processing apparatus is provided, which is applied to a shooting end, the apparatus comprising:

the first acquisition module is used for acquiring video frames in the shooting process;

the detection module is used for detecting the shooting direction of the shooting end in the shooting process, wherein the shooting direction comprises a transverse screen direction or a vertical screen direction;

the first generation module is used for generating a first direction frame according to the shooting direction of the first video frame for the collected first video frame, and inserting the first direction frame in front of the first video frame, wherein the first direction frame carries the shooting direction information of the first video frame;

a second generation module, configured to generate, for each video frame after the acquired first video frame, a second direction frame if a shooting direction of the video frame is different from a shooting direction of a previous video frame of the video frame, and insert the second direction frame before the video frame, where the second direction frame carries shooting direction information of the video frame;

a third generating module, configured to generate video data into which directional frames are inserted, where the directional frames include the first directional frame, or include the first directional frame and the second directional frame.

Optionally, the detection module is configured to:

Optionally, the first generating module is configured to:

generating a video pseudo frame carrying shooting direction information of the first video frame by using an extension type of a NAL (network element address) according to the shooting direction of the first video frame, and taking the video pseudo frame as the first direction frame;

alternatively, the first and second electrodes may be,

and generating an enhanced frame carrying the shooting direction information of the first video frame by utilizing an expansion mode provided by SEI according to the shooting direction of the first video frame, and taking the enhanced frame as the first direction frame.

Optionally, the apparatus further comprises:

the first enqueuing module is used for sequentially placing the first direction frame and a first video coding frame in a video frame queue to be sent, wherein the first video coding frame is obtained by coding the first video frame;

the third generating module is configured to sequentially take out video queue frames from the video frame queue to be sent, where the video queue frames include video coding frames and direction frames; and packaging the taken out video queue frame to obtain a video stream.

Optionally, the apparatus further comprises:

the second acquisition module is used for acquiring audio frames in the shooting process;

the second enqueue module is used for coding the audio frames for each collected audio frame to obtain audio coding frames, and placing the audio coding frames in an audio frame queue to be sent;

the second dequeuing module is used for sequentially taking out the audio queue frames from the audio frame queue;

and the third generation module is used for packaging the taken out video queue frame and audio queue frame to obtain a video stream.

Optionally, the apparatus further comprises:

and the processing module is used for inserting the video frame behind the last video frame if the shooting direction of the video frame is the same as that of the last video frame of the video frame.

In a fourth aspect, a video processing apparatus is provided, which is applied to a playing end, and the apparatus includes:

the device comprises an acquisition module, a display module and a display module, wherein the acquisition module is used for acquiring video data, the video data comprises a plurality of video frames and at least one direction frame, and each direction frame carries shooting direction information of a next video frame of the direction frames;

the detection module is used for detecting the screen direction of the playing end in the process of playing the video data, wherein the screen direction of the playing end comprises a transverse screen direction or a vertical screen direction;

and the processing module is used for reducing the size of each target video frame in at least one target video frame according to the screen direction of the playing end and playing the reduced target video frame if any direction frame is acquired from the video data and the shooting direction indicated by the shooting direction information carried by the direction frame is different from the screen direction of the playing end, wherein the at least one target video frame is a video frame between the direction frame and the next video frame of the direction frame.

Optionally, the processing module includes:

a determining unit, configured to determine a reduction ratio of the target video frame, where the reduction ratio is a ratio of a first side length to a second side length, the first side length is a minimum value of a height and a width of a video displayable region of the playing end, and the second side length is a maximum value of the height and the width of the target video frame;

and the reducing module is used for reducing the target video frame according to the reducing proportion.

Optionally, the determining unit is configured to:

Optionally, the processing module includes:

a filling unit, configured to perform background filling on the reduced target video frame according to the size of the video displayable area of the playing end to obtain a filled target video frame, where the size of the filled target video frame is the same as the size of the video displayable area of the playing end;

and the display unit is used for displaying the filled target video frame in a video displayable area of the playing end.

Optionally, the apparatus further comprises:

and the playing module is used for playing each target video frame in the at least one target video frame if any direction frame is obtained from the video data and the shooting direction indicated by the shooting direction information carried by the direction frame is the same as the screen direction of the playing end.

In a fifth aspect, there is provided a video processing apparatus, the apparatus comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the steps of any of the video processing methods described above.

In a sixth aspect, a computer-readable storage medium is provided, having instructions stored thereon, which when executed by a processor, implement the steps of any of the above-mentioned video processing methods.

The embodiment of the invention has the following beneficial effects:

in the embodiment of the invention, in the process of acquiring video frames at a shooting end, a first direction frame is inserted before the acquired first video frame to indicate the shooting direction of the first video frame, and when each video frame is acquired later, when the shooting direction of the acquired video frame is different from the shooting direction of the previous video frame, a second direction frame is inserted before the acquired video frame to indicate the shooting direction of the currently acquired video frame, and then video data of the inserted direction frame is generated, so that when any direction frame is acquired and the shooting direction indicated by the direction frame is different from the screen direction of the playing end in the process of playing the video data comprising the direction frame at the playing end, the video frame between the direction frame and the next direction frame is subjected to reduction processing according to the screen direction of the playing end and the size of a video displayable area, and the reduced video frame is displayed in the video displayable area, therefore, the rotation of the direction of the video frame picture is avoided, and the user can see the forward video picture. Therefore, when the shooting direction of the video frame is different from the screen direction of the playing end, the automatic adjustment of the playing picture of the video frame is realized, the operation that a user needs to manually rotate the playing end is avoided, and the video playing effect is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic illustration of a video display provided in the related art;

FIG. 2 is a schematic diagram of a video processing system according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of another video processing system provided by an embodiment of the invention;

fig. 4 is a flowchart of a video processing method according to an embodiment of the present invention;

fig. 5 is a flow chart of another video processing method according to an embodiment of the present invention;

fig. 6 is a flow chart of another video processing method according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a video processing provided by an embodiment of the invention;

FIG. 8 is a schematic diagram of another video processing provided by embodiments of the present invention;

fig. 9 is a block diagram of a video processing apparatus according to an embodiment of the present invention;

fig. 10 is a block diagram of a video processing apparatus according to an embodiment of the present invention;

fig. 11 is a block diagram of a terminal 1100 according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings.

Before explaining the embodiments of the present invention in detail, an application scenario of the embodiments of the present invention will be described. The video processing method provided by the embodiment of the invention can be applied to a live broadcast scene, a video call scene, a scene of video recording and playing the recorded video, and the like.

Next, an implementation environment of the embodiment of the present invention will be described.

The video processing method provided by the embodiment of the present invention can be applied to a video processing system, and fig. 2 is a schematic diagram of a video processing system provided by the embodiment of the present invention, and as shown in fig. 2, the system includes a shooting end 10 and a playing end 20.

The shooting end 10 is used for shooting videos, and the playing end 20 is used for playing videos. Alternatively, the capturing end 10 may upload a captured video to the network, and the playing end 20 may obtain the video uploaded by the capturing end from the network and play the video. In practical applications, the playing terminal 10 and the playing terminal 20 may be a mobile phone, a tablet computer, a computer, or the like, which is not limited in the embodiment of the present invention.

Specifically, the shooting end 10 can capture a video frame and detect a shooting direction of the shooting end 10 during shooting, where the shooting direction includes a horizontal screen direction or a vertical screen direction. For a first collected video frame, generating a first direction frame according to the shooting direction of the first video frame, and inserting the first direction frame in front of the first video frame, wherein the first direction frame carries the shooting direction information of the first video frame; for each video frame after the collected first video frame, if the shooting direction of the video frame is different from the shooting direction of the last video frame of the video frame, generating a second direction frame, inserting the second direction frame in front of the video frame, wherein the second direction frame carries the shooting direction information of the video frame. Then, video data into which the directional frame is inserted is generated.

In the process of playing the video data including the directional frames, the playing terminal 20 can detect the screen direction of the playing terminal while acquiring the frames from the video data, where the screen direction includes a horizontal screen direction or a vertical screen direction. If any directional frame is acquired from the video data, and the shooting direction indicated by the shooting direction information carried by the directional frame is different from the screen direction of the playing end, for each target video frame in at least one target video frame, the size of the target video frame may be reduced according to the screen direction of the playing end 20 and the size of the video displayable area, and then the reduced target video frame is played. Wherein the at least one target video frame is a video frame between the directional frame and a video frame next to the directional frame.

Fig. 3 is a schematic diagram of another video processing system according to an embodiment of the present invention, as shown in fig. 2, the system includes a shooting end 10, a playing end 20, and a server 30. In the live scene, the shooting end 10 is a main broadcast end, the playing end 20 is a viewer end, and the server 30 is a live broadcast server of a live broadcast platform.

That is, in the embodiment of the present invention, the shooting end may insert a direction frame before the first captured video frame to indicate the shooting direction of the subsequent video frame, and when the shooting direction of the shooting end changes, insert a direction frame before the captured video frame to indicate that the shooting direction of the subsequent video frame changes. Therefore, in the process of playing the video data, the playing end can determine the shooting direction of the adjacent video frame according to the direction frame inserted in the video data, so as to determine whether the shooting direction of the video frame to be played is the same as the screen direction of the playing end, and when the shooting direction of the video frame to be played is different from the screen direction of the playing end, the video frame to be played is reduced according to the screen direction of the playing end, so that the playing end does not need to rotate the video frame, the video frame can be displayed in the video displayable area of the playing end, the problem that the picture direction of the video frame rotates 90 degrees is avoided, the operation that the user needs to manually rotate the playing end is further avoided, and the video playing effect and the watching experience of the user are improved.

Fig. 4 is a flowchart of a video processing method, which is used at a shooting end according to an embodiment of the present invention. Referring to fig. 4, the method includes:

step 401: in the shooting process, video frames are collected, and the shooting direction of the shooting end is detected, wherein the shooting direction comprises a transverse screen direction or a vertical screen direction.

Step 402: and for the collected first video frame, generating a first direction frame according to the shooting direction of the first video frame, and inserting the first direction frame before the first video frame, wherein the first direction frame carries the shooting direction information of the first video frame.

Step 403: for each video frame after the collected first video frame, if the shooting direction of the video frame is different from the shooting direction of the last video frame of the video frame, generating a second direction frame, inserting the second direction frame in front of the video frame, wherein the second direction frame carries the shooting direction information of the video frame.

Step 404: video data inserted into directional frames including the first directional frame, or the first directional frame and the second directional frame is generated.

In the embodiment of the invention, in the process of acquiring video frames at a shooting end, a first direction frame is inserted before the acquired first video frame to indicate the shooting direction of the first video frame, and when each video frame is acquired later, when the shooting direction of the acquired video frame is different from the shooting direction of the last video frame, a second direction frame is inserted before the acquired video frame to indicate the shooting direction of the currently acquired video frame, and then video data of the inserted direction frame is generated, so that the shooting direction of the video frame after the direction frame can be determined according to the direction frame in the process of playing the video data comprising the direction frame at the playing end, and when the shooting direction indicated by any direction frame is different from the screen direction of the playing end, the video frame between the direction frame and the next direction frame is subjected to reduction processing according to the screen direction of the playing end and the size of a video displayable area, and displaying the reduced video frame in the video displayable area, thereby avoiding the rotation of the frame direction of the video frame and ensuring that a user can see the forward video frame. Therefore, when the shooting direction of the video frame is different from the screen direction of the playing end, the automatic adjustment of the playing picture of the video frame is realized, the operation that a user needs to manually rotate the playing end is avoided, and the video playing effect is improved.

Optionally, detecting a shooting direction of the shooting end includes:

calling a screen direction monitoring interface of a shooting end, and acquiring the screen direction of the shooting end through the screen direction monitoring interface;

Optionally, generating a first direction frame according to the shooting direction of the first video frame includes:

generating a video pseudo frame carrying shooting direction information of a first video frame by utilizing an extension type of a Network Abstraction Layer (NAL) according to the shooting direction of the first video frame, and taking the video pseudo frame as the first direction frame;

alternatively, the first and second electrodes may be,

and generating an enhanced frame carrying the shooting direction information of the first video frame by utilizing an extension mode provided by supplemental enhancement information SEI according to the shooting direction of the first video frame, and taking the enhanced frame as the first direction frame.

Optionally, inserting the first directional frame before the first video frame further comprises:

sequentially placing a first direction frame and a first video coding frame in a video frame queue to be sent, wherein the first video coding frame is obtained by coding the first video frame;

generating video data into which directional frames are inserted, comprising:

sequentially taking out video queue frames from a video frame queue to be sent, wherein the video queue frames comprise video coding frames and direction frames;

and packaging the taken out video queue frame to obtain a video stream.

Optionally, before encapsulating the extracted video queue frame to obtain a video stream, the method further includes:

collecting audio frames in the shooting process;

sequentially taking out the audio queue frames from the audio frame queue;

encapsulating the extracted video queue frame to obtain a video stream, including:

Optionally, the method further comprises:

All the above optional technical solutions can be combined arbitrarily to form an optional embodiment of the present invention, which is not described in detail herein.

Fig. 5 is a flowchart of another video processing method, which is applied to a playing end, according to an embodiment of the present invention. Referring to fig. 5, the method includes:

step 501: the method comprises the steps of obtaining video data, wherein the video data comprise a plurality of video frames and at least one direction frame, and each direction frame carries shooting direction information of a video frame next to the direction frame.

Step 502: in the process of playing the video data, detecting the screen direction of the playing end, wherein the screen direction of the playing end comprises a horizontal screen direction or a vertical screen direction.

Step 503: if any direction frame is obtained from the video data, and the shooting direction indicated by the shooting direction information carried by the direction frame is different from the screen direction of the playing end, for each target video frame in at least one target video frame, reducing the size of the target video frame according to the screen direction of the playing end, and playing the reduced target video frame, wherein the at least one target video frame is a video frame between the direction frame and the next video frame of the direction frame.

In the embodiment of the invention, when any direction frame is obtained and the shooting direction indicated by the direction frame is different from the screen direction of the playing end in the process of playing the video data comprising the direction frame by the playing end, the video frame between the direction frame and the next direction frame is reduced according to the screen direction of the playing end and the size of the video displayable area, and the reduced video frame is displayed in the video displayable area, so that the rotation of the frame direction of the video frame is avoided, and the forward video frame can be seen by a user. Therefore, when the shooting direction of the video frame is different from the screen direction of the playing end, the automatic adjustment of the playing picture of the video frame is realized, the operation that a user needs to manually rotate the playing end is avoided, and the video playing effect is improved.

Optionally, the reducing the size of the target video frame according to the screen direction of the playing end includes:

and carrying out reduction processing on the target video frame according to the reduction ratio.

when the shooting direction indicated by the shooting direction information carried by the direction frame is a horizontal screen direction and the screen direction of the playing end is a vertical screen direction, determining the proportion between the width of the video displayable area of the playing end and the width of the target video frame as the reduction proportion;

and when the shooting direction indicated by the shooting direction information carried by the direction frame is the vertical screen direction and the screen direction of the playing end is the horizontal screen direction, determining the proportion between the height of the video displayable area of the playing end and the height of the target video frame as the reduction proportion.

Optionally, the playing the reduced target video frame includes:

Optionally, the method further comprises:

Fig. 6 is a flowchart of another video processing method, which is used at the shooting end and the playing end, according to an embodiment of the present invention. Referring to fig. 6, the method includes:

step 601: the shooting end collects video frames in the shooting process, and detects the shooting direction of the shooting end, wherein the shooting direction comprises a transverse screen direction or a vertical screen direction.

In the embodiment of the invention, when the shooting end shoots the video, the shooting direction of the shooting end can be detected while the shooting end collects the shot video frame. The shooting direction of the shooting end refers to the screen direction of the shooting end.

Specifically, the shooting end can call a screen direction monitoring interface of the shooting end, obtain the screen direction of the shooting end through the screen direction monitoring interface, and then determine the obtained screen direction as the shooting direction of the shooting end. By way of example, the screen orientation monitoring interface may be a screen orientation event monitoring function.

Specifically, the shooting end can call an audio and video acquisition interface, and video frames are acquired through the audio and video acquisition interface. Illustratively, the audio/video capture interface may be an audio/video capture callback function.

In one embodiment, the photographing end may define a photographing direction variable in advance, the photographing direction variable indicating a real-time photographing direction. Before the shooting end starts shooting, the shooting direction variable can be initialized, and then the initialized shooting direction variable is modified when the change of the shooting direction of the shooting end is detected. For example, before starting shooting, the shooting direction variable is initialized to the vertical screen direction, and then when it is detected that the shooting direction is the horizontal screen direction, the value of the shooting direction variable is modified to the horizontal screen direction. For example, the shooting direction variable is initialized to a default value true, and when it is detected that the shooting direction is the landscape direction, the value of the shooting direction variable is modified to false. Wherein true is used for indicating the vertical screen direction, and false is used for indicating the horizontal screen direction.

Further, a last effective shooting direction variable indicating a last effective shooting direction of the real-time shooting direction, that is, the last detected effective shooting direction at the current time may be predefined. Before the shooting end starts to shoot, the last effective shooting direction variable can be initialized, and then the last effective shooting direction variable is changed according to the value of the shooting direction variable. For example, the value of the last effective shooting direction variable may be initialized to the initialized value of the shooting direction variable, and then the last effective shooting direction variable may be changed according to the value of the shooting direction variable when the shooting direction variable changes.

Step 602: for the collected first video frame, the shooting end generates a first direction frame according to the shooting direction of the first video frame, and inserts the first direction frame before the first video frame, wherein the first direction frame carries the shooting direction information of the first video frame.

When the first video frame is acquired, the shooting direction of the first video frame can be determined, a first direction frame is generated according to the shooting direction of the first video frame, and then the first direction frame is inserted before the first video frame. The first direction frame is used for indicating the shooting direction when the shooting end starts shooting.

In one embodiment, when the first video frame is captured, the shooting direction of the first video frame may be determined according to the value of the screen direction variable at the time of shooting the first video frame.

Specifically, generating the first direction frame according to the shooting direction of the first video frame may include the following two implementations:

the first implementation mode comprises the following steps: and generating a video pseudo frame carrying the shooting direction information of the first video frame by utilizing the extension type of a network abstraction layer NAL according to the shooting direction of the first video frame, and taking the video pseudo frame as the first direction frame.

The frame type of the video pseudo frame is a video pseudo frame type, and data in the video pseudo frame is shooting direction information of a first video frame. Illustratively, the data value of the video dummy frame is a value of a screen orientation variable at the time of capturing the first video frame. For example, the extension type of the NAL is the extension type reserved for the application for the NAL of H264.

The second implementation mode comprises the following steps: and generating an enhanced frame carrying the shooting direction information of the first video frame by utilizing an extension mode provided by supplemental enhancement information SEI according to the shooting direction of the first video frame, and taking the enhanced frame as the first direction frame.

The frame type of the enhanced frame is an enhanced frame type, and the data in the enhanced frame is shooting direction information of the first video frame. Illustratively, the data value of the video dummy frame is a value of a screen orientation variable at the time of capturing the first video frame. For example, the extension mode provided by the SEI is an extension mode in which the enhancement frame type SEI of H264 is reserved for a user.

In a live broadcast or video call scene, a video shot by the shooting end needs to be sent to the server in real time and forwarded by the server to be a playing end. For example, in a live broadcast scene, the shooting end is a main broadcast end, the playing end is a user end, and the main broadcast end needs to send live video to a live broadcast server in the live broadcast shooting process, and the live broadcast server forwards the live broadcast video to the main broadcast end.

In an embodiment, taking a live broadcast or video call scene as an example, inserting a first direction frame before the first video frame, while placing the first direction frame and the first video coding frame in a to-be-sent video frame queue, the shooting end may also take out the video queue frames from the to-be-sent video frame queue in sequence, and perform encapsulation processing on the taken out video queue frames to obtain a video stream, so as to send the video stream to a streaming media server, and the streaming media server forwards the video stream to the playing end. The first video coding frame is obtained by coding the first video frame, and the video queue frame comprises the video coding frame and the direction frame.

That is, in the video frame queue to be transmitted, the first direction frame precedes the first video encoded frame. After the playing end generates the first direction frame, the first video frame may be placed in the video frame queue to be sent first, then the first video frame is encoded to obtain a first video encoded frame, and the first video encoded frame is placed in the video frame queue to be sent.

Furthermore, in the shooting process of the shooting end, the shooting end can collect not only video frames but also audio frames, and then video data are generated according to the collected video frames, the generated direction frames and the collected audio frames.

In an embodiment, taking a live broadcast or video call scene as an example, before a shooting end encapsulates a taken out video queue frame to obtain a video stream, an audio frame may be collected during the shooting process, the audio frame is encoded for each collected audio frame to obtain an audio encoded frame, and the audio encoded frame is placed in an audio frame queue to be sent. On the other hand, the video queue frames can be sequentially taken out from the video frames to be sent, the audio queue frames can be sequentially taken out from the audio frame queue to be sent, and then the video queue frames and the audio queue frames which are taken out are packaged to obtain the video stream.

That is, the shooting end can start the audio and video acquisition thread and the network transmission thread at the same time, the audio and video acquisition thread can acquire audio frames and video frames, the network transmission thread can take out queue frames from an audio and video queue to be sent, the queue frames taken out are packaged into video streams, and the video streams are sent to the streaming media server so as to be forwarded to the playing end through the streaming media server.

Step 603: for each video frame after the first collected video frame, if the shooting direction of the video frame is different from the shooting direction of the last video frame of the video frame, the shooting end generates a second direction frame, the second direction frame is inserted in front of the video frame, and the second direction frame carries the shooting direction information of the video frame.

That is, in the process of subsequently capturing video frames, when the shooting direction of the shooting end changes, the direction frame is continuously generated and inserted before the captured video frame, where the direction frame is used to indicate that the shooting direction of the shooting end changes.

Optionally, if a shooting direction variable and a last effective shooting direction variable are preset at the shooting end, when the video frame is collected, a value of the shooting direction variable and a value of the last effective shooting direction variable may be obtained, and when the value of the shooting direction variable and the value of the last effective shooting direction variable are different, it is determined that the shooting direction of the video frame is different from the shooting direction of a previous video frame of the video frame, and then a second direction frame is generated.

Specifically, the manner of generating the second directional frame is the same as the manner of generating the first directional frame, and may include the following two implementations:

the first implementation mode comprises the following steps: and generating a video pseudo frame carrying shooting direction information of the video frame by using the extension type of the NAL, and taking the video pseudo frame as a second direction frame.

The second implementation mode comprises the following steps: and generating an enhanced frame carrying shooting direction information of the video frame by utilizing an expansion mode provided by the SEI, and taking the enhanced frame as the second direction frame.

In addition, if the shooting direction of the video frame is the same as the shooting direction of the last video frame of the video frame, no direction frame needs to be generated, and the video frame is continuously collected according to a common video frame collection mode. That is, if the shooting direction of the video frame is the same as the shooting direction of the previous video frame of the video frame, the video frame is inserted after the previous video frame, and the next video frame of the video frame is continuously collected.

Step 604: the shooting end generates video data inserted with directional frames, wherein the directional frames comprise the first directional frames or comprise the first directional frames and the second directional frames.

Specifically, the shooting end generates video data of the insertion direction frame in the following two ways:

the first implementation mode comprises the following steps: the video frame inserted with the directional frame is generated into a video, and the generated video can be stored in a preset storage space of a shooting end, uploaded to a network, or sent to other terminals.

It should be noted that the first implementation manner may be applied to a scene such as video recording.

The second implementation mode comprises the following steps: and converting the video frames into which the directional frames are inserted into video streams, and sending the video streams to a streaming media server so as to forward the video streams to a playing end through the streaming media server.

It should be noted that the second implementation manner can be applied to scenes such as live broadcast or video call.

Step 605: the method comprises the steps that a playing end obtains video data, the video data comprise a plurality of video frames and at least one direction frame, and each direction frame carries shooting direction information of a next video frame of the direction frame.

The video data may be video data pre-stored by a playing terminal or video data downloaded from a network, or may be video data sent by a shooting terminal.

Taking a live broadcast or video call scene as an example, the playing end may receive a video stream from the streaming media server, where the video stream includes video coding frames and direction frames. When the playing end receives the video coding frame, the video coding frame can be decoded to obtain the video frame. For example, the playing terminal may start a network receiving and decoding thread, receive the video stream from the streaming media server through the network receiving and decoding thread, and decode a video coding frame in the video stream to obtain a video frame.

Further, the video data may also include audio frames. Taking a live broadcast or video call scene as an example, the playing end may receive a video stream from the streaming media server, where the video stream includes video encoding frames, direction frames, and audio encoding frames. The playing end can decode the video coding frame to obtain the video frame when receiving the video coding frame, and can decode the audio coding frame to obtain the audio frame when receiving the audio coding frame. For example, the playing end may start a network receiving and decoding thread, receive the video stream from the streaming media server through the network receiving and decoding thread, and decode the video encoded frame and the audio encoded frame in the video stream to obtain the video frame and the audio frame.

Step 606: and the playing end detects the screen direction of the playing end in the process of playing the video data, wherein the screen direction of the playing end comprises a transverse screen direction or a vertical screen direction.

That is, the playing end can play the video data and detect the screen direction of the playing end at the same time. Specifically, the playing end may call a screen direction monitoring interface, and obtain the screen direction of the playing end through the screen direction monitoring interface. By way of example, the screen orientation monitoring interface may be a screen orientation event monitoring function.

In one embodiment, the playing end may pre-define a screen direction variable, which is used to indicate the real-time screen direction. Before the play end starts to shoot, the screen direction variable can be initialized, and then the initialized screen direction variable is modified when the change of the screen direction of the play end is detected. For example, before starting shooting, the screen direction variable is initialized to the vertical screen direction, and then when the screen direction is detected to be the horizontal screen direction, the value of the screen direction variable is modified to the horizontal screen direction. For example, the screen direction variable is initialized to a default value true, and when it is detected that the shooting direction is the landscape direction, the value of the shooting direction variable is modified to false. Wherein true is used for indicating the vertical screen direction, and false is used for indicating the horizontal screen direction.

Further, a last valid screen direction variable indicating a last valid screen direction of the real-time screen direction, that is, a last detected valid screen direction at the current time may be predefined. Before the play end starts to shoot, the last effective screen direction variable can be initialized, and then the last effective screen direction variable is changed according to the value of the screen direction variable. For example, the value of the last effective screen direction variable may be initialized to the initialized value of the screen direction variable, and then the last effective screen direction variable may be changed according to the value of the screen direction variable when the screen direction variable changes.

Step 607: if any direction frame is obtained from the video data, and the shooting direction indicated by the shooting direction information carried by the direction frame is different from the screen direction of the playing end, for each target video frame in at least one target video frame, the playing end reduces the size of the target video frame according to the screen direction of the playing end and the size of a video displayable area, and plays the reduced target video frame, wherein the at least one target video frame is a video frame between the direction frame and a next video frame of the direction frame.

In the process of playing video data, if any direction frame is obtained from the video data, shooting direction information carried by the direction frame and the screen direction of a shooting end can be obtained, if the shooting direction indicated by the shooting direction information carried by the direction frame is different from the screen direction of the playing end, the size of the target video frame can be reduced for any target video frame after the direction frame according to the screen direction of the playing end and the size of a video displayable area, and then the reduced target video frame is displayed in the video displayable area.

When the target video frame is reduced and displayed in the video displayable region, the user can see the forward video frame only when the video frame is reduced because the frame direction of the target video frame is not rotated. For example, assuming that the shooting direction indicated by the shooting direction information carried by the directional frame is a horizontal screen direction, that is, the shooting direction of at least one target video frame between the directional frame and the next directional frame is a horizontal screen direction, and the screen direction of the playing end is a vertical screen direction, in the related art, if the target video frame is displayed in the vertical screen direction, the target video frame needs to be rotated by 90 degrees and then displayed on the display screen, that is, the target video frame is rotated to the vertical screen and then fully filled on the display screen of the vertical screen, but in this way, the picture direction of the target video frame also rotates, for example, an original portrait standing vertically will rotate to stand horizontally, and then, if a user wants to see a forward video picture, the playing end needs to be rotated again, which affects the viewing experience of the user. In the embodiment of the invention, the target video frame in the transverse screen direction can be reduced and then directly displayed in the video displayable area of the display screen, for example, the target video frame is displayed in the center of the video displayable area, so that the picture direction of the target video frame can not be rotated, a user does not need to manually rotate the playing end, and the watching experience of the user is improved.

Specifically, according to the screen direction of the playing end and the size of the video displayable area, reducing the size of the target video frame includes: and determining the reduction ratio of the target video frame, and reducing the target video frame according to the reduction ratio. Wherein. The reduction ratio is a ratio of a first side length to a second side length, the first side length is a minimum value of the height and the width of the video displayable area of the playing end, and the second side length is a maximum value of the height and the width of the target video frame.

Specifically, determining the reduction ratio of the target video frame includes the following two cases:

in the first case: and when the shooting direction indicated by the shooting direction information carried by the direction frame is the horizontal screen direction and the screen direction of the playing end is the vertical screen direction, determining the proportion between the width of the video displayable area of the playing end and the width of the target video frame as the reduction proportion.

For example, as shown in fig. 7, if the shooting direction of the target video frame a is the landscape screen direction, the screen direction of the playback end is the portrait screen direction, the width of the target video frame a is W, and the width of the video displayable region of the playback end is W, the reduction ratio S is W/W. And reducing the target video frame according to the reduction ratio to obtain a reduced target video frame B, wherein the reduced target video frame B can be directly displayed in a video displayable area of the playing end.

In the second case: and when the shooting direction indicated by the shooting direction information carried by the direction frame is the vertical screen direction and the screen direction of the playing end is the horizontal screen direction, determining the proportion between the height of the video displayable area of the playing end and the height of the target video frame as the reduction proportion.

For example, as shown in fig. 8, if the shooting direction of the target video frame a is the vertical screen direction, the screen direction of the playback end is the horizontal screen direction, the height of the target video frame a is H, and the height of the video displayable region of the playback end is H, the reduction ratio S is H/H. And reducing the target video frame according to the reduction ratio to obtain a reduced target video frame B, wherein the reduced target video frame B can be directly displayed in a video displayable area of the playing end.

Specifically, performing the reduction processing on the target video frame according to the reduction ratio means performing the equal-scale reduction on the height and the width of the target video frame according to the reduction ratio.

Further, playing the scaled down target video frame may further include: and according to the size of the video displayable area of the playing end, performing background filling on the reduced target video frame to obtain a filled target video frame, wherein the size of the filled target video frame is the same as that of the video displayable area of the playing end, and then displaying the filled target video frame in the video displayable area of the playing end.

For example, as shown in fig. 7, when the shooting direction of the target video frame is a horizontal screen direction and the screen direction of the playing end is a vertical screen direction, background filling may be performed on the upper and lower sides of the reduced target video frame, respectively, and then the target video frame with the filled background is rendered into the video displayable area of the playing end.

For another example, as shown in fig. 8, when the shooting direction of the target video frame is a vertical screen direction and the screen direction of the playing end is a horizontal screen direction, background filling may be performed on the left and right of the reduced target video frame, and then the target video frame with the filled background is rendered into the video displayable area of the playing end.

In addition, if any direction frame is acquired from the video data, and the shooting direction indicated by the shooting direction information carried by the direction frame is the same as the screen direction of the playing end, each target video frame in the at least one target video frame is directly played.

For example, if the shooting directions of the video frames in the video data are both the horizontal screen direction, and the screen direction of the playing end is also the horizontal screen direction, after the playing end acquires the direction frame before the first video frame in the video data, it can be determined that the shooting direction of the video frame after the direction frame is the same as the screen direction of the playing end, and the video frame of the video data is directly played in the horizontal screen mode.

Further, when the video data further includes an audio frame, the video frame and the audio frame can be played simultaneously.

In order to make the reader easily understand the technical solution provided by the embodiment of the present invention, the following will schematically describe the technical processes of the video processing method executed by the anchor terminal and the live terminal, taking the anchor terminal and the viewer terminal in a live scene as an example:

anchor end

1.1, when the anchor starts to broadcast directly, the anchor can register a screen direction event monitoring function, initialize a shooting direction variable ZhuuboSDIrtical as a default value true, initialize a value of a last effective shooting direction variable ZhuuboLastValidSDirection as a value of the ZhuboSDIrtical, register an audio and video acquisition callback function, connect a streaming media server, and start a network sending thread. Wherein true is used to indicate the portrait screen direction.

And 1.2, judging whether the current shooting direction of the main broadcast end is a horizontal screen direction or a vertical screen direction by utilizing a screen direction event monitoring function of the main broadcast end, recording ZhuboSDISVirtial as false if the current shooting direction is the horizontal screen direction, and recording ZhuboSDISVirtial as true if the current shooting direction is the horizontal screen direction.

And 1.3, assigning a timestamp to the acquired audio and video frames in the audio and video acquisition callback function, judging the frame type, and executing 1.3.1 if the acquired audio and video frames are video frames, or executing 1.3.2 if the acquired audio and video frames are audio frames.

1.3.1, for a first collected video frame, obtaining a value of ZhuboSDIVIical, generating a first direction frame, placing the first direction frame in a video queue to be sent, wherein the data value of the first direction frame is the value of the ZhuboSDIVIical, then coding the first video frame to obtain a first video coding frame, and placing the first video coding frame in the video queue to be sent. For any video frame after the first video frame, judging whether the current value of the ZhuboLastValidSDirection is equal to the ZhuboSDIsrVirtial, if not, generating a second direction frame, placing the second direction frame in a video queue frame column to be sent, wherein the data value of the second direction frame is the value of the ZhuboSDIsrVirtial, assigning the value of the ZhuboSDIsrIrtial to the ZhuboLastValidSDirection, then coding the video frame to obtain a video coding frame, and placing the video coding frame in the video queue frame column to be sent.

1.3.2, for the audio frame, coding the audio frame to obtain an audio coding frame, and placing the audio coding frame in an audio queue frame column to be sent.

1.4, the network sending thread continuously takes out the queue frame from the audio queue frame array to be sent or the video queue frame array to be sent, and the queue frame is packaged according to the streaming media protocol and then sent to the streaming media server.

Audience member

2.1, when a viewer watches the live broadcast, the viewer registers a screen direction event monitoring function, initializes a screen direction variable guanzhongsdisvirtuality to a default value true, initializes a last valid screen direction variable zhubo lastvaliddsdirection 1 to a default value true, initializes a timestamp variable lastoutputtimestamp of the last output audio to 0, connects to a streaming media server, starts a network receiving and decoding thread, starts audio output and registers an audio callback function, and starts a video rendering thread. Wherein true is used to indicate the portrait screen direction.

And 2.2, judging whether the current screen direction is a horizontal screen direction or a vertical screen direction by the audience side by using a screen direction event monitoring function, if so, recording the guanZhong SDIs visual as false, and otherwise, recording the guanZhong SDIs visual as true.

2.3, in the network receiving and decoding thread, receiving and decoding the audio and video coding frame, and executing the following steps:

and 2.3.1, receiving the audio and video coding frame, if the audio coding frame is received, executing 2.3.2, and otherwise, executing 2.3.3.

2.3.2 for the received audio coding frame, decoding to obtain an audio frame, and placing the audio frame in an audio output queue, and turning to 2.3.1.

2.3.3 for other frames, determine if the type of the frame is a directional frame type. If yes, extracting the value V1, and generating a video record, wherein the type of the video record is a video shooting direction type, and the value is shooting direction information V1 carried by a direction frame; otherwise, the frame is decoded to obtain a video frame, and a video record is generated, wherein the type of the video record is a renderable video frame type and the video data is video data V2. Finally, the video recording is placed in a video output queue, and the process goes to 2.3.1.

2.4 in the audio callback function, obtaining the audio frame from the audio output queue, recording that the value of LastOutputStamp is equal to the time stamp value of the audio frame, and outputting the audio frame through the audio playing device.

2.5 in the video rendering thread, rendering the output video, which is executed by the following steps:

2.5.1 get a video record from the video output queue, extract its type T1 and data D1, execute 2.5.2 if T1 is the video capture direction type, execute 2.5.3 if T1 is the renderable video frame type.

2.5.2 record the value of ZhuboLastValidSDirection1 equal to D1, going to 2.5.1.

2.5.3 according to D1, obtain the timestamp of the video frame S1, if S1 is greater than lastoutputtimestamp, sleep for a preset duration, perform 2.5.1, otherwise perform 2.5.4.

2.5.4 judging whether the values of ZhuboLastValidSDirection1 and guanZhongSDIrtical are equal, if so, acquiring video image data according to D1 and rendering the video image data to a video display area, and executing 2.5.1, otherwise, executing 2.5.5.

2.5.5 if ZhuboLastValidSDirection1 is false, that is, the shooting direction of the main broadcasting end is the landscape direction, and guanZhongSDIs visual is true, that is, the audience end is the portrait direction, then a reduction ratio is calculated, which is the width of the video displayable area divided by the width of the video frame, the video frame is reduced in each of the width and height directions according to the reduction ratio, and background colors are filled up and down, and new video image data is generated and rendered to the video display area. Otherwise, if zhubolastvalidspair 1 is true, that is, the shooting direction of the anchor side is the vertical screen direction, and guanzhongsdissvisual is false, that is, the audience side is the horizontal screen direction, then a reduction ratio is calculated, which is the height of the displayable region of the video divided by the height of the video frame, the video frame is reduced in each of the width and height directions according to the reduction ratio, and background colors are filled left and right, new video image data is generated and rendered to the video display region, and 2.5.1 is executed.

Fig. 9 is a block diagram of a video processing apparatus according to an embodiment of the present invention, and as shown in fig. 9, the apparatus includes:

a first collecting module 901, configured to collect a video frame in a shooting process;

a detecting module 902, configured to detect, in a shooting process, a shooting direction of the shooting end, where the shooting direction includes a horizontal screen direction or a vertical screen direction;

a first generating module 903, configured to generate, for a first captured video frame, a first direction frame according to a shooting direction of the first video frame, and insert the first direction frame before the first video frame, where the first direction frame carries shooting direction information of the first video frame;

a second generating module 904, configured to generate, for each video frame after the acquired first video frame, a second direction frame if a shooting direction of the video frame is different from a shooting direction of a previous video frame of the video frame, and insert the second direction frame before the video frame, where the second direction frame carries shooting direction information of the video frame;

a third generating module 905, configured to generate video data into which directional frames are inserted, where the directional frames include the first directional frame or include the first directional frame and the second directional frame.

Optionally, the detecting module 902 is configured to:

Optionally, the first generating module 903 is configured to:

generating a video pseudo frame carrying shooting direction information of the first video frame by utilizing an extension type of a Network Abstraction Layer (NAL) according to the shooting direction of the first video frame, and taking the video pseudo frame as the first direction frame;

alternatively, the first and second electrodes may be,

Optionally, the apparatus further comprises:

the third generating module 905 is configured to sequentially take out video queue frames from the video frame queue to be sent, where the video queue frames include video coding frames and direction frames; and packaging the taken out video queue frame to obtain a video stream.

Optionally, the apparatus further comprises:

the third generating module 905 is configured to perform encapsulation processing on the extracted video queue frame and audio queue frame to obtain a video stream.

Optionally, the apparatus further comprises:

Fig. 10 is a block diagram of a video processing apparatus according to an embodiment of the present invention, and as shown in fig. 10, the apparatus includes:

an obtaining module 1001, configured to obtain video data, where the video data includes a plurality of video frames and at least one directional frame, and each directional frame carries shooting direction information of a video frame next to the directional frame;

the detecting module 1002 is configured to detect a screen direction of the playing end in a process of playing the video data, where the screen direction of the playing end includes a horizontal screen direction or a vertical screen direction;

a processing module 1003, configured to, if any direction frame is obtained from the video data, and a shooting direction indicated by the shooting direction information carried in the direction frame is different from a screen direction of the playing end, reduce, for each target video frame in at least one target video frame, a size of the target video frame according to the screen direction of the playing end and a size of a video displayable area, and play the reduced target video frame, where the at least one target video frame is a video frame between the direction frame and a next video frame of the direction frame.

Optionally, the processing module 1003 includes:

Optionally, the determining unit is configured to:

Optionally, the processing module 1003 includes:

Optionally, the apparatus further comprises:

It should be noted that: in the video processing apparatus provided in the above embodiment, when performing video processing, only the division of the above functional modules is taken as an example, and in practical applications, the above functions may be distributed by different functional modules according to needs, that is, the internal structure of the apparatus is divided into different functional modules to complete all or part of the above described functions. In addition, the video processing apparatus and the video processing method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.

Fig. 11 is a block diagram of a terminal 1100 according to an embodiment of the present invention. The terminal 1100 may be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion video Experts compression standard Audio Layer 3), an MP4 player (Moving Picture Experts Group Audio Layer IV, motion video Experts compression standard Audio Layer 4), a notebook computer, or a desktop computer. Terminal 1100 may also be referred to by other names such as user equipment, portable terminal, laptop terminal, desktop terminal, and so forth.

In general, terminal 1100 includes: a processor 1101 and a memory 1102.

Processor 1101 may include one or more processing cores, such as a 4-core processor, an 8-core processor, or the like. The processor 1101 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 1101 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 1101 may be integrated with a GPU (Graphics Processing Unit) that is responsible for rendering and drawing the content that the display screen needs to display. In some embodiments, the processor 1101 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.

Memory 1102 may include one or more computer-readable storage media, which may be non-transitory. Memory 1102 can also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 1102 is used to store at least one instruction for execution by processor 1101 to implement the video processing method provided by the method embodiments herein.

In some embodiments, the terminal 1100 may further include: a peripheral interface 1103 and at least one peripheral. The processor 1101, memory 1102 and peripheral interface 1103 may be connected by a bus or signal lines. Various peripheral devices may be connected to the peripheral interface 1103 by buses, signal lines, or circuit boards. Specifically, the peripheral device includes: at least one of radio frequency circuitry 1104, touch display screen 1105, camera 1106, audio circuitry 1107, positioning component 1108, and power supply 1109.

The peripheral interface 1103 may be used to connect at least one peripheral associated with I/O (Input/Output) to the processor 1101 and the memory 1102. In some embodiments, the processor 1101, memory 1102, and peripheral interface 1103 are integrated on the same chip or circuit board; in some other embodiments, any one or two of the processor 1101, the memory 1102 and the peripheral device interface 1103 may be implemented on separate chips or circuit boards, which is not limited by this embodiment.

The Radio Frequency circuit 1104 is used to receive and transmit RF (Radio Frequency) signals, also called electromagnetic signals. The radio frequency circuit 1104 communicates with communication networks and other communication devices via electromagnetic signals. The radio frequency circuit 1104 converts an electric signal into an electromagnetic signal to transmit, or converts a received electromagnetic signal into an electric signal. Optionally, the radio frequency circuit 1104 includes: an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and so forth. The radio frequency circuit 1104 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocols include, but are not limited to: metropolitan area networks, various generation mobile communication networks (2G, 3G, 4G, and 5G), Wireless local area networks, and/or WiFi (Wireless Fidelity) networks. In some embodiments, the rf circuit 1104 may further include NFC (Near Field Communication) related circuits, which are not limited in this application.

The display screen 1105 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display screen 1105 is a touch display screen, the display screen 1105 also has the ability to capture touch signals on or over the surface of the display screen 1105. The touch signal may be input to the processor 1101 as a control signal for processing. At this point, the display screen 1105 may also be used to provide virtual buttons and/or a virtual keyboard, also referred to as soft buttons and/or a soft keyboard. In some embodiments, display 1105 may be one, providing the front panel of terminal 1100; in other embodiments, the display screens 1105 can be at least two, respectively disposed on different surfaces of the terminal 1100 or in a folded design; in still other embodiments, display 1105 can be a flexible display disposed on a curved surface or on a folded surface of terminal 1100. Even further, the display screen 1105 may be arranged in a non-rectangular irregular pattern, i.e., a shaped screen. The Display screen 1105 may be made of LCD (Liquid Crystal Display), OLED (Organic Light-Emitting Diode), and the like.

Camera assembly 1106 is used to capture images or video. Optionally, camera assembly 1106 includes a front camera and a rear camera. Generally, a front camera is disposed at a front panel of the terminal, and a rear camera is disposed at a rear surface of the terminal. In some embodiments, the number of the rear cameras is at least two, and each rear camera is any one of a main camera, a depth-of-field camera, a wide-angle camera and a telephoto camera, so that the main camera and the depth-of-field camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize panoramic shooting and VR (Virtual Reality) shooting functions or other fusion shooting functions. In some embodiments, camera assembly 1106 may also include a flash. The flash lamp can be a monochrome temperature flash lamp or a bicolor temperature flash lamp. The double-color-temperature flash lamp is a combination of a warm-light flash lamp and a cold-light flash lamp, and can be used for light compensation at different color temperatures.

The audio circuitry 1107 may include a microphone and a speaker. The microphone is used for collecting sound waves of a user and the environment, converting the sound waves into electric signals, and inputting the electric signals to the processor 1101 for processing or inputting the electric signals to the radio frequency circuit 1104 to achieve voice communication. For stereo capture or noise reduction purposes, multiple microphones may be provided, each at a different location of terminal 1100. The microphone may also be an array microphone or an omni-directional pick-up microphone. The speaker is used to convert electrical signals from the processor 1101 or the radio frequency circuit 1104 into sound waves. The loudspeaker can be a traditional film loudspeaker or a piezoelectric ceramic loudspeaker. When the speaker is a piezoelectric ceramic speaker, the speaker can be used for purposes such as converting an electric signal into a sound wave audible to a human being, or converting an electric signal into a sound wave inaudible to a human being to measure a distance. In some embodiments, the audio circuitry 1107 may also include a headphone jack.

Positioning component 1108 is used to locate the current geographic position of terminal 1100 for purposes of navigation or LBS (Location Based Service). The Positioning component 1108 may be a Positioning component based on the united states GPS (Global Positioning System), the chinese beidou System, the russian graves System, or the european union galileo System.

Power supply 1109 is configured to provide power to various components within terminal 1100. The power supply 1109 may be alternating current, direct current, disposable or rechargeable. When the power supply 1109 includes a rechargeable battery, the rechargeable battery may support wired or wireless charging. The rechargeable battery may also be used to support fast charge technology.

In some embodiments, terminal 1100 can also include one or more sensors 1110. The one or more sensors 1110 include, but are not limited to: acceleration sensor 1111, gyro sensor 1112, pressure sensor 1113, fingerprint sensor 1114, optical sensor 1115, and proximity sensor 1116.

Acceleration sensor 1111 may detect acceleration levels in three coordinate axes of a coordinate system established with terminal 1100. For example, the acceleration sensor 1111 may be configured to detect components of the gravitational acceleration in three coordinate axes. The processor 1101 may control the touch display screen 1105 to display a user interface in a landscape view or a portrait view according to the gravitational acceleration signal collected by the acceleration sensor 1111. The acceleration sensor 1111 may also be used for acquisition of motion data of a game or a user.

The gyro sensor 1112 may detect a body direction and a rotation angle of the terminal 1100, and the gyro sensor 1112 may cooperate with the acceleration sensor 1111 to acquire a 3D motion of the user with respect to the terminal 1100. From the data collected by gyroscope sensor 1112, processor 1101 may implement the following functions: motion sensing (such as changing the UI according to a user's tilting operation), image stabilization at the time of photographing, game control, and inertial navigation.

Pressure sensor 1113 may be disposed on a side bezel of terminal 1100 and/or on an underlying layer of touch display screen 1105. When the pressure sensor 1113 is disposed on the side frame of the terminal 1100, the holding signal of the terminal 1100 from the user can be detected, and the processor 1101 performs left-right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 1113. When the pressure sensor 1113 is disposed at the lower layer of the touch display screen 1105, the processor 1101 controls the operability control on the UI interface according to the pressure operation of the user on the touch display screen 1105. The operability control comprises at least one of a button control, a scroll bar control, an icon control and a menu control.

The fingerprint sensor 1114 is configured to collect a fingerprint of the user, and the processor 1101 identifies the user according to the fingerprint collected by the fingerprint sensor 1114, or the fingerprint sensor 1114 identifies the user according to the collected fingerprint. Upon recognizing that the user's identity is a trusted identity, the user is authorized by the processor 1101 to perform relevant sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying for and changing settings, etc. Fingerprint sensor 1114 may be disposed on the front, back, or side of terminal 1100. When a physical button or vendor Logo is provided on the terminal 1100, the fingerprint sensor 1114 may be integrated with the physical button or vendor Logo.

Optical sensor 1115 is used to collect ambient light intensity. In one embodiment, the processor 1101 may control the display brightness of the touch display screen 1105 based on the ambient light intensity collected by the optical sensor 1115. Specifically, when the ambient light intensity is high, the display brightness of the touch display screen 1105 is increased; when the ambient light intensity is low, the display brightness of the touch display screen 1105 is turned down. In another embodiment, processor 1101 may also dynamically adjust the shooting parameters of camera assembly 1106 based on the ambient light intensity collected by optical sensor 1115.

Proximity sensor 1116, also referred to as a distance sensor, is typically disposed on a front panel of terminal 1100. Proximity sensor 1116 is used to capture the distance between the user and the front face of terminal 1100. In one embodiment, the touch display screen 1105 is controlled by the processor 1101 to switch from a bright screen state to a dark screen state when the proximity sensor 1116 detects that the distance between the user and the front face of the terminal 1100 is gradually decreasing; when the proximity sensor 1116 detects that the distance between the user and the front face of the terminal 1100 becomes gradually larger, the touch display screen 1105 is controlled by the processor 1101 to switch from a breath-screen state to a bright-screen state.

Those skilled in the art will appreciate that the configuration shown in fig. 11 does not constitute a limitation of terminal 1100, and may include more or fewer components than those shown, or may combine certain components, or may employ a different arrangement of components.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A video processing method is applied to a shooting end, and comprises the following steps: in the shooting process, collecting video frames and detecting the shooting direction of the shooting end, wherein the shooting direction comprises a transverse screen direction or a vertical screen direction;

for each video frame after the first video frame, acquiring a value of a shooting direction variable and a value of a last effective shooting direction variable, wherein the shooting direction variable and the last effective shooting direction variable are preset, the shooting direction variable is used for indicating a real-time shooting direction, and the last effective shooting direction variable is used for indicating a last effective shooting direction of the real-time shooting direction; when the value of the shooting direction variable is different from the value of the last effective shooting direction variable, determining that the shooting direction of the video frame is different from the shooting direction of the last video frame of the video frame, and generating a second direction frame; inserting the second direction frame before the video frame, wherein the second direction frame carries shooting direction information of the video frame;

2. The method of claim 1, wherein the detecting of the photographing direction of the photographing end comprises:

3. The method of claim 1, wherein said generating a first direction frame based on a shooting direction of said first video frame comprises:

alternatively, the first and second electrodes may be,

4. The method of claim 1, wherein said inserting said first directional frame before said first video frame further comprises:

the generating of the video data inserted with the directional frame includes:

and packaging the taken out video queue frame to obtain a video stream.

5. The method of claim 4, wherein before encapsulating the fetched video queue frames to obtain the video stream, the method further comprises:

collecting audio frames in the shooting process;

sequentially taking out audio queue frames from the audio frame queue to be sent, wherein the audio queue frames comprise the audio coding frames;

and packaging the taken out video queue frame and the audio queue frame to obtain the video stream.

6. The method of claim 1, wherein the method further comprises:

7. A video processing method is applied to a playing end, and the method comprises the following steps:

acquiring video data generated by a shooting end; the video data includes a plurality of video frames and a direction frame, the direction frame carries shooting direction information of a next video frame of the direction frame, the direction frame includes a first direction frame, or includes the first direction frame and a second direction frame, the first direction frame is generated by the shooting end according to a shooting direction of a first video frame, the second direction frame is generated by the shooting end according to a shooting direction of a last video frame of the video frames when the shooting end determines that the shooting direction of the video frame is different from the shooting direction of the last video frame of the video frames according to the obtained value of a shooting direction variable and the value of a last effective shooting direction variable, the shooting direction variable and the last effective shooting direction variable are preset, the shooting direction variable is used for indicating a real-time shooting direction, the last effective shooting direction variable is used for indicating the last effective shooting direction of the real-time shooting direction;

detecting the screen direction of the playing end in the process of playing the video data, wherein the screen direction of the playing end comprises a transverse screen direction or a vertical screen direction; if any direction frame is obtained from the video data, and the shooting direction indicated by the shooting direction information carried by the direction frame is different from the screen direction of the playing end, for each target video frame in at least one target video frame, reducing the size of the target video frame according to the screen direction of the playing end and the size of a video displayable area, and playing the reduced target video frame, wherein the at least one target video frame is a video frame between the direction frame and a next video frame of the direction frame.

8. The method of claim 7, wherein the reducing the size of the target video frame according to the screen orientation of the playing end and the size of the video displayable area comprises:

9. The method of claim 8, wherein said determining a downscaling of the target video frame comprises:

10. The method of claim 7, wherein said playing out said scaled down target video frame comprises:

11. The method of claim 7, wherein the method further comprises:

12. A video processing apparatus, applied to a shooting side, the apparatus comprising:

a second generation module, configured to obtain, for each video frame subsequent to the first video frame, a value of a shooting direction variable and a value of a last effective shooting direction variable, where the shooting direction variable and the last effective shooting direction variable are preset, the shooting direction variable is used to indicate a real-time shooting direction, and the last effective shooting direction variable is used to indicate a last effective shooting direction of the real-time shooting direction; when the value of the shooting direction variable is different from the value of the last effective shooting direction variable, determining that the shooting direction of the video frame is different from the shooting direction of the last video frame of the video frame, and generating a second direction frame; inserting the second direction frame before the video frame, wherein the second direction frame carries shooting direction information of the video frame;

13. The apparatus of claim 12, wherein the detection module is to:

14. The apparatus of claim 12, wherein the first generating module is to:

alternatively, the first and second electrodes may be,

15. The apparatus of claim 12, wherein the apparatus further comprises:

16. The apparatus of claim 15, wherein the apparatus further comprises:

and the third generating module is configured to perform encapsulation processing on the extracted video queue frame and the extracted audio queue frame to obtain the video stream.

17. The apparatus of claim 12, wherein the apparatus further comprises:

18. A video processing apparatus, applied to a playing side, the apparatus comprising:

the acquisition module is used for acquiring video data generated by a shooting end; the video data includes a plurality of video frames and a direction frame, the direction frame carries shooting direction information of a next video frame of the direction frame, the direction frame includes a first direction frame, or includes the first direction frame and a second direction frame, the first direction frame is generated by the shooting end according to a shooting direction of a first video frame, the second direction frame is generated by the shooting end according to a shooting direction of a last video frame of the video frames when the shooting end determines that the shooting direction of the video frame is different from the shooting direction of the last video frame of the video frames according to the obtained value of a shooting direction variable and the value of a last effective shooting direction variable, the shooting direction variable and the last effective shooting direction variable are preset, the shooting direction variable is used for indicating a real-time shooting direction, the last effective shooting direction variable is used for indicating the last effective shooting direction of the real-time shooting direction;

and the processing module is used for reducing the size of each target video frame in at least one target video frame according to the screen direction of the playing end and the size of a video displayable area and playing the reduced target video frame if any direction frame is acquired from the video data and the shooting direction indicated by the shooting direction information carried by the direction frame is different from the screen direction of the playing end, wherein the at least one target video frame is a video frame between the direction frame and the next video frame of the direction frame.

19. The apparatus of claim 18, wherein the processing module comprises:

20. The apparatus of claim 19, wherein the determination unit is to:

21. The apparatus of claim 18, wherein the processing module comprises:

22. The apparatus of claim 18, wherein the apparatus further comprises:

23. A video processing apparatus, characterized in that the apparatus comprises:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the steps of any of the methods of claims 1-6 or claims 7-11.

24. A computer readable storage medium having instructions stored thereon, wherein the instructions, when executed by a processor, implement the steps of any of the methods of claims 1-6 or claims 7-11.