CN118175346A

CN118175346A - Video playing method and device, electronic equipment and readable storage medium

Info

Publication number: CN118175346A
Application number: CN202410287562.2A
Authority: CN
Inventors: 徐杰
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2024-03-13
Filing date: 2024-03-13
Publication date: 2024-06-11

Abstract

The application discloses a video playing method and device, electronic equipment and a readable storage medium, and belongs to the technical field of streaming media live broadcasting. The method comprises the following steps: receiving a first input under the condition of playing a first video; the first input corresponds to a first keyword; playing a second video in response to the first input; the second video is determined according to the live video stream corresponding to the first keyword in the first video.

Description

Video playing method and device, electronic equipment and readable storage medium

Technical Field

The application belongs to the technical field of streaming media live broadcasting, and particularly relates to a video playing method and device, electronic equipment and a readable storage medium.

Background

In the related art, as an information transmission mode, the information transmission efficiency of the live video is affected by the host of the live video, when a user needs to acquire certain information, it is possible that the content currently being explained by the live video is irrelevant to the content interested by the user, at this time, the user can only ask questions to the host by means of a barrage, comments and the like, and the host can not always see and feed back at the first time, so that the efficiency of acquiring effective information by the user through live video is low.

Disclosure of Invention

The embodiment of the application aims to provide a video playing method and device, electronic equipment and a readable storage medium, which can solve the problem of poor efficiency of acquiring effective information through live broadcast.

In a first aspect, an embodiment of the present application provides a video playing method, where the method includes:

receiving a first input under the condition of playing a first video; the first input corresponds to a first keyword;

Playing a second video in response to the first input; the second video is determined according to the live video stream corresponding to the first keyword in the first video.

In a second aspect, an embodiment of the present application provides a video playing device, where the playing device includes:

The receiving module is used for receiving a first input under the condition of playing the first video; the first input corresponds to a first keyword;

A play module for playing the second video in response to the first input; the second video is determined according to the live video stream corresponding to the first keyword in the first video.

In a third aspect, embodiments of the present application provide an electronic device comprising a processor and a memory storing a program or instructions executable on the processor, the program or instructions implementing the steps of the method as in the first aspect when executed by the processor.

In a fourth aspect, embodiments of the present application provide a readable storage medium having stored thereon a program or instructions which when executed by a processor perform the steps of the method as in the first aspect.

In a fifth aspect, embodiments of the present application provide a chip comprising a processor and a communication interface coupled to the processor for running a program or instructions implementing the steps of the method as in the first aspect.

In a sixth aspect, embodiments of the present application provide a computer program product stored in a storage medium, the program product being executable by at least one processor to implement a method as in the first aspect.

In the embodiment of the application, when watching live broadcast, a user can input the search keyword, search and pay attention to the live video stream content associated with the user keyword based on the search keyword in the history live broadcast record of the current live broadcast room or the live broadcast content to be played, and generate the simplified media fragment, namely the second video for the user client to play based on the searched live video stream, so that the user can quickly obtain the required information when the current live broadcast content is irrelevant to the interested content, and the efficiency of the user to obtain the effective information through live broadcast is improved.

Drawings

FIG. 1 illustrates a flow chart of a video playback method in accordance with some embodiments of the application;

FIG. 2 is a schematic diagram of an interface of a related art streaming media live room;

FIG. 3 illustrates a schematic diagram of an audio-video entry process of a streaming live broadcast in accordance with some embodiments of the present application;

Fig. 4 is a schematic diagram illustrating an audio/video decoding process of a live streaming according to some embodiments of the present application;

FIG. 5 illustrates an audio-video playback synchronization diagram in accordance with some embodiments of the present application;

FIG. 6 illustrates an audio-visual raw data schematic of some embodiments of the application;

FIG. 7 is a schematic diagram of audio ranges and video image frames corresponding to search keywords in some implementations of the application;

FIG. 8 illustrates a schematic diagram of an original I-frame sequence in accordance with some embodiments of the application;

FIG. 9 is a schematic diagram of a key I frame sequence after removal of duplicate information in accordance with some embodiments of the application;

FIG. 10 illustrates a schematic diagram of a process for keyword retrieval of media segments according to some embodiments of the present application;

FIG. 11 illustrates a schematic diagram of a first media segment of some embodiments of the application;

Fig. 12 is a block diagram illustrating a video playback apparatus according to some embodiments of the present application;

FIG. 13 shows a block diagram of an electronic device according to an embodiment of the application;

fig. 14 is a schematic hardware structure of an electronic device implementing an embodiment of the present application.

Detailed Description

The technical solutions of the embodiments of the present application will be clearly described below with reference to the drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which are obtained by a person skilled in the art based on the embodiments of the present application, fall within the scope of protection of the present application.

The terms first, second and the like in the description and in the claims, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the terms so used are interchangeable under appropriate circumstances such that the embodiments of the application are capable of operation in sequences other than those illustrated or otherwise described herein, and that the objects identified by "first," "second," etc. are generally of a type not limited to the number of objects, for example, the first object may be one or more. Furthermore, in the description and claims, "and/or" means at least one of the connected objects, and the character "/", generally means that the associated object is an "or" relationship.

The video playing method and apparatus, the electronic device and the readable storage medium provided by the embodiments of the present application are described in detail below with reference to the accompanying drawings by means of specific embodiments and application scenarios thereof.

In some embodiments of the present application, a video playing method is provided, fig. 1 shows a flowchart of the video playing method according to some embodiments of the present application, and as shown in fig. 1, the method includes:

102, receiving a first input under the condition of playing a first video; the first input corresponds to a first keyword.

In the embodiment of the application, the first video is a live video when the user watches live, and the first video may be, for example, a video played in a first direct broadcasting room, where the first direct broadcasting room is a direct broadcasting room where the user watches live streaming media currently. Illustratively, the first video includes a video portion and an audio portion.

In the related art, the live streaming is an important way to provide information, and the information amount and information density provided by the live streaming are greatly affected by the anchor. For example, taking a live sales scenario as an example, fig. 2 shows an interface schematic of a related art streaming media live room, and as shown in fig. 2, the streaming media live interface 200 includes a variety of different information, such as a live screen 202, a bullet screen message 204, a commodity link 206, and other introduction information 208.

In a live broadcast, a host may introduce a number of different items, not necessarily all of which are of interest to the user viewing the live broadcast. If the user wishes to acquire the required information, the user needs to communicate with the host through the barrage, or waits for the host to speak the commodity of interest, if the host has previously spoken the commodity, the explanation may not be repeated, so that the user wastes time and cannot acquire the required information.

In view of the above, a search entry is provided in the streaming live broadcast process for inputting a search keyword, that is, the above-mentioned first keyword, through the first input and the search entry. Illustratively, the first keyword may be information that the user wishes to obtain, or a commodity name of interest to the user, or other information of interest. Illustratively, the first keyword may be: what the price is, what the score line is, when the new product is released, etc.

Step 104, responding to the first input, and playing a second video; the second video is determined according to the live video stream corresponding to the first keyword in the first video.

In the embodiment of the application, a user inputs a first keyword in the process of watching a live first video, and generates and plays a second video based on the first keyword input by the user. The second video may be obtained by compacting the first video of the current live broadcast, where the compacted second video only includes audio and video contents related to the first keyword that are interested by the user, and a content portion that is not interested by the user is removed. The second video may also be obtained by focusing on the content which is not yet live broadcast in the first direct broadcast room, and after the video object associated with the first keyword appears in the first video played in the first direct broadcast room, intercepting or processing the first video at this time, so that the user can skip the live broadcast content which is not interested.

In some embodiments, the client sends the first keyword input by the user to the live broadcast server, and the live broadcast server retrieves the relevant historical live video in the historical live record of the current live broadcast room according to the first keyword input by the user and generates a second video according to the historical live video, wherein the second video comprises the first object indicated by the first keyword of the user.

Illustratively, in other embodiments, the live client locally retrieves a relevant historical live video based on the first keyword entered by the user and generates the second video from the retrieved historical live video associated with the first keyword.

In other embodiments, the live client records a first keyword input by a user, determines a first object of interest to the user according to the first keyword, and meanwhile focuses on video content of the first video to determine whether the first object is included in the video content of the first video. It can be appreciated that the first object is included in the video content of the first video, which may be that the first object appears in a video picture of the first video, or that the first object is mentioned in an audio portion of the first video. It can be appreciated that after the user inputs the first keyword, the first video may be in a foreground playing state, a background playing state, or a stop playing state.

According to the embodiment of the application, when watching live broadcast, a user can input the search keyword, search and pay attention to the live video stream content associated with the user keyword based on the search keyword in the history live broadcast record of the current live broadcast room or the live broadcast content to be played, and generate the simplified media fragment, namely the second video for the user client to play, so that the user can quickly obtain the needed information when the current live broadcast content is irrelevant to the interested content, and the efficiency of the user for acquiring the effective information through live broadcast is improved.

In some embodiments of the application, the method further comprises:

editing a live video stream corresponding to a first keyword in a first video;

and determining a second video according to the edited live video stream.

In the embodiment of the application, the first video comprises a live video stream, and the live video stream can be obtained by pushing streaming media through a live server. The live broadcast server pushes the live broadcast video stream to the live broadcast client, and after receiving the live broadcast video stream, the live broadcast client reorganizes the video part and the audio part of the live broadcast video stream to obtain a corresponding first video and plays the first video.

After the user inputs the first keyword, taking the first object indicated by the first keyword as an example, editing the current live video stream based on the first object, so as to obtain a second video only containing video content related to the first object of interest to the user.

The second video is related to the first object indicated by the first keyword, and the video content comprising the second video also comprises related audio of the first object, such as the name of the first object, the voice introducing the first object, and the like, in the audio content comprising the second video.

According to the embodiment of the application, the second video which only comprises the object of interest of the user can be obtained by editing the live video stream of the first video, so that useless information is reduced, the user is helped to obtain the required information quickly, and the efficiency of the user for obtaining the effective information through live broadcast is improved.

In some embodiments of the present application, before the step of editing the live video stream corresponding to the first keyword in the first video, the method further includes:

Receiving a second input; the second input corresponds to the first editing policy;

editing the live video stream corresponding to the first keyword in the first video, including:

and responding to the second input, and editing the live video stream corresponding to the first keyword according to the first editing strategy.

In the embodiment of the application, the user can specify the editing strategy when the user edits to obtain the second video through the second input, namely the first editing strategy. The first editing strategy comprises removing redundant content, adjusting the volume of a host, only looking at the introduction of purchasing more commodities, only looking at the commodities corresponding to the designated time period or only looking at the commodities in the specific price interval, and the like.

And after the first editing strategy is input by the user, editing the live video stream corresponding to the first keyword in the first video based on the first editing strategy. Specifically, fig. 3 shows a schematic diagram of an audio and video recording process of live streaming media according to some embodiments of the present application, as shown in fig. 3, in the live streaming media process, a microphone collects live streaming sound to obtain an audio sampling frame, a camera shoots a live streaming picture to obtain an image frame, and synchronization is performed between the sampling frame and the image frame through a clock. The synchronized sampling frames and image frames are respectively subjected to audio processing and image processing to obtain a sampling frame queue and an image frame queue, respectively subjected to audio encoding and identification encoding to obtain an audio packet queue and a video packet queue, and subjected to audio and video encapsulation through a multiplexer to obtain the live streaming media file.

Fig. 4 is a schematic diagram illustrating an audio/video decoding process of a live streaming media according to some embodiments of the present application, where, as shown in fig. 4, a streaming media file is subjected to inverse encapsulation processing by a demultiplexer to obtain an audio packet queue and a video packet queue, and is subjected to audio decoding and video decoding respectively to obtain a decoded sample frame queue and an image frame queue, and after clock synchronization, the sample frame queue and the image frame queue are respectively subjected to audio processing and then played as sound heard by a user, and after image processing, are displayed on a screen.

Fig. 3 and 4 show the process from the recording of a main cast to the viewing of live content by a user during a live broadcast. To assist the user in quickly acquiring the content desired to be retrieved, the retrieval speed and the content accuracy become critical. Fig. 5 is a schematic diagram illustrating audio and video playing synchronization according to some embodiments of the present application, as shown in fig. 5, a live client receives video data, where the video data is specifically streaming media data pushed by a live server, after the video data is subjected to a protocol decompression process, package format data is obtained, and after the package format data is subjected to a protocol decompression process, audio compression data and video compression data are obtained.

The audio compressed data is subjected to audio decoding to obtain audio original data. After video decoding, the video compressed data obtain video original data, and according to clock information, audio and video synchronous processing is carried out on the audio original data and the video original data, so as to obtain audio and video which can be played by audio driving equipment (such as a sound card) and video driving equipment.

When a user watches live broadcast by using a mobile phone, the mobile phone is an audio and video driving device, and the original data of the audio and video and the synchronization of the original audio and video are obtained in the previous step. Fig. 6 shows an audio-visual raw data diagram of some embodiments of the present application, as shown in fig. 6, in which the wavy line represents sounds heard by the user, in particular sound waves of different lengths, which are recorded by the host, in particular by a microphone, and the user is interrupting the sound. The sound will generally follow the entire live session even though the host does not make a sound, but the audio will still be synchronized with the video at this time, with the audio wavelength being nearly 0.

The user can always see pictures in the whole process of live broadcasting, the pictures are videos, the videos consist of pictures of one frame and one frame, the pictures are related, the abscissa of fig. 6 shows the relationship, the frames comprise three types of I (key frame), B (bi-directional prediction frame) and P (prediction frame), the information occupied by the I frame is the largest, and the B frame and the P frame are used for guaranteeing the continuity of the video effect.

For the content played in real time, the corresponding relationship of the audio and video is shown in fig. 6, and for the audio and video already played, the corresponding relationship is also shown in fig. 6.

On the live broadcast page, a shortcut entry can be added as an input shortcut entry of a search keyword, and when the user performs search, the user inputs a first keyword to be searched in the live broadcast interface, such as "what is the price of two large and one small".

In the introduction process of the anchor, the description is made on the price of two large and one small, and then the keyword can be used to match the voice content in the range so as to obtain the corresponding audio range, fig. 7 shows a schematic diagram of the audio range and the video image frame corresponding to the search keyword in some implementations of the application, and as shown in fig. 7, the beginning part and the ending part of the audio are intercepted according to the keyword and the current context to form the corresponding audio fragment. Since audio is typically small, the speed of retrieving audio from keywords is also relatively fast.

In order to increase the retrieval speed, a video clip containing valid information can be validated without losing retrieval information. The video segment retrieved by the first keyword is the first media segment. Illustratively, non-critical information other than the retrieval information, such as anchor persona, repeated frame content, useless frame content, etc., may be removed for the first media segment.

Specifically, for a complete picture of live broadcast, invalid information can be identified therein by an image recognition algorithm. Illustratively, a host character may be identified. The pictures of the anchor person often introduce commodities, and in the whole live broadcast process, the body or the face is changed all the time in a fixed section, and other parts are unchanged and have no effective information. The anchor person can be removed in the original frame image, thereby reducing the amount of invalid information in the image.

For example, duplicate frame content may be identified. For example, in the live broadcast process, content information which is displayed in a screen for a long period and does not change, such as advertisement, commodity information, etc., may appear, and there is a large number of repetitions of these content, so that these repeated frames may be removed, reducing the amount of data.

Illustratively, useless frames can be identified, specifically, the key of the live search result is to provide information required by the user, and the smoothness requirement on playing the media segment is not high. Therefore, the original audio can be kept unchanged on the basis of the original live video data, and the image frames of the video corresponding to the audio fragments can be processed. For example, only key I frames may be retained in the video, removing useless B and P frames. In some embodiments, the B frames and the P frames are not necessarily completely removed, whether the B frames and the P frames are completely deleted may be determined according to whether the key I frames have a change in the current audio time period, and when the I frames have a large change, part of the B frames or the P frames may be reserved, so that the video transition is slightly natural. In some implementations, the persona information in the I frame may be further removed.

For the I frames, a first I frame can be used as a starting frame, and a subsequent I frame is subjected to bi-difference conversion, so that the data volume of each frame is reduced, the difference between the subsequent I frame and the first I is compared, and if the difference is the same, the subsequent I frame is not reserved; if it is different, it is reserved.

If the original video segment contains 100I frames, 200B frames and 100P frames, each I frame is 10KB in picture size, each B frame is 5KB in picture size, and each P frame is 6KB in picture size, the original video segment is approximately 2MB in size. After optimization, assume 200B frames, 100P frames are removed, and sample extraction is performed on 100I frames, sampling presence rules: the 1 st I frame is used as a starting frame, so that 100I frames only need to be reserved for 1 frame. Meanwhile, the character information is removed, the size of the video clip can be reduced from 2MB to about 10KB, and the size can be immediately searched in a short time, so that the efficiency is high.

Illustratively, fig. 8 shows a schematic diagram of an original I-frame sequence according to some embodiments of the present application, as shown in fig. 8, in which a plurality of repeated frames exist in a plurality of I-frames, the repeated frames therein are removed, only the three frame images on the top layer remain, and fig. 9 shows a schematic diagram of a key I-frame sequence according to some embodiments of the present application after removing the repeated information, and the final effect is as shown in fig. 9.

After the second video is obtained, the second video is obtained by searching based on the first keyword input by the user, and the second video contains the content corresponding to the first keyword interested by the user. Specifically, fig. 10 is a schematic diagram illustrating a process of retrieving media segments by keywords according to some embodiments of the present application, and after a user inputs keywords, audio raw data is retrieved based on the keywords and video raw data corresponding to the audio raw data is matched based on a clock, as shown in fig. 10. After the audio original data is reserved as an audio fragment, the video original data is optimized, B frames, P frames and character pictures are removed, I frames with invalid information removed are obtained, the audio fragment and the I frames with invalid information removed are matched and synchronized according to an original clock, audio and video synchronization is carried out, synchronized audio and video are obtained, and decoding and playing are carried out through an audio driving device and a video driving device which are interrupted by a user.

Fig. 11 is a schematic diagram of a first media segment according to some embodiments of the present application, where the first media segment includes key information and complete audio, as shown in fig. 11, and the information is reduced based on the live view shown in fig. 2.

According to the embodiment of the application, the media fragments meeting the user requirements, namely the second video, are generated according to the editing strategy appointed by the user, so that the second video watched by the user only comprises the information part interested by the user, the interference of redundant information is removed, and the information acquisition efficiency of the user is improved.

In some embodiments of the application, the editing process is used to simplify video content, simplify audio content, and/or adjust volume.

In the embodiment of the application, the live video stream of the first video is edited, which specifically includes simplifying the video content of the first video. The simplified video content specifically refers to removing video content, such as video background, main broadcasting portrait, advertisement banner, gift information or barrage information, which is irrelevant to a first keyword input by a user, from a video picture of the first video.

By simplifying the editing process of the video content, redundant information in the live video can be effectively removed, users are helped to pay attention to effective information of interest, and information acquisition efficiency is improved.

In particular, since the contents of live pictures in some live rooms are complex, they are filled with some ineffective information, such as advertisements, background images, and other contents of gifts, bullet screens, etc., of users watching live broadcast, these contents occupy part of the live pictures, so that users watching live broadcast are distracted by these ineffective information.

Aiming at the problems, the embodiment of the application provides a selection of a live broadcast succinct mode. Specifically, the first control is a control that opens a "live succinct mode". In the live broadcast watching process of the user, when the fourth input of the user to the first control is received, the first video of the current live broadcast room is processed, and part or all of invalid information content in the first video is removed, so that a live broadcast interface is more concise and refined.

Specifically, at least one media object in the first video may be hidden, where the media object may be advertisement information, a live background image, a main cast portrait, etc., and reducing invalid information may reduce an influence on a user watching live broadcast, and help the user improve concentration. The frame rate of the first video may also be reduced, in particular, B frames and P frames in the first video may be removed, and duplicate I frames may also be removed. By removing the invalid frames, the data volume of the video can be reduced, thereby saving bandwidth. The audio volume of the first video may also be reduced. Illustratively, the audio portion of the video may include a plurality of audio tracks, and the volume of a particular audio track may be reduced, such as by reducing the volume of background music, to highlight the presenter's voice, depending on the user selection. Or the volume of the anchor voice is reduced, so that the user is more focused on the live picture content.

Editing the live video stream of the first video, and simplifying the audio content. The simplified audio content specifically refers to removing audio content, which is irrelevant to a first keyword input by a user, in an audio portion of the first video, such as voice, background music, barrage sound effects when a host player interacts with the barrage, and the like.

By simplifying the editing process of the audio content, the live audio can be more indirect, and invalid audio is prevented from interfering with the user's listening.

Editing the live video stream of the first video, and adjusting the volume. In some scenarios, the user may need to carefully watch the visual content, or the user may be browsing other pages, listening "to the live broadcast in the background.

The user's sensitivity to audio varies from viewing scene to viewing scene. For example, when the user gathers and watches the live video, if the audio volume is too large, the user can be distracted, and the volume of the live audio can be properly reduced. For example, the user does not play the video in the foreground at this time, but runs other programs or displays other windows in the foreground, and plays the live video in the background, so that the volume of the live video can be properly increased at this time.

The embodiment of the application reduces redundant information by simplifying the video content or the audio of the live video, improves the information acquisition efficiency, and properly adjusts the volume of the live video so as to meet different watching scenes.

In some embodiments of the application, playing the second video in response to the first input, comprises:

Responding to the first input, and displaying a first interface and a second interface in a split screen mode under the condition that the first interface is displayed;

and playing the second video in the second interface.

In the embodiment of the application, in a live broadcast, a host will often explain different information in different time periods, live broadcast content before the host explains the information of interest of the user is invalid content for the user, at this time, the user may put the live broadcast video in the background to play, and display other program content in the foreground through other interfaces, such as browsing web pages, playing games, and the like.

To avoid missing the content of interest, the user specifies a first keyword through the first input, the first keyword indicating a first object, i.e. an object of interest to the user. After receiving the first keyword, the background live window continuously monitors live content, including identifying live pictures and live audio, to determine whether the live video is related to a first object of interest to the user.

At this time, the first object may also be regarded as an object "reserved for viewing" by the user. For example, if the living room is explaining a commodity, and the user is interested in the commodity, the first keyword "pillow" can be input into the living room, and the living room records that the "pillow" is the object reserved by the user.

After the first keyword is input, if the user is browsing the first interface, for example, when browsing a web page through the first interface, the display interface of the electronic device is automatically split, specifically, the first interface and the second interface are split displayed, the first interface keeps original content unchanged, for example, the original web page browsing interface is kept, and the second interface is used for playing a second video associated with the user searching keyword.

In some embodiments, after the user inputs the first keyword, there may be no content associated with the first keyword in the live broadcast record, or the live broadcast has not yet progressed to the progress related to the first object indicated by the first keyword, where the live broadcast background may continuously monitor the video content of the first video, and determine whether the related content of the first object appears.

In an exemplary process of live broadcasting, the live broadcasting background continuously detects live broadcasting content, and when detecting that the media content of the first video of the live broadcasting contains a pillow, if the pillow is identified in a video picture of the first video, or a keyword associated with the pillow appears in audio of the first video, prompt information is displayed in a popup window reminding mode and the like to prompt a user that the content associated with the pillow is started to be live-broadcast.

For example, whether the live content contains the object reserved by the user can be determined by audio detection, for example, by detecting the anchor voice, and when the anchor voice contains the term "pillow", prompt information is displayed.

For example, whether the content contains an object reserved by the user may be determined by image detection of the live view, and when an object of "pillow" appears in the live view, a prompt message is displayed.

After the prompt information is displayed, the user can display the first interface and the second interface in a split screen mode by clicking the prompt information, the operation interface which is carried out by the user before is reserved in the first interface, meanwhile, the edited second video is played in the second interface, and interested live broadcast contents are timely provided for the user on the basis of not interrupting the original operation of the user.

The embodiment of the application can timely display the interested live content to the user without interrupting the original operation of the user.

In some embodiments of the present application, in a case where the second video is played in the second interface, an audio output channel of the electronic device is allocated to the second interface;

the method further comprises the steps of:

Receiving a third input;

In response to the third input, an audio output channel of the electronic device is switched from the second interface to the first interface.

In the embodiment of the application, after the first interface and the second interface are displayed in a split screen, the audio output channel of the electronic device defaults to the second interface, that is, plays the audio of the second video. At this time, the user can hear the sound of the second video.

If the user wishes to play the audio in the first interface that is browsed originally, a third input may be performed on the first interface, for example, long pressing the first interface, or clicking an audio play control in the first interface, where the audio output channel of the electronic device is switched from playing the audio of the first video to playing the audio in the first interface.

According to the embodiment of the application, the audios of different interfaces can be automatically switched according to the user demands, and the current operation of the user is prevented from being interrupted while the user is prevented from missing the interested live broadcast content.

In some embodiments of the application, after the step of switching the audio output channel of the electronic device from the second interface to the first interface in response to the third input, the method further comprises:

and displaying video text information corresponding to the second video on the second interface.

In the embodiment of the application, after the user switches the audio output channel of the electronic device from the second interface for playing the second video to the first interface through the third input, the user cannot hear the video sound of the second video.

At this time, in order to avoid the user missing the live broadcast content, video text information corresponding to the second video may be displayed on the second interface, specifically, the video text information may be obtained by performing speech-to-text processing on an audio portion of the second video, and after the video text information is displayed on the second interface, the user may obtain the content of the live broadcast instruction by reading the subtitle information.

According to the embodiment of the application, the audio output of the double-screen interface can be freely switched, and after the live audio output is stopped, the caption information is automatically generated and displayed in the live interface, so that a user can acquire the information of live broadcast explanation through the caption information, and the user can be ensured to acquire the information of live broadcast explanation completely under the condition of not interrupting the original browsing content of the user.

According to the video playing method provided by the embodiment of the application, the execution main body can be a video playing device. In the embodiment of the present application, a video playing method performed by a video playing device is taken as an example, and the video playing device provided by the embodiment of the present application is described.

In some embodiments of the present application, a video playing device is provided, fig. 12 shows a block diagram of the video playing device according to some embodiments of the present application, and as shown in fig. 12, a video playing device 1200 includes:

A receiving module 1202 for receiving a first input in case of playing a first video; the first input corresponds to a first keyword;

A play module 1204 for playing the second video in response to the first input; the second video is determined according to the live video stream corresponding to the first keyword in the first video.

In some embodiments of the present application, the playing device further includes:

the editing module is used for editing the live video stream corresponding to the first keyword in the first video;

and the determining module is used for determining the second video according to the edited live video stream.

In some embodiments of the application, the receiving module is further configured to receive a second input; the second input corresponds to the first editing policy;

and the editing module is also used for responding to the second input and editing the live video stream corresponding to the first keyword according to the first editing strategy.

the display control module is used for responding to the first input and displaying the first interface and the second interface in a split screen mode under the condition that the first interface is displayed;

And the playing module is used for playing the second video in the second interface.

the receiving module is also used for receiving a third input;

The playing device further comprises:

and the audio channel switching module is used for responding to the third input and switching the audio output channel of the electronic equipment from the second interface to the first interface.

In some embodiments of the present application, the display control module is further configured to display video text information corresponding to the second video on the second interface.

The video playing device in the embodiment of the application can be an electronic device or a component in the electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be other devices than a terminal. The electronic device may be a Mobile phone, a tablet computer, a notebook computer, a palm computer, a vehicle-mounted electronic device, a Mobile internet appliance (Mobile INTERNET DEVICE, MID), an augmented reality (augmented reality, AR)/Virtual Reality (VR) device, a robot, a wearable device, an ultra-Mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), etc., and may also be a server, a network attached storage (Network Attached Storage, NAS), a personal computer (personal computer, PC), a Television (TV), a teller machine, a self-service machine, etc., which are not particularly limited in the embodiments of the present application.

The video playing device in the embodiment of the application can be a device with an operating system. The operating system may be an Android operating system, an iOS operating system, or other possible operating systems, and the embodiment of the present application is not limited specifically.

The video playing device provided by the embodiment of the application can realize each process realized by the embodiment of the method, and in order to avoid repetition, the description is omitted.

Optionally, an embodiment of the present application further provides an electronic device, fig. 13 shows a block diagram of a structure of an electronic device according to an embodiment of the present application, as shown in fig. 13, an electronic device 1300 includes a processor 1302, a memory 1304, and a program or an instruction stored in the memory 1304 and capable of running on the processor 1302, where the program or the instruction is executed by the processor 1302 to implement each process of the foregoing method embodiment, and the same technical effects are achieved, and are not repeated herein.

The electronic device in the embodiment of the application includes the mobile electronic device and the non-mobile electronic device.

The electronic device 1400 includes, but is not limited to: radio frequency unit 1401, network module 1402, audio output unit 1403, input unit 1404, sensor 1405, display unit 1406, user input unit 1407, interface unit 1408, memory 1409, and processor 1410.

Those skilled in the art will appreciate that the electronic device 1400 may also include a power source (e.g., a battery) for powering the various components, which may be logically connected to the processor 1410 by a power management system to perform functions such as managing charging, discharging, and power consumption by the power management system. The electronic device structure shown in fig. 14 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than shown, or may combine certain components, or may be arranged in different components, which are not described in detail herein.

Wherein the user input unit 1407 is for receiving a first input in a case where the first video is played; the first input corresponds to a first keyword;

The processor 1410 is configured to play a second video in response to the first input; the second video is determined according to the live video stream corresponding to the first keyword in the first video.

Optionally, the processor 1410 is further configured to edit the live video stream in the first video that corresponds to the first keyword; and determining a second video according to the edited live video stream.

Optionally, the user input unit 1407 is further configured to receive a second input; the second input corresponds to the first editing policy;

the processor 1410 is further configured to perform editing processing on the live video stream corresponding to the first keyword according to the first editing policy in response to the second input.

Optionally, the editing process is used to simplify video content, simplify audio content, and/or adjust volume.

Optionally, the processor 1410 is further configured to, in response to the first input, display the first interface and the second interface in a split screen manner in the case that the first interface is displayed;

A user input unit 1407 for playing a second video in the second interface;

Optionally, under the condition that the second video is played in the second interface, an audio output channel of the electronic device is distributed to the second interface;

a user input unit 1407 for receiving a third input;

the processor 1410 is further configured to switch the audio output channel of the electronic device from the second interface to the first interface in response to the third input.

Optionally, a display unit 1406 is configured to display video text information corresponding to the second video on the second interface.

It should be appreciated that in embodiments of the present application, the input unit 1404 may include a graphics processor (Graphics Processing Unit, GPU) 14041 and a microphone 14042, with the graphics processor 14041 processing image data of still pictures or video obtained by an image capture device (e.g., a camera) in a video capture mode or an image capture mode. The display unit 1406 may include a display panel 14061, and the display panel 14061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1407 includes at least one of a touch panel 14071 and other input devices 14072. The touch panel 14071 is also referred to as a touch screen. The touch panel 14071 may include two parts, a touch detection device and a touch controller. Other input devices 14072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and so forth, which are not described in detail herein.

Memory 1409 may be used to store software programs as well as various data. The memory 1409 may mainly include a first memory area storing programs or instructions and a second memory area storing data, wherein the first memory area may store an operating system, application programs or instructions (such as a sound playing function, an image playing function, etc.) required for at least one function, and the like. Further, the memory 1409 may include volatile memory or nonvolatile memory, or the memory 1409 may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM), static random access memory (STATIC RAM, SRAM), dynamic random access memory (DYNAMIC RAM, DRAM), synchronous Dynamic Random Access Memory (SDRAM), double data rate Synchronous dynamic random access memory (Double DATA RATE SDRAM, DDRSDRAM), enhanced Synchronous dynamic random access memory (ENHANCED SDRAM, ESDRAM), synchronous link dynamic random access memory (SYNCH LINK DRAM, SLDRAM), and Direct random access memory (DRRAM). Memory 1409 in embodiments of the application includes, but is not limited to, these and any other suitable types of memory.

Processor 1410 may include one or more processing units; optionally, the processor 1410 integrates an application processor that primarily processes operations involving an operating system, user interface, application programs, etc., and a modem processor that primarily processes wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 1410.

The embodiment of the application also provides a readable storage medium, and the readable storage medium stores a program or an instruction, which when executed by a processor, implements each process of the above method embodiment, and can achieve the same technical effects, so that repetition is avoided, and no further description is provided herein.

The processor is a processor in the electronic device in the above embodiment. Readable storage media include computer readable storage media such as Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic or optical disks, and the like.

The embodiment of the application further provides a chip, the chip comprises a processor and a communication interface, the communication interface is coupled with the processor, the processor is used for running programs or instructions, the processes of the embodiment of the method can be realized, the same technical effects can be achieved, and the repetition is avoided, and the description is omitted here.

It should be understood that the chips referred to in the embodiments of the present application may also be referred to as system-on-chip chips, chip systems, or system-on-chip chips, etc.

Embodiments of the present application provide a computer program product stored in a storage medium, where the program product is executed by at least one processor to implement the respective processes of the above method embodiments, and achieve the same technical effects, and for avoiding repetition, a detailed description is omitted herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element. Furthermore, it should be noted that the scope of the methods and apparatus in the embodiments of the present application is not limited to performing the functions in the order shown or discussed, but may also include performing the functions in a substantially simultaneous manner or in an opposite order depending on the functions involved, e.g., the described methods may be performed in an order different from that described, and various steps may be added, omitted, or combined. Additionally, features described with reference to certain examples may be combined in other examples.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in part in the form of a computer software product stored on a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method of the embodiments of the present application.

The embodiments of the present application have been described above with reference to the accompanying drawings, but the present application is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present application and the scope of the claims, which are to be protected by the present application.

Claims

1. A video playing method, the method comprising:

Playing a second video in response to the first input; the second video is determined according to a live video stream corresponding to the first keyword in the first video.

2. The video playback method as recited in claim 1, wherein the method further comprises:

editing a live video stream corresponding to the first keyword in the first video;

And determining the second video according to the edited live video stream.

3. The video playing method according to claim 2, wherein before the step of editing the live video stream corresponding to the first keyword in the first video, the method further comprises:

receiving a second input; the second input corresponds to a first editing policy;

the editing processing of the live video stream corresponding to the first keyword in the first video includes:

4. The video playback method as recited in claim 2, wherein the editing process is used to simplify video content, simplify audio content, and/or adjust volume.

5. The video playback method of claim 1, wherein playing a second video in response to the first input comprises:

And playing the second video in the second interface.

6. The video playback method as recited in claim 5, wherein in the case of playing the second video in the second interface, an audio output channel of the electronic device is allocated to the second interface;

The method further comprises the steps of:

Receiving a third input;

And responding to the third input, and switching an audio output channel of the electronic device from the second interface to the first interface.

7. The video playback method of claim 6, wherein after the step of switching the audio output channel of the electronic device from the second interface to the first interface in response to the third input, the method further comprises:

8. A video playback device, the playback device comprising:

A play module for playing a second video in response to the first input; the second video is determined according to a live video stream corresponding to the first keyword in the first video.

9. An electronic device comprising a processor and a memory storing a program or instructions executable on the processor, which when executed by the processor, implement the steps of the video playback method of any one of claims 1 to 7.

10. A readable storage medium, characterized in that the readable storage medium has stored thereon a program or instructions which, when executed by a processor, implement the steps of the video playback method of any one of claims 1 to 7.