CN115955596A

CN115955596A - Method, apparatus, device and medium for providing video related information

Info

Publication number: CN115955596A
Application number: CN202211652440.6A
Authority: CN
Inventors: 刘云; 刘晗月
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2022-12-21
Filing date: 2022-12-21
Publication date: 2023-04-11

Abstract

Methods, apparatuses, devices and media for providing video related information are disclosed. In one method, video frames in a video played in an application are determined. At least one object in the video frame is identified. Providing a tag associated with a target object of the at least one object in the application, the text of the tag representing summary information of the target object and the tag being linked to detailed information of the target object. With exemplary implementations of the present disclosure, a user may obtain information associated with an object of interest in a video in a single application in a simpler and efficient manner.

Description

Method, apparatus, device and medium for providing video related information

Technical Field

Example implementations of the present disclosure relate generally to video processing and, more particularly, to methods, apparatuses, devices, and computer-readable storage media for providing video-related information.

Background

A large number of video-based applications are currently provided in which a user can play video. Certain objects (e.g., items, people, etc.) of interest to the user may appear in the video, and the user desires to learn more about the video that is relevant. At this time, since the object exists in the video, only video and/or image information of the object can be obtained, and it is difficult to determine the search keyword of the object. In this case, how to provide information related to video in a more convenient and efficient manner is an urgent problem in the field of video processing.

Disclosure of Invention

In a first aspect of the disclosure, a method for providing video-related information is provided. In the method, video frames in a video played in an application are determined. At least one object in a video frame is identified. Providing a tag associated with a target object of the at least one object in the application, the text of the tag representing summary information of the target object and the tag being linked to detailed information of the target object.

In a second aspect of the disclosure, an apparatus for providing video-related information is provided. The device includes: a determination module configured to determine a video frame in a video played in an application; an identification module configured to identify at least one object in a video frame; and a providing module configured to provide a tag associated with a target object of the at least one object in the application, the text of the tag representing summary information of the target object, and the tag being linked to detailed information of the target object.

In a third aspect of the disclosure, an electronic device is provided. The electronic device includes: at least one processing unit; and at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, the instructions when executed by the at least one processing unit cause the apparatus to perform a method according to the first aspect of the disclosure.

In a fourth aspect of the present disclosure, a computer-readable storage medium is provided, having stored thereon a computer program, which, when executed by a processor, causes the processor to carry out the method according to the first aspect of the present disclosure.

It should be understood that what is described in this section is not intended to limit key features or essential features of implementations of the disclosure, nor is it intended to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The above and other features, advantages and aspects of various implementations of the present disclosure will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings. In the drawings, like or similar reference characters designate like or similar elements, and wherein:

FIG. 1 illustrates a block diagram of a process for providing video-related information, according to one aspect;

fig. 2 illustrates a block diagram of a process for providing video-related information, in accordance with some implementations of the present disclosure;

FIG. 3 illustrates a block diagram for determining a set of images in a media library that are similar to an object in a video frame, according to some implementations of the present disclosure;

FIG. 4 illustrates a block diagram for determining tag-related summary and detail information in accordance with some implementations of the present disclosure;

FIG. 5 illustrates a block diagram for performing a search in a media library, according to some implementations of the present disclosure;

FIG. 6 illustrates a block diagram for presenting detailed information, in accordance with some implementations of the present disclosure;

FIG. 7 illustrates a detailed information block diagram according to some implementations of the present disclosure;

FIG. 8 illustrates a block diagram that presents more detailed information based on interactions, in accordance with some implementations of the present disclosure;

fig. 9 illustrates a flow diagram of a method for providing video-related information, in accordance with some implementations of the present disclosure;

FIG. 10 illustrates a block diagram of an apparatus for providing video-related information in accordance with some implementations of the present disclosure; and

fig. 11 illustrates a block diagram of a device capable of implementing various implementations of the present disclosure.

Detailed Description

Implementations of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain implementations of the present disclosure are illustrated in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the implementations set forth herein, but rather are provided for a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and implementations of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.

In describing implementations of the present disclosure, the terms "include," including, "and their like are to be construed as being inclusive, i.e.," including, but not limited to. The term "based on" should be understood as "based at least in part on". The term "one implementation" or "the implementation" should be understood as "at least one implementation". The term "some implementations" should be understood as "at least some implementations". Other explicit and implicit definitions are also possible below. As used herein, the term "model" may represent an associative relationship between various data. For example, the above-described association may be obtained based on various technical solutions that are currently known and/or will be developed in the future.

It will be appreciated that the data involved in the subject technology, including but not limited to the data itself, the acquisition or use of the data, should comply with the requirements of the corresponding laws and regulations and related regulations.

It is understood that before the technical solutions disclosed in the embodiments of the present disclosure are used, the user should be informed of the type, the use range, the use scene, etc. of the personal information related to the present disclosure and obtain the authorization of the user through an appropriate manner according to the relevant laws and regulations.

For example, in response to receiving an active request from a user, a prompt message is sent to the user to explicitly prompt the user that the requested operation to be performed would require the acquisition and use of personal information to the user. Thus, the user can autonomously select whether to provide personal information to software or hardware such as an electronic device, an application program, a server, or a storage medium that performs the operations of the disclosed technical solution, according to the prompt information.

As an optional but non-limiting implementation manner, in response to receiving an active request from the user, the prompt information is sent to the user, for example, a pop-up window may be used, and the prompt information may be presented in text in the pop-up window. In addition, a selection control for providing personal information to the electronic equipment by the user's selection of "agree" or "disagree" can be carried in the pop-up window.

It is understood that the above notification and user authorization process is only illustrative and not limiting, and other ways of satisfying relevant laws and regulations may be applied to the implementation of the present disclosure.

Example Environment

Fig. 1 shows a block diagram 100 of a process for providing video-related information according to one aspect. As shown in fig. 1, a user may play a video in an application 110, and each video frame in the video may include one or more objects. For example, the video frame 120 may include an object 130 (e.g., a flower). Assuming that the user wishes to know the variety of flowers, the search process 140 at this point would be complex and involve multiple applications. For example, at block 142, the user may screen shot in a video application (i.e., application 110) to obtain an image including the object 130 to be searched. At block 144, the user may open an image editing application and perform an image cropping operation to obtain an image to be searched that includes the region in which the object 130 is located. At block 146, the user may open a search application and upload images to be searched, then utilize the search functionality provided by the search application, and obtain search results for the object 130.

At this time, the user has to open a plurality of applications and perform a complicated series of operations, respectively, in order to perform a search based on images in the video. At this time, it is desirable that video-related information can be obtained in a more convenient and efficient manner.

Summary process for providing video related information

In order to solve the deficiencies in the above technical solutions, according to an exemplary implementation of the present disclosure, a method for providing video related information is provided. An overview of the method is described with reference to fig. 2, which fig. 2 shows a block diagram 200 of a process for providing video-related information, in accordance with some implementations of the present disclosure. As shown in fig. 2, a video frame 120 in a video presented in the application 110 may be determined. One or more objects in the video frame 120 may be identified. Further, a tag 210 associated with a target object (e.g., object 130) of the at least one object may be presented in the application 110. Here, the text of the tag 210 may represent summary information of the target object, and the tag 210 is linked to detailed information 220 of the target object.

As shown in fig. 2, an object 130 may be identified in a video frame 120. At this time, the text of the label 210 of the object 130 may indicate the summary information of the object 130, i.e., "gerbera jamesonii". Further, the tag 210 may be linked to detailed information 220 of the object 130. In other words, if a user of the application 110 interacts with the tab 210, the detailed information 220 may be presented in the application 110. It will be understood that the detailed information herein may include a plurality of items. For example, an introduction to gerbera may be presented in a pagination "synthesis" such as, for example, a school name, alias, distribution area, morphological characteristics, and growth habit. As another example, an online shopping link for purchasing African daisy may be presented in a tab "merchandise," and so on.

It will be understood that one or more objects may be included in the video frame 120. For example, an object 230 may be identified and the tag 212 of the object 230 presented. The text of the label 212 at this point may be "breakfast cup". When the user interacts with the tab 212, detailed information associated with the breakfast cup may be presented in the application 110.

With the exemplary implementation of the present disclosure, when a user is interested in a certain object in a video, it is not necessary to perform a complicated search operation, but only a video frame 120 desired to be searched needs to be acquired. In turn, the application 110 may automatically identify the various objects in the video frame 120, and then present tags of the various objects in the application 110. At this time, the user can acquire basic information about the object from the text in the tag, and obtain detailed information about the object by clicking, pressing the tag, or the like. In this way, the user does not need to perform switching among a plurality of applications, the efficiency of the user for acquiring the video related information can be greatly improved, and further, the complexity of manual operation of the user is reduced and the waiting time of the user is reduced.

Detailed procedure for providing video-related information

Having described an overview according to one exemplary implementation of the present disclosure, more details of providing video-related information will be described below. According to an exemplary implementation of the present disclosure, the technical solution according to an exemplary implementation of the present disclosure may be implemented in various applications. For example, the technical solution may be implemented in a video-based social networking application. At this time, the user may be a user of the social network, and when the user browses videos from other users in the application, a video frame including the object of interest in the videos may be acquired. The social networking application may then automatically provide the user with a tag about the object. In the event that the user interacts with the tag, detailed information about the object may be presented to the user.

According to one exemplary implementation of the present disclosure, further details regarding providing video-related information will be described, by way of example, in relation to a video-based social networking application. Alternatively and/or additionally, the above-described solution may be implemented in other applications. For example, the above-described technical solution may be implemented in a video playing application. As another example, the above-described solution may be implemented in a video-based shopping application, and so on. With exemplary implementations of the present disclosure, users of a variety of applications may be allowed to obtain more information about a viewed video in a simpler and more efficient manner.

According to one exemplary implementation of the present disclosure, a user may determine video frames in a video based on a variety of ways. For example, a user may perform a screen capture operation on a video in order to capture a video frame that includes an object of interest. For another example, the user may perform a pause operation during the playing of the video, and at this time, the current image displayed by the paused video is the video frame selected by the user. As another example, the user may perform predefined interactive operations with respect to the video. For example, the user may perform a long press, double click, swipe, or press a predefined button while playing the video to select a video frame in the video that includes the object of interest.

In the case where the video frame 120 has been determined, at least one object may be identified from the video frame 120. For example, all potential objects in the video frame 120 may be determined based on a variety of image recognition techniques that are currently known and/or will be developed in the future. Alternatively and/or additionally, the user may specify certain object(s) in the video frame 120. For example, the user may select a certain object in the video frame 120 by clicking, double clicking, or the like, and may select a certain target area in the video frame 120. At this time, the application 110 will only recognize objects in the target area, and will not recognize objects in other areas. With exemplary implementations of the present disclosure, a user may be allowed to select an object of interest in a more flexible manner. In this way, the flexibility of the interactive operation may be increased, thereby reducing the overhead of acquiring time and computational resources associated with multiple objects.

According to an example implementation of the present disclosure, after determining a target object in a video frame, a tag associated with the target object may be presented in the application 110. Here, the text of the tag represents summary information of the target object, and the tag may be linked to detailed information of the target object. In the context of the present disclosure, the summary information and the detailed information may be determined based on a variety of ways. For example, a set of images similar to an image region in which a target object in a video frame is located may be determined based on an image similarity comparison in a media library associated with the application 110.

Fig. 3 illustrates a block diagram 300 for determining a set of images 330 in a media library 310 that are similar to objects in a video frame, according to some implementations of the present disclosure. As shown in FIG. 3, there may be a media library 310 associated with the application 110, which media library 310 may include a variety of content. In particular, media library 310 may include, among other things, a video library 340, a department library 342, and a merchandise library 344 associated with an application. For example, video library 340 may include a plurality of videos published by respective users of application 110, encyclopedia library 342 may include a plurality of encyclopedia entries, and merchandise library 344 may include information for a plurality of merchandise sold in a mall of application 110. These databases may provide a large amount of video 312, images 314, and other corresponding data (e.g., text, etc.) (not shown).

At this point, a set of images 330 that are similar to the image region in which the object 130 is located may be determined based on the image similarity comparison 320. Here, the above-described process may be performed based on various image similarity algorithms that are currently known and/or will be developed in the future. For example, features of the respective images may be determined and a set of images 330 obtained by comparing distances between the respective features. In this way, a set of images 330 similar to the object 130 may be obtained with greater accuracy. Further, the label 210 of the object 130 may be determined based on the set of images 330.

Fig. 4 illustrates a block diagram 400 for determining tag-related summary and detail information in accordance with some implementations of the present disclosure. As shown in fig. 4, a set of images 330 may be determined from the media library 310 in the manner described above. Further, textual information 410 associated with a set of images 330 in the media library 310 may be obtained. Here, the text information 410 includes at least any one of: titles, descriptions, labels, comments, etc. associated with a set of images 330 in the media library 310. Assuming that the obtained set of images 330 is images of "gerbera" it is possible to determine that the summary information of the subject 130 is "gerbera" based on the titles of these images. In turn, the text "gerbera" may be presented in label 210. With an exemplary implementation of the present disclosure, text matching the object 130 may be determined in a media library 310 that includes a large amount of data. In this way, the summary information 420 of the object may be provided in the tag 210 in a more accurate manner.

According to an exemplary implementation of the present disclosure, the detailed information 220 of the object 130 may be determined based on the text information 410. For example, the detailed information 220 may include: content associated with "gerbera" in video library 340, content associated with "gerbera" in encyclopedia library 342, and content associated with "gerbera" in commodity library 344, and so on. In this manner, richer and more comprehensive information may be retrieved from the media library 310 and the retrieved information presented to the user in the application 110.

Alternatively and/or additionally, other video-related information may be further considered in determining the set of images 330. For example, attribute information associated with a video may be obtained, and a set of images may be determined in the media library 310 based on the attribute information. Fig. 5 illustrates a block diagram 500 for performing a search in the media library 310, in accordance with some implementations of the present disclosure. As shown in fig. 5, a user may set attributes 520 for a video when publishing the video 510. Here, the attribute 520 may include a title 521, a description 522, a tag 523, and the like in order to facilitate other users to know the summary of the video. Further, the user may add text 525 to the video 510 or a voice-over to the background music, etc. Other users may add comments during the viewing of video 510, and so on.

For example, the title of the video may be "share breakfast with African daisy," and for example, the commentary may include "African daisy is good at! ". At this point, the keyword "gerbera" may be extracted to facilitate searching in the media library 310. Although the above only schematically illustrates the process of determining keywords based on titles and comments, and then performing a search in the media library 310. Other attributes may be obtained in a similar manner, and a search may be performed in the media library 310 based on both the obtained attributes and the similarity comparison. With the exemplary implementation of the present disclosure, the matching degree of the determined set of images 330 with the object 130 may be improved, thereby making the summary information and the detailed information of the object determined based on the set of images 330 more accurate.

According to an exemplary implementation of the present disclosure, the tag 210 is interactive, and if an interactive operation of the tag 210 by a user of the application 110 is detected, the detailed information 220 may be provided in the application 110. Fig. 6 illustrates a block diagram 600 for presenting detailed information 220, according to some implementations of the present disclosure. As shown in FIG. 6, a first interaction 610 between the user and the tab 210 may be detected in the application 110, where the first interaction 610 may include, for example and without limitation, a single click, a double click, a swipe, and the like. It will be appreciated that here the tab 210 is linked to a page providing the detailed information 220, whereby in case the first interaction 610 is detected, the detailed information 220 about the object 130 may be presented in the application 110. In this way, the user is allowed to obtain more information about the object of interest 130 by simple interactive operations.

According to an example implementation of the present disclosure, the detailed information 220 may include a page of a plurality of information items associated with the object 130. It will be appreciated that the pages herein may include a variety of content, for example, for gerbera, a "general" page and a "goods" page may be provided. In particular, the "comprehensive" page may provide knowledge data of the object 130, such as, for example, the subject's germam name, alias name, distribution area, morphological characteristics, growth habit, and the like. The "goods" page may provide goods data for the object 130 associated with the purchase of goods. With exemplary implementations of the present disclosure, a "comprehensive" page can provide intellectual data about an object, thereby meeting the basic need for a user to know "what" the object is. Further, the 'goods' page provides a channel for the user to purchase goods related to the object, and further meets the purchase demand of the user.

The user may perform interactions with the respective pages in order to display the corresponding data. As shown in FIG. 6, if a second interaction 620 between the user and the "integrated" page is detected, the detailed information shown in FIG. 6 may be presented. According to one exemplary implementation of the present disclosure, a default page of detailed information may be provided, for example, either a "general" page or a "goods" page may be set as the default page. Alternatively and/or additionally, the data in the two pages may also be combined, thereby providing a default page. With the exemplary implementation of the present disclosure, the user is allowed to interact with the detailed information 220, thereby obtaining more information of interest to himself.

According to an exemplary implementation of the present disclosure, a user may click on "good" at which time a "good" page will be presented in the application 110. Fig. 7 illustrates a block diagram 700 of details according to some implementations of the present disclosure. As shown in FIG. 7, at this point the detailed information 710 may display a merchandise link for purchasing the African daisy. The user may perform further interaction 720 with a commodity link, e.g., the user may click on the commodity link, at which point a shopping page for purchasing gerbera is presented in the application 110.

FIG. 8 illustrates a block diagram 800 that presents more detailed information based on interactions, according to some implementations of the present disclosure. As shown in fig. 8, the detailed information 810 at this time includes a commodity name, a price, and the like, and the user can go to the payment page by pressing a "purchase" button. At this time, the user may be allowed to perform a purchase operation inside the single application 110. However, in the prior art solution, the user needs to execute a long operation path: screenshot in the application 110, cutout in the image processing application, image search in the shopping application, and order placement. At this time, the user has to switch between a plurality of applications, which is liable to an operational error and results in failure to accurately find a desired product. Compared with the prior art, the complexity of manual operation of a user can be greatly reduced and the possibility of errors in the complicated operation process of the user is reduced by utilizing the exemplary implementation mode of the disclosure. Furthermore, the speed and the accuracy of obtaining information by the user can be greatly improved.

The foregoing has described only the process of providing information related to the object 130, according to one exemplary implementation of the present disclosure. Alternatively and/or additionally, one or more objects may be identified from the video frame 120, returning to FIG. 2 describing processing for more objects. For example, the object 230 may be identified from the video 120, and the media library 310 may be searched for a set of images corresponding to the object 230 based on an image similarity algorithm. Further, based on the information of title, description, label, comment, etc. associated with the found set of images, the text in the label 212 of the object 230 may be determined to be "breakfast cup". Further, the tab 212 may be linked to a page that provides detailed information of the object 230. In this manner, when the user clicks on the tab 212, detailed information of "breakfast cup" may be presented in the application 110, e.g., a specific purchase method, etc.

It will be appreciated that although the process of providing video-related information is described above with respect to a video-based social networking application as an example only, the process described above may be performed in other applications. For example, in a video playback application, a user may watch a movie video and may select a video frame from the movie segment to obtain information about objects in the video frame. Assuming that the selected video frame includes a person located beside the vehicle, the identified objects may include the vehicle and the person at this time. The tags of the vehicles may represent summary information such as the make and model of the vehicles, and the tags of the characters may represent personas in movies, and so on. When the user clicks on the tag of the vehicle, more detailed information about the vehicle may be further provided, such as brand, model, origin, length, width, height, and so forth. When the user clicks the tag of the character, more detailed information of the character and the actor may be further provided.

With the exemplary implementation of the present disclosure, a user does not have to switch between multiple applications, nor does the user have to perform a complicated operational procedure. In the context of the present disclosure, a user may obtain information associated with an object of interest in a video in a single application. In other words, the user need only determine (e.g., by capturing a picture during the playing of the video) the video frames that include the object of interest, and may then automatically trigger the above-described process of providing information and display the tags that include the summary information of the object of interest at the respective locations of the video frames. Further, the user is allowed to obtain detailed information about the object of interest by clicking on the tag.

Example procedure

Fig. 9 illustrates a flow diagram of a method 900 for providing video-related information in accordance with some implementations of the present disclosure. At block 910, a video frame in a video playing in an application is determined. At block 920, at least one object in a video frame is identified. At block 930, a tag associated with a target object of the at least one object is provided in the application, the text of the tag representing summary information of the target object and the tag being linked to detailed information of the target object.

According to an exemplary implementation of the present disclosure, providing a tag includes: determining, in a media library associated with the application, a set of images that are similar to an image region in which a target object in the video frame is located based on the image similarity comparison; and providing a label based on the set of images.

According to an exemplary implementation of the present disclosure, determining a set of images comprises: obtaining attribute information associated with a video, the attribute information including at least any one of: title, description, label, comment, text in video, and voice-over in video of video; and retrieving a set of images in the media library based on the attribute information.

According to an exemplary implementation of the present disclosure, providing a label based on a set of images includes: obtaining textual information associated with a set of images, the textual information including at least any one of: titles, descriptions, tags, comments in the media library associated with a set of images; and determining summary information and detailed information corresponding to the tag based on the text information.

According to an exemplary implementation of the present disclosure, the tag is interactive, and the method further comprises: in response to detecting a first interactive operation of a user of the application with respect to the tag, detailed information is provided in the application.

According to an exemplary implementation of the present disclosure, providing detailed information in an application includes: providing a page of information items associated with the target object; and presenting the detailed information in the application in response to a second interactive operation of the user on the page.

According to an exemplary implementation of the disclosure, the detailed information includes at least any one of: base data associated with the target object; commodity data associated with the target object.

According to an exemplary implementation of the present disclosure, identifying the at least one object includes at least any one of: determining at least one object in a video frame based on image recognition; in response to a third interactive operation by a user of the application with respect to a target area in the video frame, at least one object in the target area is identified.

According to an example implementation of the present disclosure, determining the video frame includes determining the video frame in response to at least any one of: a screen capture operation of a user of the application for the video; pausing operation of a user of the application for the video; the user of the application performs predefined operations on the video.

According to an exemplary implementation of the disclosure, the media database comprises at least any one of: video libraries, encyclopedia libraries, and commodity libraries associated with the application. .

Example apparatus and devices

Fig. 10 illustrates a block diagram of an apparatus 1000 of a device for providing video-related information according to some implementations of the present disclosure. The apparatus 1000 comprises: a determining module 1010 configured to determine video frames in a video played in an application; an identification module 1020 configured to identify at least one object in a video frame; and a providing module 1030 configured to provide a tag associated with a target object of the at least one object in the application, the text of the tag representing summary information of the target object, and the tag being linked to detailed information of the target object.

According to an exemplary implementation of the present disclosure, the providing module 1030 includes: an image determination module configured to determine, in a media library associated with an application, a set of images that are similar to an image region in which a target object in a video frame is located based on an image similarity comparison; and a label providing module configured to provide a label based on the set of images.

According to an exemplary implementation of the present disclosure, an image determination module includes: an attribute information acquisition module configured to acquire attribute information associated with the video, the attribute information including at least any one of: title, description, label, comment, text in video, and voice-over in video of video; and an image acquisition module configured to acquire a set of images in the media library based on the attribute information.

According to one exemplary implementation of the present disclosure, a tag providing module includes: a text information acquisition module configured to acquire text information associated with a set of images, the text information including at least any one of: titles, descriptions, tags, comments in the media library associated with a set of images; and an information determination module configured to determine summary information and detailed information corresponding to the tag based on the text information.

According to one exemplary implementation of the present disclosure, the tag is interactable, and the apparatus further comprises: a detailed information providing module configured to provide detailed information in the application in response to detecting a first interactive operation for the tag by a user of the application.

According to an exemplary implementation of the present disclosure, the detailed information providing module includes: a page providing module configured to provide a page of information items associated with the target object; and a presentation module configured to present the detailed information in the application in response to a second interactive operation of the user with respect to the page.

According to an exemplary implementation of the disclosure, the detailed information includes at least any one of: base data associated with the target object; merchandise data associated with the target object.

According to an exemplary implementation of the disclosure, the identification module comprises at least any one of: an object recognition module configured to determine at least one object in a video frame based on image recognition; an interaction-based object identification module configured to identify at least one object in a target region in the video frame in response to a third interaction operation by a user of the application with respect to the target region.

According to an exemplary implementation of the disclosure, the determining module is further configured to determine the video frame in response to at least any one of: a screen capture operation of a user of the application for the video; pausing operation of a user of the application for the video; the user of the application performs predefined operations on the video.

According to an exemplary implementation of the disclosure, the media database comprises at least any one of: video libraries, encyclopedia libraries, and commodity libraries associated with the application.

Fig. 11 illustrates a block diagram of a device 1100 capable of implementing multiple implementations of the present disclosure. It should be understood that the computing device 1100 illustrated in FIG. 11 is merely exemplary and should not constitute any limitation as to the functionality or scope of the implementations described herein. The computing device 1100 shown in fig. 11 may be used to implement the methods described above.

As shown in fig. 11, computing device 1100 is in the form of a general purpose computing device. Components of computing device 1100 may include, but are not limited to, one or more processors or processing units 1111, memory 1120, storage device 1130, one or more communication units 1140, one or more input devices 1150, and one or more output devices 1160. The processing unit 1111 may be a real or virtual processor and can perform various processes according to programs stored in the memory 1120. In a multi-processor system, multiple processing units execute computer-executable instructions in parallel to increase the parallel processing capability of the computing device 1100.

Computing device 1100 typically includes a number of computer storage media. Such media may be any available media that is accessible by computing device 1100 and includes, but is not limited to, volatile and non-volatile media, removable and non-removable media. The memory 1120 may be volatile memory (e.g., registers, cache, random Access Memory (RAM)), non-volatile memory (e.g., read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory), or some combination thereof. Storage device 1130 may be a removable or non-removable medium and may include a machine-readable medium, such as a flash drive, a magnetic disk, or any other medium that may be capable of being used to store information and/or data (e.g., training data for training) and that may be accessed within computing device 1100.

The computing device 1100 may further include additional removable/non-removable, volatile/nonvolatile storage media. Although not shown in FIG. 11, a magnetic disk drive for reading from and writing to a removable, non-volatile magnetic disk (e.g., a "floppy disk") and an optical disk drive for reading from or writing to a removable, non-volatile optical disk may be provided. In these cases, each drive may be connected to a bus (not shown) by one or more data media interfaces. The memory 1120 may include a computer program product 1125 having one or more program modules configured to perform the various methods or acts of the various implementations of the disclosure.

The communication unit 1140 enables communication with other computing devices over a communication medium. Additionally, the functionality of the components of computing device 1100 may be implemented in a single computing cluster or multiple computing machines, which are capable of communicating over a communications connection. Thus, the computing device 1100 may operate in a networked environment using logical connections to one or more other servers, network Personal Computers (PCs), or another network node.

The input device 1150 may be one or more input devices such as a mouse, keyboard, trackball, or the like. Output device(s) 1160 may be one or more output devices such as a display, speakers, printer, etc. Computing device 1100 can also communicate with one or more external devices (not shown), such as storage devices, display devices, etc., communicating with one or more devices that enable a user to interact with computing device 1100, or communicating with any devices (e.g., network cards, modems, etc.) that enable computing device 1100 to communicate with one or more other computing devices, as desired, via communication unit 1140. Such communication may be performed via input/output (I/O) interfaces (not shown).

According to an exemplary implementation of the present disclosure, a computer-readable storage medium having stored thereon computer-executable instructions is provided, wherein the computer-executable instructions are executed by a processor to implement the above-described method. According to an exemplary implementation of the present disclosure, there is also provided a computer program product, tangibly stored on a non-transitory computer-readable medium and comprising computer-executable instructions that are executed by a processor to implement the method described above. According to an exemplary implementation of the present disclosure, a computer program product is provided, on which a computer program is stored, which when executed by a processor, performs the above described method.

Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus, devices and computer program products implemented in accordance with the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.

These computer-readable program instructions may be provided to a processing unit of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processing unit of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.

The computer readable program instructions may be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various implementations of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The foregoing has described implementations of the present disclosure, and the above description is illustrative, not exhaustive, and not limited to the implementations disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the described implementations. The terminology used herein was chosen in order to best explain the principles of the implementations, the practical application, or improvements to the technology in the marketplace, or to enable others of ordinary skill in the art to understand the various implementations disclosed herein.

Claims

1. A method for providing video-related information, comprising:

determining a video frame in a video played in an application;

identifying at least one object in the video frame; and

providing a tag associated with a target object of the at least one object in the application, the text of the tag representing summary information of the target object and the tag being linked to detailed information of the target object.

2. The method of claim 1, wherein providing the tag comprises:

determining, in a media library associated with the application, a set of images that are similar to an image region in the video frame where the target object is located based on an image similarity comparison; and

providing the label based on the set of images.

3. The method of claim 2, wherein determining the set of images comprises:

obtaining attribute information associated with the video, the attribute information including at least any one of: title, description, label, comment of the video, text in the video, and voice-over in the video; and

retrieving the set of images in the media library based on the attribute information.

4. The method of claim 2, wherein providing the label based on the set of images comprises:

obtaining textual information associated with the set of images, the textual information including at least any one of: titles, descriptions, labels, comments in the media library associated with the set of images; and

determining the summary information and the detailed information corresponding to the tag based on the text information.

5. The method of claim 1, wherein the tag is interactable, and the method further comprises: providing the detailed information in the application in response to detecting a first interactive operation of a user of the application with respect to the tag.

6. The method of claim 5, wherein providing the detailed information in the application comprises:

providing a page of information items associated with the target object; and

presenting the detailed information in the application in response to a second interactive operation of the user with respect to the page.

7. The method of claim 6, wherein the detailed information comprises at least any one of:

base data associated with the target object;

merchandise data associated with the target object.

8. The method of claim 1, wherein identifying the at least one object comprises at least any one of:

determining the at least one object in the video frame based on image recognition;

identifying the at least one object in a target region in the video frame in response to a third interaction operation by the user of the application with respect to the target region.

9. The method of claim 1, wherein determining the video frame comprises determining the video frame in response to at least any one of:

a screen capture operation of the application user for the video;

a pause operation by a user of the application for the video;

a user of the application performs a predefined operation with respect to the video.

10. The method of claim 1, wherein the media database comprises at least any one of: a video library, an encyclopedia library, and a merchandise library associated with the application.

11. An apparatus for providing video-related information, comprising:

a determination module configured to determine video frames in a video played in an application;

an identification module configured to identify at least one object in the video frame; and

a providing module configured to provide, in the application, a tag associated with a target object of the at least one object, a text of the tag representing summary information of the target object, and the tag being linked to detailed information of the target object.

12. An electronic device, comprising:

at least one processing unit; and

at least one memory coupled to the at least one processing unit and storing instructions for execution by the at least one processing unit, the instructions when executed by the at least one processing unit causing the electronic device to perform the method of any of claims 1-10.

13. A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, causes the processor to carry out the method according to any one of claims 1 to 10.