WO2022028177A1 - 信息推送、视频处理方法和设备 - Google Patents

信息推送、视频处理方法和设备 Download PDF

Info

Publication number
WO2022028177A1
WO2022028177A1 PCT/CN2021/104450 CN2021104450W WO2022028177A1 WO 2022028177 A1 WO2022028177 A1 WO 2022028177A1 CN 2021104450 W CN2021104450 W CN 2021104450W WO 2022028177 A1 WO2022028177 A1 WO 2022028177A1
Authority
WO
WIPO (PCT)
Prior art keywords
item
appearing
information
video frame
video stream
Prior art date
Application number
PCT/CN2021/104450
Other languages
English (en)
French (fr)
Inventor
崔英林
Original Assignee
上海连尚网络科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海连尚网络科技有限公司 filed Critical 上海连尚网络科技有限公司
Publication of WO2022028177A1 publication Critical patent/WO2022028177A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/74Browsing; Visualisation therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the embodiments of the present application relate to the field of computer technologies, and in particular, to methods and devices for information push and video processing.
  • video applications support more and more diverse functions, such as live broadcast functions, on-demand functions, and so on.
  • more and more users are attracted to watch the video, and the viewing time is getting longer and longer.
  • Various objects such as clothes, decorations, food, etc., often appear in videos. If the user is interested in the items in it, he needs to run the video application in the background, then open the search application or shopping application, and enter the name of the item to search to obtain the detailed information of the item.
  • the embodiments of the present application propose methods and devices for information push and video processing.
  • an embodiment of the present application provides a method for pushing information, including: performing code stream conversion on video data to obtain the video stream and identification information of items appearing in the video stream; playing the video stream on a playback device; There is an item of interest of the user in the current video frame of the video stream, and the identification information of the item of interest is determined; based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented.
  • determining that there is an item of interest of the user in the current video frame of the video stream includes: collecting the user's voice information; recognizing the voice information, and determining the name of the item included in the voice information; The occurrence item is matched, and the matched occurrence item is determined as the attention item.
  • determining that there is an item of interest of the user in the current video frame of the video stream includes: setting a trigger area in the video frame where the item appears in the video stream; in response to detecting that the user confirms the trigger area of the current video frame, The appearing item corresponding to the confirmed trigger area is determined as the attention item.
  • the identification information includes coordinate information; and setting the trigger area in the video frame where the item appears in the video stream includes: setting the area corresponding to the coordinate information as the trigger area.
  • the coordinate information is a percentage coordinate
  • setting the area corresponding to the coordinate information as the trigger area includes: calculating the present item of the current video frame based on the resolution of the playback device and the percentage coordinates of the present item of the current video frame the lattice coordinates; set the area corresponding to the lattice coordinates as the trigger area.
  • the lattice coordinates of the items appearing in the current video frame are calculated, including: if the coordinate system of the percentage coordinates and the screen coordinate system of the playback device Similarly, multiply the horizontal and vertical pixel values of the resolution of the playback device and the horizontal and vertical coordinate values of the percentage coordinates of the items in the current video frame to obtain the lattice coordinates of the items in the current video frame.
  • the lattice coordinates of the items appearing in the current video frame are calculated, further comprising: if the coordinate system of the percentage coordinates and the screen coordinates of the playback device If the system is different, convert the coordinate system of the percentage coordinates to obtain the converted percentage coordinates in the screen coordinate system; compare the horizontal pixel value and vertical pixel value of the resolution of the playback device with the horizontal coordinates of the converted percentage coordinates of the items appearing in the current video frame The value and the vertical coordinate value are correspondingly multiplied to obtain the lattice coordinates of the item appearing in the current video frame.
  • detecting the trigger area where the user confirms the current video frame includes: if the user touches the trigger area of the current video frame, determining the user confirms the trigger area.
  • detecting that the user confirms the trigger area of the current video frame includes: capturing the focus of the user's eyes; and determining the user confirms the trigger area in response to determining that the focus is on the trigger area of the current video frame.
  • capturing the focus of the user's eye includes: using a camera of the playback device to emit a light beam to the eye; using a photosensitive material on a screen of the playback device to sense the intensity of the light beam reflected from the eye; and determining the dark spot on the screen based on the light beam intensity point as the focal point.
  • an embodiment of the present application provides a video processing method, including: performing item identification on a video stream to determine an item appearing in the video stream; acquiring identification information of the appearing item; adding the identification information of the appearing item to a corresponding video In the frame protocol, video data is generated.
  • acquiring the identification information of the appearing item includes: performing position recognition on the video stream to determine the coordinate information of the appearing article; and adding the coordinate information of the appearing article to the identification information of the appearing article.
  • performing position recognition on the video stream to determine the coordinate information of the item that appears includes: simulating the pilot video stream on the pilot device; performing position recognition on the video stream to obtain the lattice coordinates of the item that appears; Lattice coordinates, determine the coordinate information of the item appearing.
  • determining the coordinate information of the appearing item based on the lattice coordinates of the appearing item including: comparing the horizontal coordinate value and the vertical coordinate value of the lattice coordinate of the appearing item with the horizontal pixel value and vertical value of the resolution of the pilot device The pixel value is divided correspondingly to get the percentage coordinates of the item appearing.
  • the identification information added to the video frame protocol of the first-occurring item includes the item name, coordinate information, brief information and/or web page link.
  • the identification information of the video frame protocol in which the item appears includes the item name and coordinate information.
  • adding the identification information of the appearing item to the corresponding video frame protocol includes: extending the network abstraction layer information of the corresponding video frame protocol based on the identification information of the appearing article.
  • an embodiment of the present application provides an information push device, comprising: a conversion unit configured to perform code stream conversion on video data to obtain the video stream and identification information of items appearing in the video stream; a playback unit configured to to play the video stream on the playback device; the determining unit is configured to determine the identification information of the attention item in response to determining that the user's attention item exists in the current video frame of the video stream; the presenting unit is configured to be based on the attention item identification information , query the push information of the concerned item, and present the push information.
  • the determining unit is further configured to: collect the voice information of the user; identify the voice information, and determine the name of the item contained in the voice information; Identified as an item of interest.
  • the determining unit includes: a setting subunit configured to set a trigger area in a video frame where an item of appearance of the video stream is located; the determining subunit configured to respond to detecting a trigger of the user confirming the current video frame area, and determine the appearing item corresponding to the confirmed trigger area as the item of interest.
  • the identification information includes coordinate information; and the setting subunit includes: a setting module configured to set an area corresponding to the coordinate information as a trigger area.
  • the coordinate information is a percentage coordinate
  • a setting module comprising: a calculation submodule, configured to calculate, based on the resolution of the playback device and the percentage coordinates of the occurrence item of the current video frame, the Lattice coordinates; the setting sub-module is configured to set the area corresponding to the lattice coordinates as the trigger area.
  • the calculation sub-module is further configured to: if the coordinate system of the percentage coordinates is the same as the screen coordinate system of the playback device, compare the horizontal pixel value and the vertical pixel value of the resolution of the playback device with the appearance item of the current video frame The horizontal coordinate value and the vertical coordinate value of the percentage coordinates are multiplied correspondingly to obtain the lattice coordinates of the items appearing in the current video frame.
  • the calculation submodule is further configured to: if the coordinate system of the percentage coordinates is different from the screen coordinate system of the playback device, convert the coordinate system of the percentage coordinates to obtain the converted percentage coordinates in the screen coordinate system;
  • the horizontal and vertical pixel values of the resolution of the device are correspondingly multiplied by the horizontal and vertical coordinate values of the conversion percentage coordinates of the items appearing in the current video frame to obtain the lattice coordinates of the items appearing in the current video frame.
  • the determining subunit is further configured to: if the user touches the triggering region of the current video frame, determine that the user confirms the triggering region.
  • the determination subunit includes: a capture module configured to capture the focus of the user's eyes; and a determination module configured to determine the user confirms the trigger region in response to determining that the focus is on the trigger region of the current video frame.
  • the capture module is further configured to: use the camera of the playback device to emit a light beam to the eye; use a photosensitive material on the screen of the playback device to sense the intensity of the light beam reflected from the eye; determine the dark spot on the screen based on the light beam intensity, as the focus.
  • an embodiment of the present application provides a video processing device, including: a determining unit configured to perform item identification on a video stream to determine an item appearing in the video stream; an obtaining unit configured to obtain identification information of the appearing item ; The adding unit is configured to add the identification information of the item to the corresponding video frame protocol to generate video data.
  • the acquiring unit includes: a determining subunit, configured to perform position recognition on the video stream, and determine coordinate information of the appearing item; and an adding subunit, configured to add the coordinate information of the appearing article to the identification of the appearing article information.
  • the determining subunit includes: a pilot-broadcasting module, configured to simulate a pilot-broadcast video stream on a pilot-broadcasting device; an identification module, configured to perform position recognition on the video stream to obtain the lattice coordinates of the appearing item; the determining module, is configured to determine coordinate information of the appearing item based on the lattice coordinates of the appearing article.
  • the determining module is further configured to: divide the horizontal coordinate value and the vertical coordinate value of the lattice coordinates of the appearing item and the horizontal pixel value and the vertical pixel value of the resolution of the pilot device correspondingly to obtain the appearance of the article. Percentage coordinates.
  • the identification information added to the video frame protocol of the first-occurring item includes the item name, coordinate information, brief information and/or web page link.
  • the identification information of the video frame protocol in which the item appears includes the item name and coordinate information.
  • the adding unit is further configured to: extend the network abstraction layer information of the corresponding video frame protocol based on the identification information of the present item.
  • an embodiment of the present application provides a computer device, the computer device includes: one or more processors; a storage device on which one or more programs are stored; when one or more programs are stored by one or more The processors execute such that one or more processors implement a method as described in any implementation of the first aspect or implement a method as described in any implementation of the second aspect.
  • an embodiment of the present application provides a computer-readable medium on which a computer program is stored, and when the computer program is executed by a processor, implements the method described in any of the implementation manners in the first aspect or implements the method described in the second aspect.
  • the video data is first converted into a code stream to obtain the video stream and the identification information of the items appearing in the video stream; then the video stream is played on the playback device; then, in response to determining There is an item of interest of the user in the current video frame of the video stream, and the identification information of the item of interest is determined; finally, based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented.
  • FIG. 1 is an exemplary system architecture to which the present application may be applied;
  • FIG. 2 is a flowchart of an embodiment of an information push method according to the present application.
  • Fig. 3 is a flowchart of another embodiment of the information push method according to the present application.
  • FIG. 4 is a flowchart of another embodiment of an information push method according to the present application.
  • FIG. 5 is a flowchart of an embodiment of a video processing method according to the present application.
  • FIG. 6 is a schematic structural diagram of a computer system suitable for implementing the computer device of the embodiment of the present application.
  • FIG. 1 shows an exemplary system architecture 100 to which embodiments of the information push and video processing methods of the present application may be applied.
  • the system architecture 100 may include devices 101 , 102 and a network 103 .
  • the network 103 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
  • the devices 101, 102 may be hardware devices or software that support network connections to provide various network services.
  • the device can be a variety of electronic devices including, but not limited to, smart phones, tablet computers, laptop computers, desktop computers, and servers, among others.
  • a hardware device it can be implemented as a distributed device group composed of multiple devices, or can be implemented as a single device.
  • the device is software, it can be installed in the electronic devices listed above.
  • software it may be implemented as a plurality of software or software modules for providing distributed services, or may be implemented as a single software or software module. There is no specific limitation here.
  • the device can provide corresponding network services by installing a corresponding client application or server application.
  • client application After the client application is installed on the device, it can be embodied as a client in network communication.
  • server application After the server application is installed, it can be embodied as a server in network communication.
  • device 101 is embodied as a client, and device 102 is embodied as a server.
  • the device 101 may be a client of a video application, and the device 102 may be a server of the video application.
  • the information pushing method and the video processing method provided by the embodiments of the present application may be executed by the device 101 .
  • the device 101 executes the information push method, it may be a playback device.
  • the device 102 performs the video processing method, it may be a pilot device.
  • FIG. 2 shows a process 200 of an embodiment of the information push method according to the present application.
  • the information push method includes the following steps:
  • Step 201 Convert the video data to a code stream to obtain the video stream and the identification information of the items appearing in the video stream.
  • the execution body of the information push method may acquire video data from the background server of the video application (for example, the device 102 shown in FIG. 1 ), and encode the video data Stream conversion, to obtain the video stream and the identification information of the items appearing in the video stream.
  • the video data may include the video stream and the identification information of the items appearing in the video stream.
  • Video streams are playable data, including but not limited to TV series, movies, live broadcasts, short videos, and so on.
  • the identification information of the item appearing in the video stream is unplayable data, which is used to identify the item appearing in the video stream, including but not limited to the item name, coordinate information, brief information, and web page link.
  • Appearing items may be items that appear in the video stream, such as clothing, decorations, food, and the like.
  • the code stream conversion may adopt a static transcoding method or a dynamic transcoding method.
  • NAL Network Abstraction Layer
  • NAL Header can be used to store basic information of video frames.
  • the NAL payload can be used to store a binary stream of video frames.
  • NAL Extension can be used to store identification information. It should be noted that since the video frame itself is a highly compressed data body, the NAL Extension also needs to have high compression.
  • the same item can appear in multiple consecutive video frames.
  • the identification information added to the video frame protocol of the first-appearing item may be detailed information, including item name, coordinate information, brief information and/or web page link, and the corresponding video frame is called Detailed frame;
  • the identification information added to the video frame protocol of a non-first-occurring item can be abbreviated information, including item name and coordinate information, and the corresponding video frame is called an abbreviated frame. In this way, the purpose of saving space can be achieved.
  • the detailed information can be decoded and cached, and when the thumbnail frame is played later, if it is detected that there is an item of interest of the user in the current video frame
  • the abbreviated information is cached to query, and the detailed information of the concerned item can be obtained.
  • Step 202 Play the video stream on the playback device.
  • the above-mentioned execution body may play the video stream on the playback device.
  • the above-mentioned execution body may be a playback device on which a player is installed for playing the video stream.
  • the playback device usually plays the video stream while converting the code stream. Therefore, during the playback of the video stream, the identification information of the items appearing in the video stream can be successively obtained.
  • Step 203 in response to determining that there is an item of interest of the user in the current video frame of the video stream, determine identification information of the item of interest.
  • the above-mentioned execution subject may determine whether there is an item of interest of the user in the current video frame of the video stream. If there is an item of interest of the user, determine the identification information of the item of interest; if there is no item of interest of the user, continue to play the video stream.
  • the user's attention item may be determined by the above-mentioned execution subject based on the user's reaction when watching the video stream.
  • the user can say the name of the attention item.
  • the above-mentioned executive body may collect the voice information of the user, identify the voice information, and determine the name of the item contained in the voice information. If the item name matches the item appearing in the current video frame, the matched appearing item is determined as the item of interest; if the item name does not match the appearing item in the current video frame, continue to collect the user's voice information.
  • the current video frame is the currently playing video frame. Multiple items may appear in the same video frame, and the item that matches the item name contained in the user's voice information is the user's attention item. For example, the user says "watch", and the items appearing in the current video frame include watches of brand A, clothes of brand B, and shoes of brand C. Only the watches of brand A match with "watch" and are the user's attention items.
  • Step 204 based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
  • the above-mentioned execution subject may, based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
  • the push information may be a link for the user to browse the detailed information of the item of interest or a link to purchase the item of interest.
  • the push information can be presented on the current video frame, especially in the vicinity of the item of interest in the current video frame. Subsequently, the user can perform corresponding operations based on the push information to view the detailed information of the item of interest or purchase the item of interest.
  • the above-mentioned executive body can query the push information of the concerned item in various ways. For example, when the push information of a large number of items is stored locally, the push information of the concerned item is searched locally. For another example, in the case where a video application integrates a search function or a shopping function, based on the identification information of the item of interest, a push information acquisition request is sent to the background server of the video application, and the push information of the item of interest returned by the background server of the video application is received. information. For another example, based on the identification information of the item of interest, a push information acquisition request is sent to the background server of the search application or the shopping application, and the push information of the item of interest returned by the background server of the search application or the shopping application is received.
  • the video data is first converted into a code stream to obtain the video stream and the identification information of the items appearing in the video stream; then the video stream is played on the playback device; then, in response to determining the current video of the video stream There is an item of interest of the user in the frame, and the identification information of the item of interest is determined; finally, based on the identification information of the item of interest, the push information of the item of interest is queried, and the push information is presented.
  • FIG. 3 it shows a process 300 of still another embodiment of the information pushing method according to the present application.
  • the information push method includes the following steps:
  • Step 301 Convert the video data to the code stream to obtain the video stream and the identification information of the items appearing in the video stream.
  • Step 302 Play the video stream on the playback device.
  • steps 301-302 have been described in detail in steps 201-202 in the embodiment shown in FIG. 2, and are not repeated here.
  • Step 303 setting a trigger area in the video frame where the item appearing in the video stream is located.
  • the execution body of the information push method may set a trigger area in the video frame where the item appears in the video stream.
  • the trigger area can be set in the vicinity of the present item in the video frame.
  • the identification information includes coordinate information
  • the area corresponding to the coordinate information is set as the trigger area. It should be understood that, when the number of items appearing in the video frame is multiple, multiple trigger areas may be set, and one trigger area corresponds to one item appearing.
  • Step 304 in response to detecting that the user confirms the trigger area of the current video frame, determine the appearing item corresponding to the confirmed trigger area as the item of interest.
  • the above-mentioned execution body can detect whether the user confirms the trigger area of the current video frame. If the trigger area where the user confirms the current video frame is detected, the item corresponding to the confirmed trigger area is determined as the item of interest; if the trigger area where the user confirms the current video frame is not detected, the video stream continues to be played and the detection continues.
  • the playback device needs to have corresponding hardware or plug-ins to detect the user's operation on the trigger area, while the video stream itself has no monitoring and network connection capabilities.
  • the playback device when the playback device has a touch screen, if the user touches the trigger area of the current video frame, it is determined that the user confirms the trigger area.
  • the playback device when the playback device has a camera, if it is captured that the focus of the user's eyes is in the trigger area of the current video frame, it is determined that the user confirms the trigger area.
  • the above-mentioned executive body may analyze the angle of view of the user's eyes in the user image collected by the camera to determine whether the focus of the user's eyes falls on the trigger area.
  • the above-mentioned executive body can first use the camera to emit light beams to the eyes of the user; then use the photosensitive material on the screen of the playback device to sense the intensity of the light beam reflected from the eyes; finally A dark spot on the screen is determined based on the beam intensity as the focus of the user's eyes.
  • the light beam hits the pupil of the eye, most of the light beam is absorbed by the pupil, so that the intensity of the light beam reflected on the screen is lower and dark spots appear.
  • the light beam irradiates the part other than the pupil, most of the light beam will be reflected on the screen, the light beam intensity is low, and bright spots appear.
  • Step 305 Determine the identification information of the object of interest.
  • Step 306 based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
  • steps 305-306 have been described in detail in steps 203-204 in the embodiment shown in FIG. 2, and are not repeated here.
  • the process 300 of the information push method in this embodiment highlights the step of determining the user's attention item. Therefore, in the solution described in this embodiment, a trigger area is set in the video frame where the item appears, and the item of interest is determined based on the user's operation on the trigger area, thereby improving the accuracy of determining the item of interest.
  • FIG. 4 it shows a process 400 of another embodiment of the information pushing method according to the present application.
  • the information push method includes the following steps:
  • Step 401 Convert the video data to a code stream to obtain the video stream and the identification information of the items appearing in the video stream.
  • Step 402 Play the video stream on the playback device.
  • steps 401-402 have been described in detail in steps 301-302 in the embodiment shown in FIG. 3, and are not repeated here.
  • Step 403 based on the resolution of the playback device and the percentage coordinates of the items that appear in the current video frame, calculate the lattice coordinates of the items appearing in the current video frame.
  • the execution body of the information push method may be based on the resolution of the playback device and the appearance of the current video frame.
  • the percentage coordinates of calculate the lattice coordinates of the items appearing in the current video frame.
  • the coordinate information in the identification information is a percentage coordinate.
  • lattice coordinates are required, so it is necessary to convert the percentage coordinates into the corresponding lattice coordinates.
  • the above-mentioned execution body can multiply the horizontal pixel value and vertical pixel value of the resolution of the playback device and the horizontal coordinate value and vertical coordinate value of the percentage coordinate of the item appearing in the current video frame correspondingly to obtain the appearing item in the current video frame. lattice coordinates.
  • a video stream is played using a playback device with a resolution of A*B. If the percentage coordinates of the appearing items are (x/a, y/b), then the lattice coordinates of the appearing items are (x*A/a, y*B/b).
  • a, b, A and B are positive integers
  • x is a positive integer not greater than a
  • y is a positive integer not greater than b
  • x/a and y/b are positive numbers not greater than 1
  • x*A/ a and y*B/b are positive integers.
  • the coordinate system of the percentage coordinate is the same as the screen coordinate system of the playback device, both with the upper left corner as the origin, the rightward as the positive direction of the horizontal axis, and the downward as the positive direction of the vertical axis.
  • Step 404 Set the area corresponding to the lattice coordinates as the trigger area.
  • the above-mentioned execution body may set the area corresponding to the lattice coordinates as the trigger area.
  • Step 405 in response to detecting that the user confirms the trigger area of the current video frame, determine the appearing item corresponding to the confirmed trigger area as the item of interest.
  • Step 406 Determine the identification information of the object of interest.
  • Step 407 based on the identification information of the item of interest, query the push information of the item of interest, and present the push information.
  • steps 405-407 have been described in detail in steps 304-306 in the embodiment shown in FIG. 3, and are not repeated here.
  • the process 400 of the information push method in this embodiment highlights the step of setting a trigger area. Therefore, the coordinate information in the identification information in the solution described in this embodiment is a percentage coordinate, and the corresponding lattice coordinates are obtained through coordinate transformation, thereby adapting to different screen resolutions of different playback devices.
  • the video processing method includes the following steps:
  • Step 501 Perform item identification on the video stream to determine the items appearing in the video stream.
  • the execution body of the video processing method (for example, the device 101 shown in FIG. 1 ) can perform item identification on the video stream, and determine the items appearing in the video stream.
  • the above-mentioned executive body can determine the occurrence items of the video stream in various ways. In some embodiments, those skilled in the art can perform item recognition on the video stream, and input the recognition result to the above-mentioned execution body. In some embodiments, the above-mentioned executive body may split the video stream into a series of video frames, and perform item identification on each video frame to determine the occurrence of items in the video stream.
  • step 502 the identification information of the appearing item is acquired.
  • the above-mentioned execution subject may acquire the identification information of the appearing item.
  • the identification information of the appearing item is unplayable data, which is used to identify the article appearing in the video stream.
  • the identification information may include coordinate information.
  • the above-mentioned execution body can perform position identification on the video stream, determine the coordinate information of the appearing item, and add the coordinate information of the appearing article to the identification information of the appearing article.
  • the coordinate information may be determined by simulating a pilot video stream on a pilot device. Specifically, a pilot video stream is first simulated on a pilot device; then the location of the video stream is identified to obtain the lattice coordinates of the appearing item; finally, the coordinate information of the appearing item is determined based on the lattice coordinates of the appearing article.
  • the coordinate information may be lattice coordinates.
  • the coordinate information in the identification information is a percentage coordinate. Specifically, by dividing the horizontal coordinate value and vertical coordinate value of the lattice coordinates of the appearing item correspondingly with the horizontal pixel value and vertical pixel value of the resolution of the pilot device, the percentage coordinates of the appearing article can be obtained.
  • a and b are positive integers
  • x is a positive integer not greater than a
  • y is a positive integer not greater than b
  • x/a and y/b are positive numbers not greater than 1.
  • the selection of the resolution of the pilot device needs to match the resolution of the video, for example, 16:9 is selected above 720p, and 4:3 is selected below. In this way, the error can be reduced as much as possible.
  • Step 503 adding the identification information of the item to the corresponding video frame protocol to generate video data.
  • the above-mentioned execution body may add the identification information of the appearing item to the corresponding video frame protocol to generate video data.
  • the identification information can be added to the original video frame protocol by performing code stream encoding processing on the video frame where the item with identification information is located, and by transforming the video frame protocol.
  • the transforming methods are different for different protocol formats.
  • the NAL information of the corresponding video frame protocol is extended based on the identification information of the item to support adding identification information.
  • NAL can include NAL Header, NAL Extension and NAL payload.
  • NAL Header can be used to store basic information of video frames.
  • the NAL payload can be used to store a binary stream of video frames.
  • NAL Extension can be used to store identification information. It should be noted that since the video frame itself is a highly compressed data body, the NAL Extension also needs to have high compression.
  • the same item can appear in multiple consecutive video frames.
  • the identification information added to the video frame protocol of the first-appearing item may be detailed information, including item name, coordinate information, brief information and/or web page link, and the corresponding video frame is called Detailed frame;
  • the identification information added to the video frame protocol of a non-first-occurring item can be abbreviated information, including item name and coordinate information, and the corresponding video frame is called an abbreviated frame. In this way, the purpose of saving space can be achieved.
  • the detailed information can be decoded and cached, and when the thumbnail frame is played later, if it is detected that there is an item of interest of the user in the current video frame
  • the abbreviated information is cached to query, and the detailed information of the concerned item can be obtained.
  • FIG. 6 shows a schematic structural diagram of a computer system 600 suitable for implementing a computer device (eg, the device 101 shown in FIG. 1 ) according to an embodiment of the present application.
  • the computer device shown in FIG. 6 is only an example, and should not impose any limitations on the functions and scope of use of the embodiments of the present application.
  • a computer system 600 includes a central processing unit (CPU) 601, which can be loaded into a random access memory (RAM) 603 according to a program stored in a read only memory (ROM) 602 or a program from a storage section 608 Instead, various appropriate actions and processes are performed.
  • RAM random access memory
  • ROM read only memory
  • various programs and data required for the operation of the system 600 are also stored.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (I/O) interface 605 is also connected to bus 604 .
  • the following components are connected to the I/O interface 605: an input section 606 including a keyboard, a mouse, etc.; an output section 607 including a cathode ray tube (CRT), a liquid crystal display (LCD), etc., and a speaker, etc.; a storage section 608 including a hard disk, etc. ; and a communication section 609 including a network interface card such as a LAN card, a modem, and the like. The communication section 609 performs communication processing via a network such as the Internet.
  • a drive 610 is also connected to the I/O interface 605 as needed.
  • a removable medium 611 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, etc., is mounted on the drive 610 as needed so that a computer program read therefrom is installed into the storage section 608 as needed.
  • embodiments of the present disclosure include a computer program product comprising a computer program carried on a computer-readable medium, the computer program containing program code for performing the method illustrated in the flowchart.
  • the computer program may be downloaded and installed from the network via the communication portion 609 and/or installed from the removable medium 611 .
  • CPU central processing unit
  • the computer-readable medium described in this application may be a computer-readable signal medium or a computer-readable storage medium, or any combination of the above two.
  • the computer readable storage medium can be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus or device, or a combination of any of the above. More specific examples of computer readable storage media may include, but are not limited to, electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read only memory (ROM), erasable Programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage devices, magnetic storage devices, or any suitable combination of the above.
  • a computer-readable storage medium can be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, carrying computer-readable program code therein. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • a computer-readable signal medium can also be any computer-readable medium other than a computer-readable storage medium that can transmit, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device .
  • Program code embodied on a computer readable medium may be transmitted using any suitable medium including, but not limited to, wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
  • Computer program code for performing the operations of the present application may be written in one or more programming languages, including object-oriented programming languages - such as Java, Smalltalk, C++, but also conventional Procedural programming language - such as "C" language or similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer, or entirely on the remote computer or electronic device.
  • the remote computer may be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computer (eg, using an Internet service provider through Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider e.g., using an Internet service provider through Internet connection.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of code that contains one or more logical functions for implementing the specified functions executable instructions.
  • the functions noted in the blocks may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved.
  • each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations can be implemented in dedicated hardware-based systems that perform the specified functions or operations , or can be implemented in a combination of dedicated hardware and computer instructions.
  • the units involved in the embodiments of the present application may be implemented in a software manner, and may also be implemented in a hardware manner.
  • the described unit may also be provided in the processor, for example, it may be described as: a processor includes a converting unit, a playing unit, a determining unit and a presenting unit.
  • the names of these units do not constitute a limitation of the unit itself in this case, for example, the conversion unit can also be described as "converting the video data to a stream to obtain the video stream and the identification information of the items that appear in the video stream. unit".
  • a processor includes a determination unit, an acquisition unit, and an addition unit.
  • the names of these units do not constitute a limitation of the unit itself, for example, the determination unit may also be described as "a unit for identifying items in a video stream and determining items appearing in a video stream".
  • the present application also provides a computer-readable medium.
  • the computer-readable medium may be included in the computer device described in the above embodiments; it may also exist independently without being assembled into the computer device. middle.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the computer equipment, the computer equipment is made to perform code stream conversion on the video data to obtain the video stream and the appearance of the video stream. identification information; play the video stream on the playback device; in response to determining that there is an item of interest of the user in the current video frame of the video stream, determine the identification information of the item of interest; based on the identification information of the item of interest, query the push information of the item of interest, and present Push information. Or make the computer equipment: perform item identification on the video stream to determine the item appearing in the video stream; obtain the identification information of the appearing item; add the identification information of the appearing item to the corresponding video frame protocol to generate video data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

一种信息推送、视频处理方法和设备。该信息推送方法包括:对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息(201);在播放设备上播放视频流(202);响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息(203);基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息(204)。该方法从视频流出现的众多物品中发现用户感兴趣的物品,并自动呈现其推送信息,从而满足了用户详细了解其感兴趣物品的信息的需求,以便于购买其感兴趣的物品,节省了用户的操作成本。

Description

信息推送、视频处理方法和设备 技术领域
本申请实施例涉及计算机技术领域,具体涉及信息推送、视频处理方法和设备。
背景技术
随着互联网的飞速发展,视频应用支持的功能越来越多样,例如直播功能、点播功能等等。进而吸引了越来越多的用户观看视频,且观看时间也越来越久。视频中经常会出现各种物品,例如衣服、装饰品、食品等等。若用户对其中的物品感兴趣,需要先将视频应用后台运行,再打开搜索应用或购物应用,输入物品名称进行搜索,才能获取物品的详细信息。
发明内容
本申请实施例提出了信息推送、视频处理方法和设备。
第一方面,本申请实施例提供了一种信息推送方法,包括:对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;在播放设备上播放视频流;响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。
在一些实施例中,确定视频流的当前视频帧中存在用户的关注物品,包括:采集用户的语音信息;对语音信息进行识别,确定语音信息包含的物品名称;若物品名称与当前视频帧的出现物品匹配,将匹配的出现物品确定为关注物品。
在一些实施例中,确定视频流的当前视频帧中存在用户的关注物品,包括:在视频流的出现物品所在的视频帧中设置触发区域;响应于检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品。
在一些实施例中,标识信息包括坐标信息;以及在视频流的出现物品所在 的视频帧中设置触发区域,包括:将坐标信息对应的区域设置为触发区域。
在一些实施例中,坐标信息是百分比坐标;以及将坐标信息对应的区域设置为触发区域,包括:基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标;将点阵坐标对应的区域设置为触发区域。
在一些实施例中,基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标,包括:若百分比坐标的坐标系与播放设备的屏幕坐标系相同,将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。
在一些实施例中,基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标,还包括:若百分比坐标的坐标系与播放设备的屏幕坐标系不同,对百分比坐标的坐标系进行转换,得到屏幕坐标系下的转换百分比坐标;将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的转换百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。
在一些实施例中,检测到用户确认当前视频帧的触发区域,包括:若用户触摸当前视频帧的触发区域,确定用户确认触发区域。
在一些实施例中,检测到用户确认当前视频帧的触发区域,包括:捕捉用户的眼睛的焦点;响应于确定焦点在当前视频帧的触发区域,确定用户确认触发区域。
在一些实施例中,捕捉用户的眼睛的焦点,包括:利用播放设备的摄像头向眼睛发射光束;利用播放设备的屏幕上的感光材料感应从眼睛反射的光束强度;基于光束强度确定屏幕上的暗点,作为焦点。
第二方面,本申请实施例提供了一种视频处理方法,包括:对视频流进行物品识别,确定视频流的出现物品;获取出现物品的标识信息;将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。
在一些实施例中,获取出现物品的标识信息,包括:对视频流进行位置识别,确定出现物品的坐标信息;将出现物品的坐标信息添加到出现物品的标识 信息中。
在一些实施例中,对视频流进行位置识别,确定出现物品的坐标信息,包括:在试播设备上模拟试播视频流;对视频流进行位置识别,得到出现物品的点阵坐标;基于出现物品的点阵坐标,确定出现物品的坐标信息。
在一些实施例中,基于出现物品的点阵坐标,确定出现物品的坐标信息,包括:将出现物品的点阵坐标的水平坐标值和垂直坐标值与试播设备的分辨率的水平像素值和垂直像素值对应相除,得到出现物品的百分比坐标。
在一些实施例中,对于视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息包括物品名称、坐标信息、简要信息和/或网页链接,添加到非首次出现物品的视频帧协议的标识信息包括物品名称和坐标信息。
在一些实施例中,将出现物品的标识信息添加到对应的视频帧协议中,包括:基于出现物品的标识信息扩展对应的视频帧协议的网络抽象层信息。
第三方面,本申请实施例提供了一种信息推送装置,包括:转换单元,被配置成对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;播放单元,被配置成在播放设备上播放视频流;确定单元,被配置成响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;呈现单元,被配置成基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。
在一些实施例中,确定单元进一步被配置成:采集用户的语音信息;对语音信息进行识别,确定语音信息包含的物品名称;若物品名称与当前视频帧的出现物品匹配,将匹配的出现物品确定为关注物品。
在一些实施例中,确定单元包括:设置子单元,被配置成在视频流的出现物品所在的视频帧中设置触发区域;确定子单元,被配置成响应于检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品。
在一些实施例中,标识信息包括坐标信息;以及设置子单元包括:设置模块,被配置成将坐标信息对应的区域设置为触发区域。
在一些实施例中,坐标信息是百分比坐标;以及设置模块,包括:计算子 模块,被配置成基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标;设置子模块,被配置成将点阵坐标对应的区域设置为触发区域。
在一些实施例中,计算子模块进一步被配置成:若百分比坐标的坐标系与播放设备的屏幕坐标系相同,将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。
在一些实施例中,计算子模块进一步被配置成:若百分比坐标的坐标系与播放设备的屏幕坐标系不同,对百分比坐标的坐标系进行转换,得到屏幕坐标系下的转换百分比坐标;将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的转换百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。
在一些实施例中,确定子单元进一步被配置成:若用户触摸当前视频帧的触发区域,确定用户确认触发区域。
在一些实施例中,确定子单元包括:捕捉模块,被配置成捕捉用户的眼睛的焦点;确定模块,被配置成响应于确定焦点在当前视频帧的触发区域,确定用户确认触发区域。
在一些实施例中,捕捉模块进一步被配置成:利用播放设备的摄像头向眼睛发射光束;利用播放设备的屏幕上的感光材料感应从眼睛反射的光束强度;基于光束强度确定屏幕上的暗点,作为焦点。
第四方面,本申请实施例提供了一种视频处理装置,包括:确定单元,被配置成对视频流进行物品识别,确定视频流的出现物品;获取单元,被配置成获取出现物品的标识信息;添加单元,被配置成将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。
在一些实施例中,获取单元包括:确定子单元,被配置成对视频流进行位置识别,确定出现物品的坐标信息;添加子单元,被配置成将出现物品的坐标信息添加到出现物品的标识信息中。
在一些实施例中,确定子单元包括:试播模块,被配置成在试播设备上模拟试播视频流;识别模块,被配置成对视频流进行位置识别,得到出现物品的 点阵坐标;确定模块,被配置成基于出现物品的点阵坐标,确定出现物品的坐标信息。
在一些实施例中,确定模块进一步被配置成:将出现物品的点阵坐标的水平坐标值和垂直坐标值与试播设备的分辨率的水平像素值和垂直像素值对应相除,得到出现物品的百分比坐标。
在一些实施例中,对于视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息包括物品名称、坐标信息、简要信息和/或网页链接,添加到非首次出现物品的视频帧协议的标识信息包括物品名称和坐标信息。
在一些实施例中,添加单元进一步被配置成:基于出现物品的标识信息扩展对应的视频帧协议的网络抽象层信息。
第五方面,本申请实施例提供了一种计算机设备,该计算机设备包括:一个或多个处理器;存储装置,其上存储有一个或多个程序;当一个或多个程序被一个或多个处理器执行,使得一个或多个处理器实现如第一方面中任一实现方式描述的方法或实现如第二方面中任一实现方式描述的方法。
第六方面,本申请实施例提供了一种计算机可读介质,其上存储有计算机程序,该计算机程序被处理器执行时实现如第一方面中任一实现方式描述的方法或实现如第二方面中任一实现方式描述的方法。
本申请实施例提供的信息推送、视频处理方法和设备,首先对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;之后在播放设备上播放视频流;然后响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;最后基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。从视频流出现的众多物品中发现用户感兴趣的物品,并自动呈现其推送信息,从而满足了用户详细了解其感兴趣物品的信息的需求,以便于购买其感兴趣的物品,节省了用户的操作成本。
附图说明
通过阅读参照以下附图所作的对非限制性实施例所作的详细描述,本申请的其它特征、目的和优点将会变得更明显:
图1是本申请可以应用于其中的示例性***架构;
图2是根据本申请的信息推送方法的一个实施例的流程图;
图3是根据本申请的信息推送方法的又一个实施例的流程图;
图4是根据本申请的信息推送方法的另一个实施例的流程图;
图5是根据本申请的视频处理方法的一个实施例的流程图;
图6是适于用来实现本申请实施例的计算机设备的计算机***的结构示意图。
具体实施方式
下面结合附图和实施例对本申请作进一步的详细说明。可以理解的是,此处所描述的具体实施例仅仅用于解释相关发明,而非对该发明的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与有关发明相关的部分。
需要说明的是,在不冲突的情况下,本申请中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本申请。
图1示出了可以应用本申请的信息推送、视频处理方法的实施例的示例性***架构100。
如图1所示,***架构100中可以包括设备101、102和网络103。网络103用以在设备101、102之间提供通信链路的介质。网络103可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。
设备101、102可以是支持网络连接从而提供各种网络服务的硬件设备或软件。当设备为硬件时,其可以是各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机、台式计算机和服务器等等。这时,作为硬件设备,其可以实现成多个设备组成的分布式设备群,也可以实现成单个设备。当设备为软件时,可以安装在上述所列举的电子设备中。这时,作为软件,其可以实现成例如用来提供分布式服务的多个软件或软件模块,也可以实现成单个软件或软件模块。在此不做具体限定。
在实践中,设备可以通过安装相应的客户端应用或服务端应用来提供相应的网络服务。设备在安装了客户端应用之后,其可以在网络通信中体现为客户端。相应地,在安装了服务端应用之后,其可以在网络通信中体现为服务端。
作为示例,在图1中,设备101体现为客户端,而设备102体现为服务端。例如,设备101可以是视频类应用的客户端,设备102可以是视频类应用的服务端。
需要说明的是,本申请实施例所提供的信息推送方法和视频处理方法可以由设备101执行。当设备101执行信息推送方法时,其可以是播放设备。当设备102执行视频处理方法时,其可以是试播设备。
应该理解,图1中的网络和设备的数目仅仅是示意性的。根据实现需要,可以具有任意数目的网络和设备。
继续参考图2,其示出了根据本申请的信息推送方法的一个实施例的流程200。该信息推送方法包括以下步骤:
步骤201,对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息。
在本实施例中,信息推送方法的执行主体(例如图1所示的设备101)可以从视频类应用的后台服务器(例如图1所示的设备102)获取视频数据,并对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息。
其中,视频数据可以包括视频流和视频流的出现物品的标识信息。视频流是可播放数据,包括但不限于电视剧、电影、直播、短视频等等。视频流的出现物品的标识信息是不可播放数据,用于标识视频流中出现的物品,包括但不限于物品名称、坐标信息、简要信息和网页链接等等。出现物品可以是视频流中出现的物品,例如衣服、装饰品、食品等等。
对于视频流中的视频帧,并不是每帧视频帧中都出现物品,也不是出现的每个物品都有标识信息,因此,仅对有标识信息的物品所在的视频帧进行码流编码处理,改造视频帧协议,在原有视频帧协议中添加标识信息。视频数据中添加了不可播放数据无法直接播放,因此需要对视频数据进行码流转换,将可播放的视频流和不可播放的标识信息分离开来。其中,码流转换可以采用静态转码方式或动态转码方式。
需要说明的是,在对视频帧协议改造时,针对不同的协议格式,其改造方式不同。以H.264为例,通过扩展视频帧协议的NAL(Network Abstraction Layer,网络抽象层)信息来支持添加标识信息。其中,NAL可以包括NAL Header、 NAL Extension和NAL payload。NAL Header可以用于存储视频帧的基本信息。NAL payload可以用于存储视频帧的二进制流。NAL Extension可以用于存储标识信息。需要说明的是,由于视频帧本身是一个高压缩的数据体,因此NAL Extension也需要具有高压缩性。
在实际应用中,同一物品可以在连续多帧视频帧中出现。对于视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息可以是详细信息,包括物品名称、坐标信息、简要信息和/或网页链接,对应的视频帧叫做详细帧;添加到非首次出现物品的视频帧协议的标识信息可以是缩略信息,包括物品名称和坐标信息,对应的视频帧叫做缩略帧。这样,能够达到节省空间的目的。在播放设备播放视频流的过程中,在播放到详细帧时,可以将详细信息解码缓存,之后再播放缩略帧时,若检测到当前视频帧中存在用户的关注物品,则基于关注物品的缩略信息进行缓存查询,即可得到关注物品的详细信息。
步骤202,在播放设备上播放视频流。
在本实施例中,上述执行主体可以在播放设备上播放视频流。
通常,在上述执行主体作为硬件的情况下,其可以是播放设备,其上安装有播放器,用于播放视频流。
需要说明的是,播放设备通常边进行码流转换边播放视频流。因此,在视频流播放过程中,即可陆续得到视频流的出现物品的标识信息。
步骤203,响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息。
在本实施例中,上述执行主体可以确定视频流的当前视频帧中是否存在用户的关注物品。若存在用户的关注物品,确定关注物品的标识信息;若不存在用户的关注物品,继续播放视频流。
其中,用户的关注物品可以由上述执行主体基于用户观看视频流时的反应来确定。通常,用户看到其感兴趣的物品时,会作出特殊反应,例如当用户的关注物品在视频流中出现时,用户可以说出关注物品的名称。此时,上述执行主体可以采集用户的语音信息,并对语音信息进行识别,确定语音信息包含的物品名称。若物品名称与当前视频帧的出现物品匹配,将匹配的出现物品确定 为关注物品;若物品名称与当前视频帧的出现物品不匹配,继续采集用户的语音信息。当前视频帧是当前播放的视频帧。同一视频帧中可以出现多个物品,与用户的语音信息包含的物品名称匹配的物品才是用户的关注物品。例如,用户说出“手表”,而当前视频帧的出现物品包括A品牌的手表、B品牌的衣服和C品牌的鞋子,只有A品牌的手表与“手表”匹配,是用户的关注物品。
步骤204,基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。
在本实施例中,上述执行主体可以基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。其中,推送信息可以是供用户浏览关注物品的详细信息的链接或购买关注物品的链接。通常,推送信息可以呈现在当前视频帧上,尤其是呈现在当前视频帧中的关注物品的附近。随后,用户可以基于推送信息进行相应操作,以查看关注物品的详情信息或者购买关注物品。
通常,上述执行主体可以通过多种途径查询关注物品的推送信息。例如,在本地存储大量物品的推送信息的情况下,本地查找关注物品的推送信息。又例如,在视频类应用集成搜索功能或购物功能的情况下,基于关注物品的标识信息向视频类应用的后台服务器发送推送信息获取请求,并接收视频类应用的后台服务器返回的关注物品的推送信息。再例如,基于关注物品的标识信息向搜索应用或购物应用的后台服务器发送推送信息获取请求,并接收搜索应用或购物应用的后台服务器返回的关注物品的推送信息。
本申请实施例提供的信息推送方法,首先对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;之后在播放设备上播放视频流;然后响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;最后基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。从视频流出现的众多物品中发现用户感兴趣的物品,并自动呈现其推送信息,从而满足了用户详细了解其感兴趣物品的信息的需求,实现了关注物品的快速推送,节省了用户的操作成本。
进一步参考图3,其示出了是根据本申请的信息推送方法的又一个实施例的流程300。该信息推送方法包括以下步骤:
步骤301,对视频数据进行码流转换,得到视频流和视频流的出现物品的 标识信息。
步骤302,在播放设备上播放视频流。
在本实施例中,步骤301-302的具体操作已在图2所示的实施例中步骤201-202中进行了详细的介绍,在此不再赘述。
步骤303,在视频流的出现物品所在的视频帧中设置触发区域。
在本实施例中,信息推送方法的执行主体(例如图1所示的设备101)可以在视频流的出现物品所在的视频帧中设置触发区域。
通常,触发区域可以设置在视频帧的出现物品的附近。例如,在标识信息包括坐标信息的情况下,将坐标信息对应的区域设置为触发区域。应当理解的是,当视频帧的出现物品的数量是多个时,可以设置多个触发区域,一个触发区域对应一个出现物品。
步骤304,响应于检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品。
在本实施例中,上述执行主体可以检测用户是否确认当前视频帧的触发区域。若检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品;若未检测到用户确认当前视频帧的触发区域,继续播放视频流,并持续检测。
其中,用户对触发区域进行操作时,可以认为确定了触发区域。播放设备需要具有相应的硬件或插件来检测用户对触发区域的操作,而视频流本身没有监测和网络连接能力。
在一些实施例中,在播放设备具有触摸屏的情况下,若用户触摸当前视频帧的触发区域,确定用户确认触发区域。
在一些实施例中,在播放设备具有摄像头的情况下,若捕捉到用户的眼睛的焦点在当前视频帧的触发区域,确定用户确认触发区域。例如,上述执行主体可以在分析摄像头采集到的用户图像中的用户的眼睛的视角,以确定用户的眼睛的焦点是否落在触发区域。又例如,在播放设备的屏幕上覆盖有感光材料的情况下,上述执行主体可以首先利用摄像头向用户的眼睛发射光束;然后利用播放设备的屏幕上的感光材料感应从眼睛反射的光束强度;最后基于光束强度确定屏幕上的暗点,作为用户的眼睛的焦点。其中,当光束照射到眼睛的瞳 孔时,瞳孔会吸收大部分光束,从而使反射到屏幕上的光束强度较低,出现暗点。而光束照射到除瞳孔之外的部分,大部分光束会被反射到屏幕上,光束强度较低,出现亮点。
步骤305,确定关注物品的标识信息。
步骤306,基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。
在本实施例中,步骤305-306的具体操作已在图2所示的实施例中步骤203-204中进行了详细的介绍,在此不再赘述。
从图3中可以看出,与图2对应的实施例相比,本实施例中的信息推送方法的流程300突出了确定用户的关注物品的步骤。由此,本实施例描述的方案在出现物品所在的视频帧中设置触发区域,基于用户对触发区域操作来确定关注物品,从而提升了关注物品的确定准确度。
进一步参考图4,其示出了是根据本申请的信息推送方法的另一个实施例的流程400。该信息推送方法包括以下步骤:
步骤401,对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息。
步骤402,在播放设备上播放视频流。
在本实施例中,步骤401-402的具体操作已在图3所示的实施例中步骤301-302中进行了详细的介绍,在此不再赘述。
步骤403,基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标。
在本实施例中,在标识信息中的坐标信息是百分比坐标的情况下,信息推送方法的执行主体(例如图1所示的设备101)可以基于播放设备的分辨率和当前视频帧的出现物品的百分比坐标,计算当前视频帧的出现物品的点阵坐标。
由于不同的播放设备具有不同的屏幕分辨率,为了适应不同屏幕分辨率,标识信息中的坐标信息是百分比坐标。而确定触发区域时需要点阵坐标,因此需要将百分比坐标转换为对应的点阵坐标。具体地,上述执行主体可以将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵 坐标。
例如,利用分辨率为A*B的播放设备来播放视频流。如果出现物品的百分比坐标为(x/a,y/b),那么出现物品的点阵坐标为(x*A/a,y*B/b)。其中,a、b、A和B为正整数,x为不大于a的正整数,y为不大于b的正整数,x/a和y/b为不大于1的正数,x*A/a和y*B/b为正整数。
通常,百分比坐标的坐标系与播放设备的屏幕坐标系相同,均是以左上角为原点,向右为横轴的正方向,向下为纵轴的正方向。此时,直接将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,即可得到当前视频帧的出现物品的点阵坐标。在特殊情况下,若百分比坐标的坐标系与播放设备的屏幕坐标系不同,则需要先对百分比坐标的坐标系进行转换,得到屏幕坐标系下的转换百分比坐标;然后将播放设备的分辨率的水平像素值和垂直像素值与当前视频帧的出现物品的转换百分比坐标的水平坐标值和垂直坐标值对应相乘,得到当前视频帧的出现物品的点阵坐标。
步骤404,将点阵坐标对应的区域设置为触发区域。
在本实施例中,上述执行主体可以将点阵坐标对应的区域设置为触发区域。
步骤405,响应于检测到用户确认当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为关注物品。
步骤406,确定关注物品的标识信息。
步骤407,基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。
在本实施例中,步骤405-407的具体操作已在图3所示的实施例中步骤304-306中进行了详细的介绍,在此不再赘述。
从图4中可以看出,与图3对应的实施例相比,本实施例中的信息推送方法的流程400突出了设置触发区域的步骤。由此,本实施例描述的方案中的标识信息中的坐标信息是百分比坐标,通过坐标转换来得到对应的点阵坐标,从而适应不同播放设备的不同屏幕分辨率。
继续参考图5,其示出了根据本申请的视频处理方法的一个实施例的流程500。该视频处理方法包括以下步骤:
步骤501,对视频流进行物品识别,确定视频流的出现物品。
在本实施例中,视频处理方法的执行主体(例如图1所示的设备101)可以对视频流进行物品识别,确定视频流的出现物品。
通常,上述执行主体可以通过多种方式确定视频流的出现物品。在一些实施例中,本领域技术人员可以对视频流进行物品识别,将识别结果输入至上述执行主体。在一些实施例中,上述执行主体可以将视频流拆分为一系列的视频帧,并对每一帧视频帧进行物品识别,以确定视频流的出现物品。
步骤502,获取出现物品的标识信息。
在本实施例中,上述执行主体可以获取出现物品的标识信息。其中,出现物品的标识信息是不可播放数据,用于标识视频流中出现的物品。
在一些实施例中,标识信息可以包括坐标信息。具体地,上述执行主体可以对视频流进行位置识别,确定出现物品的坐标信息;将出现物品的坐标信息添加到出现物品的标识信息中。其中,坐标信息可以是通过在试播设备上模拟试播视频流来确定。具体地,在首先试播设备上模拟试播视频流;之后对视频流进行位置识别,得到出现物品的点阵坐标;最后基于出现物品的点阵坐标,确定出现物品的坐标信息。
通常,在大部分播放设备与试播设备的屏幕分辨率统一的情况下,坐标信息可以是点阵坐标。然而实际应用中,不同的播放设备具有不同的屏幕分辨率,为了适应不同屏幕分辨率,标识信息中的坐标信息是百分比坐标。具体地,将出现物品的点阵坐标的水平坐标值和垂直坐标值与试播设备的分辨率的水平像素值和垂直像素值对应相除,即可得到出现物品的百分比坐标。
例如,利用分辨率为a*b的标准设备来试播视频流,如果在试播设备上捕捉到的出现物品的点阵坐标为(x,y),那么出现物品的百分比坐标为(x/a,y/b)。其中,a和b为正整数,x为不大于a的正整数,y为不大于b的正整数,x/a和y/b为不大于1的正数。
需要说明的是,试播设备的分辨率的选择需要和视频分辨率匹配,例如720p以上选择16∶9,以下选择4∶3。这样,能够尽可能的减小误差。
步骤503,将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。
在本实施例中,上述执行主体可以将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。
通常,通过对有标识信息的物品所在的视频帧进行码流编码处理,改造视频帧协议,即可在原有视频帧协议中添加标识信息。在对视频帧协议改造时,针对不同的协议格式,其改造方式不同。以H.264为例,基于出现物品的标识信息扩展对应的视频帧协议的NAL信息来支持添加标识信息。其中,NAL可以包括NAL Header、NAL Extension和NAL payload。NAL Header可以用于存储视频帧的基本信息。NAL payload可以用于存储视频帧的二进制流。NAL Extension可以用于存储标识信息。需要说明的是,由于视频帧本身是一个高压缩的数据体,因此NAL Extension也需要具有高压缩性。
在实际应用中,同一物品可以在连续多帧视频帧中出现。对于视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息可以是详细信息,包括物品名称、坐标信息、简要信息和/或网页链接,对应的视频帧叫做详细帧;添加到非首次出现物品的视频帧协议的标识信息可以是缩略信息,包括物品名称和坐标信息,对应的视频帧叫做缩略帧。这样,能够达到节省空间的目的。在播放设备播放视频流的过程中,在播放到详细帧时,可以将详细信息解码缓存,之后再播放缩略帧时,若检测到当前视频帧中存在用户的关注物品,则基于关注物品的缩略信息进行缓存查询,即可得到关注物品的详细信息。
本申请实施例提供的视频处理方法,首先对视频流进行物品识别,确定视频流的出现物品;然后获取出现物品的标识信息;最后将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据,从而实现了在视频流中添加不可播放数据。
下面参考图6,其示出了适于用来实现本申请实施例的计算机设备(例如图1所示的设备101)的计算机***600的结构示意图。图6示出的计算机设备仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图6所示,计算机***600包括中央处理单元(CPU)601,其可以根据存储在只读存储器(ROM)602中的程序或者从存储部分608加载到随机访问存储器(RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中, 还存储有***600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM603通过总线604彼此相连。输入/输出(I/O)接口605也连接至总线604。
以下部件连接至I/O接口605:包括键盘、鼠标等的输入部分606;包括诸如阴极射线管(CRT)、液晶显示器(LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如LAN卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如磁盘、光盘、磁光盘、半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。
特别地,根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本申请的方法中限定的上述功能。
需要说明的是,本申请所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本申请中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。而在本申请中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输 用于由指令执行***、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请的操作的计算机程序代码,所述程序设计语言包括面向目标的程序设计语言-诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言-诸如”C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或电子设备上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)-连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本申请各种实施例的***、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的***来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本申请实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括转换单元、播放单元、确定单元和呈现单元。其中,这些单元的名称在种情况下并不构成对该单元本身的限定,例如,转换单元还可以被描述为“对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息的单元”。又例如,可以描述为:一种处理器包括确定单元、获取单元和添加单元。其中,这些单元的名称在种情况下并不构成对该单元本身的限定, 例如,确定单元还可以被描述为“对视频流进行物品识别,确定视频流的出现物品的单元”。
作为另一方面,本申请还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的计算机设备中所包含的;也可以是单独存在,而未装配入该计算机设备中。上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该计算机设备执行时,使得该计算机设备:对视频数据进行码流转换,得到视频流和视频流的出现物品的标识信息;在播放设备上播放视频流;响应于确定视频流的当前视频帧中存在用户的关注物品,确定关注物品的标识信息;基于关注物品的标识信息,查询关注物品的推送信息,以及呈现推送信息。或者使得该计算机设备:对视频流进行物品识别,确定视频流的出现物品;获取出现物品的标识信息;将出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。
以上描述仅为本申请的较佳实施例以及对所运用技术原理的说明。本领域技术人员应当理解,本申请中所涉及的发明范围,并不限于上述技术特征的特定组合而成的技术方案,同时也应涵盖在不脱离上述发明构思的情况下,由上述技术特征或其等同特征进行任意组合而形成的其它技术方案。例如上述特征与本申请中公开的(但不限于)具有类似功能的技术特征进行互相替换而形成的技术方案。

Claims (18)

  1. 一种信息推送方法,包括:
    对视频数据进行码流转换,得到视频流和所述视频流的出现物品的标识信息;
    在播放设备上播放所述视频流;
    响应于确定所述视频流的当前视频帧中存在用户的关注物品,确定所述关注物品的标识信息;
    基于所述关注物品的标识信息,查询所述关注物品的推送信息,以及呈现所述推送信息。
  2. 根据权利要求1所述的方法,其中,所述确定所述视频流的当前视频帧中存在用户的关注物品,包括:
    采集所述用户的语音信息;
    对所述语音信息进行识别,确定所述语音信息包含的物品名称;
    若所述物品名称与所述当前视频帧的出现物品匹配,将匹配的出现物品确定为所述关注物品。
  3. 根据权利要求1所述的方法,其中,所述确定所述视频流的当前视频帧中存在用户的关注物品,包括:
    在所述视频流的出现物品所在的视频帧中设置触发区域;
    响应于检测到所述用户确认所述当前视频帧的触发区域,将确认的触发区域对应的出现物品确定为所述关注物品。
  4. 根据权利要求3所述的方法,其中,所述标识信息包括坐标信息;以及
    所述在所述视频流的出现物品所在的视频帧中设置触发区域,包括:
    将所述坐标信息对应的区域设置为所述触发区域。
  5. 根据权利要求4所述的方法,其中,所述坐标信息是百分比坐标;以及
    所述将所述坐标信息对应的区域设置为所述触发区域,包括:
    基于所述播放设备的分辨率和所述当前视频帧的出现物品的百分比坐标,计算所述当前视频帧的出现物品的点阵坐标;
    将所述点阵坐标对应的区域设置为所述触发区域。
  6. 根据权利要求5所述的方法,其中,所述基于所述播放设备的分辨率和所述当前视频帧的出现物品的百分比坐标,计算所述当前视频帧的出现物品的点阵坐标,包括:
    若所述百分比坐标的坐标系与所述播放设备的屏幕坐标系相同,将所述播放设备的分辨率的水平像素值和垂直像素值与所述当前视频帧的出现物品的百分比坐标的水平坐标值和垂直坐标值对应相乘,得到所述当前视频帧的出现物品的点阵坐标。
  7. 根据权利要求6所述的方法,其中,所述基于所述播放设备的分辨率和所述当前视频帧的出现物品的百分比坐标,计算所述当前视频帧的出现物品的点阵坐标,还包括:
    若所述百分比坐标的坐标系与所述播放设备的屏幕坐标系不同,对所述百分比坐标的坐标系进行转换,得到所述屏幕坐标系下的转换百分比坐标;
    将所述播放设备的分辨率的水平像素值和垂直像素值与所述当前视频帧的出现物品的转换百分比坐标的水平坐标值和垂直坐标值对应相乘,得到所述当前视频帧的出现物品的点阵坐标。
  8. 根据权利要求3所述的方法,其中,所述检测到所述用户确认所述当前视频帧的触发区域,包括:
    若所述用户触摸所述当前视频帧的触发区域,确定所述用户确认所述触发区域。
  9. 根据权利要求3所述的方法,其中,所述检测到所述用户确认所述当前视频帧的触发区域,包括:
    捕捉所述用户的眼睛的焦点;
    响应于确定所述焦点在所述当前视频帧的触发区域,确定所述用户确认所述触发区域。
  10. 根据权利要求9所述的方法,其中,所述捕捉所述用户的眼睛的焦点,包括:
    利用所述播放设备的摄像头向所述眼睛发射光束;
    利用所述播放设备的屏幕上的感光材料感应从所述眼睛反射的光束强度;
    基于光束强度确定所述屏幕上的暗点,作为所述焦点。
  11. 一种视频处理方法,包括:
    对视频流进行物品识别,确定所述视频流的出现物品;
    获取所述出现物品的标识信息;
    将所述出现物品的标识信息添加到对应的视频帧协议中,生成视频数据。
  12. 根据权利要求11所述的方法,其中,所述获取所述出现物品的标识信息,包括:
    对所述视频流进行位置识别,确定所述出现物品的坐标信息;
    将所述出现物品的坐标信息添加到所述出现物品的标识信息中。
  13. 根据权利要求12所述的方法,其中,所述对所述视频流进行位置识别,确定所述出现物品的坐标信息,包括:
    在试播设备上模拟试播所述视频流;
    对所述视频流进行位置识别,得到所述出现物品的点阵坐标;
    基于所述出现物品的点阵坐标,确定所述出现物品的坐标信息。
  14. 根据权利要求13所述的方法,其中,所述基于所述出现物品的点阵坐标,确定所述出现物品的坐标信息,包括:
    将所述出现物品的点阵坐标的水平坐标值和垂直坐标值与所述试播设备 的分辨率的水平像素值和垂直像素值对应相除,得到所述出现物品的百分比坐标。
  15. 根据权利要求12所述的方法,其中,对于所述视频流中的连续视频帧的出现物品,添加到首次出现物品的视频帧协议中的标识信息包括物品名称、坐标信息、简要信息和/或网页链接,添加到非首次出现物品的视频帧协议的标识信息包括物品名称和坐标信息。
  16. 根据权利要求11-15之一所述的方法,其中,所述将所述出现物品的标识信息添加到对应的视频帧协议中,包括:
    基于所述出现物品的标识信息扩展对应的视频帧协议的网络抽象层信息。
  17. 一种计算机设备,包括:
    一个或多个处理器;
    存储装置,其上存储一个或多个程序;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-10中任一所述的方法,或者实现如权利要求11-16中任一所述的方法。
  18. 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如权利要求1-10中任一所述的方法,或者实现如权利要求11-16中任一所述的方法。
PCT/CN2021/104450 2020-08-05 2021-07-05 信息推送、视频处理方法和设备 WO2022028177A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010777098.7 2020-08-05
CN202010777098.7A CN111859158A (zh) 2020-08-05 2020-08-05 信息推送、视频处理方法和设备

Publications (1)

Publication Number Publication Date
WO2022028177A1 true WO2022028177A1 (zh) 2022-02-10

Family

ID=72971071

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/104450 WO2022028177A1 (zh) 2020-08-05 2021-07-05 信息推送、视频处理方法和设备

Country Status (2)

Country Link
CN (1) CN111859158A (zh)
WO (1) WO2022028177A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111859158A (zh) * 2020-08-05 2020-10-30 上海连尚网络科技有限公司 信息推送、视频处理方法和设备
CN114742576B (zh) * 2022-03-17 2024-05-31 北京有竹居网络技术有限公司 信息推送方法、装置和电子设备
CN115334346A (zh) * 2022-08-08 2022-11-11 北京达佳互联信息技术有限公司 界面显示方法、视频发布方法、视频编辑方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013040904A1 (zh) * 2011-09-22 2013-03-28 中兴通讯股份有限公司 一种处理广告的方法和终端
CN105100944A (zh) * 2014-04-30 2015-11-25 广州市动景计算机科技有限公司 一种物品信息的输出方法及装置
CN107704076A (zh) * 2017-09-01 2018-02-16 广景视睿科技(深圳)有限公司 一种动向投影物体展示***及其方法
CN110288400A (zh) * 2019-06-25 2019-09-27 联想(北京)有限公司 信息处理方法、信息处理装置以及信息处理***
CN111859158A (zh) * 2020-08-05 2020-10-30 上海连尚网络科技有限公司 信息推送、视频处理方法和设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102591553A (zh) * 2011-01-13 2012-07-18 京宏科技股份有限公司 视讯互动方法***及装置和视讯关联卷标产生装置及方法
US20150193446A1 (en) * 2014-01-07 2015-07-09 Microsoft Corporation Point(s) of interest exposure through visual interface
CN109120954B (zh) * 2018-09-30 2021-09-07 武汉斗鱼网络科技有限公司 视频消息推送方法、装置、计算机设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013040904A1 (zh) * 2011-09-22 2013-03-28 中兴通讯股份有限公司 一种处理广告的方法和终端
CN105100944A (zh) * 2014-04-30 2015-11-25 广州市动景计算机科技有限公司 一种物品信息的输出方法及装置
CN107704076A (zh) * 2017-09-01 2018-02-16 广景视睿科技(深圳)有限公司 一种动向投影物体展示***及其方法
CN110288400A (zh) * 2019-06-25 2019-09-27 联想(北京)有限公司 信息处理方法、信息处理装置以及信息处理***
CN111859158A (zh) * 2020-08-05 2020-10-30 上海连尚网络科技有限公司 信息推送、视频处理方法和设备

Also Published As

Publication number Publication date
CN111859158A (zh) 2020-10-30

Similar Documents

Publication Publication Date Title
WO2022028177A1 (zh) 信息推送、视频处理方法和设备
CN112261424B (zh) 图像处理方法、装置、电子设备及计算机可读存储介质
US20210136455A1 (en) Communication apparatus, communication control method, and computer program
KR102114701B1 (ko) 미디어 데이터에 있는 아이템을 인식하고 이와 관련된 정보를 전달하기 위한 시스템 및 방법
US8170392B2 (en) Method and apparatus for generation, distribution and display of interactive video content
CN111523566A (zh) 目标视频片段定位方法和装置
US9854232B2 (en) Systems and methods for picture quality monitoring
KR102286410B1 (ko) 반복적 재생을 피하기 위한 미디어 타이틀의 이전에 스트리밍된 부분들의 식별
US10999640B2 (en) Automatic embedding of information associated with video content
CN108235004B (zh) 视频播放性能测试方法、装置和***
US20040250297A1 (en) Method, apparatus and system for providing access to product data
US10897658B1 (en) Techniques for annotating media content
US20230291772A1 (en) Filtering video content items
JP2023522092A (ja) インタラクション記録生成方法、装置、デバイス及び媒体
JP2006285654A (ja) 商品情報検索システム
WO2022012273A1 (zh) 用于物品比价的方法和设备
CN109241344B (zh) 用于处理信息的方法和装置
CN114374853A (zh) 内容展示方法、装置、计算机设备和存储介质
US11531700B2 (en) Tagging an image with audio-related metadata
WO2023098576A1 (zh) 图像处理方法、装置、设备及介质
CN113298589A (zh) 商品信息处理方法及装置、信息获取方法及装置
CN114143568B (zh) 一种用于确定增强现实直播图像的方法与设备
US11700285B2 (en) Filtering video content items
CN111859159B (zh) 信息推送、视频处理方法和设备
KR102594976B1 (ko) 증강 현실을 위한 동영상 컨텐츠 선택 장치, 사용자 단말기 및 동영상 컨텐츠 제공 방법

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21852709

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 03.07.2023)

122 Ep: pct application non-entry in european phase

Ref document number: 21852709

Country of ref document: EP

Kind code of ref document: A1