CN108200446B - On-line multimedia interaction system and method of virtual image - Google Patents

On-line multimedia interaction system and method of virtual image Download PDF

Info

Publication number
CN108200446B
CN108200446B CN201810031218.1A CN201810031218A CN108200446B CN 108200446 B CN108200446 B CN 108200446B CN 201810031218 A CN201810031218 A CN 201810031218A CN 108200446 B CN108200446 B CN 108200446B
Authority
CN
China
Prior art keywords
data
virtual
audio
image
actual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810031218.1A
Other languages
Chinese (zh)
Other versions
CN108200446A (en
Inventor
刘岩
刘勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mizhi Technology Co ltd
Original Assignee
Beijing Mizhi Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mizhi Technology Co ltd filed Critical Beijing Mizhi Technology Co ltd
Priority to CN201810031218.1A priority Critical patent/CN108200446B/en
Publication of CN108200446A publication Critical patent/CN108200446A/en
Application granted granted Critical
Publication of CN108200446B publication Critical patent/CN108200446B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23412Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs for generating or manipulating the scene composition of objects, e.g. MPEG-4 objects
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/44012Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving rendering scenes according to scene graphs, e.g. MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/488Data services, e.g. news ticker
    • H04N21/4882Data services, e.g. news ticker for displaying messages, e.g. warnings, reminders
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/63Control signaling related to video distribution between client, server and network components; Network processes for video distribution between server and clients or between remote clients, e.g. transmitting basic layer and enhancement layers over different transmission paths, setting up a peer-to-peer communication via Internet between remote STB's; Communication protocols; Addressing
    • H04N21/643Communication protocols

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The embodiment of the application provides an on-line multimedia interaction system and method of an avatar, which comprises the steps of importing the virtual model of the avatar through an avatar importing module; the real-time extraction of the actual video data and the actual audio data of the actual image is realized through the data extraction module, the actual video data and the actual audio data are mapped to the virtual model through the data driving module, the virtual audio-video animation of the virtual image is generated, and then the virtual audio-video animation is output to an online interactive service platform provided by the streaming media server in real time through the virtual image output module, so that the online interactive service platform is accessed to the virtual image for multimedia interactive live broadcast, and the interest of the multimedia interactive live broadcast is improved.

Description

On-line multimedia interaction system and method of virtual image
Technical Field
The embodiment of the application relates to a virtual animation production technology and a multimedia live broadcast technology, in particular to an online multimedia interaction system and method of an avatar.
Background
The network live broadcast technology is an internet technology that a server side broadcasts live video data of a main broadcast user to a plurality of audience users for watching, and meanwhile, the network live broadcast technology can provide interaction between the audience users and the main broadcast user.
However, in the prior art, live webcasting is usually displayed by the actual image of the anchor, and if the virtual image can be accessed to the internet online interactive platform by a technical means, and the virtual image is used to participate in online live webcasting interaction instead of the actual image, the interest of live webcasting can be greatly improved.
Disclosure of Invention
In view of the above, embodiments of the present invention mainly aim to provide an online multimedia interaction system and method for an avatar, which can access an online interaction platform of the internet to perform online live broadcast by using the avatar instead of an actual avatar, so as to increase the interest of online live broadcast.
The embodiment of the application provides an online multimedia interaction system of an avatar, which is characterized by comprising: the virtual image importing module is used for importing a virtual model of the virtual image; the data extraction module is used for extracting actual video data and actual audio data of an actual image in real time; the data driving module is used for mapping the actual video data and the actual audio data to the virtual model in real time to generate a virtual audio-video animation of the virtual image; and the virtual image output module is used for outputting the virtual audio-video animation to an online interactive service platform provided by a streaming media server in real time, and accessing the virtual image to the online interactive service platform for multimedia interactive live broadcast.
Optionally, in any embodiment of the present application, the data extraction module further includes: a video extracting unit for extracting the actual video data of the actual avatar according to a preset real-time image extraction rate; and an audio extracting unit for extracting the actual audio data of the actual character.
Optionally, in any embodiment of the present application, the real-time image extraction rate is set according to a network bandwidth, a processing performance of a computer, and a network transmission protocol.
Optionally, in any embodiment of the present application, the virtual video animation of the avatar is composed of virtual image data and virtual sound data of the avatar, and the data driving module further includes:
optionally, in any embodiment of the present application, a video data processing unit for decomposing motion data from the actual video data to generate motion driving data to drive the virtual model to perform a motion, thereby generating the virtual image data of the avatar; and an audio data processing unit for performing processing including at least audio noise reduction, audio silence detection, and audio echo cancellation, respectively, with respect to the real-time audio data, thereby generating the virtual sound data of the avatar.
Optionally, in any embodiment of the present application, the avatar output module further includes: a video data encoding unit configured to perform encoding compression processing on the virtual video data to generate video compressed data; a sound data encoding unit for performing encoding compression processing for the virtual sound data to generate audio compression data; a data encapsulation unit, configured to encapsulate the video compressed data and the audio compressed data according to a message transmission protocol set by the streaming media server to generate a data packet; and the data transmission unit is used for transmitting the data packet to an online interactive service platform provided by the streaming media server based on the message transmission protocol.
Optionally, in any embodiment of the present application, the audio silence detection operation performed by the audio data processing unit includes detecting silence audio data with a silence flag in the real-time audio data, so as to make the sound data encoding unit not perform the encoding compression processing on the silence audio data.
Optionally, in any embodiment of the present application, the system further includes an information interaction processing module, configured to extract text information in the online interaction service platform, provide input feedback information for the extracted text information, and output the feedback information to the online interaction service platform, so as to provide online text information interaction operation.
Optionally, in any embodiment of the present application, the information interaction processing module extracts the text information in the online interaction service platform by using at least one of a web information crawling manner and a network packet capturing manner.
Optionally, in any embodiment of the present application, the system is applied to an electronic device, and the electronic device further includes multimedia live broadcast software connected to the online interactive service platform in a communication manner, the avatar output module is configured to access the virtual video animation to the multimedia live broadcast software, so that the multimedia live broadcast software outputs the virtual video animation to the online interactive service platform in real time, and the avatar is accessed to the online interactive service platform for multimedia interactive live broadcast.
Another embodiment of the present application provides an on-line multimedia interaction method for an avatar, comprising: importing a virtual model of the virtual image; extracting actual video data and actual audio data of an actual image in real time; mapping the actual video data and the actual audio data to the virtual model in real time to generate a virtual audio-video animation of the virtual image; and outputting the virtual video animation to an online interactive service platform provided by a streaming media server in real time, and accessing the virtual image to the online interactive service platform to perform multimedia interactive live broadcast.
Optionally, in any embodiment of the present application, the method further includes extracting the actual video data of the actual character according to a preset real-time image extraction rate.
Optionally, in any embodiment of the present application, the method further includes setting the real-time image extraction rate according to a network bandwidth, a processing performance of a computer, and a network transmission protocol.
Optionally, in any embodiment of the present application, the virtual video animation of the avatar is composed of virtual image data and virtual sound data of the avatar, and the method further includes: decomposing motion data from the actual video data to generate motion driving data and driving the virtual model to perform motion according to the motion driving data, thereby generating the virtual image data of the avatar; and processing including at least audio noise reduction, audio silence detection, and audio echo cancellation is performed respectively on the real-time audio data, thereby generating the virtual sound data of the avatar.
Optionally, in any embodiment of the present application, the method further includes: performing encoding compression processing on the virtual image data to generate video compression data; performing encoding compression processing on the virtual sound data to generate audio compression data; packaging the video compression data and the audio compression data according to a message transmission protocol set by the streaming media server to generate a data packet; and transmitting the data packet to an online interactive service platform provided by the streaming media server based on the message transmission protocol.
Optionally, in any embodiment of the present application, the audio silence detection operation includes detecting silence audio data with a silence flag in the real-time audio data, and the encoding compression processing is not performed on the silence audio data.
Optionally, in any embodiment of the present application, the method further includes extracting text information in the online interactive service platform, providing input feedback information for the extracted text information, and outputting the feedback information to the online interactive service platform to provide online text information interactive operation.
Optionally, in any embodiment of the present application, the method is applied to an electronic device installed with multimedia live broadcast software, and the method further includes accessing the virtual video animation to the multimedia live broadcast software, so that the multimedia live broadcast software outputs the virtual video animation to the online interactive service platform in real time, and the virtual image is accessed to the online interactive service platform for multimedia interactive live broadcast.
The utility model provides an online multimedia interaction system and method of avatar, through leading-in avatar's virtual model to the actual video data and the actual audio data mapping of the actual image that will extract in real time extremely in the virtual model, with real-time generation avatar's virtual audio-visual animation, and through with avatar animation transmits to online interactive service platform, thereby realizes replacing actual image with avatar and inserts online interactive service platform and carry out the interactive live broadcast of multimedia, thereby provides sound, image and the literal content that avatar and live interactive object can perceive each other, thereby promotes the live interest of multimedia interactive live broadcast.
Drawings
Some specific embodiments of the present application will be described in detail hereinafter by way of illustration and not limitation with reference to the accompanying drawings. The same reference numbers in the drawings identify the same or similar elements or components. Those skilled in the art will appreciate that the drawings are not necessarily drawn to scale. In the drawings:
FIG. 1 is a diagram illustrating a basic architecture of an on-line multimedia interactive system of an avatar according to an embodiment of the present application;
FIG. 2 is a block diagram of an embodiment of an on-line multimedia interactive system showing the avatar of FIG. 1;
fig. 3 is a basic flowchart illustrating an on-line multimedia interaction method of an avatar according to another embodiment of the present application; and
fig. 4 is a flowchart illustrating an embodiment of an online multimedia interaction method for the avatar shown in fig. 3.
Detailed Description
It is not necessary for any particular embodiment of the invention to achieve all of the above advantages at the same time.
In order to make those skilled in the art better understand the technical solutions in the embodiments of the present invention, the technical solutions in the embodiments of the present invention will be described clearly and completely with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments obtained by a person skilled in the art based on the embodiments of the present invention shall fall within the scope of the protection of the embodiments of the present invention.
The following further describes specific implementation of the embodiments of the present invention with reference to the drawings.
Fig. 1 is a basic architecture diagram of an on-line multimedia interactive system of an avatar according to an embodiment of the present application. As shown in the figure, the on-line multimedia interactive system 1 of the avatar of the present application mainly includes an avatar importing module 11, a data extracting module 12, a data driving module 13, and an avatar output module 14.
The avatar importing module 11 is used to import a virtual model of an avatar. In this embodiment, the avatar may be, for example, an avatar of a human being, an avatar of an animal, or the like. And the virtual model of the avatar may, for example, include a facial model of the avatar reflecting real-time facial expressions of the avatar, a skeletal model reflecting actions performed by the avatar in real-time, and so on.
The data extraction module 12 is used for extracting actual video data and actual audio data of the actual image in real time. In this embodiment, the actual image may be an actual person or animal, for example.
Referring to fig. 2, in an embodiment, the data extraction module 12 further includes a video extraction unit 121 and an audio extraction unit 122.
The video extracting unit 121 is configured to extract actual video data of the actual character, that is, to capture a motion trajectory of the actual character to generate motion data, according to a preset real-time image extraction rate. In the embodiment, the video extracting unit 121 may be connected to a video capturing device (not shown) to receive and extract image data of an actual image captured by the video capturing device, wherein the video capturing device is, for example, a video camera, a video recorder, an infrared sensor, or a computer camera, but the invention is not limited to the above description, and other types of video capturing devices may also be applied to the present application. Furthermore, the real-time image extraction rate is set according to the actual network bandwidth, the processing performance of the computer and the network transmission protocol. Generally, the three-dimensional engine can provide different rendering rates of 60/s or 30/s, and the required image rate can be determined according to objective factors such as actual network bandwidth, computer processing performance and target transmission protocol, so that the three-dimensional engine can subsequently render the image of the virtual model to a fixed target texture and read the target texture at fixed time intervals (for example, 1000 ms/25-40 ms corresponds to 25fps), thereby ensuring real-time performance and smoothness of the image frame of the virtual model.
The audio extracting unit 122 is configured to extract actual audio data of an actual image, in this embodiment, the audio extracting unit 122 may be connected to an audio collecting device (not shown) to extract sound data collected by the audio collecting device, where the audio collecting device may be, for example, an independent microphone, or a microphone integrally installed on an electronic device (e.g., a computer, a video camera), but the above list is not limited thereto, and other types of recording devices may also be applicable to the present application. Further, the audio extracting unit 122 may be used to extract an environmental sound of an environment in which the real character is located, in addition to the sound emitted by the real character.
In addition, the video extraction unit 121 and the audio extraction unit 122 operate independently, so that they perform their data extraction operations in a synchronous manner.
The data driving module 13 is configured to map the extracted actual video data and actual audio data to a virtual model in real time, so as to generate a virtual video animation of the virtual image. In the embodiment of the present application, the generated virtual video animation of the avatar is composed of the virtual image data and the virtual sound data of the avatar.
Similar to the data extraction module 12, the data driving module 13 also performs independent processing on the video portion and the audio portion of the avatar, and the two portions may also be in a synchronous processing mode, specifically, referring to fig. 2, in this embodiment, the data driving module 13 further includes a video data processing unit 131 and an audio data processing unit 132.
The video data processing unit 131 is configured to decompose the motion data of the real character from the real video data extracted by the video extracting unit 121 to generate motion driving data, and to drive the virtual model to perform a corresponding motion by using the motion driving data through the three-dimensional engine, thereby generating virtual image data of the virtual character.
In one embodiment, the video data processing unit 131 further separates the static image and the dynamic image from the actual video data, and superimposes the separated static image on the virtual image rendered by the three-dimensional engine. The static image includes image information such as subtitle images and static special effect images, and the decomposed dynamic image is also superimposed on the virtual image rendered by the three-dimensional engine to form a virtual image.
In addition, the video data processing unit 131 also has a function of a video real-time filter, that is, it can perform adjustment processing for the effect of the image generated after the three-dimensional engine rendering, for example, the shading adjustment processing is realized by a shader language.
The audio data processing unit 132 is configured to perform relevant processing operations, including audio noise reduction, audio silence detection, and audio echo cancellation, on the real-time audio data extracted by the audio extracting unit 122, so as to generate virtual sound data of the avatar. In this embodiment, the audio data processing unit 132 performs noise reduction, silence monitoring and echo cancellation processing on the original real-time audio data by using the audio noise reduction function, the audio silence detection function and the echo cancellation function provided by the third-party open source SDK, respectively.
The avatar output module 14 is configured to output the virtual video animation of the avatar generated by the data driving module 13 to the online interactive service platform 20 provided by the streaming server 2 in real time, so as to access the online interactive service platform 20 with the avatar instead of the traditional real avatar for multimedia interactive live broadcast. In the present embodiment, the online interactive service platform 20 includes, but is not limited to, conventional video live broadcasting, real-time video call, real-time video conference, and the like.
In an embodiment, the system 1 of the present application is applied to an electronic device, such as a computer, a smart phone, and the like. The electronic device is also pre-installed with multimedia live broadcast software which can connect and communicate with the online interactive service platform, and the avatar output module 14 is used for accessing the generated virtual audio/video animation of the avatar into the multimedia live broadcast software, so that the multimedia live broadcast software can output the virtual audio/video animation to the online interactive service platform 20 in real time, and the avatar can perform multimedia interactive live broadcast in the online interactive service platform 20.
Generally speaking, the multimedia live broadcasting software installed in the electronic device will receive the image of the actual image shot by the computer camera, and upload it as the image data of the actual image to the on-line live broadcasting platform for live broadcasting, and the avatar output module 14 of the present application can be used to simulate the computer camera, and directly interface with the multimedia live broadcasting software installed in the computer, so as to access the virtual image animation of the avatar to the on-line interactive service platform 20 instead of the image picture of the actual image for video interaction with the communication partner.
Referring to fig. 2, in an embodiment, the avatar output module 14 further includes an image data encoding unit 141, a sound data encoding unit 142, a data packing unit 143, and a data transmission unit 144.
The video data encoding unit 141 is configured to perform encoding and compression processing on the virtual video data to generate video compressed data. In this embodiment, the image data encoding unit 141 may use the third-party open source SDK to select a hardware compression encoding method or a software compression encoding method according to the supported format of the actual virtual image data to perform encoding and compression processing on the virtual image data.
The sound data encoding unit 142 is configured to perform encoding compression processing on the virtual sound data to generate audio compressed data. In addition, in the embodiment, the sound data encoding unit 142 does not perform the audio encoding compression process for the part of the mute audio data detected by the audio data processing unit 132 from the real-time audio data, so as to reduce the system resource consumption.
The data encapsulation unit 143 is configured to perform encapsulation processing on the video compressed data and the audio compressed data according to the message transmission protocol set by the streaming server 2, so as to generate a data packet.
The data transmission unit 144 is configured to transmit the data packet to the online interactive service platform 20 provided by the streaming server 2 through the internet system based on the message transmission protocol set by the streaming server 2, so as to implement online live broadcast interaction with an avatar.
Referring to fig. 2, in another embodiment of the present application, the online multimedia interactive system 1 further includes an information interaction processing module 15, which is configured to extract text information in the online interactive service platform 20, provide input feedback information for the extracted text information, and output the feedback information to the online interactive service platform 20 to provide online text information interaction operation. That is, the information interaction processing module 15 can extract the message information inputted by the online interaction service platform 20 for the audience to input the relevant feedback information by the actual operator in front of the computer, and upload the feedback information to the online interaction service platform 20, thereby implementing the text interaction between the virtual image and the actual audience. In a specific embodiment, the information interaction processing module 15 may use a web page information crawling method or a network packet capturing method to extract the message information of the audience from the online interaction service platform 20.
Fig. 3 is a basic flowchart illustrating an on-line multimedia interaction method of an avatar according to another embodiment of the present application. As shown in the figure, the on-line multimedia interaction method of the virtual image of the present application mainly includes the following processing steps:
step S31, importing a virtual model of the avatar.
In this embodiment, the avatar may be, for example, an avatar of a human being, an avatar of an animal, or the like.
The virtual model of the avatar may include a facial model and a skeletal model of the avatar, wherein the facial model is used to reflect real-time facial expressions of the avatar, and the skeletal model is used to reflect real-time actions performed by the avatar.
Step S32, extracting the actual video data and the actual audio data of the actual image in real time.
In this embodiment, the real character may be, for example, a real person or an animal, and the real-time image extraction rate may be set according to a real network bandwidth, a processing performance of a computer, and a network transmission protocol, and the real video data of the real character may be extracted according to the real-time image extraction rate. Typically, the three-dimensional engine may provide different rendering rates of 60/s or 30/s, and thus, the present application determines the required image rate according to factors such as actual network bandwidth, computer processing performance, and target transmission protocol, so that the three-dimensional engine may render the image of the virtual model to a fixed target texture for subsequent reading of the target texture at fixed time intervals (e.g., 1000 ms/25-40 ms for 25fps), thereby ensuring real-time performance and fluency of the rendered image of the virtual model.
And step S33, real-time mapping the actual video data and the actual audio data to the virtual model to generate virtual image and sound animation of the virtual image.
In the embodiment, the virtual video animation of the avatar is composed of virtual image data and virtual sound data of the avatar, and the application decomposes the action data from the actual video data to generate action driving data, and drives the virtual model to execute the action, thereby generating the virtual image data of the avatar. Specifically, the method and the device can resolve the static image and the dynamic image from the actual video data, and superimpose the resolved static image on the virtual image rendered by the three-dimensional engine. The static image contains image information such as caption image and static special effect image, and the decomposed dynamic image is also superposed on the virtual image rendered by the three-dimensional engine, so as to form virtual image picture of virtual image. In addition, the method and the device can adjust the effect of the image generated after the three-dimensional engine renders, for example, the shading degree can be adjusted through a shader language, so that the function of a video real-time filter is provided.
Meanwhile, the method and the device also respectively carry out processing at least comprising audio noise reduction, audio silence detection and audio echo elimination aiming at the real-time audio data, thereby generating virtual sound data of the virtual image. In an embodiment of the present application, the audio noise reduction, the audio silence detection, and the audio echo cancellation may be performed by using related functions provided by the third-party open source SDK.
Step S34, the virtual video animation is outputted to the online interactive service platform provided by the streaming media server in real time, and the virtual image is accessed to the online interactive service platform for multimedia interactive live broadcast.
In a specific embodiment, the method can be applied to electronic equipment provided with multimedia live broadcast software, and virtual video animation is accessed into the multimedia live broadcast software so that the multimedia live broadcast software can output the virtual video animation to an online interactive service platform in real time, and therefore the traditional actual image is replaced by the virtual image to be accessed into the online interactive service platform for multimedia interactive live broadcast.
Referring to fig. 4, in another embodiment of the present application, the step S34 specifically includes the following processing steps:
step S341, performing encoding compression processing on the virtual image data to generate video compressed data, and performing encoding compression processing on the virtual sound data to generate audio compressed data.
In this embodiment, the third-party open source SDK may be selected to perform encoding and compression processing on the virtual image data in a hardware compression encoding manner or a software compression encoding manner according to the supported format of the actual virtual image data.
In addition, it should be noted that, for the aforementioned mute audio data detected from the real-time audio data, the audio encoding compression process is not executed during the audio compression process, so as to reduce the system resource consumption.
Step S342, the video compressed data and the audio compressed data are encapsulated according to the message transmission protocol set by the streaming media server to generate a data packet.
Step S343, based on the message transmission protocol set by the streaming media server, the generated data packet is transmitted to the online interactive service platform provided by the streaming media server, so as to perform online live broadcast interaction in the online interactive service platform with the virtual image.
In addition, in another embodiment of the present application, the method may further include extracting text information in the online interactive service platform in real time during the live broadcast interaction, providing input feedback information for the extracted text information, and outputting the feedback information to the online interactive service platform to provide text interactive communication between the avatar and the actual audience. In a specific embodiment, a webpage information crawling manner or a network packet capturing manner can be adopted to extract the message information of the audience from the online interactive service platform.
The online multimedia interaction system and method of the virtual image provided by the embodiment of the application map actual video data and actual audio data of the real image extracted in real time to the virtual model to generate virtual audio and video animation of the virtual image synchronous with the real image, and then transmit the virtual image animation to the online interactive service platform to replace the online interactive service platform with the virtual image to perform multimedia interactive live broadcast, thereby improving the interest of multimedia interactive live broadcast.
Moreover, the method and the device calculate the real-time image extraction rate according to the actual network bandwidth, the processing performance of the computer and the network transmission protocol, extract the real-time video data, and package and transmit the virtual video animation data based on the real-time message transmission protocol of the online interactive service platform so as to ensure the fluency and the real-time performance of the interactive live broadcast.
In addition, the method and the system can also extract the text information input by the audience online interactive service platform, provide input feedback information aiming at the extracted text information, and output the feedback information to the online interactive service platform, so that the virtual image and the actual audience are in text interactive communication.
The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions and/or portions thereof that contribute to the prior art may be embodied in the form of a software product that can be stored on a computer-readable storage medium including any mechanism for storing or transmitting information in a form readable by a computer (e.g., a computer). For example, a machine-readable medium includes Read Only Memory (ROM), Random Access Memory (RAM), magnetic disk storage media, optical storage media, flash memory storage media, electrical, optical, acoustical or other form of propagated signals (e.g., carrier waves, infrared signals, digital signals, etc.), and others, and the computer software product includes instructions for causing a computing device (which may be a personal computer, server, or network device, etc.) to perform the methods described in the various embodiments or portions of the embodiments.
Finally, it should be noted that: the above embodiments are only used for illustrating the technical solutions of the embodiments of the present application, and are not limited thereto; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, apparatus (device), or computer program product. Accordingly, embodiments of the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
Embodiments of the present invention are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (devices) and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Claims (17)

1. An on-line multimedia interactive system for an avatar, comprising:
the virtual image importing module is used for importing a virtual model of the virtual image;
the data extraction module is used for extracting actual video data and actual audio data of an actual image in real time;
the data driving module is used for mapping the actual video data and the actual audio data to the virtual model in real time to generate a virtual audio-video animation of the virtual image; and
the virtual image output module is used for outputting the virtual audio-video animation to an online interactive service platform provided by a streaming media server in real time, and accessing the virtual image to the online interactive service platform for multimedia interactive live broadcast;
the data driving module comprises a video data processing unit, wherein the video data processing unit is used for decomposing a static image and a dynamic image from the actual video data, and superposing the decomposed static image and dynamic image to a virtual image rendered by a three-dimensional engine to form a virtual audio-video picture, and the static image comprises a subtitle picture and a static special effect picture.
2. The system of claim 1, wherein the data extraction module further comprises:
a video extracting unit for extracting the actual video data of the actual avatar according to a preset real-time image extraction rate; and
an audio extracting unit for extracting the actual audio data of the actual character.
3. The system of claim 2, wherein the real-time image extraction rate is set according to a network bandwidth, a processing capability of a computer, and a network transmission protocol.
4. The on-line multimedia interactive system for avatars according to claim 2, wherein the virtual video animation of the avatar is composed of virtual image data and virtual sound data of the avatar, and the data driving module further comprises:
a video data processing unit for decomposing motion data from the actual video data to generate motion driving data to drive the virtual model to perform a motion, thereby generating the virtual image data of the avatar; and
an audio data processing unit for performing processing including at least audio noise reduction, audio silence detection, and audio echo cancellation, respectively, with respect to the actual audio data, thereby generating the virtual sound data of the avatar.
5. The on-line multimedia interactive system for avatars according to claim 4, wherein said avatar output module further comprises:
a video data encoding unit configured to perform encoding compression processing on the virtual video data to generate video compressed data;
a sound data encoding unit for performing encoding compression processing for the virtual sound data to generate audio compression data;
a data encapsulation unit, configured to encapsulate the video compressed data and the audio compressed data according to a message transmission protocol set by the streaming media server to generate a data packet; and
and the data transmission unit is used for transmitting the data packet to an online interactive service platform provided by the streaming media server based on the message transmission protocol.
6. The system of claim 5, wherein the audio silence detecting operation performed by the audio data processing unit includes detecting silence audio data having a silence flag in the real audio data to cause the sound data encoding unit not to perform the encoding compression process with respect to the silence audio data.
7. The on-line multimedia interactive system for avatars as claimed in claim 1, further comprising an information interaction processing module for extracting text information in the on-line interactive service platform, providing input feedback information for the extracted text information, and outputting the feedback information to the on-line interactive service platform to provide on-line text information interaction operation.
8. The system of claim 7, wherein the information interaction processing module extracts the text information in the online interaction service platform by at least one of a web information crawling method and a network packet capturing method.
9. The system of claim 1, wherein the system is applied to an electronic device, and a multimedia live broadcast software communicatively connected to the online interactive service platform is further installed in the electronic device, the avatar output module is configured to access the virtual video animation to the multimedia live broadcast software, so that the multimedia live broadcast software can output the virtual video animation to the online interactive service platform in real time, and the virtual image is accessed to the online interactive service platform for multimedia interactive live broadcast.
10. An on-line multimedia interaction method for an avatar, comprising:
importing a virtual model of the virtual image;
extracting actual video data and actual audio data of an actual image in real time;
mapping the actual video data and the actual audio data to the virtual model in real time to generate a virtual audio-video animation of the virtual image; and
outputting the virtual video animation to an online interactive service platform provided by a streaming media server in real time, and accessing the virtual image to the online interactive service platform to perform multimedia interactive live broadcast;
the real-time mapping of the actual video data and the actual audio data to the virtual model to generate the virtual image animation of the virtual image comprises:
and decomposing a static image and a dynamic image from the actual video data, and superposing the decomposed static image and the dynamic image to a virtual image rendered by a three-dimensional engine to form a virtual image picture of a virtual image, wherein the static image comprises a subtitle picture and a static special effect picture.
11. The method of on-line multimedia interaction of an avatar according to claim 10, further comprising:
and extracting the actual video data of the actual image according to a preset real-time image extraction rate.
12. The method of on-line multimedia interaction of an avatar according to claim 11, further comprising setting the real-time image extraction rate according to network bandwidth, computer processing capability and network transmission protocol.
13. The method of claim 11, wherein the avatar animation is composed of avatar image data and avatar audio data, the method further comprising:
decomposing motion data from the actual video data to generate motion driving data and driving the virtual model to perform motion according to the motion driving data, thereby generating the virtual image data of the avatar; and
and respectively carrying out processing at least comprising audio noise reduction, audio silence detection and audio echo cancellation on the actual audio data so as to generate the virtual sound data of the virtual image.
14. The method of on-line multimedia interaction of an avatar according to claim 13, further comprising:
performing encoding compression processing on the virtual image data to generate video compression data;
performing encoding compression processing on the virtual sound data to generate audio compression data;
packaging the video compression data and the audio compression data according to a message transmission protocol set by the streaming media server to generate a data packet; and
and transmitting the data packet to an online interactive service platform provided by the streaming media server based on the message transmission protocol.
15. The method of on-line multimedia interaction of an avatar according to claim 14, wherein said audio silence detecting operation includes detecting silence audio data having a silence flag in said actual audio data, and said encoding compression process is not performed with respect to said silence audio data.
16. The method of claim 10, further comprising extracting text information in the online interactive service platform, providing input feedback information for the extracted text information, and outputting the feedback information to the online interactive service platform to provide online text information interactive operation.
17. The method of claim 10, wherein the method is applied to an electronic device installed with live multimedia broadcasting software, and the method further comprises accessing the live multimedia broadcasting software with the virtual video animation, so that the live multimedia broadcasting software can output the virtual video animation to the live online interactive service platform in real time, and the live virtual image can be accessed to the live online interactive service platform for live multimedia interactive broadcasting.
CN201810031218.1A 2018-01-12 2018-01-12 On-line multimedia interaction system and method of virtual image Active CN108200446B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810031218.1A CN108200446B (en) 2018-01-12 2018-01-12 On-line multimedia interaction system and method of virtual image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810031218.1A CN108200446B (en) 2018-01-12 2018-01-12 On-line multimedia interaction system and method of virtual image

Publications (2)

Publication Number Publication Date
CN108200446A CN108200446A (en) 2018-06-22
CN108200446B true CN108200446B (en) 2021-04-30

Family

ID=62588894

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810031218.1A Active CN108200446B (en) 2018-01-12 2018-01-12 On-line multimedia interaction system and method of virtual image

Country Status (1)

Country Link
CN (1) CN108200446B (en)

Families Citing this family (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109240709A (en) * 2018-07-26 2019-01-18 北京运多多网络科技有限公司 SDK cut-in method and device for live streaming
CN108986192B (en) * 2018-07-26 2024-01-30 北京运多多网络科技有限公司 Data processing method and device for live broadcast
CN109448467A (en) * 2018-11-01 2019-03-08 深圳市木愚科技有限公司 A kind of virtual image teacher teaching program request interaction systems
CN110139115B (en) * 2019-04-30 2020-06-09 广州虎牙信息科技有限公司 Method and device for controlling virtual image posture based on key points and electronic equipment
CN110071938B (en) * 2019-05-05 2021-12-03 广州虎牙信息科技有限公司 Virtual image interaction method and device, electronic equipment and readable storage medium
CN110225400B (en) * 2019-07-08 2022-03-04 北京字节跳动网络技术有限公司 Motion capture method and device, mobile terminal and storage medium
CN110910691B (en) * 2019-11-28 2021-09-24 深圳市木愚科技有限公司 Personalized course generation method and system
CN110971930B (en) * 2019-12-19 2023-03-10 广州酷狗计算机科技有限公司 Live virtual image broadcasting method, device, terminal and storage medium
CN112164128B (en) * 2020-09-07 2024-06-11 广州汽车集团股份有限公司 Vehicle-mounted multimedia music visual interaction method and computer equipment
CN112634684B (en) * 2020-12-11 2023-05-30 深圳市木愚科技有限公司 Intelligent teaching method and device
CN112788355B (en) * 2020-12-30 2022-08-23 北京达佳互联信息技术有限公司 Information processing method, device and storage medium
CN112732084A (en) * 2021-01-13 2021-04-30 西安飞蝶虚拟现实科技有限公司 Future classroom interaction system and method based on virtual reality technology
CN113132741A (en) * 2021-03-03 2021-07-16 广州鑫泓设备设计有限公司 Virtual live broadcast system and method
CN114786023A (en) * 2022-03-28 2022-07-22 南京小灿灿网络科技有限公司 AR live broadcast system based on virtual reality
CN115113963B (en) * 2022-06-29 2023-04-07 北京百度网讯科技有限公司 Information display method and device, electronic equipment and storage medium
CN116320521A (en) * 2023-03-24 2023-06-23 吉林动画学院 Three-dimensional animation live broadcast method and device based on artificial intelligence

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105657294A (en) * 2016-03-09 2016-06-08 北京奇虎科技有限公司 Method and device for presenting virtual special effect on mobile terminal
CN106383587A (en) * 2016-10-26 2017-02-08 腾讯科技(深圳)有限公司 Augmented reality scene generation method, device and equipment
CN106648120A (en) * 2017-02-21 2017-05-10 戴雨霖 Training system for escape from fire based on virtual reality and somatosensory technology

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050130725A1 (en) * 2003-12-15 2005-06-16 International Business Machines Corporation Combined virtual and video game
CN101494547A (en) * 2009-03-05 2009-07-29 广东威创视讯科技股份有限公司 Method and system for implementing conference combining local conference and network conference equipment
CN102377745A (en) * 2010-08-19 2012-03-14 上海济丽信息技术有限公司 Interactive collaboration system based on large spliced screen and interactive collaboration method
US9268406B2 (en) * 2011-09-30 2016-02-23 Microsoft Technology Licensing, Llc Virtual spectator experience with a personal audio/visual apparatus
CN105828106B (en) * 2016-04-15 2019-01-04 山东大学苏州研究院 A kind of non-integral multiple frame per second method for improving based on motion information
CN106937154A (en) * 2017-03-17 2017-07-07 北京蜜枝科技有限公司 Process the method and device of virtual image
CN106993195A (en) * 2017-03-24 2017-07-28 广州创幻数码科技有限公司 Virtual portrait role live broadcasting method and system
CN107274464A (en) * 2017-05-31 2017-10-20 珠海金山网络游戏科技有限公司 A kind of methods, devices and systems of real-time, interactive 3D animations
CN107170030A (en) * 2017-05-31 2017-09-15 珠海金山网络游戏科技有限公司 A kind of virtual newscaster's live broadcasting method and system
CN107438183A (en) * 2017-07-26 2017-12-05 北京暴风魔镜科技有限公司 A kind of virtual portrait live broadcasting method, apparatus and system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105657294A (en) * 2016-03-09 2016-06-08 北京奇虎科技有限公司 Method and device for presenting virtual special effect on mobile terminal
CN106383587A (en) * 2016-10-26 2017-02-08 腾讯科技(深圳)有限公司 Augmented reality scene generation method, device and equipment
CN106648120A (en) * 2017-02-21 2017-05-10 戴雨霖 Training system for escape from fire based on virtual reality and somatosensory technology

Also Published As

Publication number Publication date
CN108200446A (en) 2018-06-22

Similar Documents

Publication Publication Date Title
CN108200446B (en) On-line multimedia interaction system and method of virtual image
US9210372B2 (en) Communication method and device for video simulation image
CN107241646B (en) Multimedia video editing method and device
CN108401192A (en) Video stream processing method, device, computer equipment and storage medium
WO2016150317A1 (en) Method, apparatus and system for synthesizing live video
CN107979763B (en) Virtual reality equipment video generation and playing method, device and system
KR20220077132A (en) Method and system for generating binaural immersive audio for audiovisual content
EP4099709A1 (en) Data processing method and apparatus, device, and readable storage medium
CN110401810B (en) Virtual picture processing method, device and system, electronic equipment and storage medium
KR101915786B1 (en) Service System and Method for Connect to Inserting Broadcasting Program Using an Avata
US9473810B2 (en) System and method for enhancing live performances with digital content
CN112272327B (en) Data processing method, device, storage medium and equipment
CN114095744B (en) Video live broadcast method and device, electronic equipment and readable storage medium
CN110933485A (en) Video subtitle generating method, system, device and storage medium
CN110809173A (en) Virtual live broadcast method and system based on AR augmented reality of smart phone
CN111464828A (en) Virtual special effect display method, device, terminal and storage medium
CN112135155A (en) Audio and video connecting and converging method and device, electronic equipment and storage medium
CN114286021B (en) Rendering method, rendering device, server, storage medium, and program product
KR101915792B1 (en) System and Method for Inserting an Advertisement Using Face Recognition
CN114531564A (en) Processing method and electronic equipment
CN109862385B (en) Live broadcast method and device, computer readable storage medium and terminal equipment
US10762913B2 (en) Image-based techniques for audio content
CN116962742A (en) Live video image data transmission method, device and live video system
CN108495163B (en) Video barrage reading device, system, method and computer readable storage medium
CN112423108B (en) Method and device for processing code stream, first terminal, second terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant