CN118120238A - Video playing method, system and storage medium - Google Patents

Video playing method, system and storage medium Download PDF

Info

Publication number
CN118120238A
CN118120238A CN202280063551.5A CN202280063551A CN118120238A CN 118120238 A CN118120238 A CN 118120238A CN 202280063551 A CN202280063551 A CN 202280063551A CN 118120238 A CN118120238 A CN 118120238A
Authority
CN
China
Prior art keywords
video
video data
information
server
decryption information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202280063551.5A
Other languages
Chinese (zh)
Inventor
谭红平
刘文泽
吴杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Streamax Technology Co Ltd
Original Assignee
Streamax Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Streamax Technology Co Ltd filed Critical Streamax Technology Co Ltd
Publication of CN118120238A publication Critical patent/CN118120238A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2347Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving video stream encryption

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The application relates to the field of video, and provides a video playing method, a system and a storage medium, wherein the method comprises the following steps: receiving a video playing request of a browser end, and sending a video opening signaling to a device end for collecting videos; receiving first video data uploaded by a device side, encrypting an audio frame in the first video data through preset first encryption information, expanding a video frame structure in the first video data, and adding first decryption information in the expanded video frame structure; and sending the expanded video frames and the encrypted audio frames to a browser end, and decrypting and playing the video frames according to the first decryption information in the expanded video frame structure by the browser end. According to the application, by expanding the video frame structure, the interface of the server is prevented from being independently called to inquire the first decryption information, thereby being beneficial to alleviating the access pressure of the server and reducing the video playing time delay.

Description

Video playing method, system and storage medium Technical Field
The present application relates to the field of video, and in particular, to a video playing method, system and storage medium.
Background
In recent years, with the rapid development of technologies such as computers, big data, image analysis and network transmission, video monitoring has been widely applied to various fields such as traffic, finance, public security, electric power, water conservancy and hotels as an important constituent of a social public security system, and the market scale is also growing year by year. In video surveillance systems, however, the captured video contains many personal biometric information and confidential information of the unit. If the video is compromised, personal privacy may be compromised, business confidentiality compromised, adverse social effects, etc. In the current scenario where network security and privacy security are increasingly important, security of video access is very important.
In order to ensure the safety of video access, the general solution in industry is that a server encrypts audio and video data, a client decrypts and plays the audio and video data (if the client is a browser, an plugin, such as an Active plugin, is also required to be installed for the browser), and the main flow is as follows:
1. a user starts a client for playing video;
2. the client inquires decryption information of the played video from the server;
3. The server checks the access legitimacy of the client, and after the access legitimacy of the client passes the check, the server returns decryption information to the client;
4. The client pulls the video stream from the server, decrypts the encrypted audio and video data through the decryption information, and then decodes and plays the audio and video data.
Although the security of audio and video data is improved to a certain extent in this way, when the client plays videos, the browser needs to install the plug-in, and when playing videos, decryption information needs to be queried from the server, when a large number of users play videos, great access pressure is easily caused to the server, the concurrency of a platform is influenced, and when the interactive operation of querying the decryption information is sent to the server, a certain time is consumed, so that delay occurs in the display time of video pictures, and bad use experience is brought to low-delay video playing.
Technical problem
In view of the above, embodiments of the present application provide a video playing method, system, and storage medium, so as to solve the problem in the prior art that, when a client plays a video, decryption information needs to be queried from a server, which causes a great access pressure to the server, affects the concurrency of a platform, causes delay in video frame display time, and causes poor experience to low-latency video playing.
Technical solution
A first aspect of an embodiment of the present application provides a video playing method, where the method is applied to a server, and the method includes:
Receiving a video playing request of a browser end, and sending a video opening signaling to a device end for collecting videos;
receiving first video data uploaded by a device side, encrypting an audio frame in the first video data through preset first encryption information, expanding a video frame structure in the first video data, and adding first decryption information in the expanded video frame structure, wherein the first decryption information corresponds to the first encryption information;
And packaging the expanded video frames and the encrypted audio frames into second video data, and sending the second video data to a browser end, wherein the second video data is used for decrypting and playing the second video data by the browser end according to the first decryption information in the expanded video frame structure.
With reference to the first aspect, in a first possible implementation manner of the first aspect, before sending signaling for opening the video to a device side for capturing the video, the method further includes:
establishing a communication link with a device end, and receiving registration information of the device end through the communication link;
and carrying out identity authentication on the equipment end according to the registration information, and determining the online state of the equipment after the authentication is passed.
With reference to the first aspect, in a second possible implementation manner of the first aspect, the first video data uploaded by the device side is encrypted video data, and before receiving the first video data uploaded by the device side, the method further includes:
and sending the preset first encryption information to the equipment end, wherein the first encryption information is used for encrypting the third video data acquired by the equipment end.
With reference to the first aspect, in a third possible implementation manner of the first aspect, adding first decryption information in the extended video frame structure includes:
Encrypting the first decryption information according to predetermined second encryption information;
And adding the encrypted first encryption information in a network abstraction layer unit in the expanded video frame structure.
In a second aspect, an embodiment of the present application provides a video playing method, where the method is applied to a browser, and the method includes:
sending a video playing request to a server, wherein the video playing request comprises equipment information of a video requested to be played;
Receiving second video data returned by a server, wherein the second video data comprises video frames and audio frames in first video data sent by a device, the video frames are video frames added with preset first decryption information after structural expansion, and the audio frames are audio frames encrypted by first encryption information corresponding to the first decryption information;
And decrypting the encrypted audio frame according to the first decryption information in the second video data, and playing the video according to the video frame and the decrypted audio frame.
With reference to the second aspect, in a first possible implementation manner of the second aspect, decrypting the encrypted audio frame according to the first decryption information in the second video data includes:
The second video data is unpacked by a decryption library compiled into a byte code format of the bottom virtual machine in WebAssembly coding mode, and video frames with expanded video frame structures and encrypted audio frames contained in the second video data are obtained;
analyzing the video frame with the expanded video frame structure to obtain first decryption information included in the video frame;
and decrypting the encrypted audio frame according to the first decryption information.
With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, after obtaining the first decryption information included in the video frame, the method further includes:
Detecting an encryption state of the video frame;
And decrypting the video frame through the first decryption information when the video frame is in an encrypted state.
With reference to the first possible implementation manner of the second aspect, in a third possible implementation manner of the second aspect, parsing the video frame with the extended video frame structure to obtain first decryption information included in the video frame includes:
And analyzing the video frame with the expanded video frame structure, and decrypting the expanded information of the video frame through preset second decryption information to obtain first decryption information included in the video frame.
A third aspect of an embodiment of the present application provides a server, including a memory, a processor, a communication unit, and a computer program stored in the memory and executable on the processor, wherein:
The communication unit is used for receiving and transmitting data or instructions to a browser end or a device end;
The processor is configured to perform the video playing method according to any one of the first aspect.
A fourth aspect of the embodiment of the present application provides a browser side including a memory, a processor, a communication unit, and a computer program stored in the memory and executable on the processor, the browser side including the communication unit and the processing unit, wherein:
the communication unit is used for receiving and transmitting data or instructions to the server;
the processor is configured to perform the video playing method according to any one of the second aspects.
A fifth aspect of the embodiment of the present application provides a video playing system, where the system includes a browser side, a server side, and a device side, where:
the server is configured to execute the video playing method according to any one of the first aspect;
the browser side is configured to execute the video playing method according to any one of the second aspects.
A sixth aspect of the embodiments of the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the method according to any one of the first or second aspects.
Advantageous effects
Compared with the prior art, the embodiment of the application has the beneficial effects that: in the video playing method of the embodiment of the application, when the server receives the video playing request of the browser, the server sends the video opening signaling to the equipment, receives the first video data uploaded by the equipment, expands the video frame structure of the video frame in the first video data, adds the first decryption information in the expanded video frame structure, encrypts the audio frame in the first video data through the first encryption information, encapsulates the expanded video frame and the encrypted audio frame into the second video data and sends the second video data to the browser for playing, so that the interface of the server does not need to be independently called for inquiring the first decryption information when the browser plays the video, the one-time service interaction is fundamentally reduced, the access pressure of the server is facilitated to be reduced, the first decryption information is transmitted through the expanded video frame structure, and no extra access time is consumed, thereby the time delay of video playing can be reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic diagram of a video playing system according to a method provided by an embodiment of the present application;
fig. 2 is a schematic diagram of video play library construction at a browser end according to an embodiment of the present application;
Fig. 3 is a schematic implementation flow chart of a video playing method according to an embodiment of the present application;
fig. 4 is an expanded schematic diagram of a video frame structure according to an embodiment of the present application;
fig. 5 is a schematic flow chart of implementing decryption playing at a browser end according to an embodiment of the present application;
fig. 6 is a schematic diagram of an interaction flow of video playing according to an embodiment of the present application;
fig. 7 is a schematic diagram of a video playing device according to an embodiment of the present application;
Fig. 8 is a schematic diagram of an electronic device according to an embodiment of the present application.
Embodiments of the invention
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
In order to illustrate the technical scheme of the application, the following description is made by specific examples.
Fig. 1 is a schematic diagram of a video playing system for implementing a scene of a video playing method according to an embodiment of the present application. As shown in fig. 1, the video playing system includes a browser end, a server end and a device end.
Wherein the device side is a producer of video content, the device side may include one or more webcams, or other video capturing devices for capturing third video data. After the equipment end is started, the equipment end can be automatically connected to the service end, and a communication link between the equipment end and the service end is established. The equipment end initiates an authentication request to the service end, and after the service end passes the authentication of the equipment end, the service end can set the equipment end passing the authentication to be in an online state. When the equipment end is in an online state, the browser end can play the video through the server end.
When the equipment end is normally started, third video data are acquired in real time through a microphone and an image sensor, the acquired third video data are encoded and compressed through a preset encoding method, such as an H264 encoding method or an H265 encoding method, and then the acquired third video data can be encrypted through a preset first encryption algorithm, such as an encryption algorithm of AES (English fully called AdvancedEncryption Standard, chinese fully called advanced encryption standard) or RSA (English fully called Rivest-Shamir-Adleman). The encrypted video data may be stored in a memory at the device side. When the user requests to play the video through the browser, the encrypted third video data can be uploaded to the server for playing in real time, or the unencrypted third video data can be uploaded to the server for playing in real time.
The server side is used for coordinating data or signaling between the browser side and the equipment side so that the browser side can play the third video data acquired by the equipment side. When the server side coordinates the equipment side, the server side comprises the management equipment side on-line and off-line states. When the device side is online, the latest first encryption information, such as a key, is sent to the online device side. When the server coordinates the browser, if a video playing request sent by a user through the browser is received, the server informs the device in real time to upload the first video data, expands the video frame structure in the uploaded first video data, and adds expansion information in the expanded video frame structure, wherein the expansion information comprises first decryption information corresponding to the first encryption information. And sending the expanded first video data to a browser end.
The browser end comprises a terminal provided with a browser, and the video acquired by the equipment end can be requested to be played based on the installed browser. The browser side is a playing client side of the video. When the user plays the video, the user is responsible for pulling the encrypted second video data from the server, then analyzing the second video data to obtain the first decryption information, and decrypting the second video data, thereby realizing the playing of the video.
In order to realize the audio and video playing without plug-in at the browser end, the browser end realizes a video playing library (videoSdk), and the library can be directly called by a webpage program without installing any plug-in. As shown in fig. 2, in the video playback library, two sub libraries are included, a decryption library (video layer. Wasm) in wasm (english, all called WebAssembly) format, and a playback library (video layer. Js) in js format.
For the video play sub-library in wasm format, webAssembly method may be adopted to compile programs such as a video data processing module, a multimedia video processing tool FFmpeg, a decryption algorithm (including AES, RSA, etc.) and the like into a video play sub-library based on wasm format, so as to implement core service logics such as decapsulation, decryption, FFmpeg soft decoding, and FMP4 (all english is called FRAGMENTED MP 4) file encapsulation of video data. The WebAssembly method can compile the C/C++ program into LLVM (English is totally called Low Level Virtual Machine, chinese is totally called bottom virtual machine) byte codes, and the encoding format which can only be understood by a computer has the characteristics of high safety, high running speed and the like.
For the js format play library, a JavaScript language can be adopted to realize a video streaming module and an MSE (English is fully called Media Source Extensions, chinese is fully called media source expansion) play module and a webGL (English is fully called Web Graphics Library, chinese is fully called Web force-giving library) play module, when a user requests video, video data is pulled from a server through the video streaming module and is input into video player.wasm for data processing, H.264 video frames are directly packaged in FMP4 mode, are input into the MSE play module for decoding and rendering, H.265 video frames are required to be subjected to FFmpeg soft decoding in the video player.wasm, then the decoded video data is input into the webGL play module for rendering, and finally, the browser can normally decode and play audio and video data, and the user can hear sound and see video pictures.
Fig. 3 is a schematic implementation flow chart of a video playing method according to an embodiment of the present application, as shown in fig. 3, where the method includes:
in S301, a video playing request from a browser is received, and a signaling for opening a video is sent to a device for capturing a video.
In a possible implementation manner, before the server receives the video sign language request of the browser, the method may further include a step of registering the device on line with the server.
The staff or the user can input the equipment information in the server in advance, including the serial number of the equipment and the like. When the equipment end is started, a communication link is established with the server end through the preset server end access information of the equipment end, for example, the communication link can be a TLS (transmission layer security protocol is fully called in Chinese, hyper Text Transfer Protocol over SecureSocket Layer is fully called in English), and the security of the transmission process is ensured through transmission encryption and identity authentication on the basis of HTTP.
After the communication link is established, the equipment end sends a registration packet to the server end, and the server end performs identity authentication on the equipment end according to the registration packet. After the authentication is passed, the server sets the state of the equipment end to be an online state.
In order to determine the security of the first video data transmitted between the server and the device, after the communication link is established, the server may send first encryption information to the device, including key information of an encryption algorithm such as AES, RSA, or the like. When the encryption algorithm is a symmetric encryption algorithm, the first encryption information may be an encryption parameter. When the encryption algorithm is an asymmetric encryption algorithm, the first encryption information may be public key information.
In a possible implementation manner, the same device side may include multiple video channels, and different first encryption information may be configured for different video channels to further ensure the security of video content, including, for example, different encryption algorithms or encryption parameters. After receiving the issued first encryption information, the device end can encrypt the corresponding video channel and answer the result information of whether the setting is successful or not to the server end.
After the equipment end is online, a user can send a video playing request to the server end through the browser end. The video play request may include device side information or may further include video channel information. The media protocols of the browser and the server can adopt standard HTTP-FLV (English is totally called FlashVideo Over Http, chinese is totally called FLASH video based on HTTP) transmission protocol.
After receiving the video playing request from the browser, the server sends a video opening signaling through a communication link (such as a TLS communication link) established between the device and the server, wherein the signaling can carry information of a channel where the video to be opened is located, an IP address of the server, a media port and the like.
After receiving the video opening signaling sent by the server, the device establishes a media link (such as a TLS media link) between the device and the server through the server IP address and the media port information carried in the signaling. Meanwhile, the device side also collects third video data in real time through the microphone and the camera, carries out H264 or H265 coding compression on the video data, and can also carry out encryption processing on video frames and audio frames in the video data by using the first encryption information issued by the server side. After the encryption processing is completed, the device side can upload the encrypted first video data through the established TLS media link.
In S302, first video data uploaded by a device side is received, an audio frame in the first video data is encrypted by predetermined first encryption information, a video frame structure in the first video data is expanded, first decryption information is added to the expanded video frame structure, and the first decryption information corresponds to the first encryption information.
After the media port of the server receives the first video data, the video frame structure of the first video data can be expanded, and the first decryption information is added in the video frame structure.
When the first video data is encrypted first video data, unpacking processing can be performed on the first video data, and the H264/H265 video frames and the original audio frames are restored.
For an I frame in a video frame, an extension information of NALU (chinese holly called network abstraction unit, holly called Network Abstraction Layer Unit, for encapsulating data provided by a video coding layer and applying to network transmission) unit, which includes SEI (chinese holly called supplemental enhancement information, holly called SupplementalEnhancement Information, a method for providing additional information to a video code stream, a characteristic of H264/H265 video compression standard) is extended before the I frame of the original video data, and first encryption information is added in the NALU unit body, in such a way that the original frame structure of the H264/H265 video is not damaged, and audio and video data distribution can be performed using standard HTTP-FLV/HLS (holly called HTTP LIVE STREAMING)/RTMP (holly called real time message transmission Protocol, holly called REAL TIME MESSAGING Protocol), which is an open Protocol for Flash player and server direct audio, video and data transmission, and for an audio frame in the first video data, the first encryption information can be directly encrypted by the first encryption information.
As shown in fig. 4, an exemplary embodiment of the present application provides an extended video frame structure, where each NALU unit includes a NALU header and a NALU body, and in the NALU body in the NALU unit in the video data before extension, information such as SPS (fully called Sequence PARAMETER SET, fully called Sequence parameter set in chinese, which indicates that all information including an image Sequence), PPS (fully called image parameter set in chinese, fully called PictureParameter Set, which indicates that all pieces of information including an image are included), and basic image is included. In the NALU body of the extended NALU unit, supplemental enhancement information SEI, i.e. the first encryption information, is included.
In order to determine the security of the extension information, i.e., the first decryption information, may be secondarily encrypted by predetermined second encryption information. When the browser receives the second video data, the second encrypted first decryption information can be decrypted based on the predetermined second decryption information to obtain the first decryption information.
In S303, the extended video frame and the encrypted audio frame are encapsulated into second video data, and the second video data is sent to the browser end, where the second video data is used for decrypting and playing the second video data according to the first decryption information in the extended video frame structure by the browser end.
In a possible implementation manner, the unpacked video frame may be an unencrypted video frame in H264 format or H265 format, and after being expanded by the video frame structure, the expanded information and an original video frame encrypted by the first encryption information are obtained, or may also be an expanded unencrypted video frame, and the encrypted audio frame is encapsulated into second video data.
Or the encryption state of the video frames in the second video data can be switched according to the duration of the second video data sent to the browser side. For example, the video frame may be an unencrypted video frame in the transmitted second video data within a first predetermined time period from the start of the transmission of the second video data, and the first decryption information may be parsed based on the second decryption information. Or the first decryption information in the extension information may be obtained directly.
After the first preset time length is passed, the video frames in the second video data are encrypted video frames, and the browser side can decrypt the video frames and the audio frames in the second video data based on the first decryption information acquired in the first preset time length to acquire video frames and audio frames for playing.
The server may distribute the encrypted audio frames and the video frames of the extension SEI to the browser side via the TLS media link.
In a possible implementation manner, as shown in the video processing flow diagram at the browser side in fig. 5, the play library of the JS layer at the browser side may pull the audio/video stream, and may decapsulate the audio/video data through the integrated video play library (videoSdk), and parse and obtain the decryption algorithm and parameters from the SEI extension information of the video frame I frame, including the encrypted first decryption information. When the first decryption information is acquired, the encrypted first decryption information can be decrypted according to the preset second decryption information, so that the first decryption information is acquired, and the first decryption information comprises information such as encryption algorithm type and decryption parameters for decrypting the audio frames or the audio frames and the video frames.
The WASM-layer video play library (videoSdk) may decrypt the audio frames and the video frames using the type of decryption algorithm and the decryption parameters in the parsed first decryption information to obtain the original audio frame data and video frame data. If the video frame is not encrypted, the video frame which can be used for packaging and playing is obtained through decryption by the first decryption information, and if the video frame is not encrypted, the unencrypted video frame can be obtained.
For H.264 video frames, the video play library (videoSdk) encapsulates the original audio frames and video frames in FMP4 format, decodes and plays through MSE mode of browser. For the H.265 video frame, the video play library decodes the H.265 into YUV data by using WebAssembly soft decoding mode, and then invokes the WebGL interface of the browser to render, so that the browser can normally display video pictures and play sounds.
Fig. 6 is an interaction schematic diagram of video playing according to an embodiment of the present application, which is described in detail below:
The first step: and the user inputs equipment side information at the server side. When the equipment end is started, the equipment end establishes a TLS communication link with the service end and sends a registration packet to the service end. The server side can carry out identity authentication on the equipment information, and the equipment side is placed in an online state after the authentication is passed.
And a second step of: the server side issues first encryption information (e.g., keys of AES, RSA encryption algorithms) for encrypting the audio frame and the video frame to the device side. When one device supports multiple camera channels, different encryption algorithms or encryption parameters may be configured for different channels in order to further ensure video content security. After receiving the issued new first encryption information, the device end encrypts the audio frame and the video frame in the third video data by using the new first encryption information, and responds to the server end to set the result information of whether the setting is successful or not.
And a third step of: after the equipment is online, a user requests to play the video to the server through the browser, parameters such as the requested equipment information, channel information and the like can be carried, and a standard HTTP-FLV protocol can be adopted for a media protocol between the browser and the server.
Fourth step: after receiving a video playing request sent by a user through a browser, a server sends a video opening signaling through a TLS communication link established between the equipment and the server, wherein the signaling carries channel information, server IP, media ports and other information.
Fifth step: after receiving the signaling of opening the audio and video sent by the server, the device establishes a TLS media link between the device and the server through the server IP and the media port information carried in the signaling. Meanwhile, the equipment end also collects third video data in real time through a microphone or a camera in real time, carries out H264 or H265 coding compression on the third video data, and then carries out encryption processing on the third video data by using first encryption information issued by the service end.
Sixth step: and the device side uploads the encrypted third video data through the established TLS media link to obtain the first video data.
Seventh step: after receiving the first video data, the media port of the server side unpacks the audio frames and the video frames and restores the H264/H265 video frames and the original audio frames. In the video frame structure expansion, an SEI NALU unit can be expanded in front of original video data as expansion information for each I frame, and secondary encrypted key information is added in NALU unit main body.
Eighth step: the server distributes the encrypted audio frames and the video frames of the extension SEI, namely the second video data, to the browser through the TLS media link.
Ninth step: the browser side decapsulates the second video data through the integrated video play library videoSdk, acquires encrypted first decryption information from the SEI extension data of the I frame, and then decrypts the encrypted first decryption information to acquire first decryption information, namely, the original encryption algorithm type and decryption parameters.
Tenth step: the video play library videoSdk decrypts the audio frame or video frame of the second video data by using the analyzed algorithm type and decryption parameters of the first decryption information to obtain the original audio frame data and video frame data;
Eleventh step: for H.264 video frames, the video play library (videoSdk) encapsulates the original audio frames and video frames in FMP4 format, decodes and plays through MSE mode of browser. For the H.265 video frame, the video play library decodes the H.265 into YUV data by using WebAssembly soft decoding mode, and then invokes the WebGL interface of the browser to render, so that the browser can normally display video pictures and play sounds.
The embodiment of the application can achieve the following effects:
(1) The access pressure of the server side in the high concurrency scene is reduced.
The decryption algorithm type and decryption parameters of the first decryption information are transmitted in an extended video frame structure, including a manner such as an extended I frame SEI, and the browser does not need to call an interface of the server to inquire, so that one-time service interaction is reduced fundamentally, and no extra access pressure is caused to the server after the third video data are encrypted.
(2) The first screen play delay of the video is reduced.
The decryption algorithm type and decryption parameters of the first decryption information are transmitted in a manner of expanding a video frame structure, such as expanding video I frame SEI, and the browser does not need to independently call an interface of the server to inquire, so that one-time service interaction is fundamentally reduced, and no extra access time consumption is caused.
Meanwhile, the video play library (videoSdk) based on the WebAssembly method has the characteristic of high execution speed, and does not bring extra performance loss, so that play experience is not influenced.
(3) The safety is high.
In terms of data transmission, the device side and the server side can communicate and upload audio and video data in a TLS mode, the browser side and the server side can communicate and pull video data in an HTTPS mode, and network transmission is safe and reliable.
In terms of data content, the first video data uploaded to the server side by the device side is video data subjected to encryption processing. The second video data forwarded to the browser end by the server end is also video data subjected to encryption processing, and even if the video data is captured without data decryption, the video data cannot be played.
In terms of video playing, the video playing library (videoSdk) is used for carrying out first decryption information restoration and video decryption playing in a byte code format of a bottom virtual machine, such as a decryption library video player.
(3) The browser end does not need plug-in components, and is beneficial to reducing operation and maintenance cost.
When the browser side plays videos, only an integrated video play library (videoSdk) is needed, the integrated video play library can be released together with the webpage program, other plug-ins are not needed to be installed independently, when the library needs to be updated, only the webpage program needs to be released again, and no extra working cost is brought to later operation and maintenance.
(4) The standard of the streaming media transmission protocol is not destroyed, and the expansibility is good.
The first encryption information is transmitted in a manner of expanding SEI NALU units of I frames in a video frame structure, so that the frame structure is not damaged, and standard HTTP-FLV, HLS, RTMP and other streaming media protocols can still be used between the browser end and the server end, so that the method has good universality and expansibility.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present application.
Fig. 7 is a schematic diagram of a video playing device applied to a server according to an embodiment of the present application, as shown in fig. 7, where the device includes:
The signaling sending unit 701 is configured to receive a video playing request from a browser, and send a signaling for opening a video to a device for capturing a video.
The expansion unit 702 is configured to receive first video data uploaded by the device side, encrypt an audio frame in the first video data through predetermined first encryption information, expand a video frame structure in the first video data, and add first decryption information in the expanded video frame structure, where the first decryption information corresponds to the first encryption information.
The playing unit 703 is configured to encapsulate the expanded video frame and the encrypted audio frame into second video data, and send the second video data to the browser, where the second video data is used for decrypting and playing the second video data by the browser according to the first decryption information in the expanded video frame structure.
The video playback apparatus shown in fig. 7 corresponds to the video playback method shown in fig. 3.
In a possible implementation manner, the video playing device may further include a video playing device based on a browser side, including:
A request sending unit, configured to send a video playing request to a server, where the video playing request includes device information of a video that is requested to be played;
The second video data receiving unit is used for receiving second video data returned by the server, wherein the second video data comprises video frames and audio frames in the first video data sent by the equipment, the video frames are video frames added with preset first decryption information after structural expansion, and the audio frames are audio frames encrypted by first encryption information corresponding to the first decryption information;
And the decryption unit is used for decrypting the encrypted audio frame according to the first decryption information in the second video data and playing the video according to the video frame and the decrypted audio frame.
The video playing device based on the browser corresponds to the video playing device based on the equipment.
Fig. 8 is a schematic diagram of an electronic device according to an embodiment of the present application, where the electronic device may be a browser side or a server side, and the communication unit is configured to send and receive data and signaling, including sending and receiving data and signaling between the browser side and the server side or between the server side and the device side. As shown in fig. 8, the electronic device 8 of this embodiment includes: a processor 80, a communication unit and a memory 81, and a computer program 82, such as a video playing program, stored in said memory 81 and executable on said processor 80. The steps of the various video playback method embodiments described above are implemented when the processor 80 executes the computer program 82. Or the processor 80, when executing the computer program 82, performs the functions of the modules/units of the apparatus embodiments described above.
By way of example, the computer program 82 may be partitioned into one or more modules/units that are stored in the memory 81 and executed by the processor 80 to complete the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing the specified functions, which instruction segments are used to describe the execution of the computer program 82 in the electronic device 8.
The electronic device may include, but is not limited to, a processor 80, a memory 81. It will be appreciated by those skilled in the art that fig. 8 is merely an example of an electronic device 8 and is not meant to be limiting as to the electronic device 8, and may include more or fewer components than shown, or may combine certain components, or different components, e.g., the electronic device may also include input-output devices, network access devices, buses, etc.
The Processor 80 may be a central processing unit (Central Processing Unit, CPU), other general purpose Processor, digital signal Processor (DIGITAL SIGNAL Processor, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 81 may be an internal storage unit of the electronic device 8, such as a hard disk or a memory of the electronic device 8. The memory 81 may also be an external storage device of the electronic device 8, such as a plug-in hard disk, a smart memory card (SMARTMEDIA CARD, SMC), a secure digital (SecureDigital, SD) card, a flash memory card (FLASHCARD), or the like, which are provided on the electronic device 8. Further, the memory 81 may also include both an internal storage unit and an external storage device of the electronic device 8. The memory 81 is used for storing the computer program and other programs and data required by the electronic device. The memory 81 may also be used to temporarily store data that has been output or is to be output.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other manners. For example, the apparatus/terminal device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical function division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.
The integrated modules/units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on this understanding, the present application may also be implemented by implementing all or part of the procedures in the methods of the above embodiments, and the computer program may be stored in a computer readable storage medium, where the computer program when executed by a processor may implement the steps of the respective method embodiments. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer memory, a Read-only memory (ROM), a random access memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the computer readable medium may include content that is subject to appropriate increases and decreases as required by jurisdictions in which such content is subject to legislation and patent practice, such as in certain jurisdictions in which such content is not included as electrical carrier signals and telecommunication signals.
The above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application.

Claims (12)

  1. The video playing method is characterized by being applied to a server, and comprises the following steps:
    Receiving a video playing request of a browser end, and sending a video opening signaling to a device end for collecting videos;
    receiving first video data uploaded by a device side, encrypting an audio frame in the first video data through preset first encryption information, expanding a video frame structure in the first video data, and adding first decryption information in the expanded video frame structure, wherein the first decryption information corresponds to the first encryption information;
    And packaging the expanded video frames and the encrypted audio frames into second video data, and sending the second video data to a browser end, wherein the second video data is used for decrypting and playing the second video data by the browser end according to the first decryption information in the expanded video frame structure.
  2. The method of claim 1, wherein prior to sending signaling to open video to the device side for capturing video, the method further comprises:
    establishing a communication link with a device end, and receiving registration information of the device end through the communication link;
    and carrying out identity authentication on the equipment end according to the registration information, and determining the online state of the equipment after the authentication is passed.
  3. The method of claim 1, wherein the first video data uploaded by the device side is encrypted video data, and wherein prior to receiving the first video data uploaded by the device side, the method further comprises:
    and sending the preset first encryption information to the equipment end, wherein the first encryption information is used for encrypting the third video data acquired by the equipment end.
  4. The method of claim 1, wherein adding first decryption information to the expanded video frame structure comprises:
    Encrypting the first decryption information according to predetermined second encryption information;
    And adding the encrypted first encryption information in a network abstraction layer unit in the expanded video frame structure.
  5. The video playing method is characterized by being applied to a browser end, and comprises the following steps:
    sending a video playing request to a server, wherein the video playing request comprises equipment information of a video requested to be played;
    Receiving second video data returned by a server, wherein the second video data comprises video frames and audio frames in first video data sent by a device, the video frames are video frames added with preset first decryption information after structural expansion, and the audio frames are audio frames encrypted by first encryption information corresponding to the first decryption information;
    And decrypting the encrypted audio frame according to the first decryption information in the second video data, and playing the video according to the video frame and the decrypted audio frame.
  6. The method of claim 5, wherein decrypting the encrypted audio frame based on the first decryption information in the second video data comprises:
    The second video data is unpacked by a decryption library compiled into a byte code format of the bottom virtual machine in WebAssembly coding mode, and video frames with expanded video frame structures and encrypted audio frames contained in the second video data are obtained;
    analyzing the video frame with the expanded video frame structure to obtain first decryption information included in the video frame;
    and decrypting the encrypted audio frame according to the first decryption information.
  7. The method of claim 6, wherein after obtaining the first decryption information included in the video frame, the method further comprises:
    Detecting an encryption state of the video frame;
    And decrypting the video frame through the first decryption information when the video frame is in an encrypted state.
  8. The method of claim 6, wherein parsing the video frame with the extended video frame structure to obtain the first decryption information included in the video frame, comprises:
    And analyzing the video frame with the expanded video frame structure, and decrypting the expanded information of the video frame through preset second decryption information to obtain first decryption information included in the video frame.
  9. A server comprising a memory, a processor, a communication unit, and a computer program stored in the memory and executable on the processor, wherein:
    The communication unit is used for receiving and transmitting data or instructions to a browser end or a device end;
    The processor is configured to perform the video playback method of any one of claims 1-4.
  10. A browser-side comprising a memory, a processor, a communication unit, and a computer program stored in the memory and executable on the processor, wherein the browser-side comprises the communication unit and the processing unit, wherein:
    the communication unit is used for receiving and transmitting data or instructions to the server;
    The processor is configured to perform the video playback method of any one of claims 5-8.
  11. The video playing system is characterized by comprising a browser end, a service end and a device end, wherein:
    The server is used for executing the video playing method according to any one of claims 1-4;
    The browser side is configured to execute the video playing method according to any one of claims 5 to 8.
  12. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1-4, or 5-8.
CN202280063551.5A 2022-10-28 2022-10-28 Video playing method, system and storage medium Pending CN118120238A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/128383 WO2024087208A1 (en) 2022-10-28 2022-10-28 Video playback method and system, and storage medium

Publications (1)

Publication Number Publication Date
CN118120238A true CN118120238A (en) 2024-05-31

Family

ID=90829780

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202280063551.5A Pending CN118120238A (en) 2022-10-28 2022-10-28 Video playing method, system and storage medium

Country Status (2)

Country Link
CN (1) CN118120238A (en)
WO (1) WO2024087208A1 (en)

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105704545B (en) * 2016-01-20 2018-06-15 中国科学院信息工程研究所 A kind of crypto-synchronization information transmission method based on H.264 video flowing
CN112822518A (en) * 2021-04-19 2021-05-18 浙江华创视讯科技有限公司 Video playing method, device, system, electronic equipment and storage medium
CN114189713A (en) * 2021-12-21 2022-03-15 杭州当虹科技股份有限公司 Content encryption method

Also Published As

Publication number Publication date
WO2024087208A1 (en) 2024-05-02

Similar Documents

Publication Publication Date Title
CN112822518A (en) Video playing method, device, system, electronic equipment and storage medium
US9038147B2 (en) Progressive download or streaming of digital media securely through a localized container and communication protocol proxy
KR20060044745A (en) Common scrambling
US7249264B2 (en) Secure IP based streaming in a format independent manner
WO2021072878A1 (en) Audio/video data encryption and decryption method and apparatus employing rtmp, and readable storage medium
CN106331853B (en) Multimedia de-encapsulation method and device
US9485533B2 (en) Systems and methods for assembling and extracting command and control data
US11457254B2 (en) Systems and methods for secure communications between media devices
CN110611830A (en) Video processing method, device, equipment and medium
CN108989886A (en) A kind of method and system playing encrypted video
CN110012260A (en) A kind of video conference content guard method, device, equipment and system
US20090228709A1 (en) Systems and methods for using transport stream splicing for programming information security
KR101815467B1 (en) System for enforcing security surveillance by using security agents
CN111586445B (en) Video data transmission method and device
KR20050009227A (en) Individual video encryption system and method
US20080037782A1 (en) Reduction of channel change time for digital media devices using key management and virtual smart cards
CN118120238A (en) Video playing method, system and storage medium
US10231004B2 (en) Network recording service
WO2017035018A1 (en) Method and system for efficient encryption, transmission, and decryption of video data
CN117395466B (en) Video transmission real-time monitoring method and system and electronic equipment
CN115225977B (en) Video sparse asymmetric encryption method
CN115695858A (en) SEI encryption-based virtual film production video master film coding and decoding system, method and platform
KR20230031494A (en) Video storage method, video providing method and apparatus therefor in cloud environment
Wang et al. IPTV Video Hardware Encryption Transmission System Analysis
CN107950032B (en) Apparatus and method for modifying an encrypted multimedia data stream

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination