CN116156215B - VOLTE network-based video stream file compression and efficient transmission system and method - Google Patents

VOLTE network-based video stream file compression and efficient transmission system and method Download PDF

Info

Publication number
CN116156215B
CN116156215B CN202310430464.5A CN202310430464A CN116156215B CN 116156215 B CN116156215 B CN 116156215B CN 202310430464 A CN202310430464 A CN 202310430464A CN 116156215 B CN116156215 B CN 116156215B
Authority
CN
China
Prior art keywords
video
audio
original
privacy
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310430464.5A
Other languages
Chinese (zh)
Other versions
CN116156215A (en
Inventor
樊金礽
王增林
管权
陶涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Shumai Power Information Technology Co ltd
Original Assignee
Nanjing Shumai Power Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Shumai Power Information Technology Co ltd filed Critical Nanjing Shumai Power Information Technology Co ltd
Priority to CN202310430464.5A priority Critical patent/CN116156215B/en
Publication of CN116156215A publication Critical patent/CN116156215A/en
Application granted granted Critical
Publication of CN116156215B publication Critical patent/CN116156215B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • H04N21/2335Processing of audio elementary streams involving reformatting operations of audio signals, e.g. by converting from one coding standard to another
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/236Assembling of a multiplex stream, e.g. transport stream, by combining a video stream with other content or additional data, e.g. inserting a URL [Uniform Resource Locator] into a video stream, multiplexing software data into a video stream; Remultiplexing of multiplex streams; Insertion of stuffing bits into the multiplex stream, e.g. to obtain a constant bit-rate; Assembling of a packetised elementary stream
    • H04N21/2368Multiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/61Network physical structure; Signal processing
    • H04N21/6106Network physical structure; Signal processing specially adapted to the downstream path of the transmission network
    • H04N21/6131Network physical structure; Signal processing specially adapted to the downstream path of the transmission network involving transmission via a mobile phone network

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The invention discloses a video stream file compression and high-efficiency transmission system and method based on a VOLTE network, which relate to the technical field of image communication, and are used for extracting coding parameters from confirmed media video files, separating and decoding audio and video streams of the input media files, extracting an original audio stream and an original video stream, guiding the original video stream into a video privacy value calculation strategy to detect video privacy values, guiding the original audio stream into the audio privacy value calculation strategy to detect audio privacy values, respectively comparing the video privacy values and the audio privacy values with set corresponding privacy thresholds, judging whether to feed back privacy information to a publisher, and avoiding privacy leakage of publishers caused by video release.

Description

VOLTE network-based video stream file compression and efficient transmission system and method
Technical Field
The invention relates to the technical field of image communication, in particular to a video stream file compression and efficient transmission system and method based on a VOLTE network.
Background
Because of the existence of various coding standards, the media files owned by users have different file formats and audio and video coding formats. To compress the file size, firstly, file parsing and audio and video decoding are needed, and after the original audio and video streams are obtained, proper audio and video transcoders and transcoding parameters are selected to generate a new media file with smaller file size. Therefore, to compress the media file size, the format information of the media file must first be analyzed to determine transcoding parameters based on the efficiency of the codec employed by the source file and the encoding parameters. The prior video compression tool often needs deep knowledge of users on the professional knowledge of media and sets proper transcoding parameters to obtain better compression effect, and the prior video stream file compression and efficient transmission method cannot collect and check the privacy information in the transmitted video stream, so that privacy leakage is very easy to occur in the process of releasing the video carelessly by a publisher;
a method for compressing media files is disclosed, for example, in chinese patent application publication No. CN102055966a, comprising: extracting coding parameters from an input media file, and separating and decoding audio and video streams of the input media file to extract an original audio stream and an original video stream; according to the coding parameters, calculating transcoding parameters required by compression; encoding the original audio stream according to the transcoding parameters to output a new compressed audio stream, and encoding the original video stream to output a new compressed video stream; the new compressed audio stream and the compressed video stream are combined to generate a new media file. The invention also provides a system for compressing the media file. By the method and the system, the user can simply and quickly compress the media files in various different formats without knowing the professional knowledge of the media, so that the storage space is saved, and the carrying, the transmission and the sharing of the media files are convenient;
as another example, a method, a device and a system for video coding and video playing are disclosed in chinese patent with publication number CN114513669a, which relate to the technical field of video processing and can be applied to surround playing scenes. The method comprises the following steps: acquiring a plurality of original video streams acquired based on video streams acquired by a plurality of cameras for the same spatial region in the same time period, generating at least one target video stream according to the plurality of original video streams, wherein the target video stream is a video stream obtained by selecting a certain number of frame images from the original video streams corresponding to each camera according to a set direction; the at least one target video stream is compressed. Based on the method, in the video playing stage, the transmission code rate of the video stream is reduced;
all of the above patents exist: the prior video stream file compression and high-efficiency transmission method can not collect and check privacy information in a transmission video stream, so that privacy leakage is very easy to occur in the process of carelessly releasing video by a publisher.
Disclosure of Invention
The invention mainly aims to provide a video stream file compression and efficient transmission system and method based on a VOLTE network, which can effectively solve the problems in the background technology: the existing video stream file compression and efficient transmission method cannot collect and check privacy information in a transmission video stream, so that privacy leakage is easily caused in the process of carelessly releasing video by a publisher.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a video stream file compression and efficient transmission method based on VOLTE network specifically comprises the following steps:
s1, identifying an input file and confirming a video file;
s2, extracting coding parameters from the confirmed media video file, and separating and decoding audio and video streams of the input media file to extract an original audio stream and an original video stream;
s3, guiding the original video stream into a video privacy value calculation strategy to detect video privacy values, and guiding the original audio stream into an audio privacy value calculation strategy to detect audio privacy values;
s4, comparing the video privacy value and the audio privacy value with the set corresponding privacy threshold value respectively;
s5, judging whether privacy information is fed back to the publisher or not, wherein the judgment conditions are as follows: the video privacy value is larger than the set video privacy threshold value and/or the audio privacy value is larger than the set audio privacy threshold value, if the judging condition exists, the step S6 is executed, and if the judging condition does not exist, the step S7 is directly executed;
s6, feeding back the privacy information to the publisher for publication confirmation;
s7, calculating transcoding parameters required by compression according to the coding parameters;
s8, encoding and outputting a new compressed audio stream and/or encoding and outputting a new compressed video stream to the original video stream according to the transcoding parameters;
s9, combining and transmitting the compressed video stream and the compressed audio stream.
The invention further improves that the coding parameters in the S2 comprise video coding parameters and audio coding parameters, wherein the video coding parameters comprise: the method comprises the steps of an original video encoder type, an original video encoding code rate, an original video encoding frame rate and an original video resolution; the audio coding parameters include: the method comprises the steps of an original audio encoder type, an original audio encoding code rate, an original audio channel number and an original audio sampling rate, wherein the transcoding parameters in the step S7 comprise video transcoding parameters and audio transcoding parameters, and the video transcoding parameters comprise: the type of the target video encoder, the code rate of the target video encoding, the frame rate of the target video encoding and the resolution of the target video; the audio transcoding parameters include: a target audio encoder type, a target audio encoding rate, a target number of audio channels, a target audio sampling rate.
The invention further improves that the video privacy value calculation strategy comprises the following specific steps:
s301, carrying out framing treatment on the video, extracting each frame image in the video, extracting a face picture in the image, and adjusting the eye length of the face
Figure SMS_3
Eye diameter->
Figure SMS_6
And eyeball chroma +.>
Figure SMS_9
Collecting data, and calculating eye proportion value
Figure SMS_1
Extracting standard facial feature image of publisher, and adding eyeball chromaticity in the standard image>
Figure SMS_5
And eye proportion value->
Figure SMS_8
Extracting, calculating the similarity between the face picture and the standard picture of the publisher in the image, and countingThe calculation formula is as follows:
Figure SMS_10
wherein k is similarity, < >>
Figure SMS_2
For the ratio of the eye proportion value to the ratio coefficient, < ->
Figure SMS_4
Is the ratio coefficient of the eyeball chromaticity,
Figure SMS_7
comparing the calculated similarity with a similarity threshold, if the calculated similarity is greater than or equal to the similarity threshold, indicating that the face image in the image is the face image of the issuing person, and if the calculated similarity is less than the similarity threshold, indicating that the face image in the image is not the face image of the issuing person;
s302, extracting single-frame videos with face images of publishers in the videos, and obtaining the number of the single-frame videos with the face images of publishers in the videos
Figure SMS_11
Acquiring the total frame number +.>
Figure SMS_12
Simultaneously acquiring the eye length of a face in a single-frame video>
Figure SMS_13
Acquiring eye length in standard image>
Figure SMS_14
Extracting the personnel distance +.>
Figure SMS_15
Calculating the distance of the person in the actual scene, < +.>
Figure SMS_16
S303, a list of face images of the publishers existsExtracting the distance between the frame images in the actual scene, extracting the total frame number of the images, substituting the total frame number into a video privacy value calculation formula to calculate a video privacy value, wherein the video privacy value calculation formula is as follows
Figure SMS_17
Wherein n is the number of frame images with spacing in the actual scene of the person greater than the set safety spacing,/>
Figure SMS_18
For the p-th spacing in the actual scene greater than the spacing in the actual scene for which the safety spacing is set,/->
Figure SMS_19
Is a set safety distance.
The invention further improves that the specific steps of the audio privacy value calculation strategy are as follows:
s304, the publisher sets privacy vocabulary appearing in the audio in advance, extracts the audio stream in the original video, imports the audio stream data into word conversion software, converts the audio into words and outputs the words, and carries out word segmentation on the outputted words;
s305, comparing the divided word element set with the private words, searching the private words in the word element set, comparing the number of the private words in the word element set with the whole words in the word element set, and setting the obtained ratio as an audio privacy value.
The invention is further improved in that the specific content of the S6 is as follows:
and transmitting frame images with the spacing larger than the set safe spacing in the actual scene of the personnel, corresponding fragment audio streams, private vocabularies existing in the word element set and corresponding fragment video streams to a publisher, and checking and confirming the fragments by the publisher to judge whether to confirm the publication.
The invention further improves that in the step S7, the target video coding frame rate required by compression is calculated according to the following principle: comparing the original video coding frame rate with 30dps, and if the original video coding frame rate is greater than 30dps, determining that the target video coding frame rate is 30dps; if the original video coding frame rate is less than or equal to 30dps, determining that the target video coding frame rate is the same as the original video coding frame rate: the target video resolution required for compression is calculated according to the following principles: comparing the original video resolution with 640 x 480, and if the original video resolution does not exceed 640 x 480, determining that the target video resolution is the same as the original video resolution; if the original video resolution exceeds 640×480, the resolution is reduced, the original image aspect ratio is kept for scaling by taking 640×480 as a target, and then the target video encoder type required by the target video resolution calculation compression is obtained, and the following principle is adopted: if the original video encoder type is RM, or VC1, or H.264, determining that the target video encoder type is H.264; otherwise, the target video encoder type is determined to be moving picture experts group MPEG4.
The invention is further improved in that in the step S7, the target video coding rate required by compression is calculated according to the following principle: calculating a ratio K_fps of the target video coding frame rate to the original video coding frame rate and a ratio K_pix of the target video resolution to the original video resolution, and obtaining an alternative target video coding rate according to the following formula: K_fps.K_pix.K_br.original video coding rate, wherein K_br represents a predetermined target rate reduction coefficient; selecting the closest reference resolution from a preset reference corresponding table according to the target video resolution, comparing the reference code rate corresponding to the selected reference resolution with the alternative target video coding code rate, and selecting a relatively smaller code rate between the two as the target video coding code rate required by compression.
The invention further improves that in the step S7, the audio transcoding parameters required by compression are calculated according to the following principles: determining that the target audio sampling rate is the same as the original audio sampling rate, and the number of target audio channels is the same as the number of original audio channels; determining that the target audio encoder type is advanced audio coding AAC; calculating the ratio of the original audio coding rate to the original video coding rate, judging whether the ratio is larger than 1/2, and if so, downwards regulating the target audio coding rate by one or two levels; otherwise, determining that the target audio coding rate is the same as the original audio coding rate.
The invention further improves a video stream file compression and high-efficiency transmission system based on a VOLTE network, which specifically comprises a video file identification module, a coding parameter extraction module, an audio splitting module, an audio detection module, a video detection module, a detection feedback module, a transcoding calculation module, a file compression module and a merging transmission module;
the video file identification module is used for identifying the input file and confirming the video file;
the coding parameter extraction module is used for extracting coding parameters from the confirmed media video file;
the audio splitting module is used for separating and decoding audio and video streams of an input media file to extract an original audio stream and an original video stream;
the audio detection module is used for guiding the original audio stream into an audio privacy value calculation strategy to detect the audio privacy value, and simultaneously comparing the audio privacy value with a set corresponding privacy threshold value;
the video detection module is used for guiding the original video stream into a video privacy value calculation strategy to detect the video privacy value;
the detection feedback module is used for judging whether privacy information is fed back to the publisher or not;
the transcoding calculation module is used for calculating transcoding parameters required by compression according to the coding parameters;
the file compression module and the merging transmission module are used for encoding and outputting a new compressed audio stream and/or encoding and outputting a new compressed video stream to an original video stream according to the transcoding parameters;
the merging transmission module is used for merging and transmitting the compressed video stream and the compressed audio stream.
Compared with the prior art, the invention has the following beneficial effects:
the method and the device extract coding parameters from the confirmed media video file, separate and decode the audio and video streams of the input media file, extract an original audio stream and an original video stream, guide the original video stream into a video privacy value calculation strategy to detect video privacy values, guide the original audio stream into an audio privacy value calculation strategy to detect audio privacy values, respectively compare the video privacy values and the audio privacy values with the set corresponding privacy thresholds, and judge whether to feed back privacy information to a publisher or not, so that privacy leakage of publishers caused by video release is avoided.
Drawings
Fig. 1 is a schematic diagram of an overall flow frame of a video stream file compression and efficient transmission method based on a VOLTE network according to the present invention.
Fig. 2 is a schematic diagram of a video stream file compression and efficient transmission system framework based on a VOLTE network according to the present invention.
Detailed Description
In order that the technical means, the creation characteristics, the achievement of the objects and the effects of the present invention may be easily understood, it should be noted that in the description of the present invention, the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc. indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings are merely for convenience in describing the present invention and simplifying the description, and do not indicate or imply that the apparatus or elements to be referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention. Furthermore, the terms "a", "an", "the" and "the" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. The invention is further described below in conjunction with the detailed description.
Example 1
The embodiment proposes that coding parameters are extracted from a confirmed media video file, audio and video stream separation and decoding are carried out on the input media file, an original audio stream and an original video stream are extracted, the original video stream is imported into a video privacy value calculation strategy to carry out video privacy value detection, the original audio stream is imported into the audio privacy value calculation strategy to carry out audio privacy value detection, the video privacy value and the audio privacy value are respectively compared with a set corresponding privacy threshold value, whether privacy information is fed back to a publisher is judged, and privacy leakage of publishers caused by video release is avoided.
S1, identifying an input file and confirming a video file;
s2, extracting coding parameters from the confirmed media video file, and separating and decoding audio and video streams of the input media file to extract an original audio stream and an original video stream;
s3, guiding the original video stream into a video privacy value calculation strategy to detect video privacy values, and guiding the original audio stream into an audio privacy value calculation strategy to detect audio privacy values;
s4, comparing the video privacy value and the audio privacy value with the set corresponding privacy threshold value respectively;
s5, judging whether privacy information is fed back to the publisher or not, wherein the judgment conditions are as follows: the video privacy value is larger than the set video privacy threshold value and/or the audio privacy value is larger than the set audio privacy threshold value, if the judging condition exists, the step S6 is executed, and if the judging condition does not exist, the step S7 is directly executed;
s6, feeding back the privacy information to the publisher for publication confirmation;
s7, calculating transcoding parameters required by compression according to the coding parameters;
s8, encoding and outputting a new compressed audio stream and/or encoding and outputting a new compressed video stream to the original video stream according to the transcoding parameters;
s9, combining and transmitting the compressed video stream and the compressed audio stream.
In this embodiment, the encoding parameters in S2 include video encoding parameters and audio encoding parameters, where the video encoding parameters include: the method comprises the steps of an original video encoder type, an original video encoding code rate, an original video encoding frame rate and an original video resolution; the audio coding parameters include: the transcoding parameters in S7 include a video transcoding parameter and an audio transcoding parameter, wherein the video transcoding parameter includes: the type of the target video encoder, the code rate of the target video encoding, the frame rate of the target video encoding and the resolution of the target video; the audio transcoding parameters include: a target audio encoder type, a target audio encoding rate, a target number of audio channels, a target audio sampling rate.
In this embodiment, the video privacy value calculation strategy includes the following specific steps:
s301, carrying out framing treatment on the video, extracting each frame image in the video, extracting a face picture in the image, and adjusting the eye length of the face
Figure SMS_21
Eye diameter->
Figure SMS_25
And eyeball chroma +.>
Figure SMS_28
Collecting data, and calculating eye proportion value
Figure SMS_22
Extracting standard facial feature image of publisher, and adding eyeball chromaticity in the standard image>
Figure SMS_23
And eye proportion value->
Figure SMS_26
Extracting, and calculating the similarity between a face picture and a standard picture of a publisher in the image, wherein the calculation formula is as follows:
Figure SMS_29
wherein k is similarity, < >>
Figure SMS_20
For the ratio of the eye proportion value to the ratio coefficient, < ->
Figure SMS_24
Is the ratio coefficient of the eyeball chromaticity,
Figure SMS_27
comparing the calculated similarity with a similarity threshold, if the calculated similarity is greater than or equal to the similarity threshold, indicating that the face image in the image is the face image of the issuing person, and if the calculated similarity is less than the similarity threshold, indicating that the face image in the image is not the face image of the issuing person;
s302, extracting single-frame videos with face images of publishers in the videos, and obtaining the number of the single-frame videos with the face images of publishers in the videos
Figure SMS_30
Acquiring the total frame number +.>
Figure SMS_31
Simultaneously acquiring the eye length of a face in a single-frame video>
Figure SMS_32
Acquiring eye length in standard image>
Figure SMS_33
Extracting the personnel distance +.>
Figure SMS_34
Calculating the distance of the person in the actual scene, < +.>
Figure SMS_35
S303, extracting the distance in an actual scene in a single frame image of a face image with a publisher, extracting the total frame number of the image, substituting the total frame number into a video privacy value calculation formula to calculate a video privacy value, wherein the video privacy value calculation formula is that
Figure SMS_36
Wherein n is the number of frame images with spacing in the actual scene of the person greater than the set safety spacing,/>
Figure SMS_37
For the p-th spacing in the actual scene greater than the spacing in the actual scene for which the safety spacing is set,/->
Figure SMS_38
Is a set safety distance.
In this embodiment, the specific steps of the audio privacy value calculation strategy are as follows:
s304, the publisher sets privacy vocabulary appearing in the audio in advance, extracts the audio stream in the original video, imports the audio stream data into word conversion software, converts the audio into words and outputs the words, and carries out word segmentation on the outputted words;
s305, comparing the divided word element set with the private words, searching the private words in the word element set, comparing the number of the private words in the word element set with the whole words in the word element set, and setting the obtained ratio as an audio privacy value.
In this embodiment, the specific content of S6 is:
and transmitting frame images with the spacing larger than the set safe spacing in the actual scene of the personnel, corresponding fragment audio streams, private vocabularies existing in the word element set and corresponding fragment video streams to a publisher, and checking and confirming the fragments by the publisher to judge whether to confirm the publication.
In this embodiment, in S7, the target video coding frame rate required for compression is calculated according to the following principle: comparing the original video coding frame rate with 30dps, and if the original video coding frame rate is greater than 30dps, determining that the target video coding frame rate is 30dps; if the original video coding frame rate is less than or equal to 30dps, determining that the target video coding frame rate is the same as the original video coding frame rate: the target video resolution required for compression is calculated according to the following principles: comparing the original video resolution with 640×480, and if the original video resolution does not exceed 640×480, determining that the target video resolution is the same as the original video resolution; if the original video resolution exceeds 640×480, the resolution is reduced, the original image aspect ratio is kept for scaling by taking 640×480 as a target, and then the target video encoder type required by the target video resolution calculation compression is obtained, and the following principle is adopted: if the original video encoder type is RM, or VC1, or H.264, determining that the target video encoder type is H.264; otherwise, the target video encoder type is determined to be moving picture experts group MPEG4.
In this embodiment, in S7, the target video coding rate required for compression is calculated according to the following principle: calculating a ratio K_fps of the target video coding frame rate to the original video coding frame rate and a ratio K_pix of the target video resolution to the original video resolution, and obtaining an alternative target video coding rate according to the following formula: K_fps.K_pix.K_br.original video coding rate, wherein K_br represents a predetermined target rate reduction coefficient; selecting the closest reference resolution from a preset reference corresponding table according to the target video resolution, comparing the reference code rate corresponding to the selected reference resolution with the alternative target video coding code rate, and selecting a relatively smaller code rate between the two as the target video coding code rate required by compression.
In this embodiment, in S7, the audio transcoding parameters required for compression are calculated according to the following principles: determining that the target audio sampling rate is the same as the original audio sampling rate, and the number of target audio channels is the same as the number of original audio channels; determining that the target audio encoder type is advanced audio coding AAC; calculating the ratio of the original audio coding rate to the original video coding rate, judging whether the ratio is larger than 1/2, and if so, downwards regulating the target audio coding rate by one or two levels; otherwise, determining that the target audio coding rate is the same as the original audio coding rate.
Example 2
The video stream file compression and efficient transmission system based on the VOLTE network is realized based on the video stream file compression and efficient transmission method based on the VOLTE network, and specifically comprises a video file identification module, a coding parameter extraction module, an audio splitting module, an audio detection module, a video detection module, a detection feedback module, a transcoding calculation module, a file compression module and a merging transmission module;
the video file identification module is used for identifying the input file and confirming the video file;
the coding parameter extraction module is used for extracting coding parameters from the confirmed media video file;
the audio frequency shunt module is used for separating and decoding audio frequency and video frequency streams of the input media file, and extracting an original audio frequency stream and an original video frequency stream;
the audio detection module is used for guiding the original audio stream into an audio privacy value calculation strategy to detect the audio privacy value, and simultaneously comparing the audio privacy value with a set corresponding privacy threshold value;
the video detection module is used for guiding the original video stream into a video privacy value calculation strategy to detect the video privacy value;
the detection feedback module is used for judging whether privacy information is fed back to the publisher or not;
the transcoding calculation module is used for calculating transcoding parameters required by compression according to the coding parameters;
the file compression module and the merging transmission module are used for encoding and outputting a new compressed audio stream and/or encoding and outputting a new compressed video stream to the original video stream according to the transcoding parameters;
the merging transmission module is used for merging and transmitting the compressed video stream and the compressed audio stream.
It is important to note that the construction and arrangement of the invention as shown in the various exemplary embodiments is illustrative only. Although only a few embodiments have been described in detail in this disclosure, those skilled in the art who review this disclosure will readily appreciate that many modifications are possible (e.g., variations in sizes, dimensions, structures, shapes and proportions of the various elements, values of parameters (e.g., temperature, pressure, etc.), mounting arrangements, use of materials, colors, orientations, etc.) without materially departing from the novel teachings and advantages of the subject matter described in this disclosure. For example, elements shown as integrally formed may be constructed of multiple parts or elements, the position of elements may be reversed or otherwise varied, and the nature or number of discrete elements or positions may be altered or varied. Accordingly, all such modifications are intended to be included within the scope of present invention. The order or sequence of any process or method steps may be varied or re-sequenced according to alternative embodiments. In the claims, any means-plus-function clause is intended to cover the structures described herein as performing the recited function and not only structural equivalents but also equivalent structures. Other substitutions, modifications, changes and omissions may be made in the design, operating conditions and arrangement of the exemplary embodiments without departing from the scope of the present inventions. Therefore, the invention is not limited to the specific embodiments, but extends to various modifications that nevertheless fall within the scope of the appended claims.
Furthermore, in order to provide a concise description of the exemplary embodiments, all features of an actual implementation may not be described (i.e., those not associated with the best mode presently contemplated for carrying out the invention, or those not associated with practicing the invention).
It should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions may be made. Such a development effort might be complex and time consuming, but would nevertheless be a routine undertaking of design, fabrication, and manufacture for those of ordinary skill having the benefit of this disclosure.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered in the scope of the claims of the present invention.

Claims (6)

1. A video stream file compression and efficient transmission method based on a VOLTE network is characterized by comprising the following steps of: the method specifically comprises the following steps:
s1, identifying an input file and confirming a video file;
s2, extracting coding parameters from the confirmed media video file, and separating and decoding audio and video streams of the input media file to extract an original audio stream and an original video stream;
s3, guiding the original video stream into a video privacy value calculation strategy to detect video privacy values, and guiding the original audio stream into an audio privacy value calculation strategy to detect audio privacy values;
s4, comparing the video privacy value and the audio privacy value with the set corresponding privacy threshold value respectively;
s5, judging whether privacy information is fed back to the publisher or not, wherein the judgment conditions are as follows: the video privacy value is larger than the set video privacy threshold value and/or the audio privacy value is larger than the set audio privacy threshold value, if the judging condition exists, the step S6 is executed, and if the judging condition does not exist, the step S7 is directly executed;
s6, feeding back the privacy information to the publisher for publication confirmation;
s7, calculating transcoding parameters required by compression according to the coding parameters;
s8, encoding and outputting a new compressed audio stream to the original audio stream and/or encoding and outputting a new compressed video stream to the original video stream according to the transcoding parameters;
s9, combining and transmitting the compressed video stream and the compressed audio stream; the coding parameters in S2 include video coding parameters and audio coding parameters, wherein the video coding parameters include: the method comprises the steps of an original video encoder type, an original video encoding code rate, an original video encoding frame rate and an original video resolution; the audio coding parameters include: the method comprises the steps of an original audio encoder type, an original audio encoding code rate, an original audio channel number and an original audio sampling rate, wherein the transcoding parameters in the step S7 comprise video transcoding parameters and audio transcoding parameters, and the video transcoding parameters comprise: the type of the target video encoder, the code rate of the target video encoding, the frame rate of the target video encoding and the resolution of the target video; the audio transcoding parameters include: a target audio encoder type, a target audio encoding rate, a target audio channel number, and a target audio sampling rate; the video privacy value calculation strategy comprises the following specific steps:
s301, carrying out framing treatment on the video, extracting each frame image in the video, extracting a face picture in the image, and adjusting the eye length of the face
Figure QLYQS_2
Eye diameter->
Figure QLYQS_5
And eyeball chroma +.>
Figure QLYQS_7
Collecting data, and calculating eye proportion value +.>
Figure QLYQS_3
Extracting standard facial feature image of publisher, and adding eyeball chromaticity in the standard image>
Figure QLYQS_4
And eye proportion value->
Figure QLYQS_9
Extracting, and calculating the similarity between a face picture and a standard picture of a publisher in the image, wherein the calculation formula is as follows: />
Figure QLYQS_10
Wherein k is similarity, < >>
Figure QLYQS_1
For the ratio of the eye proportion value to the ratio coefficient, < ->
Figure QLYQS_6
For the eyeball chromaticity duty ratio coefficient, +.>
Figure QLYQS_8
Comparing the calculated similarity with a similarity threshold, if the calculated similarity is greater than or equal to the similarity threshold, indicating that the face image in the image is the face image of the issuing person, and if the calculated similarity is less than the similarity threshold, indicating that the face image in the image is not the face image of the issuing person;
s302, extracting single-frame videos with face images of publishers in the videos, and obtaining the number of the single-frame videos with the face images of publishers in the videos
Figure QLYQS_11
Acquiring the total frame number +.>
Figure QLYQS_12
Simultaneously acquiring the eye length of a face in a single-frame video>
Figure QLYQS_13
Acquiring eye length in standard image>
Figure QLYQS_14
Extracting the personnel distance +.>
Figure QLYQS_15
Calculating the distance of the person in the actual scene, < +.>
Figure QLYQS_16
S303, extracting the distance in an actual scene in a single frame image of a face image with a publisher, extracting the total frame number of the image, substituting the total frame number into a video privacy value calculation formula to calculate a video privacy value, wherein the video privacy value calculation formula is that
Figure QLYQS_17
Where n is the number of frame images with a spacing in the actual scene of the person greater than the set safety spacing,
Figure QLYQS_18
for the p-th spacing in the actual scene greater than the spacing in the actual scene for which the safety spacing is set,/->
Figure QLYQS_19
Is a set safety interval; the specific steps of the audio privacy value calculation strategy are as follows:
s304, the publisher sets privacy vocabulary appearing in the audio in advance, extracts the audio stream in the original video, imports the audio stream data into word conversion software, converts the audio into words and outputs the words, and carries out word segmentation on the outputted words;
s305, comparing the divided word element set with the private words, searching the private words in the word element set, comparing the number of the private words in the word element set with the whole words in the word element set, and setting the obtained ratio as an audio privacy value.
2. The video stream file compression and efficient transmission method based on the VOLTE network according to claim 1, wherein: the specific content of the S6 is as follows:
and transmitting frame images with the spacing larger than the set safe spacing in the actual scene of the personnel, corresponding fragment audio streams, private vocabularies existing in the word element set and corresponding fragment video streams to a publisher, and checking and confirming the fragments by the publisher to judge whether to confirm the publication.
3. The video stream file compression and efficient transmission method based on the VOLTE network according to claim 2, characterized in that: in S7, the target video coding frame rate required for compression is calculated according to the following principles: comparing the original video coding frame rate with 30dps, and if the original video coding frame rate is greater than 30dps, determining that the target video coding frame rate is 30dps; if the original video coding frame rate is less than or equal to 30dps, determining that the target video coding frame rate is the same as the original video coding frame rate: the target video resolution required for compression is calculated according to the following principles: comparing the original video resolution with 640 x 480, and if the original video resolution does not exceed 640 x 480, determining that the target video resolution is the same as the original video resolution; if the original video resolution exceeds 640×480, the resolution is reduced, the original image aspect ratio is kept for scaling by taking 640×480 as a target, and then the target video encoder type required by the target video resolution calculation compression is obtained, and the following principle is adopted: if the original video encoder type is RM, or VC1, or H.264, determining that the target video encoder type is H.264; otherwise, the target video encoder type is determined to be moving picture experts group MPEG4.
4. A video stream file compression and efficient transmission method based on a VOLTE network as claimed in claim 3, characterized in that: in the step S7, the target video coding rate required by compression is calculated according to the following principle: calculating a ratio K_fps of the target video coding frame rate to the original video coding frame rate and a ratio K_pix of the target video resolution to the original video resolution, and obtaining an alternative target video coding rate according to the following formula: K_fps.K_pix.K_br.original video coding rate, wherein K_br represents a predetermined target rate reduction coefficient; selecting the closest reference resolution from a preset reference corresponding table according to the target video resolution, comparing the reference code rate corresponding to the selected reference resolution with the alternative target video coding code rate, and selecting a relatively smaller code rate between the two as the target video coding code rate required by compression.
5. The video stream file compression and efficient transmission method based on the VOLTE network as claimed in claim 4, wherein: in S7, the audio transcoding parameters required for compression are calculated according to the following principles: determining that the target audio sampling rate is the same as the original audio sampling rate, and the number of target audio channels is the same as the number of original audio channels; determining that the target audio encoder type is advanced audio coding AAC; calculating the ratio of the original audio coding rate to the original video coding rate, judging whether the ratio is larger than 1/2, and if so, downwards regulating the target audio coding rate by one or two levels; otherwise, determining that the target audio coding rate is the same as the original audio coding rate.
6. A video stream file compression and efficient transmission system based on a VOLTE network, which is implemented based on a video stream file compression and efficient transmission method based on a VOLTE network according to any one of claims 1 to 5, characterized in that: the system specifically comprises a video file identification module, a coding parameter extraction module, an audio shunt module, an audio detection module, a video detection module, a detection feedback module, a transcoding calculation module, a file compression module and a merging transmission module;
the video file identification module is used for identifying the input file and confirming the video file;
the coding parameter extraction module is used for extracting coding parameters from the confirmed media video file;
the audio splitting module is used for separating and decoding audio and video streams of an input media file to extract an original audio stream and an original video stream;
the audio detection module is used for guiding the original audio stream into an audio privacy value calculation strategy to detect the audio privacy value, and simultaneously comparing the audio privacy value with a set corresponding privacy threshold value;
the video detection module is used for guiding the original video stream into a video privacy value calculation strategy to detect the video privacy value;
the detection feedback module is used for judging whether privacy information is fed back to the publisher or not;
the transcoding calculation module is used for calculating transcoding parameters required by compression according to the coding parameters;
the file compression module and the merging transmission module are used for encoding and outputting a new compressed audio stream for the original audio stream and/or encoding and outputting a new compressed video stream for the original video stream according to the transcoding parameters;
the merging transmission module is used for merging and transmitting the compressed video stream and the compressed audio stream.
CN202310430464.5A 2023-04-21 2023-04-21 VOLTE network-based video stream file compression and efficient transmission system and method Active CN116156215B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310430464.5A CN116156215B (en) 2023-04-21 2023-04-21 VOLTE network-based video stream file compression and efficient transmission system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310430464.5A CN116156215B (en) 2023-04-21 2023-04-21 VOLTE network-based video stream file compression and efficient transmission system and method

Publications (2)

Publication Number Publication Date
CN116156215A CN116156215A (en) 2023-05-23
CN116156215B true CN116156215B (en) 2023-07-07

Family

ID=86354705

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310430464.5A Active CN116156215B (en) 2023-04-21 2023-04-21 VOLTE network-based video stream file compression and efficient transmission system and method

Country Status (1)

Country Link
CN (1) CN116156215B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116915402B (en) * 2023-09-05 2023-11-21 南京铂航电子科技有限公司 Data security transmission method and system based on quantum encryption

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102055966B (en) * 2009-11-04 2013-03-20 腾讯科技(深圳)有限公司 Compression method and system for media file
CN108256513A (en) * 2018-03-23 2018-07-06 中国科学院长春光学精密机械与物理研究所 A kind of intelligent video analysis method and intelligent video record system
CN110087099B (en) * 2019-03-11 2020-08-07 北京大学 Monitoring method and system for protecting privacy
CN112491669A (en) * 2020-11-17 2021-03-12 珠海格力电器股份有限公司 Data processing method, device and system
CN115723529A (en) * 2021-08-31 2023-03-03 上海博泰悦臻网络技术服务有限公司 In-vehicle privacy protection method, system, medium, and apparatus

Also Published As

Publication number Publication date
CN116156215A (en) 2023-05-23

Similar Documents

Publication Publication Date Title
US11902704B2 (en) Apparatus and method for video-audio processing, and program for separating an object sound corresponding to a selected video object
CN116156215B (en) VOLTE network-based video stream file compression and efficient transmission system and method
US9014261B2 (en) Method and system for media file compression
US10304458B1 (en) Systems and methods for transcribing videos using speaker identification
US6989868B2 (en) Method of converting format of encoded video data and apparatus therefor
US8879788B2 (en) Video processing apparatus, method and system
JP2014515225A (en) Target object-based image processing
CN102056099B (en) By using hand-written data to reproduce the apparatus and method of hand-written message
CN102238390B (en) Image-library-based video and image coding and decoding method and system
US9615101B2 (en) Method and apparatus for signal encoding producing encoded signals of high fidelity at minimal sizes
WO2007036838A1 (en) Face annotation in streaming video
EP2312822A1 (en) System and method for code stream transform, code stream identification unit and scheme confirmation unit
WO2017050067A1 (en) Video communication method, apparatus, and system
EP4307672A3 (en) Reference picture information signaling in a video bitstream
TWI504275B (en) Method and apparatus for signaling and decoding avs1-p2 bitstreams of different versions
US8406533B2 (en) Image comparing apparatus and method therefor, image retrieving apparatus as well as program and recording medium
CN110379130B (en) Medical nursing anti-falling system based on multi-path high-definition SDI video
US20140313327A1 (en) Processing device, integrated circuit, processing method, and recording medium
US20230353797A1 (en) Classifying segments of media content using closed captioning
CN104320644B (en) A kind of decoding method of depth information, system and device
CN112104872A (en) Image transmission method and device
US9661331B2 (en) Method and apparatus for signal encoding realizing optimal fidelity
CN113315931B (en) HLS stream-based data processing method and electronic equipment
CN205430500U (en) Multi -media transcoding dispatch device
US20240089500A1 (en) Method for multiview video data encoding, method for multiview video data decoding, and devices thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant