CN109168032A - Processing method, terminal, server and the storage medium of video data - Google Patents
Processing method, terminal, server and the storage medium of video data Download PDFInfo
- Publication number
- CN109168032A CN109168032A CN201811337105.0A CN201811337105A CN109168032A CN 109168032 A CN109168032 A CN 109168032A CN 201811337105 A CN201811337105 A CN 201811337105A CN 109168032 A CN109168032 A CN 109168032A
- Authority
- CN
- China
- Prior art keywords
- video data
- target area
- data stream
- video image
- area information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000003672 processing method Methods 0.000 title claims abstract description 35
- 238000003860 storage Methods 0.000 title claims abstract description 16
- 238000000034 method Methods 0.000 claims abstract description 89
- 238000000605 extraction Methods 0.000 claims description 18
- 241001269238 Data Species 0.000 claims 1
- 230000008569 process Effects 0.000 abstract description 61
- 238000012545 processing Methods 0.000 abstract description 50
- 230000006870 function Effects 0.000 description 40
- 239000000284 extract Substances 0.000 description 14
- 238000005111 flow chemistry technique Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 10
- 230000001133 acceleration Effects 0.000 description 9
- 230000002093 peripheral effect Effects 0.000 description 8
- 230000005540 biological transmission Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 6
- 230000009471 action Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000009826 distribution Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000006835 compression Effects 0.000 description 3
- 238000007906 compression Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 239000000919 ceramic Substances 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 238000003780 insertion Methods 0.000 description 2
- 230000037431 insertion Effects 0.000 description 2
- 230000001788 irregular Effects 0.000 description 2
- 239000004973 liquid crystal related substance Substances 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 1
- 238000005314 correlation function Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000000116 mitigating effect Effects 0.000 description 1
- 238000010295 mobile communication Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 238000011176 pooling Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000010845 search algorithm Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/43—Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
- H04N21/44—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
- H04N21/4402—Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
The invention discloses a kind of processing method of video data, terminal, server and storage mediums, belong to technical field of data processing.The embodiment of the present invention obtains the target area information of an at least frame raw video image by the first equipment, and during the first equipment generates video data stream, the video data stream generated is set to carry corresponding target area information, after receiving the video data stream so as to the second equipment, required target area information can be directly extracted from the video data stream, it avoids the second equipment and is again based on process that relevant video image obtains this complexity of target area information, data processing time is greatly saved, reduces system burden.
Description
Technical field
The present invention relates to technical field of data processing, in particular to a kind of processing method of video data, terminal, server
And storage medium.
Background technique
It is more and more to the processing method of video data with the continuous development of data processing technique, for example, in order to adapt to
The processing capacity of different network bandwidth or different terminals needs to carry out transcoding processing to corresponding video data, for not
Same user demand, it is also possible to need to carry out mixed flow processing to corresponding video data.Journey is being treated to video data
In, the target area of corresponding video image can be identified according to demand, for example, can be carried out to area-of-interest
Identification, so that can distribute more code rates during Video coding to the area-of-interest, improve the matter of Video coding
Amount.
Currently, the processing method of common video data are as follows: according to the recognition rule of setting, to an at least frame original video
Image carries out target area identification, and to this, at least a frame raw video image is encoded based on the target area recognized,
So that more code rates are distributed to the target area in cataloged procedure, to obtain corresponding video data stream.And then when to the view
When frequency data stream carries out transcoding processing, first the video data stream is decoded, corresponding video image is obtained, then according to knowledge
It is irregular, target area identification is carried out to the video image again, based on the target area again identified that, according to different
Target bit rate recodes to the video image, finally obtains target video data stream corresponding with target bit rate.
Based on the processing method of above-mentioned video data, is encoded and recoded to an at least frame raw video image
During, it needs repeatedly to carry out target area identification to video image, the process of target area identification is complex and consumes
When it is longer, therefore, repeatedly carry out target area identification considerably increase system burden.
Summary of the invention
The embodiment of the invention provides a kind of processing method of video data, terminal, server and storage mediums, can solve
Certainly need the problem of target area identification repeatedly is carried out to video image.The technical solution is as follows:
On the one hand, a kind of processing method of video data is provided, which comprises
Obtain an at least frame raw video image;
Based on an at least frame raw video image, the target area letter of an at least frame raw video image is obtained
Breath;
Based on the target area information of an at least frame raw video image, to an at least frame raw video image
It is encoded, generates video data stream, the video data stream carries the target area of an at least frame raw video image
Information;
The video data stream is sent to the second equipment.
In a kind of possible implementation, the target area information based on an at least frame raw video image,
An at least frame raw video image is encoded, video data stream is generated, the video data stream carrying is described at least
The target area information of one frame raw video image includes:
To at least target area information of a frame raw video image and an at least frame raw video image into
Row coding, generates at least one first data packet for carrying at least one target area mark, at least one described target area
Mark is encoded to obtain by an at least frame raw video image;
Based at least one first data packet of described at least one target area of carrying mark, the video data is generated
Stream.
In a kind of possible implementation, the target area information based on an at least frame raw video image,
An at least frame raw video image is encoded, video data stream is generated, the video data stream carrying is described at least
The target area information of one frame raw video image includes:
The target area information of an at least frame raw video image is encoded, at least one second data is generated
Packet;
At least one described raw video image is encoded, at least one first data packet is generated;
Every the first data packet of preset number, it is inserted into second data packet, generates the video data stream.
On the one hand, a kind of processing method of video data is provided, which comprises
Video data stream is received, the video data stream carries the target area information of an at least frame raw video image;
From the video data stream, the target area information of an at least frame raw video image is extracted;
The video data stream is decoded, the corresponding video image of the video data stream is generated;
Based on the corresponding target area information of the video data stream, to the corresponding video image of the video data stream into
Row is recoded, and target video data stream is generated.
It is described to be based on the video data stream in a kind of possible implementation, extract an at least frame original video
The target area information of image includes:
Based at least one field of at least one the first data packet in the video data stream, at least one target is extracted
Area identification;
At least one described target area mark is decoded, the target of an at least frame raw video image is generated
Area information.
It is described to be based on the video data stream in a kind of possible implementation, extract an at least frame original video
The target area information of image includes:
Based on the first data packet of at least one of described video data stream and at least one second data packet, every default
The first data packet of number is decoded the second data packet after the first data packet of the preset number, described in generation
At least target area information of a frame raw video image.
On the one hand, a kind of processing method of video data is provided, which comprises
At least two-path video data flow is received, every road video data stream carries the target area of an at least frame raw video image
Domain information;
From at least two-path video data flow, the corresponding at least frame original video figure of every road video data stream is extracted
The target area information of picture;
Every road video data stream is decoded, at least corresponding video figure of two-path video data flow is generated
Picture;
The corresponding video image of at least two-path video data flow is merged, target video image is generated;
Based on the corresponding target area information of at least two-path video data flow, weight is carried out to the target video image
Coding generates target video data stream.
It is described based on at least two-path video data flow in a kind of possible implementation, extract every road video data
The target area information for flowing a corresponding at least frame raw video image includes:
Based at least one field of at least one the first data packet in every road video data stream, extraction is described at least
Corresponding at least one target area mark of two-path video data flow;
At least one corresponding target area of every road video data stream mark is decoded, generates described at least two
The target area information of the corresponding at least frame raw video image of road video data stream.
It is described based on at least two-path video data flow in a kind of possible implementation, extract every road video data
The target area information for flowing a corresponding at least frame raw video image includes:
Based on the first data packet of at least one of every road video data stream and at least one second data packet, every
The first data packet of preset number is decoded the second data packet after the first data packet of the preset number, generates
The target area information of an at least frame raw video image in at least two-path video data flow.
On the one hand, a kind of processing unit of video data is provided, described device includes:
Module is obtained, for obtaining an at least frame raw video image;
The acquisition module is also used to that it is original to obtain an at least frame based on an at least frame raw video image
The target area information of video image;
Generation module, for the target area information based on an at least frame raw video image, to described at least one
Frame raw video image is encoded, and video data stream is generated, and the video data stream carries an at least frame original video
The target area information of image;
Sending module, for sending the video data stream to the second equipment.
In a kind of possible implementation, the generation module is used for:
To at least target area information of a frame raw video image and an at least frame raw video image into
Row coding, generates at least one first data packet for carrying at least one target area mark, at least one described target area
Mark is encoded to obtain by an at least frame raw video image;
Based at least one first data packet of described at least one target area of carrying mark, the video data is generated
Stream.
In a kind of possible implementation, the generation module is used for:
The target area information of an at least frame raw video image is encoded, at least one second data is generated
Packet;
An at least frame raw video image is encoded, at least one first data packet is generated;
Every the first data packet of preset number, it is inserted into second data packet, generates the video data stream.
On the one hand, a kind of processing unit of video data is provided, described device includes:
Receiving module, for receiving video data stream, the video data stream carries an at least frame raw video image
Target area information;
Extraction module extracts the target area of an at least frame raw video image for being based on the video data stream
Domain information;
Decoder module generates the corresponding video figure of the video data stream for being decoded to the video data stream
Picture;
Recodification module, for the target area information based on an at least frame raw video image, to the video
The corresponding video image of data flow is recoded, and target video data stream is generated.
In a kind of possible implementation, the extraction module is used for:
Based at least one field of at least one the first data packet in the video data stream, at least one target is extracted
Area identification;
At least one described target area mark is decoded, the target of an at least frame raw video image is generated
Area information.
In a kind of possible implementation, the extraction module is used for:
Based on the first data packet of at least one of described video data stream and at least one second data packet, every default
The first data packet of number is decoded the second data packet after the first data packet of the preset number, described in generation
At least target area information of a frame raw video image.
On the one hand, a kind of processing unit of video data is provided, described device includes:
Receiving module, for receiving at least two-path video data flow, every road video data stream carries the original view of an at least frame
The target area information of frequency image;
Extraction module, for it is corresponding at least to extract every road video data stream based on at least two-path video data flow
The target area information of one frame raw video image;
Decoder module generates at least two-path video data flow for being decoded to every road video data stream
Corresponding video image;
Merging module generates target for merging the corresponding video image of at least two-path video data flow
Video image;
Recodification module, for being based on at least corresponding target area information of two-path video data flow, to the mesh
Mark video image is recoded, and target video data stream is generated.
In a kind of possible implementation, the extraction module is used for:
Based at least one field of at least one the first data packet in every road video data stream, extraction is described at least
Corresponding at least one target area mark of two-path video data flow;
At least one corresponding target area of every road video data stream mark is decoded, generates described at least two
The target area information of the corresponding at least frame raw video image of road video data stream.
In a kind of possible implementation, the extraction module is used for:
Based on the first data packet of at least one of every road video data stream and at least one second data packet, every
The first data packet of preset number is decoded the second data packet after the first data packet of the preset number, generates
The target area information of an at least frame raw video image in at least two-path video data flow.
On the one hand, provide a kind of terminal, the terminal includes processor and memory, be stored in the memory to
A few instruction, described instruction are loaded as the processor and are executed to realize as performed by the processing method of above-mentioned video data
Operation.
On the one hand, a kind of server is provided, the server includes processor and memory, is stored in the memory
There is at least one instruction, described instruction is loaded by the processor and executed to realize the processing method institute such as above-mentioned video data
The operation of execution.
On the one hand, a kind of computer readable storage medium is provided, at least one instruction is stored in the storage medium,
Described instruction is loaded as processor and is executed to realize the operation as performed by the processing method of above-mentioned video data.
The embodiment of the present invention obtains the target area information of an at least frame raw video image by the first equipment, and the
During one equipment generates video data stream, the video data stream generated is made to carry corresponding target area information, so as to the
After two equipment receive the video data stream, required target area information can be directly extracted from the video data stream, is kept away
The second equipment is exempted from and has been again based on process that relevant video image obtains this complexity of target area information, number is greatly saved
According to the processing time, system burden is reduced.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment
Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for
For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other
Attached drawing.
Fig. 1 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention;
Fig. 2 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention;
Fig. 3 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention;
Fig. 4 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention;
Fig. 5 is the flow chart of a kind of pair of encoding video pictures provided in an embodiment of the present invention and transcoding;
Fig. 6 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention;
Fig. 7 is the flow chart of a kind of pair of encoding video pictures provided in an embodiment of the present invention and mixed flow;
Fig. 8 is a kind of structural schematic diagram of the processing unit of video data provided in an embodiment of the present invention;
Fig. 9 is a kind of structural schematic diagram of the processing unit of video data provided in an embodiment of the present invention;
Figure 10 is a kind of structural schematic diagram of the processing unit of video data provided in an embodiment of the present invention;
Figure 11 is a kind of structural block diagram of terminal provided in an embodiment of the present invention;
Figure 12 is a kind of structural schematic diagram of server provided in an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached drawing to embodiment party of the present invention
Formula is described in further detail.
Fig. 1 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention, the place of the video data
Reason method can be applied in the first equipment.Referring to Fig. 1, which includes:
101, an at least frame raw video image is obtained.
102, it is based on an at least frame raw video image, obtains the target area letter of an at least frame raw video image
Breath.
103, the target area information based on an at least frame raw video image, to an at least frame raw video image
It is encoded, generates video data stream, which carries the target area information of an at least frame raw video image.
104, the video data stream is sent to the second equipment.
In some embodiments, should target area information based on an at least frame raw video image, to this at least one
Frame raw video image is encoded, and video data stream is generated, which carries an at least frame raw video image
Target area information include:
To this at least the target area information of a frame raw video image and this at least a frame raw video image is compiled
Code, generate carry at least one target area mark at least one first data packet, at least one target area mark by
At least a frame raw video image encodes to obtain for this;
Based at least one first data packet of at least one target area of carrying mark, the video data stream is generated.
In some embodiments, should target area information based on an at least frame raw video image, to this at least one
Frame raw video image is encoded, and video data stream is generated, which carries an at least frame raw video image
Target area information include:
The target area information of an at least frame raw video image is encoded, at least one second data is generated
Packet;
To this, at least a frame raw video image is encoded, and generates at least one first data packet;
Every the first data packet of preset number, it is inserted into second data packet, generates the video data stream.
All the above alternatives can form alternative embodiment of the invention using any combination, herein no longer
It repeats one by one.
Fig. 2 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention, the place of the video data
Reason method can be applied in the second equipment.Referring to fig. 2, which includes:
201, video data stream is received, which carries the target area letter of an at least frame raw video image
Breath.
202, from the video data stream, the target area information of an at least frame raw video image is extracted.
203, the video data stream is decoded, generates the corresponding video image of the video data stream.
204, based on the video data stream carry target area information, to the corresponding video image of the video data stream into
Row is recoded, and target video data stream is generated.
In some embodiments, it should be based on the video data stream, extract the target area of an at least frame raw video image
Domain information includes:
Based at least one field of at least one the first data packet in the video data stream, at least one target area is extracted
Domain identifier;
At least one target area mark is decoded, the target area of an at least frame raw video image is obtained
Information.
In some embodiments, it should be based on the video data stream, extract the target area of an at least frame raw video image
Domain information includes:
Based on the first data packet of at least one of the video data stream and at least one second data packet, every present count
Mesh the first data packet is decoded the second data packet after the first data packet of the preset number, generate this at least one
The target area information of frame raw video image.
All the above alternatives can form alternative embodiment of the invention using any combination, herein no longer
It repeats one by one.
Fig. 3 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention, the place of the video data
Reason method can be applied in the second equipment.Referring to Fig. 3, which includes:
301, at least two-path video data flow is received, every road video data stream carries the mesh of an at least frame raw video image
Mark area information.
302, from this at least two-path video data flow, the corresponding at least frame original video of every road video data stream is extracted
The target area information of image.
303, every road video data stream is decoded, generates at least corresponding video figure of two-path video data flow
Picture.
304, at least corresponding video image of two-path video data flow is merged, generates target video image.
305, it is based on at least corresponding target area information of two-path video data flow, weight is carried out to the target video image
Coding generates target video data stream.
In some embodiments, it should be based on at least two-path video data flow, it is corresponding extremely to extract every road video data stream
The target area information of a frame raw video image includes: less
Based at least one field of at least one the first data packet in every road video data stream, at least two-way is extracted
Corresponding at least one target area mark of video data stream;
At least one corresponding target area of every road video data stream mark is decoded, obtaining this, at least two-way regards
The target area information of the corresponding at least frame raw video image of frequency data stream.
In some embodiments, it should be based on at least two-path video data flow, it is corresponding extremely to extract every road video data stream
The target area information of a frame raw video image includes: less
Based on the first data packet of at least one of every road video data stream and at least one second data packet, every pre-
If the first data packet of number is decoded the second data packet after the first data packet of the preset number, generates this extremely
The target area information of an at least frame raw video image in few two-path video data flow.
All the above alternatives can form alternative embodiment of the invention using any combination, herein no longer
It repeats one by one.
Fig. 4 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention, the place of the video data
Reason method is illustrated so that the first equipment and the second equipment interact as an example, wherein the first equipment has encoding function,
Second equipment has the function of transcoding.Referring to fig. 4, which includes:
401, the first equipment obtains an at least frame raw video image.
In embodiments of the present invention, which has the function of video image acquisition and encoding function, first equipment
Function can be obtained by the video image, obtain an at least frame raw video image.Wherein, an at least frame original video figure
As being that un-encoded wait is handled, the video image that the first equipment is originally taken.
By taking first equipment is terminal as an example, multimedia client, such as live streaming client can be installed in the terminal,
The multimedia client can acquire an at least frame raw video image in real time by the camera in terminal.Wherein, terminal can
First to get an at least frame raw video image, then to this, at least a frame raw video image is encoded again.Certainly,
As soon as terminal can also be often to get a raw video image, to corresponding encoding function is passed through, to an original video figure
As being encoded.
Certainly, which is also possible to server, which can receive at least frame that any terminal is sent
Raw video image, and at least frame raw video image real-time perfoming received is compiled based on the encoding function on server
Code.Certainly, which can also first get an at least frame raw video image, then again to an at least frame original video
Image is encoded.Concrete form and an acquisition at least frame original video figure of the embodiment of the present invention at this to first equipment
The detailed process of picture is without limitation.
402, the first equipment is based on an at least frame raw video image, obtains the mesh of an at least frame raw video image
Mark area information.
In embodiments of the present invention, target area refers to the image district for needing emphasis to handle on each raw video image
Domain is based on the target area, and the first equipment, can be with emphasis to the target area when encoding to each raw video image
It is analyzed, and distributes more code rates to the target area, to increase the encoding precision of the target area, improve binary encoding
Quality.Target area information is the information in relation to the target area, which can be used to indicate that each original
Whether the correspondence macro block on video image belongs to target area, which can be used for indicating each original video
The bias of the importance of correspondence macro block on image or corresponding macro block.Certainly, which may be other
Information in relation to the target area, the embodiment of the present invention at this to the particular content of the target area information without limitation.
Specifically, which can be area-of-interest, which can be the required emphasis of user
The region of concern, or the main part in correspondence image, for example, the area-of-interest can be face, certainly, the mesh
Marking region can also be other setting regions, and the embodiment of the present invention is it is not limited here.As shown in figure 5, in the mistake of image procossing
Cheng Zhong, the first equipment can know an above-mentioned at least frame raw video image by corresponding target area recognizer
Not, to identify the target area in each raw video image.Wherein, the first equipment can not be advised by box, circle or
Then the modes such as polygon sketch the contours the target area in each raw video image recognized.In turn, the first equipment can
To extract target area corresponding with each target area based on the target area in each raw video image recognized
Information.
By taking selective search (Selective Search) algorithm as an example, the extraction process of target area information is said
Bright: the first equipment can run Selective Search algorithm to an above-mentioned at least frame raw video image, to each original
Beginning video image carries out initial image segmentation, and each raw video image is divided at least one lesser candidate region,
Then screening and merger are carried out at least one corresponding candidate region of each raw video image, target area will not be met
The candidate region that domain requires is deleted, and the candidate region for meeting target area requirement is carried out merger.
For example, can the parameters pair such as color based at least one above-mentioned candidate region, texture, size and space be overlapping
Similarity between at least one candidate region is calculated, and the target in each candidate region and database can also be calculated
Similarity between region, for example, the similarity of the human face region stored in each candidate region and database can be calculated, with
Determine whether the candidate region is required human face region.It may finally be based on the higher candidate region of similarity, obtained each
The corresponding target area of raw video image.In turn, it can be based on each target area, obtained corresponding with each target area
Target area information then can be with for example, the similarity between the target area stored in a target area and database is higher
One target area is determined as more important target area, target area information corresponding with a target area is then
It can be used to indicate that a target area is important area.
Certainly, in other embodiments, the first equipment can also be above-mentioned to identify by other target area recognizers
Target area in an at least frame raw video image simultaneously obtains corresponding target area information, which can also be with
For other information, the tool of specific algorithm and above-mentioned target area information of the embodiment of the present invention at this to target area identification
Hold in vivo with concrete form without limitation.
It should be noted that first equipment can then pass through corresponding target often to get a raw video image
Region recognition algorithm, to obtain the target area information of said one raw video image.Certainly, which can also be first
Part or all raw video images to be processed are got, then by corresponding target area recognizer, in acquisition
The target area information of part or all raw video images to be processed is stated, the embodiment of the present invention is it is not limited here.
403, the first equipment encodes the target area information of an at least frame raw video image, generates at least one
A target area mark.
In embodiments of the present invention, the target area based at least one raw video image got in step 402
Information, the first equipment are carried in the video data stream that after at least a frame raw video image encodes, which is generated to this
There is target area corresponding with each raw video image information, so as to the subsequent processes of related video datastream
In, when relevant device needs corresponding target area information, required target can be directly extracted from above-mentioned video data stream
Area information avoids again to this complicated process of associated video image operational objective region recognition algorithm, greatly reduces
Processing time of video data, reduce the processing load of system.
In one embodiment, the first equipment can be by believing the corresponding target area of an at least frame raw video image
Breath is encoded, at least one target area mark generated is incorporated into the video data stream ultimately generated, to realize view
Frequency data stream carries the purpose of at least target area information of a frame raw video image.
Specifically, the first equipment can compress the corresponding target area information of each raw video image, will
Above-mentioned target area information is converted into corresponding binary digit, wherein the corresponding binary digit is each original view
The corresponding target area mark of the target area information of frequency image.Target area mark can be used to indicate that corresponding target area
Significance level, for example, when the target area information indicates that corresponding target area is most important region, then to the target
The target area mark that area information generates after being encoded can be digital " 1 ", when the target area information indicates corresponding
When target area is normal areas, then the target area mark that information generates after encoding to the target area can be number
“0”。
Certainly, in other embodiments, above-mentioned target area identifies its that can be also used for indicating corresponding target area
His target area information, and, corresponding target area information can also be identified by other means, and the embodiment of the present invention is to this
The specific expression content and be specifically identified mode without limitation that target area identifies.
404, to this, at least a frame raw video image encodes the first equipment, generates at least one first data packet.
In embodiments of the present invention, at least frame raw video image got based on step 401, the first equipment can be with
Each raw video image is encoded, by data volume it is huge this at least a frame raw video image is compressed into data volume
Lesser video data stream, is transmitted convenient for Transmission system, saves transmission time.
Specifically, the first equipment can remove the redundancy letter of an above-mentioned at least frame raw video image by encoding function
Breath, for example, the first equipment can remove the spatial redundancy information of each raw video image, time redundancy information, visual redundancy
Information etc., to compress to an at least frame raw video image, which be can specifically include: prediction, transformation, quantization
And the processes such as entropy coding, by the above process, the first equipment available corresponding with each raw video image at least one
A code.
Based at least one obtained code, the first equipment can be arranged the code for setting quantity according to corresponding rule
It is listed in together, and is packaged, for example, carrying out NAL (Network AbstractLayer, network abstract layer) to above-mentioned code
It is packaged, first data packet is formed, for above-mentioned by an at least frame raw video image.Encode at least one generation generated
Code, at least one available corresponding first data packet.It wherein, may include at least one generation in each first data packet
Code, the embodiment of the present invention at this to the quantity of the code in each first data packet without limitation.
405, at this at least one first data packet, corresponding at least one target area of being inserted into identifies the first equipment,
Generate the video data stream.
In embodiments of the present invention, at least one target area mark obtained based on above-mentioned steps 403 is obtained with step 404
The first data packet of at least one arrived, the first equipment, which can identify at least one target area, corresponding be filled in corresponding the
In one data packet, so that corresponding first data packet carries corresponding target area mark, and it is based at least one target area
Domain identifier and at least one first data packet generate video data stream, realize that an at least frame is carried in video data stream is original
The purpose of the target area information of video image.
It specifically, include at least one generated based on target area at least one first data packet that the first equipment generates
At least one first data packet that a first data packet and nontarget area generate, the first equipment can be based on target area above-mentioned
It is corresponding at least one first data packet that domain generates to be inserted at least one target area mark.For example, the first equipment can incite somebody to action
Each target area mark is incorporated into the first data packet corresponding with each target area mark, and certainly, the first equipment may be used also
Each target area mark to be inserted in the rearmost position for identifying corresponding first data packet with each target area, so that base
A corresponding target area is carried in the first data packet of each of target area generation to identify.
Based on the above process, at least one target area mark of generation is successively all inserted into corresponding first number by the first equipment
After the corresponding position of packet, the first equipment can be based at least one target area mark and at least one first data
Packet a series of processes such as is spliced and is packaged, ultimately generates corresponding video data stream, then carry in the video data stream
Corresponding target area mark.Wherein, in an encoding process, the first equipment can be in an at least frame raw video image
More code rates are distributed in target area, so that the first equipment is higher to the coding quality of target area.
Above-mentioned steps 403 to step 405 is that the first equipment is believed based on the target area of an at least frame raw video image
Breath generates corresponding at least one target area mark, based on an at least frame raw video image generate it is corresponding at least one the
One data packet, and at least one target area is identified in corresponding insertion at least one first data packet, finally to give birth to
The process of corresponding target area information is carried at video data stream.
Except process involved in step 403 to step 405, another kind introduced below can be such that the video data stream generated takes
Process with corresponding target area information:
(1) first equipment to the target area information of at least frame raw video image obtained based on step 402 into
Row coding, generates at least one second data packet.Wherein, each second data packet is made of at least one corresponding code, should
Code is to pass through the data that the encoding function of the first equipment compresses each raw video image.Specifically,
One equipment can be predicted by the target area information to an at least frame raw video image, be converted, being quantified and entropy is compiled
The processes such as code are obtained corresponding with each target area information with removing the related redundancy information in above-mentioned target area information
At least one code, and then the first equipment can rule by least one corresponding code of each target area information to set
It is arranged together, and is packaged, obtain at least one corresponding second data packet of at least one target area information.Wherein,
The present invention to the specific queueing discipline of at least one above-mentioned character without limitation;
(2) first equipment to got based on step 401 this at least a frame raw video image encodes, generate extremely
Few first data packet.Similarly, this will not be repeated here by the present invention for the detailed process and above-mentioned steps 404;
(3) first equipment are based at least one first data packet, every the first data packet of preset number, are inserted into one
Second data packet generates the video data stream.Specifically, based at least one second data packet and step obtained in step (1)
Suddenly at least one first data packet obtained in (2), the first equipment can be in the last positions of the first data packet of every preset number
It sets, is inserted into second data packet, so that the first data packet of every preset number carries second data packet, wherein default
Number can be any positive integer of the first equipment setting.It is of course also possible to which the first data packet of setting section does not carry the second number
According to packet.The embodiment of the present invention at this to the specific value of preset number without limitation, and to carrying specific the of the second data packet
One data packet is without limitation.
Based on the above process, at least one second data packet of generation is sequentially inserted into every preset number the by the first equipment
After the corresponding position of one data packet, the first equipment can be based at least one second data packet and at least one first data
Packet a series of processes such as is spliced and is packaged, ultimately generates corresponding video data stream, then the video data stream China takes
With corresponding second data packet.Wherein, in an encoding process, the first equipment can be in an at least frame raw video image
More code rates are distributed in target area, so that the first equipment is higher to the coding quality of target area.
Above-mentioned steps (1) to step (3) is that the first equipment is believed based on the target area of an at least frame raw video image
Breath generates at least one corresponding second data packet, and insertion of at least one second data packet is based on the original view of an at least frame by this
In at least one first data packet that frequency image generates, the video data stream generated is finally made to carry corresponding target area letter
The process of breath.
In other embodiments, except above two method can make the video data stream generated carry corresponding target area
Except domain information, above-mentioned video data stream can also be made to carry corresponding target area information, the present invention using other modes
Embodiment is not done repeat one by one herein.
It should be noted that above-mentioned steps 403 to process involved in step 405 is to an at least frame original video figure
At least a frame raw video image is encoded the target area information of picture with this, is generated and is carried at least one target area mark
The process of at least one the first data packet.In this process, the first equipment can be based on an above-mentioned at least frame raw video image
And its corresponding target area information, while generating at least one first data packet, that is to say, the first equipment can it is above-mentioned extremely
At least one target area mark is inserted into few at least one corresponding code of a frame raw video image, so that the first equipment can
At least one above-mentioned code and at least one target area are identified while be packaged, to generate at least one first data packet.
To this, the specific generating mode of at least one the first data packet is without limitation at this for the embodiment of the present invention.
406, the first equipment sends the video data stream to the second equipment.
In embodiments of the present invention, as shown in figure 5, based on the video data stream that step 405 obtains, the first equipment can be incited somebody to action
The video data stream is to other any second equipment.Wherein, the first equipment can be based on corresponding Transmission system for the view
Frequency data stream is transmitted in corresponding second equipment, which can be internet, terrestrial wireless broadcast and satellite etc..
Form based on video data stream transmits data, so that data are more quick in transmission process, and stores more convenient, mitigation
The burden of Transmission system.
It should be noted that second equipment can have store function, decoding function and recodification function, this second is set
Standby to can be terminal, which can flow into video data by the application program with decoding function and recodification function
Processing row decoding and recoded.Second equipment may be server, which can obtain corresponding video in real time
Data flow, and by the decoding and recodification process on server, the video data stream real-time perfoming got is handled.This hair
Bright embodiment at this to the concrete form of second equipment without limitation.
407, the second equipment receives video data stream, which carries the target of an at least frame raw video image
Area information.
In embodiments of the present invention, based on step 401 to step 405 it is found that the first equipment is original based on an at least frame
During video image is encoded, the target area information extracted from this at least a frame raw video image is also compiled
Enter in corresponding video data stream, so that the video data stream generated carries the target area of an at least frame raw video image
Information, therefore, the second equipment it is received from the video data stream of the first equipment while, be also received by and be incorporated into video counts
According to the target area information of at least frame raw video image in stream.
It should be noted that second equipment can that is to say that second equipment can be with one with real-time reception video data stream
Side receives video data stream, synchronizes handle the video data stream received on one side.Certainly, which can also be first
All video data streams of the first equipment transmission have been received, then the video data stream received have been performed corresponding processing, this hair
Bright embodiment is it is not limited here.
408, the second equipment is decoded at least one target area mark in the video data stream, obtains this at least
The target area information of one frame raw video image.
In embodiments of the present invention, right as shown in figure 5, second equipment can be based on decoding function and recodification function
The video data stream received carries out transcoding, wherein transcoding refers to the video data stream generated based on above-mentioned first equipment, will
The video data stream is converted into another video data stream, with adapt to different network bandwidths, different terminal processing capacities and
Different user demands etc..For example, above-mentioned video data stream can be transcoded into the video counts of different video format by the second equipment
According to stream, for example the second equipment can be by MPEG-2 (Moving Picture Experts Group, Motion Picture Experts Group) lattice
The video data stream of formula switchs to the video data stream of H.264 format, and the second equipment can also change to be received from the first equipment
The bit rate of video data stream, to meet the demand of the broadcasting of distinct device, in addition, second equipment can also be to receiving
Video data stream carries out transcoding, so that the resolution ratio of the corresponding video image of video data stream before and after transcoding changes, than
HD video can such as be switched to SD video.The embodiment of the present invention does not limit the particular use of the transcoding process at this
It is fixed.
The essence of above-mentioned transcoding process be first decoded based on the video data stream received, then to the obtained data of decoding into
The process that row is recoded.Wherein, the video data stream received for the second equipment, can by above-mentioned steps 403 to step 405
Know, had both included the data encoded by an at least frame raw video image in the video data stream, and also included by an at least frame
The data that the corresponding target area information coding of raw video image obtains.Therefore, which can be based on the video counts
According to stream, corresponding target area information is extracted, wherein the process of the extraction target area information is to above-mentioned video data
Decoded process is flowed, which is the process unziped it to the related data in video data stream.
Accordingly with step 405, in one embodiment, may include in the video data stream that the second equipment receives
At least one target area mark, at least one target area mark are based in a corresponding at least frame raw video image
Target area Information Compression obtained from.Therefore, it when second equipment needs corresponding target area information, can be based on
At least one target area mark in the video data stream is decoded, to extract required target area information.
Specifically, the first data packet of each of above-mentioned video data stream includes at least one field, at least one word
Section includes data head and data body portion, wherein the data head can identify for corresponding target area, which exists
It can first be extracted during extracting at least target area information of a frame raw video image based on above-mentioned video data stream
The corresponding target area mark of data head at least one above-mentioned field, is then decoded target area mark, with
The corresponding target area information in each target area is extracted, the embodiment of the present invention is at this to second equipment to above-mentioned video
The detailed process that at least one target area mark in data flow is decoded is without limitation.
The above process is to be decoded at least one target area mark in video data stream, to extract correspondence
At least target area information of a frame raw video image for be illustrated, be described below another from video data stream
The middle method for extracting at least target area information of a frame raw video image:
Accordingly with the step (1) in step 405 to step (3), in one embodiment, the second equipment can be based on being somebody's turn to do
The first data packet of at least one of video data stream and at least one second data packet, every the first data of preset number
Packet is decoded the second data packet after the first data packet of the preset number, generates an at least frame original video figure
The target area information of picture.Specifically, may include in the video data stream at least one first data packet and at least one the
Two data packets, at least one second data packet are that the target area information coding based on an at least frame raw video image obtains
It arrives, the second equipment at least one second data packet can be decoded this, and required target area information can be obtained.
Wherein, the second equipment can detect above-mentioned video data stream, and the second equipment may be every preset number
First data packet can detecte to corresponding second data packet, and specifically, the second equipment can be every N number of first data
Packet detects that the N+1 data packet is the second data packet, wherein N can be any positive integer.Certainly, above-mentioned corresponding second
Data packet may also be located at the other positions of every preset number the first data packet, and, the every two that the second equipment detects the
The first data packet between two data packets may be any other quantity, and the embodiment of the present invention is it is not limited here.Second sets
It is standby second data packet to be unziped it, which is reduced to correspond to based on the decoding function having
Target area information, achieve the purpose that extract an at least frame raw video image target area information.
Based on the above process, the second equipment can more quickly extract the target area letter carried in video data stream
Breath, avoids in subsequent processing, and the second equipment carries out target area recognizer to video image again, needed for acquisition
Target area information, greatly reduces data processing time, reduces the operation burden of the second equipment.
It should be noted that two kind of second equipment except above-mentioned introduction extracts corresponding mesh based on the video data stream received
The method for marking area information, the second equipment can also extract corresponding target area information by other methods, and the present invention is implemented
Example extracts the specific method of target area information without limitation to second equipment at this.
409, the second equipment is decoded the first data packet of at least one of the video data stream, obtain this at least one
The corresponding video image of a first data packet.
In embodiments of the present invention, which is former at least frame got by the first equipment
Beginning encoding video pictures obtain.Second equipment needs to be based on during carrying out transcoding to the video data stream received
The video data stream is decoded, and the first data packet of at least one of the video data stream is reduced to corresponding video figure
Picture, and then the parameters such as the resolution ratio based on the setting of the second equipment or format, handle corresponding video image, are accorded with
The video image of conjunction demand.
Specifically, accordingly with step 404, the second equipment can be by corresponding decoding algorithm to above-mentioned video data stream
At least one of the first data packet be decoded, for example, the second equipment can be by H.264 decoding algorithm to the video data
The first data packet of at least one of stream is decoded.Second equipment can call the correlation function in the decoding algorithm, obtain
The packaging information in the video data stream is taken, at least one of to read and analyze the video data stream the first data packet, is sought
The leader for finding each first data packet is known, and is then decoded, finally obtains each to the data between the knowledge of every two leader
The corresponding each video image of data.Based on the above process, the second equipment can be by least one of video data stream first
Data packet is successively reduced to a corresponding at least frame video image, realizes the purpose of video data stream decoding.
Above-mentioned steps 408 to step 409 is that the video data stream that the second equipment interconnection receives is decoded, and generates the video
The process of the corresponding video image of data flow, the process include to video data stream decoding to obtain corresponding target area information
Process, further include in video data stream database decode to obtain the process of corresponding video image.Certainly, at other
In embodiment, the second equipment can also be decoded video data stream by other decoding algorithms, and the embodiment of the present invention is herein
Without limitation to the detailed process of video data stream decoding.
It should be noted that in above-mentioned steps 408 to the first equipment involved in step 409 at least one the first data
Packet be decoded during, first equipment can obtain simultaneously at least one corresponding video image of the first data packet with
And corresponding target area information.The embodiment of the present invention obtains above-mentioned video image and its corresponding target to the first equipment at this
The sequence of area information is without limitation.
410, target area information of second equipment based on an at least frame raw video image, to the video data stream pair
The video image answered is recoded, and target video data stream is generated.
In embodiments of the present invention, the target area letter based on an at least frame raw video image obtained in step 408
The corresponding video image obtained in breath and step 409 to video data stream decoding, the second equipment can be to the corresponding video
Image is recoded, which is ROI (Region Of Interest, area-of-interest) coding, and according to above-mentioned mesh
Area information is marked, during recodification, more to target area corresponding with the target area information in video image distribution
More code rate, to generate the higher target video data stream of quality.
Specifically, similarly with the process that is encoded in step 404, the second equipment can according to the object format of setting or
The parameters such as target resolution, predict the corresponding video image of above-mentioned video data stream, are converted, being quantified and entropy coding etc.
Process, to remove the redundancy of the video image, final second equipment can be by the corresponding video figure of above-mentioned video data stream
As corresponding at least one object code of the parameters such as boil down to and the object format of setting or target resolution.
Based at least one object code obtained above, the second equipment can according to corresponding rule, by this at least one
A object code is arranged, and carries out the process such as being packaged, and ultimately generates object format or target resolution etc. with setting
The corresponding target video data stream of parameter is realized and carries out transcoding to the corresponding video data stream of an at least frame raw video image
Process.
It should be noted that the second equipment during recoding to above-mentioned video image, can also be based on it
He recodes at parameter, and the detailed process of parameter and recodification that the embodiment of the present invention recodes to the second equipment at this is not
It limits.
Above-mentioned steps 407 to step 410 is the process for the video data stream progress transcoding that the second equipment interconnection receives, such as
Shown in Fig. 5, during the transcoding, the second equipment can directly extract corresponding target area letter from video data stream
Breath, avoids the process for reruning target area recognizer, greatly improves system performance.Certainly, except above-mentioned mentioned
Except transcoding process, the second equipment can also realize transcoding by other methods, as long as the second equipment can be directly from video
Corresponding target area information is extracted in data flow, the embodiment of the present invention is it is not limited here.
The embodiment of the present invention obtains the target area information of an at least frame raw video image by the first equipment, and the
During one equipment generates video data stream, the video data stream generated is made to carry corresponding target area information, so as to the
After two equipment receive the video data stream, required target area information can be directly extracted from the video data stream, is kept away
The second equipment is exempted from and has been again based on process that relevant video image obtains this complexity of target area information, number is greatly saved
According to the processing time, system burden is reduced.
Above-described embodiment can be applied in net cast scene, and specifically, in net cast, live streaming client can be with
An at least frame raw video image is obtained in real time by the camera of terminal, and terminal can be to an at least frame raw video image
Target area identification is carried out, and an at least frame raw video image is encoded based on obtained target area information.Terminal
For the video data stream that above-mentioned coding can be generated to server, server can be based on the mesh carried in video data stream
Mark area information is decoded video data stream, obtains corresponding video image and its carries target area information, and to upper
It states video image to recode, realizes the purpose to video data stream transcoding, so that the target video data stream that transcoding generates
Corresponding video resolution or video format etc. change, to adapt to the different demands of user.And server can also incite somebody to action
Target video data stream after transform format is sent to other terminals, to adapt to the video playing and processing capacity of different terminals.
In addition to above-mentioned net cast scene, which can also be applied to other scenes, and the embodiment of the present invention is right at this
The particular use of transcoding processing is without limitation.
All the above alternatives can form alternative embodiment of the invention using any combination, herein no longer
It repeats one by one.
Fig. 6 is a kind of flow chart of the processing method of video data provided in an embodiment of the present invention, the place of the video data
Reason method is interacted and is illustrated with the first equipment and the second equipment, wherein the first equipment have encoding function, second
Equipment has the function of mixed flow.Referring to Fig. 6, which includes:
601, the first equipment obtains an at least frame raw video image.
602, the first equipment is based on an at least frame raw video image, obtains the mesh of an at least frame raw video image
Mark area information.
603, the first equipment encodes the target area information of an at least frame raw video image, generates at least one
A target area mark.
604, to this, at least a frame raw video image encodes the first equipment, generates at least one first data packet.
605, at this at least one first data packet, corresponding at least one target area of being inserted into identifies the first equipment,
Generate the video data stream.
606, the first equipment sends the video data stream to the second equipment.
In embodiments of the present invention, same to step 406 with step 401 as shown in fig. 7, above-mentioned steps 601 are to step 606
Reason, details are not described herein for the embodiment of the present invention.
607, the second equipment receives at least two-path video data flow, and every road video data stream carries an at least frame original video
The target area information of image.
In embodiments of the present invention, as shown in fig. 7, second equipment can have store function, decoding function, merge function
At least two-path video data flow from least one the first equipment can be can receive with recodification function, second equipment, with
At least two-path video data flow progress mixed flow processing to receiving, wherein mixed flow processing refers to above-mentioned source is different
The corresponding video image of at least two-path video data flow merge, be finally by above-mentioned at least two-path video data stream merging
It that is to say with video data stream all the way with meeting the needs of users, the essence of mixed flow processing is at least two-path video data flow
The process for being decoded, merging and recoding.
Second equipment can be server, which can have mixed flow function, which, which can receive, comes from
At least two-path video data flow of different multimedia client, and to this at least two-path video data flow be decoded, merge and
It recodes, by at least two-path video data flow mixed flow at target video data stream all the way.Certainly, which can be with
For terminal, which can receive at least two-path video data flow that any other equipment is sent, and by at least two-path video
Data stream merging is with target video data stream all the way.The embodiment of the present invention does not limit the concrete form of second equipment at this
It is fixed.
Above-mentioned second equipment can at least two-path video data flow with real-time reception from different first equipment, and synchronize pair
At least two-path video data flow carries out mixed flow processing for this, that is to say, which can the different video of one side receipt source
Data flow carries out mixed flow processing to the video data stream having been received on one side.Certainly, which can also first receive
The different all video data streams in source, then based on all video data streams received, mixed flow processing is carried out, the present invention is implemented
Example to the second equipment interconnection rating frequency data stream and carries out the sequence of mixed flow processing without limitation at this.
It should be noted that can both have the function of mixed flow in same second equipment or have the function of transcoding, for example, together
It can have mixed flow system and trans-coding system in one the second equipment, wherein the mixed flow system can be at least two received
Road video data stream carries out mixed flow processing, which can carry out transcoding processing to the every road video data stream received.
Certainly, the mixed flow system and trans-coding system can also be located in the second different equipment, wherein the with mixed flow system
Two equipment can carry out mixed flow processing at least two-path video data flow that receives, and the second equipment with trans-coding system can be with
Transcoding processing carried out to every road video data stream for receiving, the embodiment of the present invention is at this to whether having simultaneously in second equipment
There are mixed flow function and transcoding function without limitation.
608, the second equipment is decoded at least one target area mark in each video data stream, is somebody's turn to do
The target area information of an at least frame raw video image at least in two-path video data flow.
609, the second equipment is decoded the first data packet of at least one of every road video data stream, obtains this extremely
At least one of few two-path video data flow corresponding video image of the first data packet.
All video data streams that above-mentioned steps 608 to step 609 needs to receive the second equipment all carry out accordingly
Processing, as shown in Figure 7, wherein to the treatment process of every road video data stream all with the treatment process of step 408 to step 409
Similarly, details are not described herein for the embodiment of the present invention.
610, the second equipment merges at least corresponding video image of two-path video data flow, generates target video
Image.
In embodiments of the present invention, as shown in fig. 7, it is every in at least two-path video data flow obtained based on step 609
The corresponding video image of road video data stream, the second equipment can be by corresponding pooling functions, by at least two-path video number
It is combined according to corresponding video image is flowed, so that the video image of at least two-path video data flow is combined into an entirety,
It that is to say and corresponding target video image is generated based on an at least frame video image.
Specifically, it is based on the corresponding at least frame video image of above-mentioned at least two-path video data flow, the second equipment can be with
Since first video image of every road video data stream, by each view of same position in this at least two-path video data flow
Frequency image correspondence merges.In addition, the second equipment can also will be corresponding in above-mentioned at least two-path video data flow
Per N number of video image, correspondence is merged, wherein N is positive integer.Further, the second equipment can be by above-mentioned at least two
An at least frame video image corresponding to video data stream carries out left back merging or merges up and down, and the second equipment can also be to upper
The merging mode that an at least frame video image carries out " the big small figure of picture frame " is stated, so that the target video image generated is " picture-in-picture "
Form.Certainly, in addition to the merging method of above-mentioned video image, the second equipment can also merge at least two using other modes
The corresponding video image of road video data stream, to generate target video image, this clearly demarcated embodiment generates the second equipment at this
The concrete mode of target video image is without limitation.
611, the second equipment is based on at least corresponding target area information of two-path video data flow, to the target video figure
As recoding, target video data stream is generated.
In embodiments of the present invention, as shown in fig. 7, step 611 and above-mentioned steps 410 similarly, the embodiment of the present invention is herein
It repeats no more.
The embodiment of the present invention obtains the target area information of an at least frame raw video image by the first equipment, and the
During one equipment generates video data stream, the video data stream generated is made to carry corresponding target area information, so as to the
After two equipment receive the video data stream, required target area information can be directly extracted from the video data stream, is kept away
The second equipment is exempted from and has been again based on process that relevant video image obtains this complexity of target area information, number is greatly saved
According to the processing time, system burden is reduced.
Above-described embodiment can be applied in net cast scene, and specifically, during live streaming, mixed flow processing can be answered
For processes such as video interactives between main broadcaster and other users, in the process, server be can receive from different more
The video data stream that media client is sent, the video data stream that server can be different to the source received carry out above-mentioned mixed
Stream process, so that the video data stream of above-mentioned separate sources is merged into target video data stream all the way.Except above-mentioned net cast
Except scene, which can also be applied to other scenes, and the embodiment of the present invention is at this to the tool of mixed flow processing
Body purposes is without limitation.
All the above alternatives can form alternative embodiment of the invention using any combination, herein no longer
It repeats one by one.
Fig. 8 is a kind of structural schematic diagram of the processing unit of video data provided in an embodiment of the present invention.It, should referring to Fig. 8
Device includes: to obtain module 801, generation module 802, sending module 803.
Module 801 is obtained, for obtaining an at least frame raw video image;
The acquisition module 801 is also used to obtain an at least frame original video based on an at least frame raw video image
The target area information of image;
Generation module 802, for the target area information based on an at least frame raw video image, to an at least frame
Raw video image is encoded, and video data stream is generated, which carries an at least frame raw video image
Target area information;
Sending module 803, for sending the video data stream to the second equipment.
In some embodiments, which is used for:
To this at least the target area information of a frame raw video image and this at least a frame raw video image is compiled
Code, generate carry at least one target area mark at least one first data packet, at least one target area mark by
At least a frame raw video image encodes to obtain for this;
Based at least one first data packet of at least one target area of carrying mark, the video data stream is generated.
In some embodiments, which is used for:
The target area information of an at least frame raw video image is encoded, at least one second data is generated
Packet;
To this, at least a frame raw video image is encoded, and generates at least one first data packet;
Every the first data packet of preset number, it is inserted into second data packet, generates the video data stream.
The embodiment of the present invention obtains the target area information of an at least frame raw video image by the first equipment, and the
During one equipment generates video data stream, the video data stream generated is made to carry corresponding target area information, so as to the
After two equipment receive the video data stream, required target area information can be directly extracted from the video data stream, is kept away
The second equipment is exempted from and has been again based on process that relevant video image obtains this complexity of target area information, number is greatly saved
According to the processing time, system burden is reduced.
It should be understood that the processing unit of video data provided by the above embodiment is in the processing of video data, only
The example of the division of the above functional modules, in practical application, can according to need and by above-mentioned function distribution by
Different functional modules is completed, i.e., the internal structure of equipment is divided into different functional modules, described above complete to complete
Portion or partial function.In addition, the processing unit of video data provided by the above embodiment and the processing method of video data are real
It applies example and belongs to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Fig. 9 is a kind of structural schematic diagram of the processing unit of video data provided in an embodiment of the present invention.It, should referring to Fig. 9
Device includes: receiving module 901, extraction module 902, decoder module 903, recodification module 904.
Receiving module 901, for receiving video data stream, which carries an at least frame raw video image
Target area information;
Extraction module 902 extracts the target area of an at least frame raw video image for being based on the video data stream
Information;
Decoder module 903 generates the corresponding video figure of the video data stream for being decoded to the video data stream
Picture;
Recodification module 904, for target area information and target bit rate based on an at least frame raw video image,
It recodes to the corresponding video image of the video data stream, generates target video data stream.
In some embodiments, which is used for:
Based at least one field of at least one the first data packet in the video data stream, at least one target area is extracted
Domain identifier;
At least one target area mark is decoded, the target area of an at least frame raw video image is obtained
Information.
In some embodiments, which is used for:
Based on the first data packet of at least one of the video data stream and at least one second data packet, every present count
Mesh the first data packet is decoded the second data packet after the first data packet of the preset number, generate this at least one
The target area information of frame raw video image.
The embodiment of the present invention obtains the target area information of an at least frame raw video image by the first equipment, and the
During one equipment generates video data stream, the video data stream generated is made to carry corresponding target area information, so as to the
After two equipment receive the video data stream, required target area information can be directly extracted from the video data stream, is kept away
The second equipment is exempted from and has been again based on process that relevant video image obtains this complexity of target area information, number is greatly saved
According to the processing time, system burden is reduced.
It should be understood that the processing unit of video data provided by the above embodiment is in the processing of video data, only
The example of the division of the above functional modules, in practical application, can according to need and by above-mentioned function distribution by
Different functional modules is completed, i.e., the internal structure of equipment is divided into different functional modules, described above complete to complete
Portion or partial function.In addition, the processing unit of video data provided by the above embodiment and the processing method of video data are real
It applies example and belongs to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Figure 10 is a kind of structural schematic diagram of the processing unit of video data provided in an embodiment of the present invention.Referring to Figure 10,
The device includes: receiving module 1001, extraction module 1002, decoder module 1003, merging module 1004, recodification module
1005。
Receiving module 1001, for receiving at least two-path video data flow, it is former that every road video data stream carries an at least frame
The target area information of beginning video image;
It is corresponding extremely to extract every road video data stream for being based on at least two-path video data flow for extraction module 1002
The target area information of a few frame raw video image;
Decoder module 1003 generates at least two-path video data flow for being decoded to every road video data stream
Corresponding video image;
Merging module 1004 generates mesh for merging at least corresponding video image of two-path video data flow
Mark video image;
Recodification module 1005, for being based on at least corresponding target area information of two-path video data flow, to the mesh
Mark video image is recoded, and target video data stream is generated.
In some embodiments, which is used for:
Based at least one field of at least one the first data packet in every road video data stream, at least two-way is extracted
Corresponding at least one target area mark of video data stream;
At least one corresponding target area of every road video data stream mark is decoded, obtaining this, at least two-way regards
The target area information of the corresponding at least frame raw video image of frequency data stream.
In some embodiments, which is used for:
Based on the first data packet of at least one of every road video data stream and at least one second data packet, every pre-
If the first data packet of number is decoded the second data packet after the first data packet of the preset number, generates this extremely
The target area information of an at least frame raw video image in few two-path video data flow.
The embodiment of the present invention obtains the target area information of an at least frame raw video image by the first equipment, and the
During one equipment generates video data stream, the video data stream generated is made to carry corresponding target area information, so as to the
After two equipment receive the video data stream, required target area information can be directly extracted from the video data stream, is kept away
The second equipment is exempted from and has been again based on process that relevant video image obtains this complexity of target area information, number is greatly saved
According to the processing time, system burden is reduced.
It should be understood that the processing unit of video data provided by the above embodiment is in the processing of video data, only
The example of the division of the above functional modules, in practical application, can according to need and by above-mentioned function distribution by
Different functional modules is completed, i.e., the internal structure of equipment is divided into different functional modules, described above complete to complete
Portion or partial function.In addition, the processing unit of video data provided by the above embodiment and the processing method of video data are real
It applies example and belongs to same design, specific implementation process is detailed in embodiment of the method, and which is not described herein again.
Figure 11 is a kind of structural block diagram of terminal 1100 provided in an embodiment of the present invention.The terminal 1100 may is that intelligence
Mobile phone, tablet computer, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic image
Expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic shadow
As expert's compression standard audio level 4) player, laptop or desktop computer.Terminal 1100 is also possible to referred to as user
Other titles such as equipment, portable terminal, laptop terminal, terminal console.
In general, terminal 1100 includes: processor 1101 and memory 1102.
Processor 1101 may include one or more processing cores, such as 4 core processors, 8 core processors etc..Place
Reason device 1101 can use DSP (Digital Signal Processing, Digital Signal Processing), FPGA (Field-
Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array, may be programmed
Logic array) at least one of example, in hardware realize.Processor 1101 also may include primary processor and coprocessor, master
Processor is the processor for being handled data in the awake state, also referred to as CPU (Central Processing
Unit, central processing unit);Coprocessor is the low power processor for being handled data in the standby state.?
In some embodiments, processor 1101 can be integrated with GPU (Graphics Processing Unit, image processor),
GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen.In some embodiments, processor 1101 can also be wrapped
AI (Artificial Intelligence, artificial intelligence) processor is included, the AI processor is for handling related machine learning
Calculating operation.
Memory 1102 may include one or more computer readable storage mediums, which can
To be non-transient.Memory 1102 may also include high-speed random access memory and nonvolatile memory, such as one
Or multiple disk storage equipments, flash memory device.In some embodiments, the non-transient computer in memory 1102 can
Storage medium is read for storing at least one instruction, at least one instruction for performed by processor 1101 to realize this hair
The processing method for the video data that bright middle embodiment of the method provides.
In some embodiments, terminal 1100 is also optional includes: peripheral device interface 1103 and at least one periphery are set
It is standby.It can be connected by bus or signal wire between processor 1101, memory 1102 and peripheral device interface 1103.It is each outer
Peripheral equipment can be connected by bus, signal wire or circuit board with peripheral device interface 1103.Specifically, peripheral equipment includes:
In radio circuit 1104, touch display screen 1105, camera 1106, voicefrequency circuit 1107, positioning component 1108 and power supply 1109
At least one.
Peripheral device interface 1103 can be used for I/O (Input/Output, input/output) is relevant outside at least one
Peripheral equipment is connected to processor 1101 and memory 1102.In some embodiments, processor 1101, memory 1102 and periphery
Equipment interface 1103 is integrated on same chip or circuit board;In some other embodiments, processor 1101, memory
1102 and peripheral device interface 1103 in any one or two can be realized on individual chip or circuit board, this implementation
Example is not limited this.
Radio circuit 1104 is for receiving and emitting RF (Radio Frequency, radio frequency) signal, also referred to as electromagnetic signal.
Radio circuit 1104 is communicated by electromagnetic signal with communication network and other communication equipments.Radio circuit 1104 is by telecommunications
Number being converted to electromagnetic signal is sent, alternatively, the electromagnetic signal received is converted to electric signal.Optionally, radio circuit
1104 include: antenna system, RF transceiver, one or more amplifiers, tuner, oscillator, digital signal processor, volume solution
Code chipset, user identity module card etc..Radio circuit 1104 can by least one wireless communication protocol come with it is other
Terminal is communicated.The wireless communication protocol includes but is not limited to: Metropolitan Area Network (MAN), each third generation mobile communication network (2G, 3G, 4G and
5G), WLAN and/or WiFi (Wireless Fidelity, Wireless Fidelity) network.In some embodiments, radio frequency electrical
Road 1104 can also include NFC (Near Field Communication, wireless near field communication) related circuit, the present invention
This is not limited.
Display screen 1105 is for showing UI (User Interface, user interface).The UI may include figure, text,
Icon, video and its their any combination.When display screen 1105 is touch display screen, display screen 1105 also there is acquisition to exist
The ability of the touch signal on the surface or surface of display screen 1105.The touch signal can be used as control signal and be input to place
Reason device 1101 is handled.At this point, display screen 1105 can be also used for providing virtual push button and/or dummy keyboard, it is also referred to as soft to press
Button and/or soft keyboard.In some embodiments, display screen 1105 can be one, and the front panel of terminal 1100 is arranged;Another
In a little embodiments, display screen 1105 can be at least two, be separately positioned on the different surfaces of terminal 1100 or in foldover design;
In still other embodiments, display screen 1105 can be flexible display screen, is arranged on the curved surface of terminal 1100 or folds
On face.Even, display screen 1105 can also be arranged to non-rectangle irregular figure, namely abnormity screen.Display screen 1105 can be with
Using LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-Emitting Diode,
Organic Light Emitting Diode) etc. materials preparation.
CCD camera assembly 1106 is for acquiring image or video.Optionally, CCD camera assembly 1106 includes front camera
And rear camera.In general, the front panel of terminal is arranged in front camera, the back side of terminal is arranged in rear camera.?
In some embodiments, rear camera at least two is that main camera, depth of field camera, wide-angle camera, focal length are taken the photograph respectively
As any one in head, to realize that main camera and the fusion of depth of field camera realize background blurring function, main camera and wide
Pan-shot and VR (Virtual Reality, virtual reality) shooting function or other fusions are realized in camera fusion in angle
Shooting function.In some embodiments, CCD camera assembly 1106 can also include flash lamp.Flash lamp can be monochromatic temperature flash of light
Lamp is also possible to double-colored temperature flash lamp.Double-colored temperature flash lamp refers to the combination of warm light flash lamp and cold light flash lamp, can be used for
Light compensation under different-colour.
Voicefrequency circuit 1107 may include microphone and loudspeaker.Microphone is used to acquire the sound wave of user and environment, and
It converts sound waves into electric signal and is input to processor 1101 and handled, or be input to radio circuit 1104 to realize that voice is logical
Letter.For stereo acquisition or the purpose of noise reduction, microphone can be separately positioned on the different parts of terminal 1100 to be multiple.
Microphone can also be array microphone or omnidirectional's acquisition type microphone.Loudspeaker is then used to that processor 1101 or radio frequency will to be come from
The electric signal of circuit 1104 is converted to sound wave.Loudspeaker can be traditional wafer speaker, be also possible to piezoelectric ceramics loudspeaking
Device.When loudspeaker is piezoelectric ceramic loudspeaker, the audible sound wave of the mankind can be not only converted electrical signals to, can also be incited somebody to action
Electric signal is converted to the sound wave that the mankind do not hear to carry out the purposes such as ranging.In some embodiments, voicefrequency circuit 1107 may be used also
To include earphone jack.
Positioning component 1108 is used for the current geographic position of positioning terminal 1100, to realize navigation or LBS (Location
Based Service, location based service).Positioning component 1108 can be the GPS (Global based on the U.S.
Positioning System, global positioning system), the dipper system of China, Russia Gray receive this system or European Union
The positioning component of Galileo system.
Power supply 1109 is used to be powered for the various components in terminal 1100.Power supply 1109 can be alternating current, direct current
Electricity, disposable battery or rechargeable battery.When power supply 1109 includes rechargeable battery, which can support wired
Charging or wireless charging.The rechargeable battery can be also used for supporting fast charge technology.
In some embodiments, terminal 1100 further includes having one or more sensors 1110.One or more sensing
Device 1110 includes but is not limited to: acceleration transducer 1111, gyro sensor 1112, pressure sensor 1113, fingerprint sensing
Device 1114, optical sensor 1115 and proximity sensor 1116.
Acceleration transducer 1111 can detecte the acceleration in three reference axis of the coordinate system established with terminal 1100
Size.For example, acceleration transducer 1111 can be used for detecting component of the acceleration of gravity in three reference axis.Processor
The 1101 acceleration of gravity signals that can be acquired according to acceleration transducer 1111, control touch display screen 1105 with transverse views
Or longitudinal view carries out the display of user interface.Acceleration transducer 1111 can be also used for game or the exercise data of user
Acquisition.
Gyro sensor 1112 can detecte body direction and the rotational angle of terminal 1100, gyro sensor 1112
Acquisition user can be cooperateed with to act the 3D of terminal 1100 with acceleration transducer 1111.Processor 1101 is according to gyro sensors
The data that device 1112 acquires, following function may be implemented: action induction (for example changing UI according to the tilt operation of user) is clapped
Image stabilization, game control and inertial navigation when taking the photograph.
The lower layer of side frame and/or touch display screen 1105 in terminal 1100 can be set in pressure sensor 1113.When
When the side frame of terminal 1100 is arranged in pressure sensor 1113, user can detecte to the gripping signal of terminal 1100, by
Reason device 1101 carries out right-hand man's identification or prompt operation according to the gripping signal that pressure sensor 1113 acquires.Work as pressure sensor
1113 when being arranged in the lower layer of touch display screen 1105, is grasped by processor 1101 according to pressure of the user to touch display screen 1105
Make, realization controls the operability control on the interface UI.Operability control include button control, scroll bar control,
At least one of icon control, menu control.
Fingerprint sensor 1114 is used to acquire the fingerprint of user, is collected by processor 1101 according to fingerprint sensor 1114
Fingerprint recognition user identity, alternatively, by fingerprint sensor 1114 according to the identity of collected fingerprint recognition user.Knowing
Not Chu the identity of user when being trusted identity, authorize the user to execute relevant sensitive operation by processor 1101, which grasps
Make to include solving lock screen, checking encryption information, downloading software, payment and change setting etc..Fingerprint sensor 1114 can be set
Set the front, the back side or side of terminal 1100.When being provided with physical button or manufacturer Logo in terminal 1100, fingerprint sensor
1114 can integrate with physical button or manufacturer Logo.
Optical sensor 1115 is for acquiring ambient light intensity.In one embodiment, processor 1101 can be according to light
The ambient light intensity that sensor 1115 acquires is learned, the display brightness of touch display screen 1105 is controlled.Specifically, work as ambient light intensity
When higher, the display brightness of touch display screen 1105 is turned up;When ambient light intensity is lower, the aobvious of touch display screen 1105 is turned down
Show brightness.In another embodiment, the ambient light intensity that processor 1101 can also be acquired according to optical sensor 1115, is moved
The acquisition parameters of state adjustment CCD camera assembly 1106.
Proximity sensor 1116, also referred to as range sensor are generally arranged at the front panel of terminal 1100.Proximity sensor
1116 for acquiring the distance between the front of user Yu terminal 1100.In one embodiment, when proximity sensor 1116 is examined
When measuring the distance between the front of user and terminal 1100 and gradually becoming smaller, by processor 1101 control touch display screen 1105 from
Bright screen state is switched to breath screen state;When proximity sensor 1116 detect the distance between front of user and terminal 1100 by
When gradual change is big, touch display screen 1105 is controlled by processor 1101 and is switched to bright screen state from breath screen state.
It, can be with it will be understood by those skilled in the art that the restriction of the not structure paired terminal 1100 of structure shown in Figure 11
Including than illustrating more or fewer components, perhaps combining certain components or being arranged using different components.
Figure 12 is a kind of structural schematic diagram of server provided in an embodiment of the present invention, the server 1200 can because of configuration or
Performance is different and generates bigger difference, may include one or more processors (central processing
Units, CPU) 1201 and one or more memory 1202, wherein at least one is stored in the memory 1202
Instruction, at least one instruction are loaded by the processor 1201 and are executed the video to realize above-mentioned each embodiment of the method offer
The processing method of data.Certainly, which can also have wired or wireless network interface, keyboard and input/output interface
Equal components, to carry out input and output, which can also include other for realizing the component of functions of the equipments, not do herein
It repeats.
In the exemplary embodiment, a kind of computer readable storage medium is additionally provided, the memory for example including instruction,
Above-metioned instruction can be executed by the processor in terminal to complete the processing method of video data in above-described embodiment.For example, the meter
Calculation machine readable storage medium storing program for executing can be read-only memory (Read-Only Memory, ROM), random access memory (Random
Access Memory, RAM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM), tape, floppy disk and
Optical data storage devices etc..
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware
It completes, relevant hardware can also be instructed to complete by program, above procedure can store computer-readable to be deposited in a kind of
In storage media, storage medium mentioned above can be read-only memory, disk or CD etc..
It above are only presently preferred embodiments of the present invention, be not intended to limit the invention, it is all in the spirit and principles in the present invention
Within, any modification, equivalent replacement, improvement and so on should all be included in the protection scope of the present invention.
Claims (12)
1. a kind of processing method of video data, which is characterized in that be applied to the first equipment, which comprises
Obtain an at least frame raw video image;
Based on an at least frame raw video image, the target area information of an at least frame raw video image is obtained;
Based on the target area information of an at least frame raw video image, an at least frame raw video image is carried out
Coding, generates video data stream, and the video data stream carries the target area information of an at least frame raw video image;
The video data stream is sent to the second equipment.
2. the method according to claim 1, wherein the mesh based on an at least frame raw video image
Area information is marked, an at least frame raw video image is encoded, generates video data stream, the video data stream is taken
The target area information of a band at least frame raw video image includes:
At least target area information of a frame raw video image and an at least frame raw video image are compiled
Code generates at least one first data packet for carrying at least one target area mark, at least one target area mark
It encodes to obtain by an at least frame raw video image;
Based at least one first data packet of described at least one target area of carrying mark, the video data stream is generated.
3. the method according to claim 1, wherein the mesh based on an at least frame raw video image
Area information is marked, an at least frame raw video image is encoded, generates video data stream, the video data stream is taken
The target area information of a band at least frame raw video image includes:
The target area information of an at least frame raw video image is encoded, at least one second data packet is generated;
An at least frame raw video image is encoded, at least one first data packet is generated;
Every the first data packet of preset number, it is inserted into second data packet, generates the video data stream.
4. a kind of processing method of video data, which is characterized in that be applied to the second equipment, which comprises
Video data stream is received, the video data stream carries the target area information of an at least frame raw video image;
Based on the video data stream, the target area information of an at least frame raw video image is extracted;
The video data stream is decoded, the corresponding video image of the video data stream is generated;
Based on the target area information of an at least frame raw video image, video image corresponding to the video data stream
It recodes, generates target video data stream.
5. according to the method described in claim 4, extraction is described at least it is characterized in that, described be based on the video data stream
The target area information of one frame raw video image includes:
Based at least one field of at least one the first data packet in the video data stream, at least one target area is extracted
Mark;
At least one described target area mark is decoded, the target area of an at least frame raw video image is obtained
Information.
6. according to the method described in claim 4, extraction is described at least it is characterized in that, described be based on the video data stream
The target area information of one frame raw video image includes:
Based on the first data packet of at least one of described video data stream and at least one second data packet, every preset number
A first data packet is decoded the second data packet after the first data packet of the preset number, and generation is described at least
The target area information of one frame raw video image.
7. a kind of processing method of video data, which is characterized in that be applied to the second equipment, which comprises
At least two-path video data flow is received, every road video data stream carries the target area letter of an at least frame raw video image
Breath;
Based on at least two-path video data flow, the corresponding at least frame raw video image of every road video data stream is extracted
Target area information;
Every road video data stream is decoded, at least corresponding video image of two-path video data flow is obtained;
The corresponding video image of at least two-path video data flow is merged, target video image is generated;
Based on the corresponding target area information of at least two-path video data flow, the target video image is rearranged
Code generates target video data stream.
8. the method according to the description of claim 7 is characterized in that described based on at least two-path video data flow, extraction
The target area information of the corresponding at least frame raw video image of every road video data stream includes:
Based at least one field of at least one the first data packet in every road video data stream, at least two-way is extracted
Corresponding at least one target area mark of video data stream;
At least one corresponding target area of every road video data stream mark is decoded, at least two-way is obtained and regards
The target area information of the corresponding at least frame raw video image of frequency data stream.
9. the method according to the description of claim 7 is characterized in that described based on at least two-path video data flow, extraction
The target area information of the corresponding at least frame raw video image of every road video data stream includes:
Based on the first data packet of at least one of every road video data stream and at least one second data packet, every default
The first data packet of number is decoded the second data packet after the first data packet of the preset number, described in generation
The target area information of an at least frame raw video image at least in two-path video data flow.
10. a kind of terminal, which is characterized in that the terminal includes processor and memory, is stored at least in the memory
One instruction, described instruction are loaded as the processor and are executed to realize as described in claim 1 to any one of claim 9
Video data processing method performed by operation.
11. a kind of server, which is characterized in that the server includes processor and memory, is stored in the memory
At least one instruction, described instruction are loaded by the processor and are executed to realize such as any one of claim 1 to claim 9
Operation performed by the processing method of the video data.
12. a kind of computer readable storage medium, which is characterized in that be stored at least one instruction, institute in the storage medium
Instruction is stated to be loaded by processor and executed to realize such as claim 1 to the described in any item video datas of claim 9
Operation performed by reason method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811337105.0A CN109168032B (en) | 2018-11-12 | 2018-11-12 | Video data processing method, terminal, server and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811337105.0A CN109168032B (en) | 2018-11-12 | 2018-11-12 | Video data processing method, terminal, server and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109168032A true CN109168032A (en) | 2019-01-08 |
CN109168032B CN109168032B (en) | 2021-08-27 |
Family
ID=64877084
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811337105.0A Active CN109168032B (en) | 2018-11-12 | 2018-11-12 | Video data processing method, terminal, server and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109168032B (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110019953A (en) * | 2019-04-16 | 2019-07-16 | 中国科学院国家空间科学中心 | A kind of real-time quick look system of payload image data |
CN110602398A (en) * | 2019-09-17 | 2019-12-20 | 北京拙河科技有限公司 | Ultrahigh-definition video display method and device |
CN112468845A (en) * | 2020-11-16 | 2021-03-09 | 维沃移动通信有限公司 | Processing method and processing device for screen projection picture |
CN113096201A (en) * | 2021-03-30 | 2021-07-09 | 上海西井信息科技有限公司 | Embedded video image deep learning system, method, equipment and storage medium |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1291314A (en) * | 1998-03-20 | 2001-04-11 | 马里兰大学 | Method and apparatus for compressing and decompressing image |
CN101742289A (en) * | 2008-11-14 | 2010-06-16 | 北京中星微电子有限公司 | Method, system and device for compressing video code stream |
CN103024445A (en) * | 2012-12-13 | 2013-04-03 | 北京百度网讯科技有限公司 | Cloud video transcode method and cloud server |
CN104185078A (en) * | 2013-05-20 | 2014-12-03 | 华为技术有限公司 | Video monitoring processing method, device and system thereof |
CN104365095A (en) * | 2012-03-30 | 2015-02-18 | 阿尔卡特朗讯公司 | Method and apparatus for encoding a selected spatial portion of a video stream |
CN104427337A (en) * | 2013-08-21 | 2015-03-18 | 杭州海康威视数字技术股份有限公司 | Region of interest (ROI) video coding method and apparatus based on object detection |
WO2015041652A1 (en) * | 2013-09-19 | 2015-03-26 | Entropic Communications, Inc. | A progressive jpeg bitstream transcoder and decoder |
US20150365687A1 (en) * | 2013-01-18 | 2015-12-17 | Canon Kabushiki Kaisha | Method of displaying a region of interest in a video stream |
CN105493509A (en) * | 2013-08-12 | 2016-04-13 | 索尼公司 | Transmission apparatus, transmission method, reception apparatus, and reception method |
CN105898313A (en) * | 2014-12-15 | 2016-08-24 | 江南大学 | Novel video synopsis-based monitoring video scalable video coding technology |
CN105917649A (en) * | 2014-02-18 | 2016-08-31 | 英特尔公司 | Techniques for inclusion of region of interest indications in compressed video data |
CN107210041A (en) * | 2015-02-10 | 2017-09-26 | 索尼公司 | Dispensing device, sending method, reception device and method of reseptance |
CN108429921A (en) * | 2017-02-14 | 2018-08-21 | 北京金山云网络技术有限公司 | A kind of video coding-decoding method and device |
-
2018
- 2018-11-12 CN CN201811337105.0A patent/CN109168032B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1291314A (en) * | 1998-03-20 | 2001-04-11 | 马里兰大学 | Method and apparatus for compressing and decompressing image |
CN101742289A (en) * | 2008-11-14 | 2010-06-16 | 北京中星微电子有限公司 | Method, system and device for compressing video code stream |
CN104365095A (en) * | 2012-03-30 | 2015-02-18 | 阿尔卡特朗讯公司 | Method and apparatus for encoding a selected spatial portion of a video stream |
CN103024445A (en) * | 2012-12-13 | 2013-04-03 | 北京百度网讯科技有限公司 | Cloud video transcode method and cloud server |
US20150365687A1 (en) * | 2013-01-18 | 2015-12-17 | Canon Kabushiki Kaisha | Method of displaying a region of interest in a video stream |
CN104185078A (en) * | 2013-05-20 | 2014-12-03 | 华为技术有限公司 | Video monitoring processing method, device and system thereof |
CN105493509A (en) * | 2013-08-12 | 2016-04-13 | 索尼公司 | Transmission apparatus, transmission method, reception apparatus, and reception method |
CN104427337A (en) * | 2013-08-21 | 2015-03-18 | 杭州海康威视数字技术股份有限公司 | Region of interest (ROI) video coding method and apparatus based on object detection |
WO2015041652A1 (en) * | 2013-09-19 | 2015-03-26 | Entropic Communications, Inc. | A progressive jpeg bitstream transcoder and decoder |
CN105917649A (en) * | 2014-02-18 | 2016-08-31 | 英特尔公司 | Techniques for inclusion of region of interest indications in compressed video data |
CN105898313A (en) * | 2014-12-15 | 2016-08-24 | 江南大学 | Novel video synopsis-based monitoring video scalable video coding technology |
CN107210041A (en) * | 2015-02-10 | 2017-09-26 | 索尼公司 | Dispensing device, sending method, reception device and method of reseptance |
CN108429921A (en) * | 2017-02-14 | 2018-08-21 | 北京金山云网络技术有限公司 | A kind of video coding-decoding method and device |
Non-Patent Citations (1)
Title |
---|
王明慧: "基于H.264的感兴趣区域视频编码研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110019953A (en) * | 2019-04-16 | 2019-07-16 | 中国科学院国家空间科学中心 | A kind of real-time quick look system of payload image data |
CN110602398A (en) * | 2019-09-17 | 2019-12-20 | 北京拙河科技有限公司 | Ultrahigh-definition video display method and device |
CN112468845A (en) * | 2020-11-16 | 2021-03-09 | 维沃移动通信有限公司 | Processing method and processing device for screen projection picture |
CN113096201A (en) * | 2021-03-30 | 2021-07-09 | 上海西井信息科技有限公司 | Embedded video image deep learning system, method, equipment and storage medium |
CN113096201B (en) * | 2021-03-30 | 2023-04-18 | 上海西井信息科技有限公司 | Embedded video image deep learning method, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109168032B (en) | 2021-08-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109168032A (en) | Processing method, terminal, server and the storage medium of video data | |
CN104350745B (en) | 3D video coding based on panorama | |
JP7085014B2 (en) | Video coding methods and their devices, storage media, equipment, and computer programs | |
CN110213616B (en) | Video providing method, video obtaining method, video providing device, video obtaining device and video providing equipment | |
CN108810538A (en) | Method for video coding, device, terminal and storage medium | |
CN110062252A (en) | Live broadcasting method, device, terminal and storage medium | |
CN103460250A (en) | Object of interest based image processing | |
CN110062246B (en) | Method and device for processing video frame data | |
CN108966008A (en) | Live video back method and device | |
CN109120933A (en) | Dynamic adjusts method, apparatus, equipment and the storage medium of code rate | |
CN109285178A (en) | Image partition method, device and storage medium | |
CN108769826A (en) | Live media stream acquisition methods, device, terminal and storage medium | |
CN110493626A (en) | Video data handling procedure and device | |
CN110121084A (en) | The methods, devices and systems of port switching | |
CN108769738A (en) | Method for processing video frequency, device, computer equipment and storage medium | |
CN110149517A (en) | Method, apparatus, electronic equipment and the computer storage medium of video processing | |
CN111586413B (en) | Video adjusting method and device, computer equipment and storage medium | |
CN110572710B (en) | Video generation method, device, equipment and storage medium | |
CN110049326A (en) | Method for video coding and device, storage medium | |
CN111107357B (en) | Image processing method, device, system and storage medium | |
CN110177275B (en) | Video encoding method and apparatus, and storage medium | |
CN116703995B (en) | Video blurring processing method and device | |
CN111478915A (en) | Live broadcast data stream pushing method and device, terminal and storage medium | |
CN110087077A (en) | Method for video coding and device, storage medium | |
CN109714628A (en) | Method, apparatus, equipment, storage medium and the system of playing audio-video |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |