WO2011030680A1 - Dispositif, procédé, et logiciel de codage - Google Patents

Dispositif, procédé, et logiciel de codage Download PDF

Info

Publication number
WO2011030680A1
WO2011030680A1 PCT/JP2010/064603 JP2010064603W WO2011030680A1 WO 2011030680 A1 WO2011030680 A1 WO 2011030680A1 JP 2010064603 W JP2010064603 W JP 2010064603W WO 2011030680 A1 WO2011030680 A1 WO 2011030680A1
Authority
WO
WIPO (PCT)
Prior art keywords
picture
encoding
reference picture
pictures
network
Prior art date
Application number
PCT/JP2010/064603
Other languages
English (en)
Japanese (ja)
Inventor
天野勝博
Original Assignee
ブラザー工業株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ブラザー工業株式会社 filed Critical ブラザー工業株式会社
Publication of WO2011030680A1 publication Critical patent/WO2011030680A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/147Communication arrangements, e.g. identifying the communication as a video-communication, intermediate storage of the signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/103Selection of coding mode or of prediction mode
    • H04N19/107Selection of coding mode or of prediction mode between spatial and temporal predictive coding, e.g. picture refresh
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/115Selection of the code volume for a coding unit prior to coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/164Feedback from the receiver or from the transmission channel
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/176Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a block, e.g. a macroblock
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/15Conference systems

Definitions

  • the present invention relates to an encoding device, an encoding method, and an encoding program for compressing and encoding image data and outputting them to a network.
  • An object of the present invention is to provide an encoding device, an encoding method, and an encoding program that can change the bit rate of data output to a network by quickly following changes in the bandwidth of the network.
  • the encoding device encodes image data input by an image input means, and a reference picture that is a picture to be referred to when other pictures are decoded, and any other pictures A non-reference picture that is a picture that is not referred to even when decoding the video, and outputs the result to a network, the detection means for detecting information on the bandwidth of the network, and the non-reference picture A generating unit that determines the generation conditions based on the band information detected by the detecting unit, an encoding unit that encodes image data under the generation conditions determined by the determining unit, and the encoding unit Deletion to delete non-reference pictures among encoded pictures according to the band information detected by the detection means And stage, and an output means for outputting the remaining pictures in which the non-reference picture is deleted to the network by said deletion means.
  • the encoding apparatus can determine the generation condition of the non-reference picture that has little influence on the video quality even if it is deleted according to the information of the network bandwidth. Then, the generated non-reference picture can be deleted before being output to the network according to the information of the network bandwidth. Therefore, the bit rate of data output can be changed by quickly following the fluctuation of the network bandwidth without causing a time delay due to the encoding process or the like.
  • the detection means may include first detection means for detecting a fluctuation amount of the network bandwidth as the bandwidth information.
  • the determining means determines a ratio of the number of non-reference pictures to the number of pictures generated within a predetermined period of reference pictures and non-reference pictures based on the variation detected by the first detection means. What is necessary is just to determine as production
  • the determination unit is configured to calculate a ratio of a non-reference picture generated by encoding a prediction error with another picture by inter-frame encoding to a number of pictures generated within the predetermined period, The generation condition may be determined.
  • an I picture Intra-coded Picture
  • a P picture Predictive-coded Picture
  • B picture B picture generated by inter-frame coding.
  • Bidirectional-coded Picture may be a non-reference picture.
  • the data size of a picture generated by interframe coding is smaller than the data size of an I picture generated by intraframe coding. Therefore, when the determining unit determines the generation ratio of the non-reference picture generated by the interframe coding, the data size of the generated picture does not increase rapidly even when the generation ratio is changed. Therefore, the load applied to the network bandwidth is not increased.
  • the detection means may include second detection means for detecting the bandwidth of the network as the bandwidth information.
  • the deletion unit may delete the non-reference picture when the band detected by the second detection unit is lowered. In this case, the encoding device can quickly reduce the bit rate of data output when the band is reduced. Therefore, it is possible to appropriately prevent the bandwidth from becoming congested.
  • the encoding device may include storage control means for storing the picture encoded by the encoding means in a buffer for temporarily storing data.
  • the deletion unit may delete the non-reference picture stored in the buffer. In this case, since the encoding apparatus can appropriately delete the non-reference picture stored in the buffer, the deletion process can be easily performed.
  • the detection means may include third detection means for detecting a fluctuation amount of the free capacity of the buffer as the band information.
  • the determination unit may determine the generation condition based on the fluctuation amount of the free space of the buffer detected by the third detection unit.
  • the encoding apparatus can measure network bandwidth information including internal factors in the apparatus from the buffer, and determine the generation condition of the non-reference picture. Therefore, a non-reference picture can be generated more appropriately.
  • the determination means minimizes a deviation in the number of reference pictures generated between a non-reference picture and a next non-reference picture generated among a plurality of repeatedly generated pictures. It is desirable to determine the generation conditions. In this case, non-reference pictures in consecutive pictures are generated in a distributed manner without being generated in a biased manner. Therefore, even when the encoding apparatus deletes the non-reference picture, it is possible to reduce the possibility that the video is reproduced so as to be interrupted.
  • the deleting means deletes a non-reference picture that is an I picture in preference to a non-reference picture that is a P picture.
  • the encoding method encodes the image data input by the image input means, and a reference picture that is a picture to be referred to when other pictures are decoded, and any other pictures
  • a non-reference picture is detected by the detection step.
  • the encoding method it is possible to determine the generation condition of the non-reference picture that has little influence on the video quality even if it is deleted, according to the network bandwidth information. Then, the generated non-reference picture can be deleted before output according to network bandwidth information. Therefore, the bit rate of data output can be changed by quickly following the fluctuation of the network bandwidth without causing a time delay due to the encoding process or the like.
  • the encoding program encodes image data input by the image input means, and a reference picture that is a picture that is referenced when other pictures are decoded, and any other pictures
  • the encoding program it is possible to determine the generation condition of the non-reference picture that has little influence on the video quality even if it is deleted according to the information of the network bandwidth. Then, the generated non-reference picture can be deleted before output according to network bandwidth information. Therefore, the bit rate of data output can be changed by quickly following the fluctuation of the network bandwidth without causing a time delay due to the encoding process or the like.
  • FIG. 2 is a block diagram showing an electrical configuration of the video conference apparatus 1.
  • FIG. It is a functional block diagram of the video conference apparatus 1 which concerns on 1st embodiment. It is a flowchart of the main process which the video conference apparatus 1 performs.
  • This is an example of a generation condition determined when the rate of variation is 0% and the number of pictures in a GOP is 10.
  • FIG. 1 a video conference apparatus 1 according to a first embodiment that embodies the encoding apparatus of the present invention will be described with reference to the drawings.
  • the drawings to be referred to are used for explaining technical features that can be adopted by the present invention.
  • the configuration of the apparatus, the flowcharts of various processes, and the like described in the drawings are not intended to be limited to these, but are merely illustrative examples.
  • the video conference apparatus 1 is connected to another video conference apparatus 1 via the network 8 (see FIG. 1).
  • Each video conference device 1 inputs and outputs image data and audio data.
  • users at multiple bases can share video and audio. Therefore, even when all the users are not at the same base, the users can smoothly execute the conference.
  • the video conference apparatus 1 includes a CPU 10 that controls the video conference apparatus 1.
  • a ROM 11, a RAM 12, a hard disk drive (hereinafter referred to as “HDD”) 13, and an input / output interface 19 are connected to the CPU 10 via a bus 18.
  • the ROM 11 stores a program for operating the video conference device 1 and initial values.
  • the RAM 12 temporarily stores various information used in the control program.
  • the HDD 13 is a non-volatile storage device that stores various types of information. Instead of the HDD 13, a storage device such as an EEPROM or a memory card may be used.
  • the input / output interface 19 is connected to an audio input processing unit 21, an audio output processing unit 22, a video input processing unit 23, a video output processing unit 24, an operation unit 25, and an external communication I / F 26.
  • the voice input processing unit 21 processes input of voice data from the microphone 31 that inputs voice.
  • the audio output processing unit 22 processes the operation of the speaker 32 that outputs audio.
  • the video input processing unit 23 processes input of video data (moving image data) from the camera 33 that captures video.
  • the video output processing unit 24 processes the operation of the display device 34 that displays video.
  • the operation unit 25 is used for a user to input various instructions to the video conference apparatus 1.
  • the external communication I / F 26 connects the video conference device 1 to the network 8.
  • the RAM 12 will be described in detail.
  • the RAM 12 is provided with various storage areas such as a work area 121 and a FIFO buffer area 122 (hereinafter also referred to as “FIFO buffer 122”).
  • the work area 121 stores various data such as flags necessary for processing.
  • FIFO buffer area 122 encoded data that is encoded image data is temporarily stored before being output to the network 8.
  • the FIFO buffer is a buffer that outputs stored data in the order in which they are stored.
  • the video conference apparatus 1 converts the image data input from the camera 33 into H.264. Image compression encoding is performed based on the H.264 standard to generate encoded data.
  • the video conference apparatus 1 outputs the generated encoded data to another video conference apparatus 1 via the network 8. Note that the video conference device 1 decodes the encoded data input from the other video conference device 1 via the network 8 and causes the display device 34 to display the decoded data.
  • Image compression coding includes intra-frame coding and inter-frame coding.
  • the intra-frame coding is coding performed by intra-screen prediction within one frame of image data of a plurality of consecutive frames input by the camera.
  • An I picture (Intra-coded Picture) that is encoded data generated by intra-frame encoding is decoded independently without referring to other pictures.
  • interframe coding prediction data is calculated by referring to data of a frame different from data of a frame to be encoded among continuous frame data, and the calculated prediction error is encoded.
  • Coded data generated by interframe coding includes a P picture (Predictive-coded Picture) and a B picture (Bidirectional-coded Picture).
  • P pictures generated by referring to past pictures are mainly used.
  • To decode a P picture a picture referenced at the time of encoding is required. However, the data amount of the P picture is smaller than that of the I picture that is decoded alone.
  • a picture that is referred to when any other P picture is decoded is referred to as a “reference picture”.
  • a picture that is not referred to when any other P picture is decoded is referred to as a “non-reference picture”.
  • the I picture is often used as a reference picture.
  • the first I picture is a non-reference picture. It becomes.
  • decoding a P picture it is possible to refer to either an I picture or a P picture.
  • a P picture can be either a reference picture or a non-reference picture.
  • DCT / quantization 42 is performed on the input image 41 from the camera 33.
  • the coefficient transformed by DCT is quantized according to the quantization parameter.
  • inverse quantization / inverse DCT 43 is performed on a part of the quantized data.
  • the data subjected to the inverse quantization / inverse DCT 43 is subjected to the deblocking filter 44 and stored in the frame memory 45.
  • intra-frame prediction 46 is performed on the data stored in the frame memory 45, and further DCT / quantization 42 is performed.
  • Entropy encoding 47 is performed on the quantized data.
  • the encoded data 48 generated by the entropy encoding 47 is input to the FIFO buffer 122.
  • the encoded data input to the FIFO buffer 122 only the encoded data that has not been deleted is subjected to the network output 50 through the non-reference picture deleting means 49 described later.
  • motion prediction 51 is performed using the input image 41, and motion compensation 52 based on the previous predicted image in the frame memory 45 is performed.
  • the prediction error calculated by the motion compensation 52 is subjected to weighted prediction 53 using a weighting factor related to brightness, and further DCT / quantization 42 is performed.
  • Entropy encoding 47 is performed on the quantized data, and encoded data 48 is generated.
  • the subsequent flow is the same as in the case of intra-frame coding.
  • the bit rate cannot be changed by quickly following the fluctuation of the band. As a result, for example, when a video conference is being performed, reproduction of the video is delayed, and the user cannot smoothly execute the conference.
  • the video conference device 1 performs the available bandwidth measurement 55 and the FIFO buffer monitoring 56 of the network 8. Then, the non-reference picture deleting unit 49 deletes a part of the encoded data in the FIFO buffer 122 when the available bandwidth of the network 8 is reduced. By deleting a part of the encoded data to be output, the bit rate of output to the network 8 can be quickly reduced when the available bandwidth is reduced.
  • the device that receives the encoded data cannot decode not only the deleted picture but also the P picture that needs to refer to the deleted picture at the time of decoding. Therefore, non-reference pictures should be deleted in order to prevent significant degradation of video quality. If the non-reference picture that can be deleted is not stored in the FIFO buffer 122 when the available bandwidth decreases, the bit rate cannot be changed in accordance with the decrease in the available bandwidth. Therefore, the video conference apparatus 1 measures the fluctuation amount of the usable bandwidth of the network 8 and the fluctuation amount of the free capacity of the FIFO buffer 122. The picture mode control block 58 changes the P picture generation ratio for all the pictures based on the measured variation.
  • the reference picture control block 59 selects the reference picture of the P picture based on the measured variation, thereby changing the ratio of the non-reference picture to all the pictures. When the available bandwidth decreases, the non-reference picture is deleted. Details of the above processing will be described below.
  • a main process performed by the video conference apparatus 1 will be described with reference to FIGS.
  • the main process is executed by the CPU 10 in accordance with a program stored in the ROM 11.
  • the main process is started when an instruction to execute transmission / reception of image data is input.
  • the usable bandwidth (W) of the network 8 and the variation (A) of the usable bandwidth are measured (S2).
  • available bandwidth for example, a packet train transfer method is used for probe packet transfer, and pathload that estimates the available bandwidth using the increasing tendency of one-way transfer delay between probe packets, ICMP (INTERNET CONTROL MESSAGE PROTOCOL, etc.) ) ECHO REQUEST packets are continuously transmitted, and a known bandwidth measurement technique such as cprobe that obtains an available bandwidth by observing the packet interval of the response packet may be used.
  • a known bandwidth measurement technique such as cprobe that obtains an available bandwidth by observing the packet interval of the response packet may be used.
  • the fluctuation amount includes both an increase amount and a decrease amount.
  • the non-reference picture generation condition mainly indicates a ratio of the number of non-reference pictures to a preset number of pictures generated within a predetermined time (for example, 1 second).
  • the video conference apparatus 1 determines the number of non-reference pictures by determining the number of pictures in the GOP (Group of Pictures) and the number of reference pictures that are referenced when the P picture is encoded and decoded. Determine the percentage.
  • a GOP is a set of a preset number of pictures generated within a predetermined period in order to efficiently manage a plurality of data.
  • the ratio of the number of non-reference pictures to the number of pictures in the GOP can be increased. Further, as will be described below, the proportion of non-reference pictures can be determined by appropriately determining which reference picture of the P picture is used.
  • the reference picture control block 59 (see FIG. 2) described above controls the motion compensation 52 based on the determined relationship between the P picture and the reference picture.
  • FIG. 4 is an example of non-reference picture generation conditions that are finally determined when the rate of change in the available bandwidth is 0% and the number of pictures in the GOP is 10. If the rate of change in the usable bandwidth is 0%, the possibility that the usable bandwidth is suddenly reduced is low. Therefore, it is rare to rapidly reduce the bit rate by deleting a large number of pictures. Therefore, it is not necessary to increase the ratio of non-reference pictures. If it is not necessary to increase the number of non-reference pictures, the reference picture of a P picture is preferably the picture immediately before the P picture. This is because the prediction error is reduced and the amount of data is reduced by using the immediately preceding picture as a reference picture. Therefore, as shown in FIG. 4, all the reference pictures of the P picture are pictures immediately before the P picture. As a result, the non-reference picture is only the last picture in the GOP.
  • FIG. 5 is an example of non-reference picture generation conditions determined when the rate of change in the available bandwidth is 50% and the number of pictures in the GOP is 10.
  • the video conference apparatus 1 determines the generation condition of the non-reference picture so that the ratio of the change amount of the available band and the ratio of the non-reference picture in the picture in the GOP are closest. Therefore, in the example shown in FIG. 5, since the ratio of the amount of change in the usable bandwidth is 50%, the generation condition is determined so that 5 out of 10 pictures are non-reference pictures.
  • the video conference apparatus 1 determines the generation condition so that the reference picture and the non-reference picture are arranged equally. In other words, the generation condition is determined so that the deviation in the number of reference pictures located between the two most recent non-reference pictures is minimized. When this deviation is large, the reference picture and the non-reference picture are not evenly arranged. Therefore, there is a high possibility that problems such as video interruption occur when non-reference pictures are deleted. Therefore, in the example illustrated in FIG. 5, the video conference apparatus 1 alternately arranges reference pictures and non-reference pictures. As a result, there is no bias in the number of reference pictures located between non-reference pictures.
  • the closest (newest) picture among the reference pictures before the P picture is determined as the reference picture of the P picture. That is, by making the reference pictures of a plurality of P pictures the same picture, the proportion of non-reference pictures that are P pictures can be increased.
  • FIG. 6 shows an example of generation conditions determined when the rate of change in the available bandwidth is 40% and the number of pictures in the GOP is 10. If the rate of change in the usable bandwidth is 40%, the video conference apparatus 1 determines the generation condition so that four out of ten pictures are non-reference pictures. Then, the arrangement of the reference picture and the non-reference picture is determined so that the number of reference pictures located between the non-reference pictures is “2”, “1”, “2”, and “1” in order.
  • FIG. 7 shows an example of generation conditions determined when the rate of change in the available bandwidth is 90% and the number of pictures in the GOP is 10. If the rate of change in the available bandwidth is 90%, the video conference apparatus 1 determines the generation condition so that nine out of ten pictures are non-reference pictures. In this case, the reference pictures of all P pictures are the first I picture in the GOP. In this way, by setting the reference pictures of all P pictures in the GOP as I pictures, the ratio of non-reference pictures that are P pictures can be maximized.
  • the video conference apparatus 1 can determine the rate at which non-reference pictures that are P pictures are generated in the process of S5. Even if the ratio of I pictures to all pictures is increased, the ratio of non-reference pictures can be increased. However, the data size of the I picture is larger than the data size of the P picture. Therefore, the load applied to the network 8 increases. On the other hand, the video conference apparatus 1 can change the generation ratio of the non-reference picture by changing the ratio of the non-reference picture that is a P picture without rapidly increasing the load applied to the network 8. .
  • the process for each frame is performed (S6).
  • the image data is encoded (encoded) according to the determined generation condition, and the non-reference picture is deleted based on the information of the network bandwidth.
  • the amount of change (C) in the free capacity of the FIFO buffer area 122 is measured (S16).
  • the fluctuation amount (C) is the difference between the previous free capacity of the FIFO buffer area 122 and the current free capacity of the FIFO buffer area 122.
  • the number of non-reference pictures corresponding to the amount of decrease in available bandwidth (W) is deleted.
  • an I picture having a larger data amount is preferentially deleted.
  • an I picture (I) and a P picture (P) are generated in the order of I / I / P / I / I / P and the P picture refers to the immediately preceding I picture, The I picture two frames before is preferentially deleted.
  • H.264 is used. Since the H.264 standard is adopted, an access unit (unit of picture defined in H.264) composed only of non-reference pictures is deleted.
  • the fluctuation amount (A) of the available bandwidth measured last time in S2 (see FIG. 3) in order to determine the generation condition of the non-reference picture, and the fluctuation amount of the free capacity of the FIFO buffer area 122 measured this time in S16.
  • the difference from (C) is calculated (S19).
  • the remaining encoded data that has not been deleted in the FIFO buffer area 122 is sequentially output to the network 8 so that the bit rate of data output follows the available bandwidth (W) of the network 8 (S21).
  • the absolute value of the difference in fluctuation amount calculated in S19 is equal to or larger than the absolute value of the difference in changeable value of the output bit rate when the generation condition is updated (S22). For example, when processing is performed with 15 pictures in a GOP, each time a new non-reference picture in one GOP is deleted or the deletion is stopped by one, the output bit rate is changed. The average value can be changed by about 6.7%. Therefore, in this case, the changeable value of the output bit rate is increased by about 6.7% by updating the generation condition so that the number of non-reference pictures in the GOP is increased by one.
  • the changeable value of the output bit rate is reduced by about 6.7%.
  • the video conference apparatus 1 does not update the generation condition when the change in the amount of change is small, and updates only when the change in the amount of change is large. Therefore, in S22, the absolute value of the difference between the changeable value of the output bit rate before the generation condition is updated and the changeable value of the output bit rate after the update is calculated. If the absolute value of the difference in variation calculated in S19 is smaller than the absolute value of the difference in output bit rate changeable value (S22: NO), the process directly returns to S11.
  • the process returns to the main process (see FIG. 3) in order to update the generation condition.
  • the main process when the frame-by-frame process (S6) is completed, the process returns to S2, and the non-reference picture generation conditions are updated (S2 to S5).
  • the video conference apparatus 1 determines the non-reference picture generation condition based on the band information of the network 8.
  • the image data is encoded with the determined generation conditions.
  • the non-reference picture is deleted from the encoded picture, and the remaining picture is output to the network. Therefore, the video conference apparatus 1 determines the generation condition of the non-reference picture that has little influence on the video quality even if it is deleted according to the band information of the network 8, and outputs the generated non-reference picture to the network. Can be deleted before. Therefore, the video conference apparatus 1 can quickly follow the fluctuation of the bandwidth of the network 8 and change the output bit rate without causing a time delay due to encoding processing, buffering, or the like.
  • the video conference apparatus 1 can quickly and rapidly reduce the output bit rate by increasing the ratio of non-reference pictures.
  • the proportion of non-reference pictures is reduced, the reference picture of a P picture can be made as close as possible to that P picture, and the encoding process can be performed efficiently.
  • the video conference apparatus 1 determines the ratio of non-reference pictures to all pictures as a generation condition from the fluctuation amount of the available bandwidth of the network 8. Therefore, it is possible to appropriately generate a non-reference picture that has little influence on the video quality even if it is deleted in accordance with the amount of change in the available bandwidth. When the available bandwidth actually decreases, the output bit rate can be quickly decreased by deleting the generated non-reference picture.
  • an I picture generated by intra-frame coding is a non-reference picture and cases where a P picture and a B picture generated by inter-frame coding are non-reference pictures.
  • the data size of a picture generated by interframe coding is smaller than the data size of an I picture generated by intraframe coding.
  • the video conference apparatus 1 can appropriately determine the generation ratio of non-reference pictures (non-reference pictures that are P pictures) generated by interframe coding. As a result, even when the generation ratio is changed, the total amount of data size of the generated picture does not increase rapidly. Therefore, the load applied to the network bandwidth is not increased.
  • the video conference apparatus 1 temporarily stores the generated encoded data in the FIFO buffer area 122. Accordingly, the deletion process can be easily performed by appropriately deleting the non-reference pictures stored in the FIFO buffer area 122. It is also possible to delete a plurality of non-reference pictures at once.
  • the video conference apparatus 1 acquires network bandwidth information including internal factors of the video conference apparatus 1 by measuring the amount of change in the free capacity of the FIFO buffer area 122. When the obtained amount of variation becomes large, the non-reference picture generation conditions can be updated. Therefore, a non-reference picture can be generated by appropriately reflecting internal factors of the video conference apparatus 1 itself.
  • the video conference apparatus 1 can generate a plurality of non-reference pictures by distributing them without bias. Therefore, even when a plurality of non-reference pictures are deleted, it is possible to reduce the possibility that the video is reproduced so as to be interrupted. Further, when deleting the non-reference picture, the video conference apparatus 1 can reduce the output bit rate more quickly and efficiently by preferentially deleting the I picture having a larger data amount than the P picture. .
  • the video conference device 1 corresponds to the “encoding device” of the present invention.
  • the camera 33 corresponds to “image input means”.
  • the usable bandwidth (W) of the network 8, the usable bandwidth variation (A), and the free capacity variation (C) of the FIFO buffer area 122 correspond to “network bandwidth information”.
  • the CPU 10 that detects band information in S2 of FIG. 3 and S15 and S16 of FIG. 8 functions as a “detection unit”.
  • the CPU 10 that determines the non-reference picture generation conditions in S5 of FIG. 3 functions as a “determination unit”.
  • the CPU 10 that encodes the image data in S11 of FIG. 8 functions as an “encoding unit”.
  • the CPU 10 that deletes the non-reference picture in S18 of FIG. 8 functions as a “deleting unit”.
  • the CPU 10 that outputs the encoded data in S21 of FIG. 8 functions as an “output unit”.
  • the CPU 10 that measures the available bandwidth (W) of the network 8 in S2 of FIG. 3 and S15 of FIG. 8 functions as a “second detection unit”.
  • the FIFO buffer area 122 of the RAM 12 corresponds to a “buffer”.
  • the CPU 10 that inputs the encoded data to the FIFO buffer area 122 in S12 of FIG. 8 functions as a “storage control unit”.
  • the CPU 10 that measures the fluctuation amount of the free capacity of the FIFO buffer area 122 in S16 of FIG. 8 functions as a “third detection unit”.
  • the process of detecting band information in S2 of FIG. 3 and S15 and S16 of FIG. 8 corresponds to the “detection step” of the present invention.
  • the process of determining the non-reference picture generation condition in S5 of FIG. 3 corresponds to the “determination step”.
  • the process of encoding the image data in S11 of FIG. 8 corresponds to the “encoding step”.
  • the process of deleting the non-reference picture in S18 of FIG. 8 corresponds to a “deletion step”.
  • the process of outputting the encoded data in S21 of FIG. 8 corresponds to an “output step”.
  • the video conference apparatus 101 according to the second embodiment is different from the video conference apparatus 1 according to the first embodiment only in that the encoded data is not buffered. Therefore, the same number is attached
  • the video conference apparatus 101 does not include the FIFO buffer 122 (see FIG. 2).
  • the picture mode control block 58 and the reference picture control block 59 generate a non-reference picture using information obtained by the available bandwidth measurement 55 of the network 8.
  • the present invention can be implemented without using a buffer for buffering encoded data. Details of the processing will be described below.
  • the CPU 10 of the video conference apparatus 101 starts a main process when an instruction to execute transmission / reception of image data is input.
  • the main process is the same as the main process (see FIG. 3) performed by the video conference apparatus 1 of the first embodiment except for the frame-by-frame process described below. Therefore, the description of the main process is omitted.
  • the image data is converted according to the determined quantization parameter (S3, see FIG. 3) and the non-reference picture generation conditions (S5, see FIG. 3).
  • One by one is encoded (S11). It is determined whether there is a non-reference picture in the encoded data output to the network 8 (S113). If there is no non-reference picture (S113: NO), the process proceeds to S121.
  • the available bandwidth (W) of the network 8 is measured (S15).
  • the fluctuation amount (B) of the usable bandwidth of the network 8 is measured (S116). It is determined whether or not the available bandwidth (W) measured this time in S15 is lower than the available bandwidth (W) measured last time in S2 (see FIG. 3) or S15 (S17). If the current time is not lower (S17: NO), the process proceeds to S119 as it is. If the available bandwidth (W) measured in the current processing of S15 is lower (S17: YES), the non-reference picture before output to the network is deleted (S118).
  • the difference between the fluctuation amount (A) of the available bandwidth measured last time in S2 (see FIG. 3) and the fluctuation amount (B) of the available bandwidth measured this time in S116 is calculated (S119).
  • the encoded data that has not been deleted is output to the network 8 (S121). It is determined whether or not the absolute value of the difference in the available bandwidth fluctuation amount calculated in S119 is equal to or greater than the absolute value of the difference in the changeable value of the output bit rate when the generation condition is updated (S22). If the absolute value of the difference in the amount of change in the available bandwidth is smaller than the absolute value of the difference in the changeable value of the output bit rate (S22: NO), the process returns to S11 as it is. If the difference is greater than or equal to the absolute value of the changeable value (S22: YES), the process returns to the main process (see S3) in order to update the generation condition.
  • the video conference apparatus 101 determines a non-reference picture generation condition in accordance with the amount of change in the available bandwidth of the network 8, and selects a non-reference picture when the available bandwidth decreases. Can be deleted. Accordingly, the output bit rate to the network 8 can be changed by quickly following the change in the available bandwidth.
  • the present invention can be implemented without using a buffer for buffering encoded data.
  • the available bandwidth (W) of the network 8 and the fluctuation amount (A, B) of the available bandwidth correspond to “network bandwidth information” of the present invention.
  • the CPU 10 that detects band information in S2 of FIG. 3 and S15 and S116 of FIG. 10 functions as a “detection unit”.
  • the CPU 10 that encodes the image data in S11 of FIG. 10 functions as an “encoding unit”.
  • the CPU 10 that deletes the non-reference picture in S118 in FIG. 10 functions as a “deleting unit”.
  • the CPU 10 that outputs the encoded data in S121 of FIG. 10 functions as an “output unit”.
  • the present invention is not limited to the above-described embodiment, and various modifications are possible.
  • the present invention is not limited to a video conference apparatus.
  • the present invention can be applied to any device that outputs encoded data via a network, such as a server that distributes video.
  • the Coding is performed based on the H.264 standard, but other standards may be adopted.
  • the video conference apparatuses 1 and 101 of the above embodiment perform processing such as deletion of non-reference pictures based on the value of the available bandwidth of the network 8.
  • the video conference apparatuses 1 and 101 may measure the bandwidth actually used in the network 8 and perform processing.
  • a condition other than the condition determined in the above embodiment may be determined.
  • the frame rate, resolution, and the like may be determined as generation conditions.
  • the video conference apparatus 101 may delete a non-reference picture when the amount of decrease in available bandwidth exceeds a threshold value.
  • the non-reference picture generation condition is updated when the change amount of the fluctuation amount of the free capacity of the FIFO buffer 122 or the change amount of the fluctuation amount of the usable bandwidth becomes large (FIG. 8). (See S22 and S22 in FIG. 10).
  • the trigger for updating the non-reference picture generation condition can also be changed.
  • the generation condition may be repeatedly updated every predetermined time or every time a predetermined number of pictures are output.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

La présente invention concerne un dispositif de codage, un procédé de codage, et un logiciel de codage qui permettront de réagir rapidement aux fluctuations affectant la bande passante du réseau en modifiant le débit binaire de données de sortie arrivant sur le réseau. En l'occurrence, un dispositif de vidéoconférence détecte une information concernant la bande passante du réseau et se base sur l'information détectée pour déterminer la condition de génération d'images ne servant pas de référence. Une image d'entrée (41) sera codée en fonction de l'état de génération pour générer des données codées (48). Dans les données codées générées, le dispositif de vidéoconférence supprime les images ne servant pas de référence, c'est-à-dire les images auxquelles il n'est pas fait référence lors du décodage de l'une quelconque des autres images en cas de chute de la bande passante disponible du réseau. Seules les images restantes qui n'ont pas été supprimées sont remises en sortie au réseau.
PCT/JP2010/064603 2009-09-08 2010-08-27 Dispositif, procédé, et logiciel de codage WO2011030680A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2009-207117 2009-09-08
JP2009207117A JP2011061362A (ja) 2009-09-08 2009-09-08 符号化装置、符号化方法、および符号化プログラム

Publications (1)

Publication Number Publication Date
WO2011030680A1 true WO2011030680A1 (fr) 2011-03-17

Family

ID=43732354

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2010/064603 WO2011030680A1 (fr) 2009-09-08 2010-08-27 Dispositif, procédé, et logiciel de codage

Country Status (2)

Country Link
JP (1) JP2011061362A (fr)
WO (1) WO2011030680A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170055001A1 (en) * 2014-05-08 2017-02-23 Mitsubishi Electric Corporation Image encoding apparatus and image decoding apparatus

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2830628A1 (fr) 2011-03-18 2012-09-27 Kagoshima University Composition pour le traitement et le diagnostic du cancer du pancreas
US8793389B2 (en) * 2011-12-20 2014-07-29 Qualcomm Incorporated Exchanging a compressed version of previously communicated session information in a communications system
JP5853757B2 (ja) * 2012-02-21 2016-02-09 富士通株式会社 動画像符号化装置及び動画像符号化方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06351006A (ja) * 1993-06-10 1994-12-22 Nippon Telegr & Teleph Corp <Ntt> 画像信号用可変レート符号化装置
JPH06350983A (ja) * 1993-06-10 1994-12-22 Nippon Telegr & Teleph Corp <Ntt> 画像信号用可変レート符号化装置および復号化装置
JPH08191451A (ja) * 1995-01-10 1996-07-23 Canon Inc 動画像送信装置
JP2008301309A (ja) * 2007-06-01 2008-12-11 Panasonic Corp 符号化レート制御方法、符号化レートを制御する伝送装置、プログラム記憶媒体及び集積回路
JP2009060553A (ja) * 2007-09-04 2009-03-19 Meidensha Corp Mpegデータの送受信方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06351006A (ja) * 1993-06-10 1994-12-22 Nippon Telegr & Teleph Corp <Ntt> 画像信号用可変レート符号化装置
JPH06350983A (ja) * 1993-06-10 1994-12-22 Nippon Telegr & Teleph Corp <Ntt> 画像信号用可変レート符号化装置および復号化装置
JPH08191451A (ja) * 1995-01-10 1996-07-23 Canon Inc 動画像送信装置
JP2008301309A (ja) * 2007-06-01 2008-12-11 Panasonic Corp 符号化レート制御方法、符号化レートを制御する伝送装置、プログラム記憶媒体及び集積回路
JP2009060553A (ja) * 2007-09-04 2009-03-19 Meidensha Corp Mpegデータの送受信方法

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170055001A1 (en) * 2014-05-08 2017-02-23 Mitsubishi Electric Corporation Image encoding apparatus and image decoding apparatus

Also Published As

Publication number Publication date
JP2011061362A (ja) 2011-03-24

Similar Documents

Publication Publication Date Title
US8731152B2 (en) Reducing use of periodic key frames in video conferencing
US9071841B2 (en) Video transcoding with dynamically modifiable spatial resolution
JP5166021B2 (ja) Dslシステム用の高速チャネル変化を可能にする方法及びシステム
JP4309098B2 (ja) 階層型ビデオ符号化情報の伝送方法
JP4554927B2 (ja) ビデオトランスコーディングにおけるレート制御方法およびシステム
JP2006087125A (ja) ビデオフレームシーケンスを符号化する方法、符号化ビットストリーム、画像又は画像シーケンスを復号する方法、データの送信又は受信を含む使用、データを送信する方法、符号化及び/又は復号装置、コンピュータプログラム、システム、並びにコンピュータ読み取り可能な記憶媒体
JP2011050117A (ja) トリックモードおよび速度移行
TWI482500B (zh) 處理輸入位元流的方法與訊號處理裝置
JP2004507178A (ja) ビデオ信号符号化方法
JP2008131143A (ja) 符号化処理装置および符号化処理方法
JP3668110B2 (ja) 画像伝送システムおよび画像伝送方法
US8812724B2 (en) Method and device for transmitting variable rate video data
JP2012507892A (ja) ビデオストリームの品質値を判定する方法及びシステム
WO2011030680A1 (fr) Dispositif, procédé, et logiciel de codage
KR102424258B1 (ko) 비디오를 인코딩하기 위한 방법 및 인코더 시스템
JP4861371B2 (ja) 映像品質推定装置、方法、およびプログラム
JP2009124518A (ja) 画像送信装置
JP4447443B2 (ja) 画像圧縮処理装置
JP5212319B2 (ja) 符号化装置、符号化方法、および符号化プログラム
JP2008263443A (ja) 情報処理装置および方法、並びにプログラム
JP3836701B2 (ja) 動画像を符号化する方法及び装置及びプログラム並びに動画像音声多重化の方法及び装置
JP5141656B2 (ja) 通信制御装置、通信制御方法、および通信制御プログラム
JP4718736B2 (ja) 動画像符号化装置
Yunus et al. A rate control model of MPEG-4 encoder for video transmission over Wireless Sensor Network
JP2003023639A (ja) データ伝送装置及び方法、データ伝送プログラム、並びに記録媒体

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10815277

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10815277

Country of ref document: EP

Kind code of ref document: A1