KR20150086110A

KR20150086110A - Apparatus and Method for Transmitting Encoded Video Stream

Info

Publication number: KR20150086110A
Application number: KR1020140006293A
Authority: KR
Inventors: 리 벤; 김창곤; 이태욱; 쟈오 징
Original assignee: 엘지디스플레이 주식회사
Priority date: 2014-01-17
Filing date: 2014-01-17
Publication date: 2015-07-27
Also published as: KR102118678B1

Abstract

According to an embodiment of the present invention, a video stream transmitting device is capable of using the advantages of both the TCP and UDP. The video stream transmitting device includes: an encoder which encodes a video stream by using a predetermined compression standard; a parser which divides the encoded video stream to multiple sub-streams and parses each of the sub-streams; a MUX unit which separates the parsed sub-streams into a sequence parameter set (SPS) and a picture parameter set (PPS) or into first data being a network adaptation layer (NAL) unit having a slice header and second data being another NAL unit having slice data; a first packet generation unit which uses the first data of each sub-stream to generate a transmission control protocol (TCP) packet and transmits the generated TCP packet through a TCP tunnel; and a second packet generation unit which uses the second data of each sub-stream to generate a user datagram protocol (UDP) packet and transmits the generated UDP packet through a UDP channel.

Description

부호화된 비디오 스트림 전송 장치 및 방법{Apparatus and Method for Transmitting Encoded Video Stream}[0001] The present invention relates to an apparatus and a method for transmitting a coded video stream,

본 발명은 데이터 전송에 관한 것으로서, 보다 구체적으로는 부호화된 비디오 스트림 전송 장치 및 방법에 관한 것이다. BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to data transmission, and more particularly, to an apparatus and method for transmitting a coded video stream.

WLAN 상에서 HD 비디오 스트리밍이 가능해졌고, 네트워크 대역폭과 같은 중요한 기술들이 지속적으로 향상되고 있으며, 스마트폰, 모바일 인터넷 장치, 및 무선 디스플레이 장치의 사용이 증가하고 있다.HD video streaming over WLAN has become possible, important technologies such as network bandwidth are constantly improving, and the use of smart phones, mobile internet devices, and wireless display devices is increasing.

일부 주목할 만한 무선 HD 스트리밍 기술들에는 애플사의 에어 플레이, 인텔사의 WiDi, 및 캐비움 사의 WiVu가 있다. 이 기술들은 애드혹(ad-hoc) 모드에서 적용되고 있다. 최신 비디오 압축 표준인 H.264는 더 효율적인 압축 알고리즘을 제공함으로써 무선 비디오 스트리밍을 용이하게 하였고, 이로 인해 네트워크를 통해 전송되어야 하는 데이터의 양이 감소하게 되었다. 더욱이, H.264는 데이터 분할(Data Partitioning: DP), 플렉서블 매크로 블록 오더링(Flexible Macroblock Ordering: FMO), 및 네트워크 어댑테이션 레이어(Network Adaption Layer: NAL) 구조와 같이 에러 내성적이고 네트워크 친화적인 많은 기술들을 제공한다. 하지만, 무선 HD 비디오 스트리밍은 여전이 많은 어려움에 직면해 있다. 이것은 일반적인 데이터 전송과는 달리, 비디오 스트리밍은 데이터 무결성(Data Integrity) 뿐만 아니라 패킷 지연 및 유실의 존재 하에서 엄격한 출력 마감시한(Playout Deadline)을 갖는 프레임들을 요구하기 때문이다. 또한, 이 두 가지 팩터들은 전송 프로토콜과 밀접하게 관련되어 있다.Some notable wireless HD streaming technologies include Apple's Airplay, Intel's WiDi, and Cavium's WiVu. These technologies are being applied in ad-hoc mode. The latest video compression standard, H.264, facilitates wireless video streaming by providing a more efficient compression algorithm, which reduces the amount of data that must be transmitted over the network. In addition, H.264 provides a number of error-tolerant and network-friendly technologies such as Data Partitioning (DP), Flexible Macroblock Ordering (FMO), and Network Adaptation Layer to provide. However, wireless HD video streaming still faces many challenges. This is because, unlike normal data transmission, video streaming requires frames with a strict output deadline in the presence of data integrity as well as packet delay and loss. In addition, these two factors are closely related to the transport protocol.

TCP(Transmitting Control Protocol) 및 UDP(User Datagram Protocol)는 네트워크를 통해 비디오 데이터를 전송하기 위해 사용되는 2개의 기본적인 전송 계층(Transport Layer) 프로토콜들이다. TCP는 신뢰할만한 프로토콜이지만 유실된 패킷들의 재전송으로 인한 지연 및 대역폭 소모가 있고, 이것들이 패킷 유실 가능성을 더욱 증가시킨다. 예를 들면, HTTP에 기반한 비디오 스트리밍은 TCP에 기초하고 있다. TCP에 의해 발생되는 지연을 가리거나 감소시키기 위한 많은 연구가 수행되었지만, 실시간 비디오 스트리밍에 있어서 이것은 여전히 중요한 문제로 남아 있다. 반대로, UDP는 최소한의 지연을 제공하지만 패킷전송을 보장하지는 못한다. 유실된 패킷들은 연속하는 프레임들로 전파되는 에러의 원인이 된다.Transmitting Control Protocol (TCP) and User Datagram Protocol (UDP) are two basic transport layer protocols used to transmit video data over a network. Although TCP is a reliable protocol, there are delays and bandwidth consumption due to retransmissions of lost packets, which further increase the likelihood of packet loss. For example, video streaming based on HTTP is based on TCP. Although much work has been done to observe or reduce the delay caused by TCP, this remains an important issue for real-time video streaming. Conversely, UDP provides minimal delay but does not guarantee packet forwarding. Lost packets cause errors that propagate in successive frames.

비디오 스트리밍을 향상시키기 위해 TCP 및 UDP에 대해 주목할만한 양의 연구가 진행되었을 지라도, 무선 비디오 스트리밍에 있어서 TCP 및 UDP 사용의 이점을 활용하는 데에는 거의 주의를 기울이지 못했다. Porter 및 Peng은 "Hybrid TCP/UDP Video Transport For H.264/AVC Content Delivery In Burst Loss Networks"라는 제목으로 발표된 논문에서, 하이브리드 TCP/UDP 스트리밍 방법을 제안했다. 이 방법은 우선순위가 높은 데이터는 TCP에 기초하여 전송하고 우선순위가 낮은 데이터는 UDP에 기초하여 전송한다. 하지만, Porter 및 Peng은 현실의 네트워크 환경에서 그들의 방법을 실제 구현하지 않았고, 대신에 UDP에 의한 패킷 유실을 시뮬레이터하기 위해 부호화된 비디오로부터 부분적으로 랜덤하게 데이터를 제거하는 기법을 사용했다. 이 평가 프로세스는 엄격함이 결여되었고, TCP 및 UDP를 사용함에 의해 발생되는 영상품질 및 버퍼링 시간과 같은 중요한 결과를 제공하지는 못했다.Although a significant amount of research has been done on TCP and UDP to improve video streaming, little attention has been given to exploiting the advantages of using TCP and UDP in wireless video streaming. Porter and Peng proposed a hybrid TCP / UDP streaming method in a paper entitled "Hybrid TCP / UDP Video Transport For H.264 / AVC Content Delivery In Burst Loss Networks". In this method, high priority data is transmitted based on TCP and low priority data is transmitted based on UDP. However, Porter and Peng did not actually implement their methods in a real-world network environment, but instead used a technique to partially and randomly remove data from encoded video to simulate packet loss by UDP. This evaluation process lacked rigor and did not provide important results such as image quality and buffering time caused by using TCP and UDP.

비디오 스트리밍과 관련하여 아래와 같은 선행특허문헌들이 존재한다. 먼저, 미국 등록특허 제8,356, 109호에서는, 인트라 부호화된 프레임들 및 높은 우선순위를 갖는 인터 부호화된 프레임들은 복수개의 TCP 채널들 상에서 전송하고, 낮은 우선선위를 갖는 인터 부호화된 프레임들은 복수개의 UDP 채널들을 통해 전송함에 의해 수신된 비디오의 품질을 향상시킨다. 이 발명은 복수개의 통신 채널을 갖는 네트워크에 포커싱하고 있다. 추가적으로, 높은 우선순위를 갖는 데이터 및 낮은 우선순위를 갖는 데이터의 분리는 프레임 레벨에서 수행된다. 하지만, 항상 복수개의 통신 채널들이 실행 가능할 수는 없다. 예컨대, 가정환경에서 비디오 스트리밍은 점대점 방식으로 수행되고, 이것은 하나의 통신 채널만이 가용적이라는 것을 의미한다. 이 발명은 이러한 환경에서는 적용되지 않는고, 더욱이 이 발명은 TCP 채널을 통해 단지 완벽한 프레임만을 전송할 수 있다는 제한이 있다.The following prior patent documents exist regarding video streaming. First, in U.S. Patent No. 8,356,109, intra-coded frames and inter-coded frames having a high priority are transmitted on a plurality of TCP channels, and inter-coded frames having a low priority order are transmitted to a plurality of UDP Thereby improving the quality of the received video by transmitting over the channels. The present invention focuses on a network having a plurality of communication channels. In addition, separation of data having a high priority and data having a low priority is performed at a frame level. However, a plurality of communication channels can not always be executed. For example, video streaming in a home environment is performed in a point-to-point fashion, which means that only one communication channel is available. The present invention is not applicable in such an environment, and furthermore, the present invention has a limitation that it can transmit only a complete frame over a TCP channel.

다음으로, 미국 공개특허 제2012/0173748호에서는, 미디어를 스트리밍하기 위해 TCP 및 UDP 프로토콜 모두가 이용된다. 이 발명은 높은 우선선위 미디어 데이터를 TCP 상에서 클라이언트에게 전달하고, 낮은 우선순위 미디어 데이터를 UDP 상에서 클라이언트에게 전달한다. 비록, 이 특허가 높은 우선순위 데이터 및 낮은 우선순위 데이터가 무엇인지 구체적으로 규정하고 있지는 않지만, 이 특허는 다음과 같이 말하고 있다. "미디어 데이터가 일련의 픽쳐(Picture)로써 부호화되는 경우, 미디어 데이터의 제1 부분은 높은 우선순위를 갖는 인트라 부호화된 픽쳐들을 포함하고, 미디어 데이터의 제2 부분은 하나 이상의 낮은 우선순위를 갖는 인터 부호화된 픽쳐들을 포함한다". 위의 진술은 이 발명 또한 프레임 레벨에서 데이터를 분리하는 것임을 명확하게 나타낸다. 더욱이, 높은 우선순위 데이터가 인트라 부호화된 프레임으로써 규정되고, 반면 낮은 우선순위 데이터가 인터 부호화된 프레임들로 규정되고 있다. 추가적으로, 이 발명은 비디오 스트리밍의 QoE의 핵심사항인 이니셜 버퍼링 및 리버퍼링을 고려하지 않았다는 한계가 있다.Next, in U.S. Patent Publication No. 2012/0173748, both TCP and UDP protocols are used to stream media. The present invention delivers high priority media data to clients on TCP and low priority media data to clients on UDP. Although this patent does not specify what high-priority data and low-priority data are, the patent says: "When the media data is encoded with a series of pictures, the first part of the media data includes intra-coded pictures with high priority, and the second part of the media data includes one or more low- Contains coded pictures ". The above statement clearly indicates that the invention is also to separate data at the frame level. Moreover, high priority data is defined as intra-coded frames, while low priority data is defined as inter-coded frames. In addition, the present invention is limited in that initial buffering and re-buffering, which are key points of QoE of video streaming, are not considered.

다음으로, 미국 등록특허 제6,771,594호에서는, 실시간 데이터 스트리밍의 QoS를 모니터링하여, QoS가 임계값 이하로 떨어지는 경우 실시간 데이터는 TCP와 같은 신뢰성있는 네트워크 서비스를 통해 라우팅하고, QoS가 적절한 경우 데이터는 UDP와 같이 신뢰성이 없는 서비스 네트워크로 라우팅하는 것을 제시하고 있다. 이 발명을 이용하는 경우 수신된 실시간 데이터의 품질은 변동적일 수 있다는 문제점이 있다. QoS가 임계값이하가 되어 남은 데이터가 TCP를 통해 라우팅되면, 이미 그 기간 동안 스트리밍되는 비디오 품질은 열악해질 것이다. 따라서, TCP와 UDP간의 일정한 스위칭으로 인해 스트리밍 품질이 일정해 질 수 없게 된다. 더욱이, 이 발명은 비디오 스트리밍에 비해 데이터 양이 훨씬 적은 VoIP에만 적용된다는 한계가 있다.In U.S. Patent No. 6,771,594, real-time data streaming QoS is monitored. When the QoS drops below a threshold, real-time data is routed through a reliable network service such as TCP. If the QoS is appropriate, To the unreliable service network. There is a problem in that the quality of received real-time data may be fluctuating. If the QoS is below the threshold and the remaining data is routed through TCP, the quality of the video streamed over that period will be poor. Therefore, streaming quality can not be constant due to constant switching between TCP and UDP. Moreover, this invention has a limitation that it applies only to VoIP with a much smaller data amount than video streaming.

본 발명은 전술한 종래의 문제점을 해결하기 위해 고안된 것으로서, TCP와 UDP의 장점을 모두 이용할 수 있는 부호화된 비디오 스트림 전송 장치 및 방법을 제공하는 것을 그 기술적 특징으로 한다.It is a technical feature of the present invention to provide an apparatus and method for transmitting a coded video stream that can utilize both the advantages of TCP and UDP.

또한, 본 발명은 비디오 스트림을 복수개의 서브 스트림들로 분할하고, 각 서브 스트림에 대해 높은 우선순위를 갖는 데이터는 TCP를 통해 전송하고 낮은 우선선위를 갖는 데이터는 UDP를 통해 전송할 수 있는 부호화된 비디오 스트림 전송 장치 및 방법을 제공하는 것을 다른 기술적 특징으로 한다.The present invention also provides a method and apparatus for segmenting a video stream into a plurality of sub-streams, wherein data having a higher priority for each sub-stream is transmitted via TCP and data having a lower priority is transmitted over UDP, Another aspect of the present invention is to provide a stream transmission apparatus and method.

또한, 본 발명은 길이가 긴 비디오 스트림에 대해 리버퍼링의 가능성을 감소시키고 이니셜 버퍼링을 최소화시키기 위해 서브 스트림들을 중첩해서 전송할 수 있는 부호화된 비디오 스트림 전송 장치 및 방법을 제공하는 것을 그 기술적 특징으로 한다.The present invention also provides a coded video stream transmission apparatus and method capable of superposing sub-streams so as to reduce the possibility of re-buffering for a long video stream and minimize initial buffering .

상술한 목적을 달성하기 위한 본 발명의 일 측면에 따른 부호화된 비디오 스트림 전송 장치는, 미리 정해진 압축 규격을 이용하여 비디오 스트림을 부호화하는 엔코더; 상기 부호화된 비디오 스트림을 복수개의 서브 스트림으로 구분하고, 각 서브 스트림을 파싱하는 파서; 파싱된 서브 스트림으로부터 SPS(Sequence Parameter Set), PPS(Picture Parameter Set), 또는 슬라이스 헤더(Slice Header)를 갖는 NAL(Network Adaptation Layer) 유닛인 제1 데이터와 슬라이스 데이터(Slice Data)를 갖는 NAL 유닛인 제2 데이터로 구분하는 먹스; 상기 서브 스트림 별로 상기 제1 데이터를 이용하여 TCP(Transmission Control Protocol) 패킷을 생성하고, 생성된 TCP 패킷을 TCP 터널을 통해 전송하는 제1 패킷 생성부; 및 상기 서브 스트림 별로 상기 제2 데이터를 이용하여 UDP(User Datagram Protocol) 패킷을 생성하고, 생성된 UDP 패킷을 UDP 터널을 통해 전송하는 제2 패킷 생성부를 포함하는 것을 특징으로 한다.According to an aspect of the present invention, there is provided an encoded video stream transmission apparatus including: an encoder for encoding a video stream using a predetermined compression standard; A parser for dividing the encoded video stream into a plurality of substreams and parsing each substream; (NAL) unit having slice data (Slice Data) and first data as a network adaptation layer (NAL) unit having a sequence parameter set (SPS), a picture parameter set (PPS) or a slice header The second data; A first packet generator for generating a Transmission Control Protocol (TCP) packet using the first data for each of the sub-streams, and transmitting the generated TCP packet through a TCP tunnel; And a second packet generator for generating a User Datagram Protocol (UDP) packet using the second data for each sub-stream and transmitting the generated UDP packet through a UDP tunnel.

상술한 목적을 달성하기 위한 본 발명의 다른 측면에 따른 부호화된 비디오 스트림 전송 방법은, 미리 정해진 압축 규격을 이용하여 비디오 스트림을 부호화하는 단계; 상기 부호화된 비디오 스트림을 복수개의 서브 스트림으로 구분하는 단계; 상기 서브 스트림으로부터 SPS(Sequence Parameter Set), PPS(Picture Parameter Set), 또는 슬라이스 헤더(Slice Header)를 갖는 NAL(Network Adaptation Layer) 유닛인 제1 데이터와 슬라이스 데이터(Slice Data)를 갖는 NAL 유닛인 제2 데이터로 구분하는 단계; 상기 제1 데이터를 이용하여 TCP(Transmission Control Protocol) 패킷을 생성하고, 상기 제2 데이터를 이용하여 UDP(User Datagram Protocol) 패킷을 생성하는 단계; 및 상기 생성된 TCP 패킷을 TCP 터널을 통해 전송하고, 상기 생성된 UDP 패킷을 UDP 터널을 통해 전송하는 단계를 포함하는 것을 특징으로 한다.According to another aspect of the present invention, there is provided an encoded video stream transmission method including: encoding a video stream using a predetermined compression standard; Dividing the encoded video stream into a plurality of sub-streams; A NAL unit having a first data and a slice data as a network adaptation layer (NAL) unit having a sequence parameter set (SPS), a picture parameter set (PPS), or a slice header Dividing the data into second data; Generating a Transmission Control Protocol (TCP) packet using the first data, and generating a User Datagram Protocol (UDP) packet using the second data; And transmitting the generated TCP packet through a TCP tunnel and transmitting the generated UDP packet through a UDP tunnel.

상술한 바와 같은 본 발명에 따르면 높은 우선순위를 갖는 데이터와 낮은 우선순위를 갖는 데이터를 구분하는 기본 단위가 NAL 유닛이기 때문에, H.264 기반의 어떠한 신택스 엘리먼트에 대해서도 우선순위를 결정할 수 있어 매우 유연하게 적용할 수 있다는 효과가 있다.According to the present invention as described above, since the basic unit for distinguishing between data having a high priority and data having a low priority is a NAL unit, priority can be determined for any syntax element based on H.264, The effect can be applied.

또한, 본 발명에 따르면 비디오 스트림을 복수개의 서브 스트림들로 분할하고, 분할된 서브 스트림들을 중첩하여 전송하기 때문에 이니셜 버퍼링을 최소화함은 물론 리버퍼링의 발생 및 리버퍼링의 시간을 최소화할 수 있다는 효과가 있다.According to the present invention, since the video stream is divided into a plurality of sub-streams and the divided sub-streams are overlapped and transmitted, it is possible to minimize initial buffering and minimize re-buffering and re- .

또한, 본 발명에 따르면 각 서브 스트림에 대해, 높은 우선순위를 갖는 데이터(예컨대, SPS(Sequence Parameter Set), PPS(Picture Parameter Set), 및 슬라이스 헤더)가 TCP를 통해 우선 전송되므로, 낮은 우선선위를 갖는 데이터(예컨대, 슬라이스 데이터)가 패킷 유실로 인해 도달하지 못하는 경우에 있어서도 디코더가 EC(Error Concealment)의 적용을 통해 프레임을 복원하는 것이 가능하다는 효과가 있다.According to the present invention, since data having a high priority (for example, Sequence Parameter Set (SPS), Picture Parameter Set (PPS), and slice header) are transmitted first over TCP over each sub-stream, It is possible to restore the frame through application of EC (Error Concealment) even when the data (e.g., slice data) having the data (e.g., slice data)

또한, 본 발명에 따르면, H.264의 데이터 분할 기법을 이용하지 않기 때문에 재부호화되어야 하는 비디오가 요구되지 않고 네트워크 또한 데이터 분할을 지원할 필요가 없기 때문에 시스템의 유연성이 증가할 뿐만 아니라, 더욱이 비디오 스트림으로부터 획득된 어떠한 신택스 엘리먼트들도 분리되거나 우선순위가 결정될 수 있다는 효과가 있다. 예컨대, 영상품질의 향상을 위해 SPS, PPS, 및 슬라이스 헤더뿐만 아니라 슬라이스 데이터의 일부를 분리하여 우선순위를 결정할 수 있다.In addition, according to the present invention, since the H.264 data division technique is not used, video to be re-encoded is not required, and the network does not need to support data division, so that the flexibility of the system is increased, Lt; / RTI > can be separated or prioritized. For example, SPS, PPS, and slice header, as well as a part of slice data, may be separated and priorities may be determined to improve image quality.

또한, 본 발명에 따르면 서브 스트림들 간의 슬랙타임을 이용하여 I 슬라이스를 전송하기 때문에, I프레임의 영상 품질이 향상되고, 에러 전파가 감소된다는 효과가 있다.Further, according to the present invention, since the I-slice is transmitted using the slack time between the sub-streams, the image quality of the I-frame is improved and the error propagation is reduced.

도 1은 H.264 신택스의 구조를 보여주는 도면이다.
도 2는 HD 비디오 클립을 구성하는 하나의 프레임 상에서의 패킷 유실의 효과를 보여주는 도면이다.
도 3은 본 발명의 제1 실시예에 따른 부호화된 비디오 스트림 전송장치의 구성을 개략적으로 보여주는 블록도이다.
도 4는 본 발명의 일 실시예에 따라 서브 스트림을 중첩하여 전송하는 방법을 보여주는 도면이다.
도 5는 OEFMON의 일반적인 구조를 보여주는 도면이다.
도 6은 스트리밍 동안의 큐 상태를 보여주는 도면이다.
도 7은 시뮬레이터된 네트워크 시나리오를 보여주는 도면이다.
도 8은 도 7에 도시된 3개의 시나리오들에 대한 PSNR의 비교결과를 보여주는 도면이다.
도 9는 도 7에 도시된 3개의 시나리오들에 대한 패킷 유실을 보여주는 도면이다.
도 10은 도 7에 도시된 시나리오 2에서 본 발명에 따라 복호화된 프레임 134를 보여주는 도면이다.
도 11은 본 발명의 제2 실시예에 따른 부호화된 비디오 스트림 전송 장치의 구성을 개략적으로 보여주는 블록도이다.
도 12는 PBP 모듈을 갖지 않는 본 발명의 PSNR, 50% PBP의 PBP 모듈을 갖는 본 발명의 PSNR, 및 90% PBP의 PBP모듈을 갖는 본 발명의 PSNR을 비교하여 보여주는 도면이다.
도 13은 프레임 1187의 영상을 비교하여 보여주는 도면이다.1 is a diagram showing the structure of the H.264 syntax.
2 is a diagram showing the effect of packet loss on one frame constituting an HD video clip.
3 is a block diagram schematically illustrating a configuration of an encoded video stream transmission apparatus according to a first embodiment of the present invention.
4 is a diagram illustrating a method of superposing and transmitting sub-streams according to an embodiment of the present invention.
5 is a diagram showing the general structure of OEFMON.
6 is a diagram showing a queue state during streaming.
7 is a diagram showing a simulated network scenario.
FIG. 8 is a view showing a result of comparison of PSNRs for the three scenarios shown in FIG. 7. FIG.
FIG. 9 is a diagram illustrating packet loss for the three scenarios shown in FIG. 7. FIG.
10 is a diagram showing a frame 134 decoded according to the present invention in the scenario 2 shown in FIG.
11 is a block diagram schematically showing a configuration of an encoded video stream transmission apparatus according to a second embodiment of the present invention.
Figure 12 is a comparison of the PSNR of the present invention having no PBP module, the PSNR of the present invention having a PBP module of 50% PBP, and the PSNR of the present invention having a PBP module of 90% PBP.
13 is a diagram showing images of the frame 1187 in comparison.

본 명세서에서 각 도면의 구성요소들에 참조번호를 부가함에 있어서 동일한 구성 요소들에 한해서는 비록 다른 도면상에 표시되더라도 가능한 한 동일한 번호를 가지도록 하고 있음에 유의하여야 한다.It should be noted that, in the specification of the present invention, the same reference numerals as in the drawings denote the same elements, but they are numbered as much as possible even if they are shown in different drawings.

한편, 본 명세서에서 서술되는 용어의 의미는 다음과 같이 이해되어야 할 것이다. Meanwhile, the meaning of the terms described in the present specification should be understood as follows.

단수의 표현은 문맥상 명백하게 다르게 정의하지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "제 1", "제 2" 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위한 것으로, 이들 용어들에 의해 권리범위가 한정되어서는 아니 된다.The word " first, "" second," and the like, used to distinguish one element from another, are to be understood to include plural representations unless the context clearly dictates otherwise. The scope of the right should not be limited by these terms.

"포함하다" 또는 "가지다" 등의 용어는 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.
It should be understood that the terms "comprises" or "having" does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

먼저, 본 발명에 대해 설명하기에 앞서, H.264 비디오를 부호화하는 방법과 스트리밍 하는 방법 관계, H.264로 부호화되어 스트리밍 되는 비디오의 영상 품질 상에서의 패킷 지연 및 유실로 인한 효과에 대해 설명한다.Before describing the present invention, a description will be given of a relation between a method of encoding H.264 video and a method of streaming, and an effect of packet delay and loss on video quality of video streamed in H.264 and streamed .

H.264는 최신 비디오 압축 표준이다. H.264의 이전 압축버전과 비교할 때, H.264는 더 공격적인 압축비율을 제공하고, 모바일 비디오 스트리밍에 더 유리하게 만드는 네트워크 친화적 특징을 갖는다.H.264 is the latest video compression standard. Compared to previous compressed versions of H.264, H.264 offers a more aggressive compression ratio and has network-friendly features that make it more useful for mobile video streaming.

효과적인 비디오 스트리밍을 위해 H.264의 중요한 몇 가지 특징들이 있다. 2개의 가장 중요한 특징은 비디오 데이터를 비트 스트림 데이터로 부호화하기 위한 신택스와 이 정보의 일부분이 다른 것들보다 얼마나 중요한지이다.There are several important features of H.264 for effective video streaming. The two most important features are the syntax for encoding video data into bitstream data and how important this portion of the information is to others.

H.264로 부호화된 비디오 스트림은 연속하는 GOP(Group of Picture)들로 구성된다. 각 GOP는 인트라 프레임(I프레임), 예측 프레임들(P프레임), 및 양방향 예측 프레임들(B프레임)로 구성된다. I프레임은 완전한 프레임을 복원하기 위해 요구되는 모든 데이터를 포함하고 다른 프레임들을 참조하지 않는다. 반대로, P프레임 및 B프레임들은 복호화과정에서 다른 프레임들의 참조를 요구한다. 만약, 참조 프레임이 에러들을 포함한다면, 이 에러들은 이 프레임을 참조하는 연속하는 프레임들을 통해 전파될 것이다. I프레임은 어떤 다른 프레임에 의존하지 않기 때문에, 새로운 I프레임이 도달하면 에러 전파는 중단될 것이다. 결과적으로, 가능하다면 I프레임들에게는 높은 우선순위가 주어져야만 한다.A video stream encoded in H.264 consists of consecutive GOPs (Group of Pictures). Each GOP is composed of an intra frame (I frame), prediction frames (P frame), and bidirectional prediction frames (B frame). An I frame contains all the data required to recover a complete frame and does not reference other frames. Conversely, P frames and B frames require references to other frames in the decoding process. If the reference frame contains errors, these errors will propagate through successive frames that reference this frame. Since the I frame does not depend on any other frame, the error propagation will stop when a new I frame arrives. As a result, I frames should be given a high priority, if possible.

H.264 비트 스트림 신택스 구조는 도 1에 도시된 바와 같다. H.264 비트 스트림 신택스는 연속하는 NAL(Network Analysis Layer)유닛들로 구성된다. 3개의 공통된 NAL 유닛들은 시퀀스 파라미터 셋(Sequence Parameter Set: SPS), 픽쳐 파라미터 셋(Picture Parameter Set: PPS), 및 슬라이스(Slice)이다. SPS는 프로파일 및 부호화된 비디오가 따르는 레벨과 같이 비디오 전체에 공통되는 파라미터들을 포함한다. 따라서, 만약 SPS가 유실되면 전체 비디오가 부호화될 수 없다. PPS는 적용된 엔트로피 코딩 모드와 같이 연속하는 프레임들에 적용되는 공통 파라미터들을 포함한다. 만약, 연속 프레임들에 대한 PPS가 유실되면 이 프레임들은 부호화될 수 없다. 슬라이스는 프레임을 구성하기 위한 하나의 유닛이고, 하나의 프레임은 싱글 슬라이스 또는 멀티플 슬라이스 중 어느 하나를 가질 수 있다. 슬라이스는 I슬라이스, P슬라이스, B스라이스, 또는 IDR (Instantaneous Decoder Refresh)슬라이스일 수 있다. IDR 슬라이스는 그 이전의 다른 슬라이스를 참조하지 않는 I슬라이스의 특별한 형태로써 참조 프레임 버퍼의 내용을 클리어하기 위해 사용된다. 슬라이스는 슬라이스 헤더와 복수개의 매크로 블록들을 포함하는 슬라이스 데이터를 포함한다. 슬라이스 헤더는 하나의 슬라이스에 포함된 모든 매크로 블록들에 공통된 정보를 포함한다. 각 슬라이스는 전송을 위해 하나 또는 복수개의 패킷들로 세분화된다. 따라서, 만약 슬라이스 헤더에 포함된 패킷이 유실되면 나머지 슬라이스 데이터가 적절하게 수신되더라도 전체 슬라이스가 부호화될 수 없다.The H.264 bitstream syntax structure is as shown in FIG. The H.264 bitstream syntax consists of consecutive NAL (Network Analysis Layer) units. The three common NAL units are a Sequence Parameter Set (SPS), a Picture Parameter Set (PPS), and a Slice. The SPS includes parameters common to all of the video, such as the profile and the level at which the encoded video follows. Thus, if the SPS is lost, the entire video can not be encoded. The PPS includes common parameters applied to successive frames, such as the applied entropy coding mode. If the PPS for the consecutive frames is lost, these frames can not be encoded. A slice is one unit for constructing a frame, and one frame may have either a single slice or multiple slices. The slice may be an I slice, a P slice, a B slice, or an Instantaneous Decoder Refresh (IDR) slice. The IDR slice is used to clear the contents of the reference frame buffer as a special form of I slice that does not reference any other previous slice. The slice includes slice data including a slice header and a plurality of macroblocks. The slice header includes information common to all macroblocks included in one slice. Each slice is subdivided into one or more packets for transmission. Therefore, if the packet included in the slice header is lost, the entire slice can not be encoded even if the remaining slice data is properly received.

도 2는 VLC 미디어 플레이어를 이용하여 UDP를 통해 스트리밍되는 "배틀필드"라는 HD 비디오 클립에서 하나의 프레임 상에서의 패킷유실의 효과를 보여준다. 이는 Wireshark 및 Elecard StreamEye Studio를 통해 분석되어진 것이다. 도 2a는 원본 전송 프레임을 보여준다. 도 2b는 패킷 유실로 인해 유실된 일부정보를 갖는 수신 프레임을 보여준다. 이 예에서, 슬라이스 4에 대한 슬라이스 헤더가 유실되었기 때문에 전체 슬라이스가 부호화될 수 없다. 반대로, 슬라이스 5에 대한 슬라이스 헤더는 수신되었고 슬라이스 데이터의 일부분을 포함하는 마지막 2개의 RTP 패킷들이 유실된 경우 대부분의 슬라이스들이 부호화된다. 그 뒤에 약간의 결함을 갖는 유실된 정보를 복원하기 위해 에러 은폐(Error Concealment: EC)기술이 사용될 수 있다. 따라서, PPS, SPS들 및 슬라이스 헤더들은 가장 중요한 데이터이므로 비디오 스트리밍 과정에서 그들에 대한 더 많은 배려가 주어져야한다.Figure 2 shows the effect of packet loss on one frame in an HD video clip called "Battlefield" streamed over UDP using a VLC media player. It was analyzed through Wireshark and Elecard StreamEye Studio. 2A shows an original transmission frame. Figure 2B shows a received frame with some information lost due to packet loss. In this example, the entire slice can not be encoded because the slice header for slice 4 has been lost. Conversely, the slice header for slice 5 has been received and most of the slices are encoded if the last two RTP packets containing a portion of the slice data are lost. Error Concealment (EC) techniques may then be used to recover lost information with some defects thereafter. Therefore, since PPS, SPS, and slice headers are the most important data, more consideration should be given to them in the video streaming process.

데이터 분할(Data Partitioning: DP)은 H.264의 에러내성적인 특성이다. 각 슬라이스에 대해 부호화된 데이터는 3개의 분리된 데이터 파티션인 A, B, 및 C에 위치된다. 파티션 A는 슬라이스 헤더 및 각 매크로블록에 대한 헤더(예컨대, MB타입, 양자화 파라미터, 및 모션 벡터들)를 포함한다. 파티션 B는 인트라 부호화된 매크로블록들에 대한 부호화된 블록 패턴들(Coded Block Patterns: CBPs) 및 계수를 포함한다. 파티션 C는 인터 부호화된 매크로블록들에 대한 CBPs 및 계수를 포함한다. 파티션 B를 복호화하기 위해서는 파티션 A가 반드시 존재해야만 한다. 파티션 C를 복호화하기 위해서는 파티션 A 및 B가 반드시 존재해야만 한다. 데이터 분할은 스트리밍 성능을 향상시키기 위해 비균등 오류정정(Unequal Error Protection: UEP) 기법과 함께 이용될 수 있다. 비록 데이터 분할이 에러 내성을 위한 강력한 도구일지라도 아직까지 넓게 적용되고 있지는 못하다. 이는 재부호화되어질 비디오와 802.11e 네트워크가 요구되기 때문이다.Data Partitioning (DP) is an error-tolerant property of H.264. The coded data for each slice is located in three separate data partitions A, B, Partition A includes a slice header and a header (e.g., MB type, quantization parameter, and motion vectors) for each macroblock. Partition B includes coded block patterns (CBPs) and coefficients for intra-coded macroblocks. Partition C contains CBPs and coefficients for inter-coded macroblocks. In order to decrypt partition B, partition A must exist. Partitions A and B must be present to decrypt partition C. Data partitioning can be used with Unequal Error Protection (UEP) techniques to improve streaming performance. Although data partitioning is a powerful tool for error tolerance, it is not yet widely applied. This is because video to be re-encoded and an 802.11e network are required.

현존하는 스트리밍 프로토콜들은 RTSP(Real Time Streaming Protocol), HTTP(HyperText Transfer Protocol), MMS(Microsoft Media Server), 및 RTP(Real-time Transport Protocol)를 포함한다. RTSP, HTTP, MMS, 및 RTP는 응용계층 프로토콜(Application Layer Protocol)들이기 때문에 그들 스스로가 스트림들을 전달하지 않는다는 것을 주목해야 한다. 예를 들면, RTP는 멀티미디어 데이터를 전달하기 위해 UDP 또는 TCP를 이용한다. RTSP, HTTP, 및 MMS는 스트리밍을 위해 더 많은 제어 특징들을 부가하지만 그들 또한 멀티미디어 데이터를 전달하기 위해 TCP 또는 UDP를 이용한다.Existing streaming protocols include Real Time Streaming Protocol (RTSP), HyperText Transfer Protocol (HTTP), Microsoft Media Server (MMS), and Real-time Transport Protocol (RTP). It should be noted that since RTSP, HTTP, MMS, and RTP are application layer protocols, they do not carry streams themselves. For example, RTP uses UDP or TCP to deliver multimedia data. RTSP, HTTP, and MMS add more control features for streaming, but they also use TCP or UDP to deliver multimedia data.

RTSP는 클라이언트가 원격에서 스트리밍 미디어 서버를 제어하는 것을 허용한다. 예를 들면, 클라이언트는 스트리밍동안 비디오를 재생, 정지, 및 탐색할 수 있다. RTSP는 QoS 상에서 통계적 데이터를 획득하기 위해 RTP 제어 프로토콜(RTCP)과 함께 이용될 수 있다. 전형적으로, RTSP는 제어 신호를 전달하기 위해 TCP를 이용하고, 멀티미디어 데이터를 전달하기 위해 RTP/UDP를 이용한다.RTSP allows the client to remotely control the streaming media server. For example, a client can play, stop, and search video during streaming. RTSP can be used with the RTP Control Protocol (RTCP) to obtain statistical data on QoS. Typically, RTSP uses TCP to deliver control signals and RTP / UDP to deliver multimedia data.

HTTP 또한 클라이언트가 스트리밍을 제어하는 것을 허용하고, 멀티미디어 데이터 및 제어 데이터 모두의 전송을 위해 TCP를 이용한다. HTTP는 TCP를 이용하기 때문에, 패킷들이 결코 유실되지 않는다. HTTP의 또 다른 장점은 HTTP 포트가 켜져 있을 때 방화벽 상에서도 동작한다는 것이다. 하지만, HTTP는 유실 패킷들이 재전송될 때 높은 종단간 지연이 발생할 것이다.HTTP also allows the client to control streaming, and uses TCP for transmission of both multimedia data and control data. Since HTTP uses TCP, packets are never lost. Another advantage of HTTP is that it also works on firewalls when the HTTP port is turned on. However, HTTP will experience high end-to-end delay when lost packets are retransmitted.

RTP는 멀티미디어 데이터의 전달을 위해 전형적으로 UDP를 이용한다. RTP 헤더는 시퀀스 넘버 및 타임스탬프를 포함한다. 시퀀스 넘버는 전송된 각 패킷에 대해 1씩 증가되고 패킷유실 검출을 위해 이용된다. 타임스탬프는 비디오 및 오디오와 같은 멀티플 스트림들을 동기화하기 위해 이용될 수 있다. 단지 RTP/UDP를 이용한 제어 기능은 없다는 것을 주목해야 한다.RTP typically uses UDP for delivery of multimedia data. The RTP header includes a sequence number and a timestamp. The sequence number is incremented by one for each transmitted packet and is used for packet loss detection. Timestamps can be used to synchronize multiple streams, such as video and audio. It should be noted that there is no control function using only RTP / UDP.

본 발명의 목적에 있어서, 핵심은 RTP/UDP 및 RTP/TCP 다이렉트 스트리밍에 있다. 이는 그들이 다른 스트리밍 프로토콜에 대해 기초적인 것이기 때문이다.For the purposes of the present invention, the key is in RTP / UDP and RTP / TCP direct streaming. This is because they are fundamental to other streaming protocols.

UDP는 실시간 비디오 스트리밍에 대해 일반적으로 TCP보다 더 적절한 것으로 인식되고 있다. 이는 UDP가 TCP에 비해 부드러운 비디오 출력을 위해 낮은 종단간 지연을 제공하기 때문이다. 비록 UDP가 데이터 유실이 발생하기 쉬울지라도, 멀티 미디어 데이터는 일반적인 데이터와 달리 어느 정도의 손실내성(Loss-Tolerant)을 갖기 때문이다. 추가적으로, 디코더는 데이터 손실에 의한 결함을 감소시키기 위해 EC기법을 이용한다. 패킷유실에 의한 영향을 감소시키기 위한 많은 EC기법들이 개발되었다. 하지만, 만약 유실패킷이 SPS, PPS들, 및 슬라이스 헤더와 같이 중요한 정보를 포함한다면, 디코더는 비록 EC의 도움을 받더라도 단순하게 비디오를 복원할 수는 없다.UDP is generally perceived as more appropriate for real-time video streaming than TCP. This is because UDP provides low end-to-end delay for smooth video output compared to TCP. Although UDP is prone to data loss, multimedia data has some loss-tolerance, unlike normal data. Additionally, the decoder uses EC techniques to reduce defects due to data loss. Many EC techniques have been developed to reduce the impact of packet loss. However, if a lost packet contains sensitive information such as SPS, PPSs, and slice headers, the decoder can not simply restore the video even with the help of EC.

UDP 스트리밍에 의한 패킷 유실을 용인하기 위해, 비균등 오류 정정기법(UEP)이 종종 이용된다. 비균등 오류정정기법은 다른 데이터들보다 중요한 데이터를 우선처리하는 것을 목적으로 한다. 이는 일부 신택스 엘리먼트들은 다른 것들보다 더 중대한 것이기 때문이다. 기본적인 비균등 오류정정기법은 중요한 패킷들을 몇 번이고 전송한다. 이는 수신단에 패킷들의 도착 가능성을 높이게 된다. 더 향상된 비균등 오류정정기법은 순방향 오류 정정(Forward Error Correction: FEC)을 포함한다. 반복성을 가지고 중요한 패킷들을 부호화하기 위해 순방향 오류정정을 이용함으로써, 수신단은 유실 패킷들을 재전송없이 복구할 수 있다. 하지만, 순방향 오류정정은 추가적인 오버헤드를 야기하고, 그것은 비디오를 전송하기 위해 요구되는 네트워크 대역폭을 증가시킨다.In order to tolerate packet loss due to UDP streaming, unequal error correction techniques (UEP) are often used. Unequal error correction techniques aim to process data that is more important than other data. This is because some syntax elements are more important than others. The basic unequal error correction technique transmits important packets many times. This increases the likelihood of arriving packets at the receiving end. A more improved non-uniform error correction technique includes forward error correction (FEC). By using forward error correction to encode important packets with repeatability, the receiver can recover lost packets without retransmission. However, forward error correction causes additional overhead, which increases the network bandwidth required to transmit video.

스트리밍에 있어서 TCP는 바람직하지 않다는 일반적인 견해에도 불구하고, 상업적 비디오 스트리밍 트래픽의 중요한 부분에서는 TCP를 이용한다. TCP는 보장된 서비스를 제공하고, 따라서 전송된 패킷들이 항상 보존된다. 그럼에도 불구하고, TCP의 재전송 및 전송률 메커니즘은 지연을 발생시키고, 그것은 출력마감시한 이후에 패킷들이 도착하게 되는 원인이 된다. 이 문제에 대한 전형적인 해법은 비디오 디코더 전단에 버퍼를 추가하는 것이다. 비디오 스트리밍의 시작시점에서, 디코더는 최초 처리량 변화 또는 패킷간 지터(Jitter)를 수용하기 위해 비디오를 디스플레이하기 이전에 버퍼가 채워질 때까지 대기한다. 이 대기시간은 이니셜 버퍼링이라 불려진다. 디코더가 버퍼에서 비디오 데이터를 디코드하기 시작한 이후에, TCP 세션 내에서 처리량의 감소는 버퍼 고갈의 원인이 된다. 이것이 발생되면, 디코더는 충분한 개수의 패킷들이 수신될 때까지 비디오 디스플레이를 중지시킨다. 이 대기시간은 리버퍼링이라 불려진다. 버퍼링은 지연 패킷들이 드롭되는 것을 방지한다. 하지만, 긴 이니셜 버퍼링은 네트워크 혼잡의 원인이 될 수 있고, 빈번한 리버퍼링은 사용자의 느낌을 저하시킨다. 리버퍼링의 빈도를 감소시키기 위한 적절한 버퍼 크기를 결정하기 위한 많은 연구가 있었다.Despite the general view that TCP is not desirable for streaming, an important part of commercial video streaming traffic uses TCP. TCP provides a guaranteed service, so that transmitted packets are always preserved. Nevertheless, the retransmission and rate mechanism of TCP causes delays, which cause packets to arrive after the output deadline. A typical solution to this problem is to add a buffer to the front of the video decoder. At the beginning of video streaming, the decoder waits until the buffer is filled before displaying the video to accommodate the initial throughput change or inter-packet jitter. This wait time is called initial buffering. After the decoder begins to decode the video data in the buffer, a decrease in throughput within the TCP session causes buffer depletion. If this occurs, the decoder stops video display until a sufficient number of packets are received. This wait time is called re-buffering. Buffering prevents delayed packets from being dropped. However, long initial buffering can lead to network congestion, and frequent re-buffering degrades the user's feel. There have been many studies to determine an appropriate buffer size to reduce the frequency of re-buffering.

무선 비디오 스트리밍을 향상시키기 위한 또 다른 접근은 IEEE 802.11e 네트워크를 이용하는 것이다. 그것은 MAC(Media Access Control)계층에 대한 변경을 통해 QoS향상 조합을 규정한다. 802.11e 네트워크에서, 비디오 및 오디오와 같이 지연에 민감한 데이터는 우선선위가 높은 클래스에 할당될 수 있다. 만약, MAC계층에서 충돌이 발생한다면 높은 우선순위를 갖는 데이터를 전송하기 위해 더 작은 충돌 윈도우 크기가 이용될 수 있다. 이에 따라 더 낮은 전송 지연이 달성될 수 있다. 802.11e는 특별히 멀티미디어에 잘 맞도록 되어 있다. 하지만, 하드웨어 변경이 요구되기 때문에 넓게 채택되지 못하고 있다.Another approach to improve wireless video streaming is to use an IEEE 802.11e network. It defines a combination of QoS enhancements through changes to the Media Access Control (MAC) layer. In an 802.11e network, delay sensitive data such as video and audio can be first assigned to a higher priority class. If a collision occurs at the MAC layer, a smaller collision window size may be used to transmit data having a higher priority. So that a lower transmission delay can be achieved. 802.11e is specifically tailored for multimedia. However, it is not widely adopted because it requires hardware changes.

이하, 첨부된 도면들을 참조하여 본 발명에 따른 부호화된 비디오 스트림 전송 장치 및 방법에 대해 보다 구체적으로 설명한다.Hereinafter, an apparatus and method for transmitting a coded video stream according to the present invention will be described in detail with reference to the accompanying drawings.

도 3은 본 발명의 제1 실시예에 따른 부호화된 비디오 스트림 전송 장치의 구성을 개략적으로 보여주는 블록도이다.3 is a block diagram schematically illustrating a configuration of an encoded video stream transmission apparatus according to a first embodiment of the present invention.

도 3에 도시된 바와 같이, 본 발명의 제1 실시예에 따른 부호화된 비디오 스트림 전송 장치(300)는 전송기(310) 및 수신기(320)를 포함한다.As shown in FIG. 3, the encoded video stream transmission apparatus 300 according to the first embodiment of the present invention includes a transmitter 310 and a receiver 320.

전송기(310)는 엔코더(311), 파서(312), 먹스(313), 제1 패킷 생성부(314), 및 제2 패킷 생성부(315)를 포함한다.The transmitter 310 includes an encoder 311, a parser 312, a multiplexer 313, a first packet generator 314, and a second packet generator 315.

엔코더(311)는 원본 비디오 스트림을 미리 정해진 비디오 압축 규격을 이용하여 부호화한다. 미리 정해진 비디오 압축 규격은 H.264 또는 H.265와 같은 H.26x계열일 수 있다. 일 실시예에 있어서, 본 발명에 따른 장치(300)는 H.26x계열의 압축 규격에서 정의된 비디오 분할 기법을 적용하지 않기 때문에, 엔코더(311)는 원본 비디오 스트림이 데이터 분할(Data Partitioning: DP) 프로파일을 포함하지 않도록 부호화한다.The encoder 311 encodes the original video stream using a predetermined video compression standard. The predetermined video compression standard may be an H.26x series such as H.264 or H.265. In one embodiment, since the apparatus 300 according to the present invention does not apply the video segmentation technique defined in the compression standard of the H.26x series, the encoder 311 encodes the original video stream as data partitioning (DP) ) Encode so as not to include the profile.

파서(312)는 부호화된 비디오 스트림을 복수개의 n초(n-second) 서브 스트림들로 분할한다. 각 서브 스트림은 복수개의 프레임들로 구성된다. 파서(312)는 복수개의 서브 스트림들을 파싱하여 각 서브 스트림으로부터 SPS, PPS, 슬라이스 헤더, 및 슬라이스 데이터를 획득한다. 스트리밍 이전에 파서(312)는 시작주소(Start Address), 길이(Length), NAL 유닛의 타입 등과 같은 각 서브 스트림들의 신택스 정보를 먹스(313)의 입력으로 제공한다. 본 발명의 일 실시예에 있어서, 파서(312)는 H.264 신택스 파서로 구현될 수 있다. 또한, 본 발명의 일 실시예에 있어서, 스트리밍 동안 각 NAL 유닛들은 RTP 패킷으로 인캡슐레이션 된다.The parser 312 divides the encoded video stream into a plurality of n-second substreams. Each sub-stream consists of a plurality of frames. Parser 312 parses the plurality of substreams to obtain SPS, PPS, slice header, and slice data from each substream. Prior to streaming, the parser 312 provides the syntax information of each sub-stream, such as a Start Address, Length, NAL unit type, etc., as input to the MUX 313. In one embodiment of the present invention, the parser 312 may be implemented with an H.264 syntax parser. Also, in one embodiment of the present invention, during streaming, each NAL unit is encapsulated into an RTP packet.

먹스(MUX, 313)는 파싱된 서브 스트림으로부터 획득된 데이터들을 제1 및 제2 데이터로 분리한다. 제1 데이터는 SPS, PPS, 또는 슬라이스 헤더와 같이 우선선위가 높은 데이터를 의미한다. 제2 데이터는 슬라이스 데이터와 같이 우선순위가 낮은 데이터를 의미한다. 이러한 경우에, 먹스(313)는 RTP 패킷이 제1 데이터인 NAL 유닛을 포함하는지 여부를 판단한다. 만약, RTP 패킷이 제1 데이터인 중요한 NAL 유닛을 포함하는 경우, 먹스(313)는 중요한 NAL 유닛을 포함하는 RTP 패킷을 제1 패킷 생성부(314)로 제공하고, 제2 데이터인 NAL 유닛을 포함하는 RTP 패킷은 제2 패킷 생성부(315)로 제공한다.The MUX 313 separates the data obtained from the parsed sub-stream into first and second data. The first data means data having a high priority such as an SPS, a PPS, or a slice header. The second data means data having a lower priority such as slice data. In this case, the MUX 313 determines whether the RTP packet includes the NAL unit which is the first data. If the RTP packet includes an important NAL unit which is the first data, the MUX 313 provides an RTP packet including an important NAL unit to the first packet generating unit 314, And provides the RTP packet to the second packet generator 315.

제1 패킷 생성부(314)는 제1 데이터를 이용하여 TCP 패킷을 생성하고, 생성된 TCP 패킷들을 TCP 터널을 통해 전송한다. 제2 패킷 생성부(315)는 제2 데이터를 이용하여 UDP 패킷을 생성하고, 생성된 UDP 패킷들을 UDP 터널을 통해 전송한다. 일 실시예에 있어서, TCP 패킷을 전송하기 위한 TCP 세션과 UDP 패킷을 전송하기 위한 UDP 세션은 비디오 스트리밍 동안 액티브 상태로 유지된다. 다시 말하면, 부호화된 비디오 스트림 전송장치(300)는 부호화된 비디오 스트림을 전송하기 위해 듀얼 터널링(TCP+UDP)를 이용한다.The first packet generator 314 generates a TCP packet using the first data, and transmits the generated TCP packets through the TCP tunnel. The second packet generator 315 generates a UDP packet using the second data, and transmits the generated UDP packet through the UDP tunnel. In one embodiment, a TCP session for transmitting TCP packets and a UDP session for transmitting UDP packets remain active during video streaming. In other words, the encoded video stream transmission apparatus 300 uses dual tunneling (TCP + UDP) to transmit the encoded video stream.

본 발명의 일 실시예에 있어서, 전송기(310)는 먼저 TCP를 통해 제1 데이터를 전송한 후에 UDP를 통해 제2 데이터를 전송한다. 다시 말하면, 전송기(310)는 제1 데이터를 포함하는 TCP 패킷을 먼저 전송하고, 이후에 제2 데이터를 포함하는 UDP 패킷을 전송한다. 예컨대, 12초 길이를 갖는 비디오 스트림에 대해, 이니셜 버퍼링(Initial Buffering, 즉, TCP를 통해 제1 데이터를 전송하는데 소요되는 시간)은 2초보다 작다. 하지만, 전체 비디오 스트림을 스트리밍 할 때 이니셜 버퍼링은 허용할 수 없을 정도로 길어진다. 따라서, 본 발명에 따른 전송기(310)는 이니셜 버퍼링 조건을 감소시키기 위해, n번째 서브 스트림에 포함된 UDP 패킷과 n+1번째 서브 스트림에 포함된 TCP 패킷을 동시에 전송한다. 즉, 본 발명은 서브 스트림의 전송을 중첩시킨다.In an embodiment of the present invention, the transmitter 310 first transmits the first data through the TCP and then the second data through the UDP. In other words, the transmitter 310 first transmits the TCP packet including the first data, and then transmits the UDP packet including the second data. For example, for a video stream having a length of 12 seconds, Initial Buffering (i.e., time required for transmitting first data via TCP) is less than 2 seconds. However, when streaming the entire video stream, initial buffering becomes unacceptably long. Therefore, the transmitter 310 simultaneously transmits the UDP packet included in the n-th sub stream and the TCP packet included in the (n + 1) th sub stream to reduce the initial buffering condition. That is, the present invention superimposes the transmission of the sub-stream.

도 4는 본 발명의 일 실시예에 따른 서브 스트림의 중첩전송을 보여준다. 도 4에 도시된 바와 같이, 전체 비디오 스트림은 몇 개의 n초 서브 스트림들로 분할된다. 도 4에서는 처음 2개의 서브 스트림들만 도시하였다. 전송기(310)는 TCP를 통해 서브 스트림 1의 제1 데이터를 포함하는 TCP 패킷을 전송할 것이다. 이후, 전송기(310)는 일반적인 UDP 스트리밍을 시작할 것이다. 전송기(310)가 특정 UDP 패킷을 전송하지 않는 한, 전송기(310)는 서브 스트림 2의 제1 데이터를 포함하는 TCP 패킷을 TCP를 통해 전송할 것이다. 만약, 출력 마감시한이 될 때까지 서브 스트림 1의 제1 데이터를 포함하는 TCP 패킷이 준비되지 않는다면 전송기(310)는 비디오 스트리밍을 중단하고 리퍼퍼링을 수행할 것이다.4 illustrates superposition transmission of a sub-stream according to an embodiment of the present invention. As shown in FIG. 4, the entire video stream is divided into several n-second sub-streams. In FIG. 4, only the first two sub-streams are shown. Transmitter 310 will transmit a TCP packet containing the first data of sub-stream 1 over TCP. Thereafter, the transmitter 310 will begin normal UDP streaming. As long as the transmitter 310 does not transmit a specific UDP packet, the transmitter 310 will transmit the TCP packet including the first data of the sub stream 2 via TCP. If the TCP packet including the first data of sub-stream 1 is not prepared until the output deadline is reached, the transmitter 310 will stop the video streaming and perform the referrer.

다시 말하면, 서브 스트리밈의 중첩 전송의 과정은 다음의 4단계로 구분될 수 있다. 첫 번째 단계로, 비디오 스트림이 복수개의 n초 서브 스트림들로 분할된다. 두 번째 단계로, 첫 번째 n초 서브 스트림의 제1 데이터만이 TCP를 통해 전송된다. 세 번째 단계로, 일반적인 UDP 스트리밍이 시작된다. 네 번째로, 만약 네트워크가 상대적으로 여유가 있다면, 다음번 n초 서브 스트림들의 제1 데이터가 TCP를 통해 전송된다.In other words, the process of superposition transmission of sub-stream MIM can be divided into the following four steps. As a first step, the video stream is divided into a plurality of n second sub-streams. In the second step, only the first data of the first n-second sub-stream is transmitted over TCP. As a third step, normal UDP streaming begins. Fourth, if the network is relatively free, the first data of the next n-second sub-streams is transmitted over TCP.

네 번째 단계에서, 네트워크가 상대적으로 여유가 있는지 여부를 판단하는 조건은 네트워크 레이어 큐(Network Layer Queue, 미도시)를 모니터링함에 의해 수행된다. 만약, 네트워크 레이어 큐(이하, '큐'라 함)에 저장된 패킷들의 개수가 임계치보다 작으면, 전송기(310)는 이전 서브 스트림의 UDP 스트리밍 동안 다음번 서브 스트림들의 TCP 패킷을 전송할 것이다.In a fourth step, the condition for determining whether the network is relatively free is performed by monitoring a Network Layer Queue (not shown). If the number of packets stored in the network layer queue (hereinafter, referred to as 'queue') is less than the threshold, the transmitter 310 will transmit TCP packets of the next substreams during the UDP streaming of the previous substream.

다시 도 3을 참조하면, 수신기(320)는 제1 패킷 수신부(321), 제2 패킷 수신부(322), 디먹스(323), 및 디코더(324)를 포함한다.3, the receiver 320 includes a first packet receiving unit 321, a second packet receiving unit 322, a DEMUX 323, and a decoder 324.

제1 패킷 수신부(312)는 TCP 터널을 통해 TCP 패킷을 수신하고, 수신된 TCP 패킷을 디먹스(323)로 전달한다. 제2 패킷 수신부(322)는 UDP 터널을 통해 UDP 패킷을 수신하고, 수신된 UDP 패킷을 디먹스(323)로 전달한다.The first packet receiving unit 312 receives the TCP packet through the TCP tunnel and transmits the received TCP packet to the demux 323. The second packet receiving unit 322 receives the UDP packet through the UDP tunnel and delivers the received UDP packet to the demux 323.

디먹스(DEMUX, 323)는 지연된 UDP 패킷을 드롭(Drop)시키고, 정시에 도착한 UDP 패킷을 TCP 패킷과 결합시킨다. 이후, 디먹스(323)는 결합된 UDP 패킷과 TCP 패킷을 디코더(324)로 전달한다. 구체적으로, TCP 패킷이 수신되면, 디먹스(323)는 TCP 패킷을 저장부(325)에 저장한다. UDP 패킷이 수신되면, 디먹스(323)는 우선 UDP 패킷을 파싱하여 RTP 타임 스탬프(Time Stamp)를 획득한다. 만약, RTP 타임 스탬프가 출력 마감시한보다 크면, 디먹스(323)는 해당 UDP패킷은 지연된 것으로 판단하여 드롭시킨다. 만약, RTP 타임 스탬프가 출력 마감시한 이하이면 디먹스(323)는 UDP 패킷을 파싱하여 RTP 시퀀스 넘버(Sequence Number)를 획득한다. 이후, 디먹스(323)는 저장부(325)를 파싱하여 RTP 시퀀스 넘버가 UDP 패킷의 RTP 시퀀스 넘버보다 작은 TCP 패킷이 존재하는지 여부를 판단한다. 만약, 존재한다면, 디먹스(323)는 이 TCP 패킷을 현재 UDP 패킷과 결합시키고, 결합된 UDP 패킷과 TCP 패킷을 디코더(324)로 전달한다.The DEMUX 323 drops the delayed UDP packet and combines the UDP packet arriving on time with the TCP packet. Thereafter, the demux 323 transfers the combined UDP packet and the TCP packet to the decoder 324. Specifically, when a TCP packet is received, the demux 323 stores the TCP packet in the storage unit 325. When the UDP packet is received, the DEMUX 323 first parses the UDP packet to obtain an RTP time stamp. If the RTP timestamp is greater than the output deadline, the demux 323 determines that the corresponding UDP packet is delayed and drops it. If the RTP timestamp is less than the output deadline, the demux 323 parses the UDP packet to obtain an RTP sequence number. Thereafter, the DEMUX 323 parses the storage unit 325 to determine whether a TCP packet whose RTP sequence number is smaller than the RTP sequence number of the UDP packet exists. If so, the demux 323 combines this TCP packet with the current UDP packet and passes the combined UDP packet and TCP packet to the decoder 324.

디코더(324)는 결합된 UDP 패킷과 TCP 패킷을 복호화하여 비디오 스트림을 복원하고, 복원된 비디오 스트림을 디스플레이를 통해 출력한다. 디코더(324)는 FFmpeg로 구현될 수 있다.The decoder 324 decodes the combined UDP packet and the TCP packet, restores the video stream, and outputs the restored video stream through display. The decoder 324 may be implemented as FFmpeg.

본 발명의 제1 실시예에 따른 부호화된 비디오 스트림 전송 장치(300)는 Direct-Show multimedia 및 QualNet 네트워크 시뮬레이터를 통합한 Open Evaluation Framework for Multimedia Over Networks(OEFMON, 이하 'OEFMON'이라 함)에서 구현될 수 있다. OEFMON의 개략적인 구성도가 도 5에 도시되어 있다. 도 5에 도시된 바와 같이, 핵심구성은 퀄넷 컨넥터(QualNet Connector), 비디오 소스 필터(Video Source Filter), 및 비디오 라이터 필터(Video Writer Filter)이다. 퀄넷 컨넥터는 RTP 패킷화를 수행한다. 비디오 소스 필터는 H.264파일을 리드하고, 리드된 데이터를 퀄넷 컨넥터로 전송한다. 비디오 라이터 필터는 복호화된 프레임 데이터를 원본 비디오 데이터로 기록한다. 이하, 본 발명에 따른 부호화된 비디오 스트림 전송장치의 구현을 위한 OEFMON의 핵심구성들에 대해 구체적으로 설명한다.The encoded video stream transmission apparatus 300 according to the first embodiment of the present invention can be implemented in an Open Evaluation Framework for Multimedia Over Networks (OEFMON), which incorporates a Direct-Show multimedia and a QualNet network simulator . A schematic diagram of the OEFMON is shown in Fig. As shown in FIG. 5, the core components are a QualNet connector, a video source filter, and a video writer filter. The QUALTNET connector performs RTP packetization. The video source filter reads the H.264 file and sends the read data to a QUALNET-compliant connector. The video writer filter records the decoded frame data as original video data. Hereinafter, the core configuration of the OEFMON for implementing the encoded video stream transmission apparatus according to the present invention will be described in detail.

OEFMON은 퀄넷 네트워크 시뮬레이터에서 UDP 스트리밍을 구현한다. 듀얼 터널링을 구현하기 위해, UDP 스트리밍을 위한 기존의 코드 변경이 요구되고 TCP 스트밍 모듈의 구현이 요구된다. 퀄넷은 이산-이벤트(Discrete-event) 시뮬레이터이고, 이벤트는 MESSAGE로 불려지는 데이터 구조에 의해 표현된다. 오리지널 코드는 UDP를 위한 메시지에 미리 포함되고, 한 쌍의 MESSAGE가 있다. 하나는 전송기(310)용이고 나머지 하나는 수신기(320) 용이다. UDP를 위해 요구되는 변화들 대부분은 해당 MESSAGE들을 조작하기 위한 코드를 복원하는데 관련된 것이다. 하지만, TCP의 구현은 3-way 핸드쉐이킹을 이용하기 때문에 더 많은 MESSAGE들을 요구한다. APP_TcpOpenConnectionWithPriority (오픈 TCP 소켓에 대한 요청)와 같은 퀄넷 API들 및 MSG_APP_FromTransListenResult(요청에 대한 응답)와 같은 MESSAGE들은 비디오 데이터 전송 이전에 적절하게 조작되어야 한다. 듀얼 터널링을 구현하기 위해, UDP 및 TCP MESSAGE들을 조작하기 위한 기능들은 퀄넷의 app_fdspvideo.cpp라 불리는 단일 어플리케이션 파일 내에 구현되어 있다.OEFMON implements UDP streaming in a Quarket network simulator. To implement dual tunneling, existing code changes for UDP streaming are required and implementation of a TCP streaming module is required. Quarket is a discrete-event simulator, and events are represented by a data structure called MESSAGE. The original code is pre-embedded in the message for UDP and has a pair of MESSAGE. One for the transmitter 310 and one for the receiver 320. Most of the changes required for UDP involve restoring code to manipulate the MESSAGES. However, the implementation of TCP requires more messages because it uses 3-way handshaking. QUESTION NET APIs such as APP_TcpOpenConnectionWithPriority (request for open TCP socket) and MSG_APP_FromTransListenResult (response to request) MUST be manipulated properly before video data transmission. To implement dual tunneling, the functions for manipulating UDP and TCP MESSAGES are implemented in a single application file called QUARNET's app_fdspvideo.cpp.

파서(312)는 h264bitstream이라 불리는 오픈소스 라이버러리에 기초하여 개발되었다. 파서(312)는 퀄넷내에서 구현되었고 app_fdspvideo.cpp에 링크되어 있다. 스트리밍 이전에 파서(312)는 비디오 스트림을 파싱하고 그것의 신택스 정보(시작주소, 길이, 및 각 NAL유닛의 타입과 같은)를 먹스(313)에 대한 입력으로써 돌려준다. 스트리밍동안, 각 NAL유닛은 OEFMON의 퀄넷 컨넥터에 의해 RTP 패킷으로 인캡슐레이션된다. 동시에, 먹스(313)는 RTP패킷이 포함하는 NAL 유닛이 SPS, PPS, 또는 슬라이스 헤더인지를 판단하기 위해 저장된 신택스 정보를 이용한다. 만약, RTP패킷이 중요한 NAL 유닛을 포함한다면, 먹스(313)는 그것을 TCP 터널로 제공할 것이고, 그렇지 않다면 해당 패킷을 UDP 터널로 제공할 것이다.Parser 312 was developed based on an open source library called h264bitstream. The parser 312 is implemented within QualNet and is linked to app_fdspvideo.cpp. Prior to streaming, the parser 312 parses the video stream and returns its syntax information (such as start address, length, and type of each NAL unit) as input to the mux 313. During streaming, each NAL unit is encapsulated into an RTP packet by the OEFMON's QUALNET-CONNECTOR. At the same time, the MUX 313 uses the stored syntax information to determine whether the NAL unit included in the RTP packet is an SPS, a PPS, or a slice header. If the RTP packet contains an important NAL unit, the MUX 313 will provide it to the TCP tunnel, otherwise it will provide the packet to the UDP tunnel.

수신기(320)가 TCP 패킷을 수신하면, 디먹스(323)는 수신된 패킷을 디스크 드라이버 상에서 "tcpdata.h264"라 불리는 파일 내에 저장할 것이다. 수신기(320)가 UDP패킷을 수신하면, 디먹스(323)는 먼저 UDP 패킷을 파싱하여 RTP 타임스탬프를 획득할 것이다. 만약, 타임스탬프가 출력 마감시한보다 크면, 이 UDP패킷은 지연된 것이기 때문에 디먹스(323)는 이 UDP패킷을 드롭시킬 것이다. 만약, 타임스탬프가 출력 마감시한보다 작으면 디먹스(323)는 UDP패킷을 파싱하여 RTP 시퀀스 넘버를 획득할 것이다. 이후, 디먹스(323)는 "tcpdata.h264"파일을 분석하여 RTP 시퀀스 넘버가 UDP 패킷의 RTP시퀀스 넘버보다 작은 TCP 패킷이 있는지 여부를 확인한다. 만약, 그렇다면 디먹스(323)는 이 TCP패킷을 현재 UDP 패킷과 결합시켜 디코더(324)로 전달할 것이다.When the receiver 320 receives the TCP packet, the demux 323 will store the received packet in a file called "tcpdata.h264" on the disk driver. When the receiver 320 receives the UDP packet, the DEMUX 323 will first parse the UDP packet to obtain the RTP timestamp. If the timestamp is greater than the output deadline, the demux 323 will drop this UDP packet because this UDP packet is deferred. If the timestamp is less than the output deadline, the demux 323 parses the UDP packet to obtain the RTP sequence number. Thereafter, the DEMUX 323 analyzes the file "tcpdata.h264" to check whether or not there is a TCP packet whose RTP sequence number is smaller than the RTP sequence number of the UDP packet. If so, the demux 323 will combine this TCP packet with the current UDP packet and deliver it to the decoder 324.

도 6은 이하에서 설명될 시나리오 1에 대한 순수 UDP를 이용한 비디오 스트리밍 동안의 큐 상태를 보여준다. X축은 본 발명에 따른 전송 장치(300)가 전송하고 있는 프레임을 나타낸다. "Num. of Pkts in Que"는 각 프레임을 전송할 때 네트워크 레이어 큐에 있는 패킷들의 개수를 나타낸다. "Num. of UDP to Be Sent"는 현재 프레임을 포함하는 UDP 패킷들의 개수를 나타낸다. 이 패킷들은 큐로 옮겨져서 전송되어야 하는 패킷들이다. 예컨대, 프레임 1의 경우, 큐는 빈 상태여야 하기 때문에 "Num. of Pkts in Que"는 0이다. 프레임 1은 177 UDP 패킷들을 포함하므로 "Num. of UDP to Be Sent"는 177이 된다. 임계치의 조건은 네트워크 조건 및 서브 스트림의 길이에 따라 달라진다. 만약, 네트워크가 혼잡하다면 "Num. of Pkts in Que"는 천천히 감소할 것이다. 예컨대, "Num. of Pkts in Que"는 프레임 18 대신에 프레임 30까지는 20이하로는 떨어지지 않을 것이다. 출력 마감시한 이전에 다음 서브 스트림의 TCP 데이터가 준비되도록 하기 위해, 전송장치(300)가 TCP 데이터를 전송할 더 많은 시간을 갖도록 상기 임계치는 20보다 큰 값으로 증가될 필요가 있다. 서브 스트림 길이가 증가하면, 다음 서브 스트림에 대해 요구되는 TCP 데이터가 증가되고, 이와 동시에 TCP 데이터를 전송하기 위해 사용할 수 있는 시간 또한 증가한다. 만약, 네트워크가 혼잡하지 않다면, TCP 데이터를 전송하기 위해 사용 가능한 충분한 시간이 있을 것이다. 반대로, 만약 네트워크가 혼잡하다면, "Num. of Pkts in Que"가 천천히 감소할 것이고, 비록 TCP 데이터를 전송하기 위해 사용 가능한 시간이 증가하더라도 TCP 데이터를 전송하기 위한 충분한 시간이 되지 않을 수도 있다. 이 상황에서 임계치는 증가되어야만 한다. 본 발명에서는 20패킷을 임계치로 결정하고, 10초를 서브 스트림 길이로 결정하기로 한다.FIG. 6 shows the queue status during video streaming using pure UDP for scenario 1, which will be described below. The X axis represents a frame transmitted by the transmitting apparatus 300 according to the present invention. "Num. Of Pkts in Que" indicates the number of packets in the network layer queue when transmitting each frame. "Num. Of UDP to Be Sent" indicates the number of UDP packets including the current frame. These packets are packets that need to be transferred to the queue. For example, in the case of frame 1, since the queue must be empty, "Num. Of Pkts in Que" Since Frame 1 contains 177 UDP packets, "Num. Of UDP to Be Sent" is 177. The condition of the threshold depends on the network conditions and the length of the sub-stream. If the network is congested, "Num. Of Pkts in Que" will slowly decrease. For example, "Num. Of Pkts in Que" will not fall below 20 to frame 30 instead of frame 18. The threshold needs to be increased to a value greater than 20 such that the transmitting device 300 has more time to transmit TCP data, so that the TCP data of the next sub-stream is ready before the output deadline. As the sub-stream length increases, the TCP data required for the next sub-stream increases and at the same time the time available for transmitting TCP data also increases. If the network is not congested, there will be enough time available to transmit TCP data. Conversely, if the network is congested, "Num. Of Pkts in Que" will slowly decrease, and even if the available time to transmit TCP data increases, it may not be enough time to transmit TCP data. In this situation, the threshold must be increased. In the present invention, 20 packets are determined as the threshold value, and 10 seconds is determined as the sub stream length.

이하, 본 발명의 제1 실시예에 따른 부호화된 비디오 스트림 전송 장치 및 방법의 시뮬레이션 및 시뮬레이션 결과에 대해 설명한다. 본 발명의 실험을 위해 선택된 프라이머리 비디오는 "아프리카 고양이" 예고편에 포함된 원본 HD YUV 비디오의 1200 프레임들이다. YUV파일은 4Mbps의 평균 비트레이트를 갖고 프레임당 한 개의 슬라이스를 갖는 x264를 이용하여 부호화된 것이다. OEFMON을 이용하여, 54Mbps의 대역폭을 갖는 802.11g 애드 혹 네트워크가 셋업 되었고 비디오 스트리밍 성능을 평가하기 위해 3개의 네트워크 시나리오가 생성되었다. 도 7에 도시된 바와 같은 3개의 시나리오들에 대한 노드들의 위치는 집 환경(Home Environment)으로 모델링한다. 시나리오 1에서, 노드 쌍 1 및 2는 네트워크 상에서 프라이머리 비디오를 스트리밍한다. 동시에, 2개의 추가적인 노드쌍들은 백그라운드 트래픽으로써 도 7에서 CBR1 및 CBR2로 마크된 일정한 비트레이트(CBR) 데이터를 생성한다. 시나리오 2는 CBR3로 마크된 CBR 데이터를 한 개 더 추가한다. 시나리오 3은 시나리오 1의 네트워크 트래픽을 반복하고, 노드 쌍 7 및 8이 추가되고 클래식 히든 노드 배열(Classic hidden-node arrangement) 내에 위치된다. 도 7의 파라미터들은 아래와 같이 정의된다. Distt1=5m, Dist2=1m, Dist3=50m, CBR1=20Mbps, CBR2=20Mbos, CBR3=10Mbps(시나리오 3을 제외하고 CBR3=5Mbps이다). 이 값들에 기초하여 네트워크는 시나리오 2에 대해 만족된다. 스트리밍은 각 NAL 유닛을 패킷화함에 의해 수행된다. 만약, NAL 유닛의 크기가 최대전송유닛(MTU)크기 이하이면 하나의 RTP 패킷은 단지 하나의 NAL유닛을 포함한다. 만약, NAL유닛의 크기가 MTU크기보다 크면 NAL 유닛은 복수개의 RTP패킷들로 분할될 것이다. 비디오 스트리밍이 완료된 이후, 송수신된 비디오 파일들은 FFmpeg 및 PSNR 값을 이용하여 복호화된다. 이때, PSNR값은 Avisynth를 이용하여 2개의 YUV파일에 대해 계산된다. 유실된 프레임들 및 2개의 동일한 프레임들에 대한 PSNR 계산은 잘 규정되어 있지 않기 때문에, 본 발명은 유실된 프레임들에 대한 PSNR값은 0db를 사용하고, Avisynth에 의해 사용된 방법을 따르기로 한다. 그 방법에서 111db는 완벽한 PSNR을 나타낸다. PSNR 정보에 추가하여 이니셜 버퍼링 및 리버퍼링은 종단간 지연을 평가하기 위해 기록된다.Hereinafter, simulation and simulation results of the apparatus and method for transmitting a coded video stream according to the first embodiment of the present invention will be described. The primary video selected for the experiment of the present invention is 1200 frames of original HD YUV video included in the "African Cat" trailer. The YUV file is encoded using x264 with an average bit rate of 4 Mbps and one slice per frame. Using OEFMON, an 802.11g ad hoc network with 54 Mbps bandwidth was set up and three network scenarios were created to evaluate video streaming performance. The location of the nodes for the three scenarios as shown in FIG. 7 is modeled as a home environment. In scenario 1, node pair 1 and 2 stream primary video over the network. At the same time, the two additional node pairs generate constant bit rate (CBR) data marked as CBR1 and CBR2 in Fig. 7 as background traffic. Scenario 2 adds one more CBR data marked CBR3. Scenario 3 repeats the network traffic of Scenario 1 and node pairs 7 and 8 are added and located in a Classic hidden-node arrangement. The parameters in FIG. 7 are defined as follows. Dist1 = 5m, Dist2 = 1m, Dist3 = 50m, CBR1 = 20Mbps, CBR2 = 20Mbos, CBR3 = 10Mbps (except for scenario 3, CBR3 = 5Mbps). Based on these values, the network is satisfied for scenario 2. Streaming is performed by packetizing each NAL unit. If the size of the NAL unit is less than the maximum transmission unit (MTU) size, then one RTP packet contains only one NAL unit. If the size of the NAL unit is larger than the MTU size, the NAL unit will be divided into a plurality of RTP packets. After the video streaming is completed, the transmitted and received video files are decoded using the FFmpeg and PSNR values. At this time, the PSNR value is calculated for two YUV files using Avisynth. Since the PSNR computation for lost frames and two identical frames is not well defined, the present invention uses a PSNR value of 0 db for lost frames and follows the method used by Avisynth. In that way, 111 db represents the perfect PSNR. In addition to PSNR information, initial buffering and re-buffering are recorded to evaluate the end-to-end delay.

이 실험에 대한 주요 목적은 전통적인 순수 UDP 및 순수 TCP 스트리밍 방법에 비해 본 발명의 장점을 보여주는 것이다. 본 발명에 대해, 모든 중요한 데이터(SPS, PPS들, 및 슬라이스 헤더들)는 TCP를 통해 먼저 전송될 것이고 이후 나머지 데이터들이 UDP로 전송될 것이다. 본 발명에서 중요한 데이터를 전송하기 위해 소요되는 시간은 이니셜 버퍼링으로 처리된다. 순수 TCP 방법에 있어서, 이니셜 버퍼링과 리버퍼링을 시뮬레이션 하기 위해 버퍼가 추가된다. 본 발명과 순수 TCP를 비교하기 위해, 순수 TCP의 버퍼 크기를 적절하게 조절하여 2개의 방법이 동일한 이니셜 버퍼링 시간을 갖도록 하였다.The main purpose of this experiment is to demonstrate the advantages of the present invention over traditional pure UDP and pure TCP streaming methods. For the present invention, all the important data (SPS, PPSs, and slice headers) will first be sent over TCP and then the rest of the data will be sent over UDP. In the present invention, the time required to transmit important data is processed by initial buffering. In the pure TCP method, a buffer is added to simulate initial buffering and re-buffering. To compare the present invention with pure TCP, the buffer size of the pure TCP is appropriately adjusted so that the two methods have the same initial buffering time.

모든 시나리오들에 대한 PSNR 비교가 도 8에 도시되어 있다. 도 8은 프레임 크기를 포함한다. 하나의 그래프로 나타내기에 1200개의 프레임은 너무 많기 때문에, PSNR, 패킷유실, 및 프레임 크기는 모두 1초 상에서 평균되었다(이는 30프레임으로 환산된다). 추가적으로, 순수 TCP 방법의 경우 항상 PSNR은 111db이고 패킷유실율은 0이기 때문에 순수 TCP의 PSNR 및 패킷유실은 생략하였다.A PSNR comparison for all scenarios is shown in FIG. Figure 8 includes the frame size. Since there are too many 1200 frames to represent in one graph, the PSNR, packet loss, and frame size are all averaged over one second (this translates to 30 frames). In addition, PSNR and packet loss of pure TCP are omitted because the PSNR is always 111db and the packet loss rate is 0 for the pure TCP method.

도 8a에 도시된 시나리오 1에 대해, 예측된 바와 같이 순수 UDP에 대한 PSNR은 본 발명보다 나빴다. 순수 UDP는 평균 93db에 도달하지만 본 발명은 102db에 도달한다. 도 8b에 도시된 시나리오 2에 대해, 순수 UDP방법은 평균 PSNR이 51db로써 83db인 본 발명보다 훨씬 낮다. 도 8c에 도시된 시나리오 3에 대해, 노드 쌍 7 및 8은 시나리오 1 및 2와 비교할 때 PSNR 및 지연의 저하에 대해 숨겨진 노드 이펙트를 유발시킨다. 순수 UDP의 평균 PRNS은 52db이지만, 이와 반대로 본 발명은 평균 PSNR은 76db이고 그것은 여전이 순수 UDP보다 좋다.For scenario 1 shown in FIG. 8A, as predicted, the PSNR for pure UDP was worse than the present invention. The pure UDP reaches an average of 93 db, but the invention reaches 102 db. For scenario 2 shown in FIG. 8B, the pure UDP method is much lower than the present invention with an average PSNR of 51 db and 83 db. For scenario 3 shown in FIG. 8C, node pairs 7 and 8 cause hidden node effects for PSNR and delay degradation as compared to scenarios 1 and 2. The average PRNS of pure UDP is 52 db, but conversely, the present invention has an average PSNR of 76 db, which is still better than pure UDP.

도 9는 3개의 모든 시나리오에 대한 패킷 유실율을 보여준다. 도 8 및 도 9를 함께 고려할 때, 패킷유실과 PSNR간에는 직접적인 상호관계가 있다. 이들 그래프에서, 각 PSNR 강등은 일부 패킷 손실에 의해 발생되었다. 예를 들면 전체 시간을 평균한 프로세스에 기인하여 관찰하기는 어려울 지라도 도 8a에서 순수 UDP에 대해 5초에서 0.2%의 패킷유실율이 있고, 그것은 PSNR을 111db에서 76db까지 감소시킨다. 도 9b 및 도 9c에서 알 수 있는 바와 같이, 패킷 유실이 85%이상인 경우 PSNR값은 프레임이 유실되었다는 것을 나타내는 0db이다. 시나리오 1에서, 순수 UDP에 대해 전체 1200개의 프레임들 중에서 32개의 프레임들이 유실되었다. 반대로, 본 발명을 이용하면 모든 1200개의 프레임들에 대한 슬라이스 헤더들은 모두 수신되었고, 따라서 모든 1200개의 프레임들이 복구 및 복호화될 수 있다. 시나리오 2에서, 순수 UDP를 이용하면 148개의 프레임들이 유실되었지만 본 발명의 경우 유실된 프레임들은 없었다. 시나리오 3에서, 순수 UDP의 경우 103개의 프레임들이 유실되었다. 또 다시 본 발명은 TCP를 이용하여 슬라이스 헤더들이 우선처리되었기 때문에 유실된 프레임은 없었다.Figure 9 shows packet loss rates for all three scenarios. Taking both FIG. 8 and FIG. 9 together, there is a direct correlation between packet loss and PSNR. In these graphs, each PSNR demotion was caused by some packet loss. For example, although it is difficult to observe due to the overall time averaging process, there is a packet loss rate of 0.2% at 5 seconds for pure UDP in Figure 8a, which reduces the PSNR from 111 db to 76 db. As can be seen in Figures 9b and 9c, if the packet loss is greater than 85%, the PSNR value is 0 db indicating that the frame is lost. In Scenario 1, 32 frames out of a total of 1200 frames for pure UDP are lost. Conversely, using the present invention all of the slice headers for all 1200 frames have been received, and therefore all 1200 frames can be recovered and decoded. In Scenario 2, using the pure UDP, 148 frames were lost, but in the present invention, there were no lost frames. In Scenario 3, 103 frames were lost for pure UDP. In the present invention, there is no lost frame because the slice headers are processed first using TCP.

위에서 설명했듯이, 슬라이스 헤더의 존재는 디코더(324)가 프레임을 복구하는데 있어서 중요하다. 일단 슬라이스 헤더가 적절하게 수신되면, 디코더(324)는 비록 나머지 데이터들이 유실되더라도 유실된 매크로블록들을 숨기기 위한 다양한 EC기법을 이용할 수 있다. 예를 들면, 도 10은 복호화된 프레임 134를 보여준다. 프레임 134는 P프레임이고 슬라이스 헤더 및 슬라이스의 일부를 포함하는 하나의 패킷으로부터 복구되었다. 좌측상단 워터마크는 프레임 번호 134를 보여주는 것으로서, 이는 패킷으로부터 획득된 정보이다. 우측하단 워터마크는 프레임 번호 132를 보여주는 것으로서, 이는 EC를 이용하는 FFmpeg가 현재 프레임 134를 복원하기 위해 이전 프레임 132로부터 복사한 정보이다.As described above, the presence of the slice header is important for the decoder 324 to recover the frame. Once the slice header is properly received, the decoder 324 may use various EC techniques to hide lost macroblocks, even if the remaining data is lost. For example, FIG. 10 shows the decoded frame 134. Frame 134 was a P frame and was recovered from a packet containing a slice header and a portion of the slice. The upper left watermark indicates frame number 134, which is the information obtained from the packet. The lower right watermark indicates frame number 132, which is the information FF mpeg using EC copied from previous frame 132 to restore current frame 134.

아래의 표 1은 3개의 모든 시나리오들에 대한 버퍼링 조건을 보여준다. Table 1 below shows the buffering conditions for all three scenarios.

순수 TCP 및 본 발명 모두 동일한 이니셜 버퍼링 시간을 갖고 네트워크 포화에 대한 반응으로써 2초부터 2.56초까지 증가시킨다. 하지만, 순수 TCP는 빈번한 리버퍼링을 발생시키고, 40초의 비디오 스트리밍동안 리버퍼링이 6~19번 발생하였으며, 각각은 0.95~1.46초 동안 지속되었다. 빈번한 리버퍼링은 사용자의 경험 변화를 유발하는 중요한 팩터이다. 순수 TCP가 완벽한 영상품질을 제공하더라도 리버퍼링의 높은 빈도는 매우 짜증스러운 것이 될 수 있다. 반대로, 본 발명은 어떠한 리버퍼링을 가지지 않는다. 이것은 순수 UDP 및 순수 TCP가 각각 영상 품질 및 지연 측면에서 허용될 수 없는 경향이 있기 때문에, 본 발명이 혼잡한 네트워크에서 매우 효과적이라는 것을 다시 한번 보여준다.Both pure TCP and the present invention have the same initial buffering time and increase from 2 seconds to 2.56 seconds in response to network saturation. However, pure TCP caused frequent re-buffering, resulting in 6 to 19 re-buffering during 40 seconds of video streaming, each lasting 0.95-1.46 seconds. Frequent re-buffering is an important factor that causes the user's experience to change. Even though pure TCP provides full image quality, the high frequency of rebuffering can be very annoying. Conversely, the present invention does not have any re-buffering. This again shows that the present invention is very effective in congested networks since pure UDP and pure TCP each tend to be unacceptable in terms of image quality and delay.

표 2는 본 발명에 있어서 서브 스트림들에 대한 준비시간 및 출력 마감시한을 보여준다.Table 2 shows the preparation time and output deadline for the sub-streams in the present invention.

서브 스트림 1의 준비시간은 표 1에 도시된 이니셜 버퍼링 시간이므로 표 2에서 도시하지는 않았다. 각 서브 스트림에 대한 출력 마감시한은 그것의 길이, 예컨대 10초에 의해 결정된다. 준비시간이 출력 마감시한보다 작은 한, 리버퍼링은 요구되지 않는다. 네트워크 혼잡도가 증가할수록 준비시간 또한 증가된다. 하지만, 모든 서브 스트림들에 대한 준비시간은 여전히 그들의 출력 마감시한보다 빠르다. 따라서, 표 1에서 나타낸 바와 같이 리버퍼링이 요구되지 않는다.The preparation time of sub-stream 1 is not shown in Table 2 because it is the initial buffering time shown in Table 1. The output deadline for each sub-stream is determined by its length, e.g., 10 seconds. As long as the preparation time is less than the output deadline, re-buffering is not required. As network congestion increases, the preparation time also increases. However, the preparation time for all sub-streams is still faster than their output deadline. Therefore, re-buffering is not required as shown in Table 1.

위에서 언급했듯이, 3개의 모든 시나리오들에 대한 서브 스트림들의 준비시간은 출력 마감시한보다 빠르다. 예컨대, 시나리오 2에서, 4번째 서브 스트림의 준비시간은 11,47초로써 그것은 출력 마감시한인 30초보다 18.53초 빠르다. 이것은 더 많은 데이터가 우선처리되어 TCP 터널을 통해 전송될 수 있도록 네트워크 조건을 조절할 수 있다는 것을 의미한다. 만약 본 발명이 TCP를 통해 데이터를 추가적으로 전송할 수 있다면, 영상품질은 훨씬 더 개선될 것이다.As mentioned above, the preparation time of sub-streams for all three scenarios is faster than the output deadline. For example, in scenario 2, the preparation time of the 4th sub-stream is 11.47 seconds, which is 18.53 seconds faster than 30 seconds of the output end time. This means that the network conditions can be adjusted so that more data can be processed first and sent through the TCP tunnel. If the present invention can additionally transmit data over TCP, the image quality will be much improved.

슬랙타임(Slack Time, 예컨대, 서브 스트림 준비시간과 출력 마감시한간의 차이)을 이용하고 영상품질을 더 개선하기 위해, 본 발명의 제2 실시예에 따른 부호화된 비디오 스트림 전송 장치 및 방법이 제안된다. 도 11은 본 발명의 제2 실시예에 따른 부호화된 비디오 스트림 전송 장치를 보여주는 블록도이다. 도 11에 도시된 바와 같이, 부호화된 비디오 스트림 전송 장치(1100)은 PBP(Percentage Based Prioritization) 모듈(316)이라 불리는 새로운 모듈을 추가함으로써 SPS, PPS, 및 슬라이스 헤더 외에 더 많은 비트 스트림 신택스 엘리먼트들을 우선처리한다.An apparatus and method for transmitting an encoded video stream according to a second embodiment of the present invention is proposed to further improve the image quality by using slack time (e.g., difference between sub-stream preparation time and output deadline) . 11 is a block diagram illustrating an apparatus for transmitting a coded video stream according to a second embodiment of the present invention. 11, the encoded video stream transmission apparatus 1100 includes a new module called a PBP (Percentage Based Prioritization) module 316, which adds more bitstream syntax elements in addition to the SPS, PPS, and slice headers First, process it.

PBP 모듈(316)은 PERCENT라 불리는 입력 파라미터에 따라 신택스 엘리먼트를 선택한다. 예컨대, 만약 PERCENT가 10%인 것으로 규정되면, 10개의 패킷들 중 11개의 패킷이 TCP를 통해 전송될 수 있다. PBP 모듈(316)은 원한다면 어떠한 신택스 엘리먼트도 우선처리할 수 있기 때문에 부호화된 비디오 스트림 전송 장치(1100)의 가변성을 확장시킨다.The PBP module 316 selects the syntax element according to an input parameter called PERCENT. For example, if PERCENT is specified as 10%, 11 packets out of 10 packets can be transmitted over TCP. The PBP module 316 expands the variability of the encoded video stream transmission device 1100 because it can preferentially process any syntax element if desired.

본 발명의 일 실시예에 있어서, PBP 모듈(316)은 100kbyte보다 더 큰 프레임들을 우선처리하기 위해 이용된다. 이 거대한 프레임들은 일반적으로 I프레임이고 따라서 B프레임들 보다 더 중요하다.In one embodiment of the present invention, the PBP module 316 is used to preferentially process frames larger than 100 kbytes. These huge frames are generally I frames and therefore more important than B frames.

본 발명의 제2 실시예에 따른 부호화된 비디오 스트림 전송 장치 및 방법을 위에서 기재된 네트워크 시나리오 2를 이용하여 시뮬레이션하였다. 시각적 향상을 보여주기 위해 입력 파라미터 PRECENT는 2개의 다른 값으로 설정되었다. 첫 번째 경우에 있어서, PERCENT는 50%로 설정되었고, 따라서 부호화된 비디오 스트림 전송 장치(1100)는 SPS, PPS, 및 슬라이스 헤더들에 추가하여 100kbyte 이상의 프레임 패킷들 중 50%를 TCP를 통해 전송한다. 두 번째 경우에 있어서, TCP를 통해 더 많은 데이터가 전송될 수 있도록 PERCENT는 90%로 설정된다.The apparatus and method for transmitting a coded video stream according to the second embodiment of the present invention are simulated using the network scenario 2 described above. To show the visual enhancement, the input parameter PRECENT was set to two different values. In the first case, PERCENT is set to 50%, and thus the encoded video stream transmission apparatus 1100 transmits 50% of frame packets of 100 kbytes or more in addition to SPS, PPS, and slice headers via TCP . In the second case, PERCENT is set to 90% so that more data can be transmitted via TCP.

도 12는 PBP 모듈(316)을 포함하지 않는 비디오 스트림 전송 장치와, 50% PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치와, 90% PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치간의 PSNR 비교를 보여준다. PBP 퍼센테이지(Percentage)가 높을수록 PSNR이 더 좋아진다는 것을 알 수 있다. PBP 모듈(316)을 포함하지 않는 비디오 스트림 전송 장치의 평균 PSNR은 83.08db였고, 50%의 PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치의 평균 PSNR은 90.99db였다. 특히, 90%의 PBP를 갖는 PBP 모듈(316))을 포함하는 비디오 스트림 전송 장치의 평균 PSNR은 완벽한 PSNR인 111db에 도달했다. 이것은 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치가 영상품질을 향상시키는데 있어서 효과적이라는 것을 나타낸다. 이는 SPS, PPS, 및 슬라이스 헤더에 추가하여 더 많은 데이터가 수신될 것이 보장되기 때문이다.12 shows a video stream transmission apparatus that includes a video stream transmission apparatus that does not include a PBP module 316, a video stream transmission apparatus that includes a PBP module 316 that has a 50% PBP, and a PBP module 316 that has a 90% PBP And a PSNR comparison between video stream transmission devices. It can be seen that the higher the PBP percentage, the better the PSNR. The average PSNR of the video stream transmission apparatus that does not include the PBP module 316 was 83.08 db, and the average PSNR of the video stream transmission apparatus including the PBP module 316 with 50% PBP was 90.99 db. In particular, the average PSNR of the video stream transmission device, including the PBP module 316 with 90% PBP, reached a full PSNR of 111 db. This indicates that the video stream transmission apparatus including the PBP module 316 is effective in improving the image quality. This is because it is guaranteed that more data will be received in addition to the SPS, PPS, and slice headers.

도 13은 프레임 1187에 대한 영상 비교를 보여준다. 도 13(a)는 순수 UDP를 이용한 경우의 프레임 1187을 보여주고, 도 13(b)는 PBP 모듈(316)을 포함하지 않는 본 발명의 비디오 스트림 전송 장치를 이용한 경우의 프레임 1187을 보여주며, 도 13(c)는 50%의 PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치를 이용한 경우의 프레임 1187을 보여주고, 도 13(d)는 90%의 PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치를 이용한 경우의 프레임 1187을 보여준다. 명확하게, 본 발명이 순수 UDP 보다 더 나은 성능을 갖는다는 것을 알 수 있다. 또한, 영상품질은 PBP 퍼센테이지가 증가할수록 점진적으로 향상된다는 것을 알 수 있다.13 shows an image comparison for frame 1187. FIG. 13 (a) shows a frame 1187 when pure UDP is used, and FIG. 13 (b) shows a frame 1187 when using the video stream transmission apparatus of the present invention not including the PBP module 316, 13 (c) shows a frame 1187 when a video stream transmission apparatus including a PBP module 316 having 50% PBP is used, and FIG. 13 (d) shows a PBP module 316 &Lt; RTI ID = 0.0 > 1187 < / RTI > Clearly, it can be seen that the present invention has better performance than pure UDP. Also, it can be seen that the image quality is gradually improved as the PBP percentage is increased.

표 3은 PBP 모듈(316)을 포함하지 않는 비디오 스트림 전송 장치, 50%의 PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치, 및 90%의 PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치에 대한 서브 스트림들의 준비시간을 보여준다.Table 3 includes a video stream transmission apparatus that does not include a PBP module 316, a video stream transmission apparatus that includes a PBP module 316 that has a PBP of 50%, and a PBP module 316 that has a PBP of 90% The preparation time of the sub-streams for the video stream transmission apparatus.

각 서브 스트림에 대한 준비시간은 PBP 퍼센테이지가 증가할수록 증가한다는 것을 알 수 있다. 하지만, 서브 스트림의 준비시간은 여전히 출력 마감시한보다 빠르고, 따라서 리버퍼링은 요구되지 않는다.It can be seen that the preparation time for each sub-stream increases with increasing PBP percentage. However, the preparation time of the sub-stream is still faster than the output deadline, and thus no re-buffering is required.

90%의 PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치는 완벽한 PSNR을 달성하고 리버퍼링이 없다. 이와 비교하여, 순수 TCP 또한 완벽한 PSNR을 달성할 수 있지만 17번의 리버퍼링이 발생한다. 더욱이, 순수 UDP와 비교하면, 90%의 PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치는 60db더 높은 PSNR을 달성하게 된다. 90%의 PBP를 갖는 PBP 모듈(316)을 포함하는 비디오 스트림 전송 장치에 의해 달성되는 결과는 순수 TCP 및 순수 UDP보다 확실히 낫고, 따라서 본 발명은 명확하게 순수 TCP 및 순수 UDP 방법에 비해 장점을 갖는다.A video stream transmission device comprising a PBP module 316 with 90% PBP achieves a perfect PSNR and is free from re-buffering. In comparison, pure TCP can also achieve a perfect PSNR, but there are 17 re-buffering. Moreover, compared to pure UDP, a video stream transmission device comprising PBP module 316 with 90% PBP achieves a PSNR of 60 db higher. The results achieved by the video stream transmission device comprising PBP module 316 with 90% PBP are certainly better than pure TCP and pure UDP, and therefore the invention clearly has advantages over pure TCP and pure UDP methods .

본 발명이 속하는 기술분야의 당업자는 상술한 본 발명이 그 기술적 사상이나 필수적 특징을 변경하지 않고서 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다.It will be understood by those skilled in the art that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.

그러므로, 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적인 것이 아닌 것으로 이해해야만 한다. 본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 등가 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. The scope of the present invention is defined by the appended claims rather than the detailed description and all changes or modifications derived from the meaning and scope of the claims and their equivalents are to be construed as being included within the scope of the present invention do.

300, 1100: 부호화된 비디오 스트림 전송 장치 310: 전송기
311: 엔코더 312: 파서
313: 먹스 314: 제1 패킷 생성부
315: 제2 패킷 생성부 316: PBP 모듈
320: 수신기 321: 제1 패킷 수신부
322: 제2 패킷 수신부 323: 디먹스
324: 디코더 325: 저장부 300, 1100: Coded video stream transmission apparatus 310: Transmitter
311: Encoder 312: Parser
313: Mux 314: First packet generating unit
315: second packet generator 316: PBP module
320: receiver 321: first packet receiver
322: second packet receiving unit 323: DEMUX
324: Decoder 325:

Claims

미리 정해진 압축 규격을 이용하여 비디오 스트림을 부호화하는 엔코더;
상기 부호화된 비디오 스트림을 복수개의 서브 스트림으로 구분하고, 각 서브 스트림을 파싱하는 파서;
파싱된 서브 스트림으로부터 SPS(Sequence Parameter Set), PPS(Picture Parameter Set), 또는 슬라이스 헤더(Slice Header)를 갖는 NAL(Network Adaptation Layer) 유닛인 제1 데이터와 슬라이스 데이터(Slice Data)를 갖는 NAL 유닛인 제2 데이터로 구분하는 먹스;
상기 서브 스트림 별로 상기 제1 데이터를 이용하여 TCP(Transmission Control Protocol) 패킷을 생성하고, 생성된 TCP 패킷을 TCP 터널을 통해 전송하는 제1 패킷 생성부; 및
상기 서브 스트림 별로 상기 제2 데이터를 이용하여 UDP(User Datagram Protocol) 패킷을 생성하고, 생성된 UDP 패킷을 UDP 터널을 통해 전송하는 제2 패킷 생성부를 포함하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치.An encoder for encoding a video stream using a predetermined compression standard;
A parser for dividing the encoded video stream into a plurality of substreams and parsing each substream;
(NAL) unit having slice data (Slice Data) and first data as a network adaptation layer (NAL) unit having a sequence parameter set (SPS), a picture parameter set (PPS) or a slice header The second data;
A first packet generator for generating a Transmission Control Protocol (TCP) packet using the first data for each of the sub-streams, and transmitting the generated TCP packet through a TCP tunnel; And
And a second packet generator for generating a User Datagram Protocol (UDP) packet using the second data for each sub-stream and transmitting the generated UDP packet through a UDP tunnel. .

제1항에 있어서,
상기 미리 정해진 압축 규격은, H.264 또는 H.265이고, 상기 부호화된 비디오 스트림은 데이터 분할(Data Partitioning: DP) 프로파일(Profile)을 포함하지 않는 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치.The method according to claim 1,
Wherein the predetermined compression standard is H.264 or H.265, and the encoded video stream does not include a data partitioning (DP) profile.

제1항에 있어서,
상기 TCP 패킷을 전송하기 위한 TCP 세션(Session) 및 상기 UDP 패킷을 전송하기 위한 UDP 세션은 상기 비디오 스트림의 전송 동안 액티브 상태로 유지되는 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치.The method according to claim 1,
Wherein the TCP session for transmitting the TCP packet and the UDP session for transmitting the UDP packet are kept active during transmission of the video stream.

제1항에 있어서,
상기 제1 패킷 생성부는, 상기 제2 패킷 생성부가 n번째 서브 스트림에 대한 UDP 패킷을 전송할 때 n+1번째 서브 스트림에 대한 TCP 패킷을 동시에 전송하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치.The method according to claim 1,
Wherein the first packet generator simultaneously transmits a TCP packet for an (n + 1) th sub-stream when the second packet generator transmits a UDP packet for an n th sub-stream.

제4항에 있어서,
상기 제1 패킷 생성부는, 큐에 저장되어 있는 UDP 패킷의 개수가 미리 정해진 임계치 보다 작은 경우, 상기 제2 패킷 생성부가 n번째 서브 스트림에 대한 UDP 패킷을 전송할 때 n+1번째 서브 스트림에 대한 TCP 패킷을 동시에 전송하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치5. The method of claim 4,
When the number of UDP packets stored in the queue is smaller than a predetermined threshold, the first packet generator generates a TCP packet for the (n + 1) th sub stream when the second packet generator transmits a UDP packet for the nth sub stream, Wherein the video stream transmission device

제1항에 있어서,
미리 정해진 퍼센트(Percent)에 따라 상기 제2 데이터로부터 상기 TCP 터널을 통해 전송될 우선 처리 데이터를 선택하는 PBP(Percentage Based Prioritization) 모듈을 더 포함하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치.The method according to claim 1,
Further comprising a PBP (Percentage Based Prioritization) module for selecting priority processing data to be transmitted through the TCP tunnel from the second data according to a predetermined percentage.

제6에 있어서,
상기 우선 처리 데이터는 I-frame인 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치.In the sixth aspect,
Wherein the priority processing data is an I-frame.

제1항에 있어서,
상기 TCP 터널을 통해 상기 TCP 패킷을 수신하는 제1 패킷 수신부;
상기 UDP 터널을 통해 상기 UDP 패킷을 수신하는 제2 패킷 수신부;
상기 수신된 UDP 패킷을 상기 수신된 TCP 패킷과 결합하는 디먹스; 및
상기 결합된 TCP 패킷 및 UDP 패킷을 복호화하여 상기 비디오 스트림을 복원하는 디코더를 더 포함하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치.The method according to claim 1,
A first packet receiver for receiving the TCP packet through the TCP tunnel;
A second packet receiver for receiving the UDP packet through the UDP tunnel;
A DEMUX for combining the received UDP packet with the received TCP packet; And
Further comprising a decoder for decoding the combined TCP packet and the UDP packet to recover the video stream.

제1항에 있어서,
상기 디먹스는, 상기 수신된 UDP 패킷 중 미리 정해진 출력 마감시한(Playout Deadline)보다 큰 RTP(Real-time Transport Protocol) 스탬프를 갖는 UDP 패킷을 드롭(Drop)시키는 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치.The method according to claim 1,
Wherein the demultiplexer drops a UDP packet having an RTP (Real-time Transport Protocol) stamp larger than a predetermined playout deadline among the received UDP packets. Transmission device.

제8항에 있어서,
상기 디먹스는, 상기 수신된 TCP 패킷들 중 상기 수신된 UDP 패킷의 RTP 시퀀스 넘버보다 작은 RTP 시퀀스 넘버를 갖는 TCP 패킷을 상기 수신된 UDP 패킷과 결합시키는 것을 특징으로 하는 부호화된 비디오 스트림 전송 장치.9. The method of claim 8,
Wherein the DEMUX combines a TCP packet having an RTP sequence number smaller than the RTP sequence number of the received UDP packet with the received UDP packet among the received TCP packets.

미리 정해진 압축 규격을 이용하여 비디오 스트림을 부호화하는 단계;
상기 부호화된 비디오 스트림을 복수개의 서브 스트림으로 구분하는 단계;
상기 서브 스트림으로부터 SPS(Sequence Parameter Set), PPS(Picture Parameter Set), 또는 슬라이스 헤더(Slice Header)를 갖는 NAL(Network Adaptation Layer) 유닛인 제1 데이터와 슬라이스 데이터(Slice Data)를 갖는 NAL 유닛인 제2 데이터로 구분하는 단계;
상기 제1 데이터를 이용하여 TCP(Transmission Control Protocol) 패킷을 생성하고, 상기 제2 데이터를 이용하여 UDP(User Datagram Protocol) 패킷을 생성하는 단계; 및
상기 생성된 TCP 패킷을 TCP 터널을 통해 전송하고, 상기 생성된 UDP 패킷을 UDP 터널을 통해 전송하는 단계를 포함하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.Encoding a video stream using a predetermined compression standard;
Dividing the encoded video stream into a plurality of sub-streams;
A NAL unit having a first data and a slice data as a network adaptation layer (NAL) unit having a sequence parameter set (SPS), a picture parameter set (PPS), or a slice header Dividing the data into second data;
Generating a Transmission Control Protocol (TCP) packet using the first data, and generating a User Datagram Protocol (UDP) packet using the second data; And
Transmitting the generated TCP packet through a TCP tunnel, and transmitting the generated UDP packet through a UDP tunnel.

제11항에 있어서,
상기 미리 정해진 압축 규격은, H.264 또는 H.265이고, 상기 부호화된 비디오 스트림은 데이터 분할(Data Partitioning: DP) 프로파일(Profile)을 포함하지 않는 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.12. The method of claim 11,
Wherein the predetermined compression standard is H.264 or H.265, and the encoded video stream does not include a Data Partitioning (DP) profile.

제11항에 있어서,
상기 TCP 패킷을 전송하기 위한 TCP 세션 및 상기 UDP 패킷을 전송하기 위한 UDP 세션은 상기 비디오 스트림의 전송 동안 액티브 상태로 유지되는 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.12. The method of claim 11,
Wherein the TCP session for transmitting the TCP packet and the UDP session for transmitting the UDP packet are kept active during transmission of the video stream.

제11항에 있어서,
상기 전송하는 단계에서, n번째 서브 스트림에 대한 UDP 패킷과 n+1번째 서브 스트림에 대한 TCP 패킷을 동시에 전송하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.12. The method of claim 11,
And transmitting the UDP packet for the nth sub stream and the TCP packet for the (n + 1) th sub stream simultaneously in the transmitting step.

제14항에 있어서,
상기 전송하는 단계에서, 큐에 저장되어 있는 UDP 패킷의 개수가 미리 정해진 임계치 보다 작은 경우, 상기 n번째 서브 스트림에 대한 UDP 패킷과 상기 n+1번째 서브 스트림에 대한 TCP 패킷을 동시에 전송하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.15. The method of claim 14,
If the number of UDP packets stored in the queue is smaller than a predetermined threshold value, the UDP packet for the n-th sub stream and the TCP packet for the (n + 1) To the encoded video stream.

제11항에 있어서,
미리 정해진 퍼센트에 따라 상기 제2 데이터로부터 상기 TCP 터널을 통해 전송될 우선 처리 데이터를 선택하는 단계를 더 포함하고,
상기 우선 처리 데이터는, 상기 TCP 패킷으로 패킷화되어 상기 TCP 터널을 통해 전송되는 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.12. The method of claim 11,
Further comprising selecting priority processing data to be transmitted through the TCP tunnel from the second data according to a predetermined percentage,
Wherein the priority processing data is packetized into the TCP packet and transmitted through the TCP tunnel.

제16에 있어서,
상기 우선 처리 데이터는 I-frame인 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.The method according to claim 16,
Wherein the priority processing data is an I-frame.

제11항에 있어서,
상기 TCP 터널을 통해 상기 TCP 패킷을 수신하고, 상기 UDP 터널을 통해 상기 UDP 패킷을 수신하는 단계;
상기 수신된 UDP 패킷을 상기 수신된 TCP 패킷과 결합하는 단계; 및
상기 결합된 TCP 패킷 및 UDP 패킷을 복호화하여 상기 비디오 스트림을 복원하는 단계를 더 포함하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.12. The method of claim 11,
Receiving the TCP packet through the TCP tunnel and receiving the UDP packet through the UDP tunnel;
Combining the received UDP packet with the received TCP packet; And
And decoding the combined TCP packet and the UDP packet to restore the video stream.

제18항에 있어서,
상기 수신된 UDP 패킷 중 미리 정해진 출력 마감시한보다 큰 RTP 스탬프를 갖는 UDP 패킷은 드롭되는 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.19. The method of claim 18,
Wherein a UDP packet having an RTP stamp greater than a predetermined output deadline is dropped from the received UDP packets.

제18항에 있어서,
상기 결합하는 단계에서,
상기 수신된 TCP 패킷들 중 상기 수신된 UDP 패킷의 RTP 시퀀스 넘버보다 작은 RTP 시퀀스 넘버를 갖는 TCP 패킷을 상기 수신된 UDP 패킷과 결합하는 것을 특징으로 하는 부호화된 비디오 스트림 전송 방법.19. The method of claim 18,
In the combining step,
And combines a TCP packet having an RTP sequence number smaller than the RTP sequence number of the received UDP packet with the received UDP packet among the received TCP packets.