JP2009071580A

JP2009071580A - Communication device

Info

Publication number: JP2009071580A
Application number: JP2007237807A
Authority: JP
Inventors: Katsuichi Osakabe; 勝一刑部
Original assignee: Yamaha Corp
Current assignee: Yamaha Corp
Priority date: 2007-09-13
Filing date: 2007-09-13
Publication date: 2009-04-02

Abstract

<P>PROBLEM TO BE SOLVED: To provide a technique to efficiently transmit necessary image information with a limited communication bandwidth in a data transfer between conference terminals. <P>SOLUTION: A conference terminal on a transmission side generates a frame image (still image) which makes a whole conference room a photographing range to be preliminarily transmitted to a conference terminal on the other end. When a conference starts, the conference terminal on the transmission side generates moving images which only consist of detailed ranges (ranges B and C in Figure) which change as time passes including participants or the like in a photographing range of a Web camera to be transmitted to the conference terminal on the other end. The conference terminal on the other end overlaps the frame image with moving images received during the conference to be displayed on a display portion. By the above processing, the participants can view and recognize the ranges which change as time passes including the participants or the like by the moving images in real time and can transmit/receive necessary information without excessively using a bandwidth of a network by covering ranges less important by still images. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、音声と共に画像を送信する通信装置に関する。 The present invention relates to a communication apparatus that transmits an image together with sound.

近年、通信網を介して接続された複数の会議端末を用いて会議を行う会議システムが一般に普及している。特許文献１には、遠隔地にある者同士が参加して行われる遠隔テレビ会議の運営を支援する技術が開示されている。この文献に開示されたシステムは、複数のテレビ会議端末と、それら各端末における音声情報や映像情報のやり取りを仲介する多地点テレビ会議中継装置とを備える。そして中継装置は、会議で使用する資料の参照ページや会議終了までの残り時間などといった会議運営情報を、自身を経由する音声情報や映像情報に対して適宜重畳する。
特開平０５−１４５９１８号公報 In recent years, a conference system that conducts a conference using a plurality of conference terminals connected via a communication network has become widespread. Patent Document 1 discloses a technology for supporting the operation of a remote video conference that is performed with participants in remote locations. The system disclosed in this document includes a plurality of video conference terminals and a multipoint video conference relay device that mediates exchange of audio information and video information at each terminal. Then, the relay device appropriately superimposes the conference operation information such as the reference page of the material used in the conference and the remaining time until the conference ends on the audio information and video information passing through the conference device.
Japanese Patent Laid-Open No. 05-145918

ところで、会議室の様子を映した映像などはデータ量が非常に大きい。そのため、ネットワークの限られた通信帯域幅の範囲内で映像情報を送信するために、従来はデータを圧縮したり映像のフレームレートや解像度を低下させたりしていた。しかし、非可逆的な圧縮を施したりフレームレートや解像度を低く設定したりすると、映像データの品質が低下するという問題点があった。 By the way, the amount of data of the video showing the state of the conference room is very large. For this reason, in order to transmit video information within the limited communication bandwidth of the network, conventionally, data has been compressed or the video frame rate and resolution have been reduced. However, when irreversible compression is performed or the frame rate and resolution are set low, there is a problem that the quality of the video data decreases.

本発明は、上記の課題に応じてなされたものであり、会議端末間のデータ転送において、限られた通信帯域幅で必要な情報を効率良く送信する技術を提供することを目的とする。 The present invention has been made in response to the above-described problems, and an object of the present invention is to provide a technique for efficiently transmitting necessary information with a limited communication bandwidth in data transfer between conference terminals.

本発明に係る通信装置の第1の実施形態は、設定された撮影領域内において、１または複数の特定領域を設定する設定手段と、前記設定された撮影領域を撮影し、前記撮影領域内の画像に対応する第１の画像データと前記設定手段が設定した特定領域内の画像に対応する第２の画像データを生成する画像データ生成手段と、前記画像データ生成手段が生成した前記第１の画像データおよび前記第２の画像データを通信網を介して他の通信装置に出力する出力手段とを具備し、前記出力手段から出力される前記第２の画像データは、所定時間あたりの画面数が前記第１の画像よりも多いことを特徴とする。 In the first embodiment of the communication apparatus according to the present invention, setting means for setting one or a plurality of specific areas in the set shooting area, shooting the set shooting area, Image data generating means for generating first image data corresponding to an image and second image data corresponding to an image in a specific area set by the setting means, and the first data generated by the image data generating means Output means for outputting the image data and the second image data to another communication device via a communication network, and the second image data output from the output means is the number of screens per predetermined time. Is greater than the first image.

また、本発明に係る通信装置の第２の実施形態は、前記第１の実施形態において、前記画像データ生成手段は、前記第２の画像データを生成するにあたり、前記第１の画像データよりも所定時間あたりの画面数を多く生成することを特徴とする。 According to a second embodiment of the communication device of the present invention, in the first embodiment, the image data generation means generates the second image data more than the first image data. A large number of screens per predetermined time are generated.

また、本発明に係る通信装置の第３の実施形態は、前記第１の実施形態において、前記出力手段は、前記第２の画像データを前記第１の画像データよりも所定時間あたりの画面数を多く出力することを特徴とする。 The communication device according to a third embodiment of the present invention is the communication device according to the first embodiment, wherein the output means uses the second image data as the number of screens per predetermined time as compared with the first image data. Is output in large quantities.

また、本発明に係る通信装置の第４の実施形態は、前記第１ないし３いずれかの実施形態において、前記画像データ生成手段は、前記第１の画像を静止画として生成すると共に、前記第２の画像を動画として生成することを特徴とする。 The communication device according to a fourth embodiment of the present invention is the communication device according to any one of the first to third embodiments, wherein the image data generation unit generates the first image as a still image, and The second image is generated as a moving image.

また、本発明に係る通信装置の第５の実施形態は、前記第１ないし４いずれかに記載の実施形態において、前記画像データ生成手段は、前記撮影領域において前記第２の画像データが表す画像の領域を含まない領域において前記第１の画像データを生成することを特徴とする。 The communication device according to a fifth embodiment of the present invention is the communication device according to any one of the first to fourth aspects, wherein the image data generation means is an image represented by the second image data in the imaging region. The first image data is generated in a region not including the first region.

また、本発明に係る通信装置の第６の実施形態は、前記第１ないし５いずれかに記載の実施形態において、接続された通信網において利用可能な通信帯域幅を測定する測定手段と、通信帯域と対応した画質を指定するテーブルと、通信に先立ち前記測定手段が測定した利用可能な通信帯域幅に対応する画質を、前記テーブルを参照して前記撮影手段に設定する画質調整手段とを有することを特徴とする。 A communication device according to a sixth embodiment of the present invention is the communication device according to any one of the first to fifth embodiments, the measuring means for measuring the communication bandwidth available in the connected communication network, and the communication A table for designating image quality corresponding to the bandwidth, and image quality adjustment means for setting the image quality corresponding to the available communication bandwidth measured by the measurement means prior to communication in the photographing means with reference to the table. It is characterized by that.

また、本発明に係る通信装置の第７の実施形態は、前記第１ないし５いずれかに記載の実施形態において、接続された通信網において利用可能な通信帯域幅を測定する測定手段と、通信帯域と対応した圧縮率を指定するテーブルと、通信に先立ち前記測定手段が測定した利用可能な通信帯域幅に対応する圧縮率を、前記テーブルを参照して前記撮影手段に設定する圧縮率調整手段とを有することを特徴とする。 A communication device according to a seventh embodiment of the present invention is the communication device according to any one of the first to fifth embodiments, wherein the communication unit can measure a communication bandwidth that can be used in the connected communication network. A compression ratio adjusting unit that sets a compression rate corresponding to an available communication bandwidth measured by the measurement unit prior to communication, in the imaging unit with reference to the table. It is characterized by having.

本発明に係る会議端末によれば、会議端末間のデータ転送において、限られた通信帯域幅で必要な情報を効率良く送信することができる、といった効果を奏する。 According to the conference terminal of the present invention, there is an effect that necessary information can be efficiently transmitted with a limited communication bandwidth in data transfer between the conference terminals.

以下、図面を参照しつつ本発明の一実施形態である会議端末について説明する。
（Ａ：構成）
図１は、本発明の一実施形態である会議端末を含む会議システム１の構成を示すブロック図である。会議システム１は、会議端末１０Ａと会議端末１０Ｂと通信網２０とからなり、会議端末１０Ａおよび会議端末１０Ｂは通信網２０にそれぞれ有線接続されている。会議端末１０Ａおよび会議端末１０Ｂは互いに同じ構成からなり、以下では会議端末１０Ａおよび会議端末１０Ｂを区別する必要が無いときには、両者を会議端末１０と総称する。
なお、ここでは２台の会議端末が通信網２０に接続されている場合について例示されているが、３台以上の会議端末が接続されているとしても良い。 Hereinafter, a conference terminal according to an embodiment of the present invention will be described with reference to the drawings.
(A: Configuration)
FIG. 1 is a block diagram showing a configuration of a conference system 1 including a conference terminal according to an embodiment of the present invention. The conference system 1 includes a conference terminal 10A, a conference terminal 10B, and a communication network 20, and the conference terminal 10A and the conference terminal 10B are connected to the communication network 20 by wire. The conference terminal 10A and the conference terminal 10B have the same configuration, and hereinafter, when there is no need to distinguish between the conference terminal 10A and the conference terminal 10B, both are collectively referred to as the conference terminal 10.
In addition, although the case where two conference terminals are connected to the communication network 20 is illustrated here, three or more conference terminals may be connected.

本実施形態では、通信プロトコルとして以下に述べる各通信プロトコルが用いられている。すなわち、アプリケーション層の通信プロトコルとして、音声データおよび画像データの転送にはReal-time Transport Protocol（以下、「ＲＴＰ」）が用いられている。ＲＴＰとは、音声データや画像データをend-to-endでリアルタイムに送受信する通信サービスを提供するための通信プロトコルであり、その詳細はＲＦＣ１８８９に規定されている。ＲＴＰにおいては、ＲＴＰパケットを生成し送受信することにより通信端末同士でデータの授受が行われる。また、トランスポート層の通信プロトコルとしては、ＵＤＰ（User Datagram Protocol）が用いられており、ネットワーク層の通信プロトコルとしてはＩＰ（Internet Protocol）が用いられている。上記の会議端末１０Ａおよび会議端末１０Ｂには、それぞれにＩＰアドレスが割り振られており、ネットワーク上で一元的に識別される。
なお、ＵＤＰおよびＩＰについては、一般に広く用いられている通信プロトコルであるため説明を省略する。 In this embodiment, each communication protocol described below is used as a communication protocol. That is, Real-time Transport Protocol (hereinafter, “RTP”) is used for transferring audio data and image data as a communication protocol in the application layer. RTP is a communication protocol for providing a communication service for transmitting and receiving audio data and image data in real time in an end-to-end manner, and the details thereof are defined in RFC1889. In RTP, data is exchanged between communication terminals by generating and transmitting / receiving RTP packets. Further, UDP (User Datagram Protocol) is used as the transport layer communication protocol, and IP (Internet Protocol) is used as the network layer communication protocol. Each of the conference terminal 10A and the conference terminal 10B is assigned an IP address, and is uniquely identified on the network.
In addition, about UDP and IP, since it is a communication protocol generally used widely, description is abbreviate | omitted.

次に、会議端末１０のハードウェア構成について図２を参照して説明する。
図に示す制御部１０１は、例えばＣＰＵ（Central Processing Unit）であり、後述する記憶部１０３に格納されている各種制御プログラムを実行することにより、会議端末１０の各部の動作を制御する。 Next, the hardware configuration of the conference terminal 10 will be described with reference to FIG.
The control unit 101 shown in the figure is, for example, a CPU (Central Processing Unit), and controls the operation of each unit of the conference terminal 10 by executing various control programs stored in the storage unit 103 described later.

Ｗｅｂカメラ１０７は、ＣＭＯＳ（Complementary Metal Oxide Semiconductor）イメージセンサやＣＣＤ(Charge Coupled Device) イメージセンサからの入力をＭｏｔｉｏｎ−ＪＰＥＧ方式の動画として出力する。なお、Ｍｏｔｉｏｎ−ＪＰＥＧ方式とは、撮影したフレームごとの画像をＪＰＥＧ（Joint Photographic Experts Group）圧縮し、これを連続して記録する動画データ生成方式である。Ｗｅｂカメラ１０７は、所定の画像サイズおよび単位時間あたりのフレーム数（ｆｐｓ；frames per second）で画像を撮影し、ＪＰＥＧ方式の画像圧縮を施してＲＡＭ１０３ｂへ出力する。画像サイズはＷｅｂカメラ１０７に予め設定された値（本実施形態では６４０pixel×４８０pixel）を用い、単位時間当たりのフレーム数については、適宜制御部１０１により制御される。また、画像の圧縮率はＪＰＥＧ方式の画像圧縮（圧縮率は１／５〜１／６０）の範囲内で、制御部１０１の制御下で設定が可能となっている。また、参加者はＷｅｂカメラ１０７の向きを手動で変更し、その画像領域を任意に設定することが可能である。また、Ｗｅｂカメラ１０７は、所定の信号を受けるとその時点で生成されているフレームの画像をキャプチャーし、静止画として出力する機能を有している。 The Web camera 107 outputs an input from a complementary metal oxide semiconductor (CMOS) image sensor or a charge coupled device (CCD) image sensor as a motion-JPEG moving image. The Motion-JPEG method is a moving image data generation method in which a photographed image for each frame is compressed by JPEG (Joint Photographic Experts Group) and recorded continuously. The Web camera 107 captures an image with a predetermined image size and the number of frames per unit time (fps; frames per second), performs JPEG image compression, and outputs the image to the RAM 103b. The image size uses a value preset in the Web camera 107 (640 pixels × 480 pixels in this embodiment), and the number of frames per unit time is appropriately controlled by the control unit 101. The image compression rate can be set under the control of the control unit 101 within the range of JPEG image compression (compression rate is 1/5 to 1/60). In addition, the participant can manually change the orientation of the Web camera 107 and arbitrarily set the image area. In addition, the Web camera 107 has a function of receiving a predetermined signal, capturing a frame image generated at that time, and outputting it as a still image.

記憶部１０３は、ＲＯＭ（Read Only Memory）１０３ａおよびＲＡＭ（Random Access Memory）１０３ｂを有する。ＲＯＭ１０３ａは、本発明に特徴的な機能を制御部１０１に実現させるためのデータや制御プログラムを格納している。上記データの一例としては、テストデータ、送信レート管理テーブル等がある。 The storage unit 103 includes a ROM (Read Only Memory) 103a and a RAM (Random Access Memory) 103b. The ROM 103a stores data and a control program for causing the control unit 101 to realize functions characteristic of the present invention. Examples of the data include test data and a transmission rate management table.

ここで、上記送信レート管理テーブルについて説明する。図３は、送信レート管理テーブルの一例を示した図である。送信レート管理テーブルには、Ｗｅｂカメラ１０７が動画データを生成する際の単位時間当たりのフレーム数（ｆｐｓ）、およびＪＰＥＧ方式の画像の圧縮率が、利用可能な通信帯域幅（Ｍｂｐｓ；Mega bit per second）に対応させて規定されている。一方上記テストデータは、予めＷｅｂカメラ１０７によって生成されたＭｏｔｉｏｎ−ＪＰＥＧ方式の画像データである。なお、その内容はどのようなものであっても良い。 Here, the transmission rate management table will be described. FIG. 3 is a diagram illustrating an example of a transmission rate management table. In the transmission rate management table, the number of frames per unit time (fps) when the Web camera 107 generates moving image data, and the compression rate of the JPEG image can be used. Available communication bandwidth (Mbps; Mega bit per second). On the other hand, the test data is Motion-JPEG image data generated in advance by the Web camera 107. The content may be anything.

ＲＡＭ１０３ｂは、各種プログラムにしたがって作動している制御部１０１によってワークエリアとして利用されると共に、音声入力部１０６およびＷｅｂカメラ１０７から受取った音声データ・画像データを記憶する。 The RAM 103b is used as a work area by the control unit 101 operating according to various programs, and stores audio data and image data received from the audio input unit 106 and the Web camera 107.

制御部１０１は上記制御プログラムに従い、ＲＡＭ１０３ｂに書き込まれた音声データまたは画像データからＲＴＰパケットを生成する。ＲＴＰパケットは、図４に示すようにＩＰにおけるデータ転送単位であるパケットやＴＣＰ（Transmission Control Protocol）におけるデータ転送単位であるセグメントと同様に、ペイロード部に対してヘッダ部が付与され構成されている。 The control unit 101 generates an RTP packet from the audio data or image data written in the RAM 103b according to the control program. As shown in FIG. 4, the RTP packet is configured by adding a header portion to the payload portion, similarly to a packet that is a data transfer unit in IP and a segment that is a data transfer unit in TCP (Transmission Control Protocol). .

ヘッダ部には、タイムスタンプ、ペイロードタイプ、シーケンス番号、画像タイプおよび区画情報の５種類のデータが書き込まれる。ここで、タイムスタンプとは、当該ＲＴＰパケットが送信される時刻（音声通信の開始を指示されてから経過した時間）を示すデータである。ペイロードタイプとは、通信メッセージの種別をその通信メッセージの宛先に識別させるためのデータである。本実施形態で利用されるメッセージ種別には、音声データ送信メッセージ、画像データ送信メッセージ、受信通知メッセージの３種類がある。それらのメッセージにおいて、ペイロードタイプには、それぞれ“１”、“２”、“３”の３種類の数字が書き込まれる。シーケンス番号とは、各パケットを一意に識別するための識別子であり、例えば１つの音声データが一連のＲＴＰパケットに分割されて送信される場合に、各パケットに対して１、２、３…のようにシーケンス番号が付与される。画像タイプとは、ペイロード部に書き込まれた画像データが、後述する“フレーム画像”と“詳細画像”のいずれであるかを示し、それぞれ“１”または“２”が書き込まれる。区画情報とは、ペイロード部に書き込まれた画像データが“詳細画像”であるとき、該詳細画像が後述する表示部１０５のどの領域に表示されるものであるかを規定する情報であるが、その詳細は後述する。 Five types of data including a time stamp, payload type, sequence number, image type, and section information are written in the header portion. Here, the time stamp is data indicating the time at which the RTP packet is transmitted (the time elapsed since the start of voice communication was instructed). The payload type is data for identifying the type of communication message to the destination of the communication message. There are three types of messages used in this embodiment: a voice data transmission message, an image data transmission message, and a reception notification message. In these messages, three types of numbers “1”, “2”, and “3” are written in the payload type, respectively. The sequence number is an identifier for uniquely identifying each packet. For example, when one voice data is divided into a series of RTP packets and transmitted, 1, 2, 3,. Thus, a sequence number is assigned. The image type indicates whether the image data written in the payload portion is a “frame image” or “detailed image” described later, and “1” or “2” is written respectively. The section information is information that defines in which area of the display unit 105 (to be described later) the detailed image is displayed when the image data written in the payload is a “detailed image”. Details thereof will be described later.

ペイロード部には、音声データ送信メッセージまたは画像データ送信メッセージにおいては、それぞれ所定時間（本実施形態においては２０ミリ秒）分の音声データまたは画像データが書き込まれる。また、受信通知メッセージにおいては、受取ったパケットのシーケンス番号が書き込まれる。 In the payload portion, audio data or image data for a predetermined time (20 milliseconds in the present embodiment) is written in the audio data transmission message or the image data transmission message. In the reception notification message, the sequence number of the received packet is written.

音声入力部１０６は、マイクロホン１０６ａと、アナログ／デジタル（以下、「Ａ／Ｄ」と略記する）コンバータ１０６ｂを含む。マイクロホン１０６ａは音声を収音し、該音声を表すアナログ信号（以下、音声信号）を生成し、Ａ／Ｄコンバータ１０６ｂに出力する。Ａ／Ｄコンバータ１０６ｂは、マイクロホン１０６ａから受取った音声信号をデジタル信号（以下、音声データ）に変換してＲＡＭ１０３ｂへ出力する。 The audio input unit 106 includes a microphone 106a and an analog / digital (hereinafter abbreviated as “A / D”) converter 106b. The microphone 106a collects sound, generates an analog signal representing the sound (hereinafter referred to as sound signal), and outputs the analog signal to the A / D converter 106b. The A / D converter 106b converts the audio signal received from the microphone 106a into a digital signal (hereinafter, audio data) and outputs the digital signal to the RAM 103b.

操作部１０４は、例えばキーボードやマウスなどであり、会議端末１０の操作者が操作部１０４を操作して何らかの入力操作を行うと、その操作内容を表すデータが制御部１０１へと伝達される。 The operation unit 104 is, for example, a keyboard or a mouse. When the operator of the conference terminal 10 operates the operation unit 104 to perform some input operation, data representing the operation content is transmitted to the control unit 101.

通信ＩＦ部１０２は、例えばＮＩＣ（Network Interface Card）であり、通信網２０に接続されている。この通信ＩＦ部１０２は、制御部１０１から受取ったＲＴＰパケットを下位層の通信プロトコルにしたがって順次カプセル化することにより得られるＩＰパケットを通信網２０へ送出する。なお、カプセル化とは、上記ＲＴＰパケットをペイロード部に書き込んだＵＤＰセグメントを生成し、さらに、そのＵＤＰセグメントをペイロード部に書き込んだＩＰパケットを生成することである。また、通信ＩＦ部１０２は、通信網２０を介してＩＰパケットを受信し、上記カプセル化とは逆の処理を行うことにより、そのＩＰパケットにカプセル化されているＲＴＰパケットを読み出して制御部１０１へ出力する。 The communication IF unit 102 is, for example, a NIC (Network Interface Card) and is connected to the communication network 20. The communication IF unit 102 sends IP packets obtained by sequentially encapsulating RTP packets received from the control unit 101 according to a lower layer communication protocol to the communication network 20. Encapsulation means generating a UDP segment in which the RTP packet is written in the payload portion, and further generating an IP packet in which the UDP segment is written in the payload portion. Further, the communication IF unit 102 receives an IP packet via the communication network 20 and performs a process reverse to the encapsulation, thereby reading out the RTP packet encapsulated in the IP packet and controlling the control unit 101. Output to.

表示部１０５は、幅６４０pixel×縦４８０pixelのモニタである。通信ＩＦ部１０２を介して受取った各種画像データに基づいて画像を表示する。 The display unit 105 is a monitor having a width of 640 pixels × a height of 480 pixels. An image is displayed based on various image data received via the communication IF unit 102.

エコーキャンセラ１１０は、制御部１０１から受取った音声データから、スピーカ１０８ａからマイクロホン１０６ａへ回り込んだエコー成分を除去し出力する。エコー成分をキャンセルする方法としては、既存のいずれの方法を用いても良い。
音声出力部１０８は、エコーキャンセラ１１０から受取った音声データの表す音声を再生するものであり、スピーカ１０８ａとＤ／Ａコンバータ１０８ｂとを含んでいる。Ｄ／Ａコンバータ１０８ｂは、制御部１０１から受取った音声データに対してＤ／Ａ変換を施すことによって音声信号へ変換しスピーカ１０８ａへ出力するものである。そして、スピーカ１０８ａは、Ｄ／Ａコンバータ１０８ｂから受取った音声信号に応じた音声を再生する。 The echo canceller 110 removes the echo component that has circulated from the speaker 108a to the microphone 106a from the audio data received from the control unit 101, and outputs the result. Any existing method may be used as a method of canceling the echo component.
The audio output unit 108 reproduces audio represented by the audio data received from the echo canceller 110, and includes a speaker 108a and a D / A converter 108b. The D / A converter 108b converts the audio data received from the control unit 101 into an audio signal by performing D / A conversion, and outputs the audio signal to the speaker 108a. The speaker 108a reproduces sound corresponding to the sound signal received from the D / A converter 108b.

以上の構成からなる会議端末１０は、会議室において以下のように設置されている。図５に示すように、会議室には机３が設置され、会議に参加する参加者２ａ、２ｂ、２ｃ、および２ｄが机の周囲に設置されたイスに腰掛けている。机の横には会議端末１０が設置され、表示部１０５は、全ての参加者が視認することができる位置に配置されている。マイクロホン１０６ａおよびＷｅｂカメラ１０７は、表示部１０５の下方に配置されている。スピーカ１０８ａは、マイクロホン１０６ａおよびＷｅｂカメラ１０７を挟むようにして会議端末１０において左右２箇所に配置されている。 The conference terminal 10 having the above configuration is installed in the conference room as follows. As shown in FIG. 5, a desk 3 is installed in the conference room, and participants 2a, 2b, 2c, and 2d participating in the conference are seated on chairs installed around the desk. The conference terminal 10 is installed beside the desk, and the display unit 105 is arranged at a position where all participants can see. The microphone 106 a and the web camera 107 are disposed below the display unit 105. The speakers 108a are arranged at two places on the left and right sides of the conference terminal 10 so as to sandwich the microphone 106a and the web camera 107.

（Ｂ：動作）
次に、会議端末１０Ａおよび１０Ｂを利用する参加者が遠隔会議を行う際に、会議端末１０が行う動作について説明する。なお、以下の説明において、上に挙げた会議端末１０の構成が、いずれの会議端末に属するものであるかを区別する必要があるときには、例えば会議端末１０Ａの制御部１０１を制御部１０１Ａなどのようにアルファベットを付して表す。 (B: Operation)
Next, an operation performed by the conference terminal 10 when a participant who uses the conference terminals 10A and 10B performs a remote conference will be described. In the following description, when it is necessary to distinguish which conference terminal the configuration of the conference terminal 10 listed above belongs to, for example, the control unit 101 of the conference terminal 10A is changed to the control unit 101A or the like. As shown, the alphabet is used.

遠隔会議が開始される前に、制御部１０１は、データ通信に係るＷｅｂカメラ１０７の設定を最適化するためのパラメータ調整処理を行う。図６は、パラメータ調整処理の流れを示したフローチャートである。 Before the remote conference is started, the control unit 101 performs a parameter adjustment process for optimizing the setting of the Web camera 107 related to data communication. FIG. 6 is a flowchart showing the flow of parameter adjustment processing.

制御部１０１は、まず利用可能帯域幅測定処理を行う（ステップＳＡ１００）。利用可能帯域幅測定処理とは、通信網２０を介して相手側会議端末とデータ通信する際に、その通信網２０にて利用することのできる最大の通信帯域幅を測定する機能である。本処理については、図７に示すフローチャートを用いて詳細に説明する。 First, the control unit 101 performs an available bandwidth measurement process (step SA100). The available bandwidth measurement process is a function of measuring the maximum communication bandwidth that can be used in the communication network 20 when performing data communication with the other party conference terminal via the communication network 20. This process will be described in detail with reference to the flowchart shown in FIG.

まず制御部１０１は、パケットを送信する際の送信間隔を決定する（ステップＳＢ１００）。利用可能帯域幅測定処理を初めて行う際には、所定の送信間隔を設定する。次に制御部１０１は、ＲＯＭ１０３ａに格納されたテストデータから一連のパケットを生成し、ステップＳＢ１００にて決定された送信間隔で相手側会議端末へ送信する（ステップＳＢ１１０）。このとき、制御部１０１は送信した各パケットのシーケンス番号をＲＡＭ１０３ｂに書き込む。 First, the control unit 101 determines a transmission interval when transmitting a packet (step SB100). When the available bandwidth measurement process is performed for the first time, a predetermined transmission interval is set. Next, the control unit 101 generates a series of packets from the test data stored in the ROM 103a and transmits the packets to the partner conference terminal at the transmission interval determined in step SB100 (step SB110). At this time, the control unit 101 writes the sequence number of each transmitted packet in the RAM 103b.

相手側の会議端末１０の制御部１０１は上記テストデータを受信し、受信した各パケットのシーケンス番号を受信通知メッセージに書き込み、該受信通知メッセージを送信側会議端末に対して返信する。送信側の会議端末１０の制御部１０１は、相手側会議端末から返信されてきた受信通知メッセージを受信し（ステップＳＢ１２０）、受信通知メッセージに書き込まれたシーケンス番号列とＲＡＭ１０３ｂに書き込まれたシーケンス番号列とから上記テストデータの送信におけるパケットロスの発生率（受信されなかったパケット数／送信されたパケット数）を算出し、パケットロスが発生したか否か判定する（ステップＳＢ１３０）。 The control unit 101 of the partner conference terminal 10 receives the test data, writes the sequence number of each received packet in a reception notification message, and returns the reception notification message to the transmission conference terminal. The control unit 101 of the conference terminal 10 on the transmission side receives the reception notification message returned from the counterpart conference terminal (step SB120), and the sequence number sequence written in the reception notification message and the sequence number written in the RAM 103b. The rate of occurrence of packet loss in the transmission of the test data (number of packets not received / number of packets transmitted) is calculated from the column and it is determined whether or not packet loss has occurred (step SB130).

制御部１０１は、上記所定の送信間隔でテストデータを送信した場合に、パケットロスが発生しなかった場合（ステップＳＢ１３０；“Ｎｏ”）、ステップＳＢ１００以降の処理を再度行う。そのとき、ステップＳＢ１００においては、前回行ったステップＳＢ１００ないしステップＳＢ１３０の処理において設定したパケット送信間隔より所定の割合だけ短い送信間隔を設定する。 When the test data is transmitted at the predetermined transmission interval and no packet loss occurs (step SB130; “No”), the control unit 101 performs the processing after step SB100 again. At that time, in step SB100, a transmission interval shorter than the packet transmission interval set in the previous processing of step SB100 to step SB130 is set.

制御部１０１は、パケットロスが発生しない間は、パケットの送信間隔を順次短くしながらステップＳＢ１００ないしステップＳＢ１３０を繰り返し行う。ステップＳＢ１３０においてパケットロスが発生した場合（ステップＳＢ１３０；“Ｙｅｓ”）には、その１回前にテストデータを送信した際の送信レート（テストデータのデータ量／送信にかかった時間）を、その時点での利用可能な帯域幅（単位はＢＰＳ；Ｂｙｔｅ／秒）として算出する（ステップＳＢ１４０）。なぜならば、送信間隔が短くなると単位時間当たりの送信データ量すなわち送信レートは高くなる。従って、テストデータの送信においてパケットロスが発生した場合には、その際に利用した送信レートは利用可能な通信帯域幅を初めて上回ったことを意味するからである。 The control unit 101 repeatedly performs step SB100 to step SB130 while sequentially shortening the packet transmission interval while no packet loss occurs. If a packet loss occurs in step SB130 (step SB130; “Yes”), the transmission rate (the amount of test data / the time taken for transmission) when the test data was transmitted one time before is indicated by The bandwidth available at the time (unit: BPS; Byte / second) is calculated (step SB140). This is because the amount of transmission data per unit time, that is, the transmission rate increases as the transmission interval becomes shorter. Therefore, when a packet loss occurs in the transmission of test data, it means that the transmission rate used at that time exceeds the available communication bandwidth for the first time.

再び図６に戻り、制御部１０１は、Ｗｅｂカメラのパラメータの設定を行う（ステップＳＡ１１０）。すなわち、利用可能帯域幅測定処理の測定値とＲＯＭ１０３ａに格納された送信レート管理テーブル（図３参照）とを照らし合わせ、送信レート管理テーブル中で利用可能な帯域幅の項目が該測定値より小さいものの中で最大の値である項目と対応付けられているフレーム数、およびＪＰＥＧ画像の圧縮率を読み出し、Ｗｅｂカメラ１０７の単位時間当たりの撮影フレーム数およびＪＰＥＧ画像の圧縮率を読み出された値に設定する。上記の処理を終え会議に係るデータ通信が開始されると、Ｗｅｂカメラ１０７は設定された単位時間あたりのフレーム数で画像データを生成し、制御部１０１は生成された画像データを選択されたＪＰＥＧ画像の圧縮率で圧縮する。 Returning to FIG. 6 again, the control unit 101 sets parameters of the Web camera (step SA110). That is, the available bandwidth measurement processing value is compared with the transmission rate management table (see FIG. 3) stored in the ROM 103a, and the available bandwidth item in the transmission rate management table is smaller than the measured value. The number of frames associated with the item having the maximum value and the JPEG image compression rate are read out, and the number of frames taken per unit time of the Web camera 107 and the JPEG image compression rate are read out. Set to. When the data communication related to the conference is started after finishing the above processing, the Web camera 107 generates image data with the set number of frames per unit time, and the control unit 101 selects the generated JPEG as the selected JPEG. Compress at the image compression rate.

制御部１０１は、ステップＳＡ１２０において、パラメータ調整処理を開始してから一定時間が経過したかどうか判定する。ステップＳＡ１２０の判定結果が“Ｎｏ”である場合は、一定時間が経過するまでステップＳＡ１２０の処理が繰り返される。一定時間が経過すると、ステップＳＡ１２０の判定結果は“Ｙｅｓ”となり、ステップＳＡ１３０が行われる。ステップＳＡ１３０においては、制御部１０１は、データ通信が終了したかどうか判定する。ステップＳＡ１３０の判定結果が“Ｎｏ”である場合にはステップＳＡ１００以降の処理が再び行われる。ステップＳＡ１３０の判定結果が“Ｙｅｓ”である場合には、制御部１０１はパラメータ調整処理を終了する。 In step SA120, the control unit 101 determines whether a certain time has elapsed since the parameter adjustment process was started. If the determination result of step SA120 is “No”, the process of step SA120 is repeated until a predetermined time has elapsed. When the predetermined time has elapsed, the determination result in step SA120 is “Yes”, and step SA130 is performed. In step SA130, control unit 101 determines whether the data communication has been completed. If the determination result in step SA130 is “No”, the processes in and after step SA100 are performed again. If the determination result in step SA130 is “Yes”, the control unit 101 ends the parameter adjustment process.

以上の処理から、制御部１０１は遠隔会議開始時および遠隔会議開始後一定時間置きに利用可能帯域幅測定処理を行い、測定された利用可能な帯域幅に合わせてＷｅｂカメラ１０７のパラメータが再設定されることとなる。そのことにより、時々刻々と変化する利用可能な通信帯域幅に応じたデータの送信を行うことができ、データを効率的に支障なく送信することができる。 From the above processing, the control unit 101 performs the available bandwidth measurement process at the start of the remote conference and at regular intervals after the start of the remote conference, and resets the parameters of the Web camera 107 according to the measured available bandwidth. Will be. As a result, data can be transmitted according to the available communication bandwidth that changes from moment to moment, and data can be transmitted efficiently and without any problem.

以下では、遠隔会議中に会議端末１０Ａの側の参加者が発言し会議端末１０Ｂの側の参加者がその発言を聴く場合に会議端末１０Ａが行う動作を説明する。
図８は、会議中に会議端末１０が実行する処理の流れを示したフローチャートである。まず、会議開始直後にステップＳＣ１００ないし１３０の処理が行われる。まずステップＳＣ１００において、制御部１０１はＷｅｂカメラ１０７に対し所定の信号を出力し、会議室全体を表す画像データ（以下、全体画像データ）を生成させる。図９は、図５に示した会議室においてＷｅｂカメラ１０７の側から会議室を見た図である。例えば、Ｗｅｂカメラ１０７の撮影領域がフレームＡで示された領域となるよう設定されている場合、Ｗｅｂカメラ１０７は、図１０に示すような画像を表す全体画像データを生成する。制御部１０１は生成された全体画像を自端末の表示部１０５に表示させる。 Hereinafter, an operation performed by the conference terminal 10A when a participant on the conference terminal 10A speaks during a remote conference and the participant on the conference terminal 10B listens to the speech will be described.
FIG. 8 is a flowchart showing a flow of processing executed by the conference terminal 10 during the conference. First, steps SC100 to 130 are performed immediately after the start of the conference. First, in step SC100, the control unit 101 outputs a predetermined signal to the web camera 107, and generates image data representing the entire conference room (hereinafter, entire image data). FIG. 9 is a view of the conference room viewed from the Web camera 107 side in the conference room shown in FIG. For example, when the shooting area of the Web camera 107 is set to be the area indicated by the frame A, the Web camera 107 generates whole image data representing an image as shown in FIG. The control unit 101 displays the generated whole image on the display unit 105 of the own terminal.

次にステップＳＣ１１０において、制御部１０１は上記全体画像から画像領域を選択する。以下では図１０に示す画像Ａにおいて、左上隅を原点（０、０）とし右下隅を（６３９、４７９）とする座標を用いて説明を行う。なお、該座標は画像データを表示する表示部１０５の画素に対応するものである。 Next, in step SC110, the control unit 101 selects an image area from the entire image. In the following description, in the image A shown in FIG. 10, description is made using coordinates with the upper left corner as the origin (0, 0) and the lower right corner as (639, 479). The coordinates correspond to the pixels of the display unit 105 that displays image data.

参加者は、画像データの表示された表示部１０５を視認しながら操作部１０４を操作することにより、全体画像において参加者が写っている１または複数の領域（以下、詳細画像領域）を選択する。図１０においては、領域ＢおよびＣで表される領域が選択される。
なお、領域の指定方法には、長方形の一つの角とその向かい合う角の座標を用いる。例えば図中の領域Ｂおよび領域Ｃは、「（５０、２４０）−（３００、４００）」および「（３４０、２４０）−（５９０、４００）」と表される。以上のようにして選択された詳細画像領域の範囲を表すデータはＲＡＭ１０３ｂに書き込まれる。 The participant operates the operation unit 104 while visually recognizing the display unit 105 on which the image data is displayed, thereby selecting one or a plurality of regions (hereinafter, detailed image regions) in which the participant is reflected in the entire image. . In FIG. 10, regions represented by regions B and C are selected.
Note that the region designation method uses one corner of the rectangle and the coordinates of the opposite corner. For example, the region B and the region C in the figure are represented as “(50, 240) − (300, 400)” and “(340, 240) − (590, 400)”. Data representing the range of the detailed image area selected as described above is written in the RAM 103b.

次に、制御部１０１は、全体画像からステップＳＣ１１０で指定された詳細画像領域を除いた領域（フレーム画像領域；図１１斜線領域）からなる画像（以下、フレーム画像）を表すフレーム画像データを生成する（ステップＳＣ１２０）。そして制御部１０１は生成されたフレーム画像データを会議端末１０Ｂへ送信する（ステップＳＣ１３０）。なお、フレーム画像データを含むＲＴＰパケットのヘッダ部のペイロードタイプには“２”が、画像タイプには“１”が書き込まれる。会議端末１０Ｂは、該フレーム画像データを受信し、ＲＡＭ１０３ｂＢに書き込む。 Next, the control unit 101 generates frame image data representing an image (hereinafter referred to as a frame image) composed of an area (frame image area; hatched area in FIG. 11) excluding the detailed image area specified in step SC110 from the entire image. (Step SC120). Then, the control unit 101 transmits the generated frame image data to the conference terminal 10B (step SC130). Note that “2” is written in the payload type of the header part of the RTP packet including the frame image data, and “1” is written in the image type. The conference terminal 10B receives the frame image data and writes it into the RAM 103bB.

さて、ステップＳＣ１００ないし１３０が終了すると、会議端末１０は音声データおよび画像データのデータ通信を開始する。音声データに関しては、音声入力部１０６Ａは遠隔会議開始後継続して音声を収音し、生成された音声データは会議端末１０Ｂに送信される。 When steps SC100 to SC130 are completed, the conference terminal 10 starts data communication of audio data and image data. Regarding the audio data, the audio input unit 106A continuously collects audio after the start of the remote conference, and the generated audio data is transmitted to the conference terminal 10B.

一方、制御部１０１は、参加者の様子を伝える詳細画像データを以下のように生成する。Ｗｅｂカメラ１０７は会議室全体を撮影領域とする全体画像データ（Ｍｏｔｉｏｎ−ＪＰＥＧ方式の動画）をパラメータ調整処理において設定されたフレームレートで生成する。制御部１０１は、ＲＡＭ１０３ｂを参照することにより１または複数の詳細画像領域の範囲を特定し、上記全体画像データからそれぞれの領域を抽出し、Ｍｏｔｉｏｎ−ＪＰＥＧ方式の動画を生成する（ステップＳＣ１４０）。ステップＳＣ１５０において、制御部１０１Ａは生成された詳細画像データを会議端末１０Ｂに送信する。なお、詳細画像データを含むＲＴＰパケットのヘッダ部のペイロードタイプには“２”が、画像タイプには“２”が書き込まれ、区画情報にはそれぞれの詳細画像の全体画像における座標が書き込まれる。 On the other hand, the control unit 101 generates detailed image data that conveys the state of the participant as follows. The Web camera 107 generates whole image data (Motion-JPEG video) having the entire conference room as a shooting area at the frame rate set in the parameter adjustment processing. The control unit 101 identifies the range of one or more detailed image areas by referring to the RAM 103b, extracts each area from the entire image data, and generates a Motion-JPEG moving image (step SC140). In step SC150, control unit 101A transmits the generated detailed image data to conference terminal 10B. Note that “2” is written in the payload type of the header portion of the RTP packet including the detailed image data, “2” is written in the image type, and the coordinates of each detailed image in the entire image are written in the partition information.

以下では、会議端末１０Ａから音声データおよび画像データを受信した会議端末１０Ｂが行う動作を説明する。会議端末１０Ｂが音声データを受信すると、音声出力部１０８Ｂは該音声データの表す音声を再生する。 Hereinafter, an operation performed by the conference terminal 10B that has received the audio data and the image data from the conference terminal 10A will be described. When the conference terminal 10B receives the audio data, the audio output unit 108B reproduces the audio represented by the audio data.

画像データは以下のように処理される。まず、会議端末１０Ｂはフレーム画像データを会議端末１０Ａから受信し、該フレーム画像データはＲＡＭ１０３ｂＢに書き込まれる。会議が開始されると会議端末１０Ａから詳細画像データを継続的に受信する。制御部１０１は、ＲＡＭ１０３ｂに書き込まれたフレーム画像データを読み出し、継続して表示部１０５に表示させると共に、会議端末１０Ａから受信した詳細画像データをフレーム画像と合成し表示部１０５において再生する。その結果、図１１に示すフレーム画像の領域は静止画像が表示され、詳細画像の領域については動画が表示される。すなわち表示部１０５において、全体画像の領域には会議開始時に撮影した机などが静止画として表示され、詳細画像の領域には参加者の動画映像がリアルタイムに表示される。 The image data is processed as follows. First, the conference terminal 10B receives frame image data from the conference terminal 10A, and the frame image data is written into the RAM 103bB. When the conference is started, detailed image data is continuously received from the conference terminal 10A. The control unit 101 reads out the frame image data written in the RAM 103b and continuously displays the frame image data on the display unit 105, and combines the detailed image data received from the conference terminal 10A with the frame image and reproduces it on the display unit 105. As a result, a still image is displayed in the area of the frame image shown in FIG. 11, and a moving image is displayed in the area of the detailed image. That is, on the display unit 105, a desk or the like taken at the start of the conference is displayed as a still image in the entire image area, and a moving image of the participant is displayed in real time in the detailed image area.

以上の処理により、会議端末１０Ｂを利用する参加者は、壁や机などの動かない物体が写った領域については会議中も会議開始時と同様であるため、会議開始時に受取ったフレーム画像で十分に様子をつかむことができる。一方、参加者などが写った領域については動画が表示されているため、その状況をリアルタイムに把握することができる。
以上のような画像データの通信を行うことにより、参加者が必要とする部分に関しては詳細な情報をやり取りすることができると共に、会議開催中に通信するデータは詳細画像の領域に限定されていることから、ネットワーク帯域を過大に利用することはない。
（Ｃ：変形例）
以上、本発明の実施形態について説明したが、本発明は以下に述べる種々の形態で実施することができる。 As a result of the above processing, the participant who uses the conference terminal 10B has the same frame image received at the start of the conference as the region in which the non-moving object such as a wall or desk is reflected is the same as that at the start of the conference. You can grasp the situation. On the other hand, since the moving image is displayed in the area where the participant is shown, the situation can be grasped in real time.
By communicating the image data as described above, it is possible to exchange detailed information regarding the parts required by the participants, and the data communicated during the conference is limited to the detailed image area. Therefore, the network bandwidth is not excessively used.
(C: Modification)
As mentioned above, although embodiment of this invention was described, this invention can be implemented with the various form described below.

（１）上記実施例において、利用可能帯域幅測定処理、Ｗｅｂカメラ１０７のパラメータ調整処理、画像データ生成処理の各処理は会議端末１０が行う場合について説明したが、各処理を行う機能の付与対象はもちろん会議端末に限定されない。記憶装置に蓄積したデータをクライアント装置へ提供するサーバ装置や、Ｗｅｂカメラで生成したデータをリアルタイムにクライアント装置へ提供するサーバ装置などに適用しても良い。 (1) In the above embodiment, the case where the conference terminal 10 performs the available bandwidth measurement process, the parameter adjustment process of the Web camera 107, and the image data generation process has been described. Of course, it is not limited to a conference terminal. The present invention may be applied to a server device that provides data accumulated in a storage device to a client device, a server device that provides data generated by a Web camera to a client device in real time, and the like.

（２）上記実施例において、本発明に係る会議端末に特徴的な機能をソフトウェアモジュールで実現する場合について説明したが、上記各機能を担っているハードウェアモジュールを組み合わせて本発明に係る会議端末を構成するようにしても良い。 (2) In the above embodiment, a case has been described in which the functions characteristic of the conference terminal according to the present invention are implemented by software modules. However, the conference terminal according to the present invention is combined with the hardware modules responsible for the above functions. You may make it comprise.

（３）上述した実施形態では、画像データおよび音声データの通信にアプリケーション層の通信プロトコルとしてＲＴＰを用いる場合について説明したが、他の通信プロトコルを用いても良いことは勿論である。要は、所定のヘッダ部とペイロード部とを有するデータブロックのペイロード部に、画像データまたは音声データを所定時間分ずつ書き込んで送信する通信プロトコルであれば、どのような通信プロトコルであっても良い。また、上述した実施形態では、トランスポート層の通信プロトコルとしてＵＤＰを用いる場合について説明したが、ＴＣＰを用いるようにしても良い。同様にネットワーク層の通信プロトコルがＩＰに限定されるものではない。 (3) In the above-described embodiment, the case where RTP is used as the communication protocol of the application layer for communication of image data and audio data has been described, but it is needless to say that other communication protocols may be used. In short, any communication protocol may be used as long as it is a communication protocol that writes and transmits image data or audio data for a predetermined time in a payload portion of a data block having a predetermined header portion and a payload portion. . In the above-described embodiment, the case where UDP is used as the transport layer communication protocol has been described. However, TCP may be used. Similarly, the network layer communication protocol is not limited to IP.

（４）上述した実施形態では、画像データおよび音声データの送受信を行う場合について説明したが、データの種類はそれらに限られるものではない。会議の主旨によっては画像データのみを送受信しても良いし、資料データのようなものを併せて送っても良い。 (4) In the above-described embodiment, the case where image data and audio data are transmitted and received has been described, but the types of data are not limited thereto. Depending on the purpose of the meeting, only image data may be transmitted or received, or data such as document data may be transmitted together.

（５）上記実施形態では、会議端末１０Ａおよび会議端末１０Ｂが通信網２０に有線接続されている場合について説明したが、通信網２０が例えば無線ＬＡＮ（Local Area Network）などの無線パケット通信網であり、会議端末１０Ａおよび会議端末１０Ｂが、この無線パケット通信網に接続されていても勿論良い。また、上記実施形態では通信網２０がインターネットである場合について説明したが、ＬＡＮであっても良いことは勿論である。要は、所定の通信プロトコルにしたがって行われる通信を仲介する機能を備えた通信網であれば、どのような通信網であっても良い。 (5) In the above embodiment, the case where the conference terminal 10A and the conference terminal 10B are wired to the communication network 20 has been described. However, the communication network 20 is a wireless packet communication network such as a wireless local area network (LAN), for example. Yes, the conference terminal 10A and the conference terminal 10B may of course be connected to this wireless packet communication network. Moreover, although the case where the communication network 20 is the Internet was demonstrated in the said embodiment, of course, it may be LAN. In short, any communication network may be used as long as it has a function of mediating communication performed in accordance with a predetermined communication protocol.

（６）上記実施形態では、本発明に係る通信装置に特徴的な機能を制御部１０１に実現させるための制御プログラムをＲＯＭ１０３ａに予め書き込んでおく場合について説明したが、ＣＤ−ＲＯＭやＤＶＤなどのコンピュータ装置読み取り可能な記録媒体に上記制御プログラムを記録して配布するとしても良く、インターネットなどの電気通信回線経由のダウンロードにより上記制御プログラムを配布するようにしても勿論良い。 (6) In the above embodiment, a case has been described in which a control program for causing the control unit 101 to realize functions characteristic of the communication apparatus according to the present invention is written in the ROM 103a in advance. The control program may be recorded and distributed on a computer-readable recording medium, or the control program may be distributed by downloading via a telecommunication line such as the Internet.

（７）上記実施形態では、詳細画像データを単位時間あたり所定のフレーム数を有する動画とする場合について説明したが、必要に応じて、または通信網の回線状況に応じて静止画を送信してもよい。この場合、フレーム画像データを構成する静止画に較べて、更新回数（すなわち、所定時間あたりに送信する画面の数）を多くする。 (7) In the above embodiment, the case where the detailed image data is a moving image having a predetermined number of frames per unit time has been described. However, a still image is transmitted as necessary or according to the line status of the communication network. Also good. In this case, the number of updates (that is, the number of screens transmitted per predetermined time) is increased as compared with the still images constituting the frame image data.

（８）上記実施形態では、フレーム画像データと詳細画像データとで同様の解像度およびＪＰＥＧ画像の圧縮率を用いる場合について説明したが、状況に応じてそれらのパラメータに差を設けても良い。 (8) In the above embodiment, the case where the same resolution and the compression rate of the JPEG image are used for the frame image data and the detailed image data has been described. However, a difference may be provided in these parameters depending on the situation.

（９）上記実施形態では、遠隔会議の開始時のみフレーム画像データを送信する場合について説明したが、遠隔会議開始後にも適宜送信してフレーム画像を更新しても良い。 (9) In the above embodiment, the case where the frame image data is transmitted only at the start of the remote conference has been described. However, the frame image may be updated by appropriately transmitting after the remote conference is started.

（１０）上記実施形態では、Ｗｅｂカメラ１０７はＭｏｔｉｏｎ−ＪＰＥＧ方式により画像データを生成する場合について説明した。しかし、画像の記録方式はＭｏｔｉｏｎ−ＪＰＥＧ方式に限定されず、ＭＰＥＧ（Moving Picture Experts Group）、ＪＰＥＧ２０００など他の方式を用いても良い。また、画像データを圧縮せずに送信しても良い。 (10) In the above embodiment, the case where the Web camera 107 generates image data by the Motion-JPEG method has been described. However, the image recording method is not limited to the Motion-JPEG method, and other methods such as MPEG (Moving Picture Experts Group) and JPEG 2000 may be used. Further, the image data may be transmitted without being compressed.

（１１）上記実施形態では、詳細画像の領域を参加者が任意に選択する場合について説明したが、利用可能な帯域幅の値に応じて該領域の広さに制限を設けるようにしても良い。具体的には、利用可能帯域幅が狭いほど詳細画像の領域として設定される領域を制限し、より多くの領域をフレーム画像として送信するように設定しても良い。 (11) In the above embodiment, the case where the participant arbitrarily selects the area of the detailed image has been described. However, the area may be limited according to the available bandwidth value. . Specifically, the area set as the area of the detailed image may be limited as the available bandwidth is narrowed, and a larger area may be set to be transmitted as a frame image.

（１２）上記実施形態では、詳細画像データとして長方形の領域を指定する場合について説明したが、該領域の形状は長方形に限定されない。要は時間と共に変化する領域と変化しない領域を区分することが目的であるから、形状は任意である。 (12) In the above embodiment, the case where a rectangular area is designated as the detailed image data has been described, but the shape of the area is not limited to a rectangle. In short, since the purpose is to distinguish between a region that changes with time and a region that does not change, the shape is arbitrary.

（１３）上記実施形態では、Ｗｅｂカメラ１０７を一つだけ設置し、該Ｗｅｂカメラ１０７がフレーム画像データおよび詳細画像データの両者を生成する場合について説明した。しかし、詳細画像データについては領域ごとに別々のＷｅｂカメラ１０７が生成するようにしても良い。その際、Ｗｅｂカメラ１０７の設定を個別に設定し、撮影対象に合わせてその画像品質を設定することができる。 (13) In the above embodiment, a case has been described in which only one Web camera 107 is installed and the Web camera 107 generates both frame image data and detailed image data. However, the detailed image data may be generated by a separate Web camera 107 for each region. At that time, the setting of the Web camera 107 can be individually set, and the image quality can be set according to the shooting target.

（１４）上記実施形態では、会議開始時に詳細画像の領域を設定し、該設定をそのまま用いる場合について説明した。しかし、会議開催中に領域の設定を見直すようにしても良い。具体的には、定期的にもしくは会議端末１０の管理者の操作に応じて、その時点のＷｅｂカメラ１０７の映し出す画像を表示部１０５に表示し、自端末を利用する参加者が領域を設定しなおすというようにしても良い。 (14) In the above embodiment, a case has been described in which a detailed image region is set at the start of a conference and the setting is used as it is. However, the area setting may be reviewed during the conference. Specifically, the image displayed by the Web camera 107 at that time is displayed on the display unit 105 periodically or in response to an operation of the administrator of the conference terminal 10, and a participant who uses the own terminal sets an area. You may make corrections.

（１５）上記実施形態では、フレーム画像の範囲は、詳細画像の領域を含まない場合について説明したが、フレーム画像は全領域を含んでいても良い。その場合、詳細画像はフレーム画像において対応する領域に上書きするように合成すればよい。 (15) In the above embodiment, the case where the range of the frame image does not include the area of the detailed image has been described, but the frame image may include the entire area. In that case, the detailed image may be synthesized so as to overwrite the corresponding region in the frame image.

（１６）上記実施形態では、人物などが含まれる詳細画像の領域を参加者が手動で設定する場合について説明したが、Ｗｅｂカメラ１０７で撮影した全体画像を所定の方法で解析することにより例えば人物が含まれる領域を自動的に選択させても良い。上記方法の一例としては、以下のようにすれば良い。参加者は会議室の所定の位置に座っても、一般に左右前後に体が動く。その間Ｗｅｂカメラ１０７は所定のフレームレートで該参加者を含む会議室全体を表す動画を生成する。制御部１０１は生成された画像データを解析し、フレーム間で画像に差があった領域を参加者が含まれる領域と判定し、該領域を詳細画像の領域とする。なお、参加者は意図的に体を動かすことで、より正確に領域の選択を行わせることもできる。 (16) In the above embodiment, the case where a participant manually sets a detailed image area including a person or the like has been described. However, for example, a person can be obtained by analyzing a whole image captured by the Web camera 107 by a predetermined method. May be automatically selected. An example of the above method is as follows. Even if a participant sits at a predetermined position in the conference room, the body generally moves left and right and back and forth. Meanwhile, the Web camera 107 generates a moving image representing the entire conference room including the participant at a predetermined frame rate. The control unit 101 analyzes the generated image data, determines an area where the image is different between frames as an area including a participant, and sets the area as a detailed image area. Participants can intentionally move their bodies to select areas more accurately.

（１７）上記実施形態では、Ｗｅｂカメラ１０７は１つ設置され、詳細画像データは全体画像データから抽出して生成する場合について説明した。しかし、Ｗｅｂカメラ１０７を複数設置し、各Ｗｅｂカメラ１０７がそれぞれ異なる詳細画像を生成するようにしても良い。具体的には以下のような実施形態が考えられる。Ｗｅｂカメラ１０７を５つ設置し、そのうち１台は会議室全体を撮影領域とする全体画像（静止画像）を生成し、遠隔会議開始時に一度だけ相手側の会議端末１０に送信する。他の４台はそれぞれ図９において参加者２ａ、２ｂ、２ｃ、および２ｄが含まれる領域を撮影領域とする詳細画像（動画）を生成し、上記実施形態における詳細画像データを生成し、会議中継続して相手側会議端末に送信する。相手側の会議端末１０の表示部１０５は、図１２に示すように５つの領域に区画化されており、それぞれの領域に各Ｗｅｂカメラ１０７により生成された画像を表示する。
以上処理により、相手側の会議端末１０を利用する参加者は、静止画により会議室全体の様子を知ることができると共に、動画により各参加者の様子を詳細に知ることができる。 (17) In the above embodiment, a case has been described in which one Web camera 107 is installed and detailed image data is generated by extracting from the entire image data. However, a plurality of Web cameras 107 may be installed, and each Web camera 107 may generate different detailed images. Specifically, the following embodiments can be considered. Five web cameras 107 are installed, one of which generates an entire image (still image) with the entire conference room as a shooting area and transmits it to the conference terminal 10 on the other side only once at the start of the remote conference. Each of the other four units generates detailed images (moving images) in which the regions including the participants 2a, 2b, 2c, and 2d in FIG. 9 are captured, generate detailed image data in the above embodiment, and are in a meeting Continue to send to the far-end conference terminal. The display unit 105 of the conference terminal 10 on the other side is partitioned into five areas as shown in FIG. 12, and images generated by the respective web cameras 107 are displayed in the respective areas.
Through the above processing, a participant who uses the other party's conference terminal 10 can know the state of the entire conference room from the still image and can know the state of each participant in detail from the moving image.

（１８）上記実施形態では、Ｗｅｂカメラ１０７が動画からキャプチャーして静止画であるフレーム画像を生成する場合について説明した。しかし、まず動画データを生成し、そのフレームを「間引く」ことによりフレーム画像を生成するようにしても良い。 (18) In the above embodiment, the case where the Web camera 107 captures a moving image and generates a frame image that is a still image has been described. However, first, moving image data may be generated, and a frame image may be generated by “thinning” the frame.

本発明に係る会議端末を含む会議システムの構成を示すブロック図である。It is a block diagram which shows the structure of the conference system containing the conference terminal which concerns on this invention. 会議端末１０の構成を示すブロック図である。2 is a block diagram showing a configuration of a conference terminal 10. FIG. 送信レート管理テーブルの一例を示す図である。It is a figure which shows an example of a transmission rate management table. ＲＴＰパケットの構成を示す図である。It is a figure which shows the structure of a RTP packet. 会議室における会議端末および参加者の位置関係を示す図である。It is a figure which shows the positional relationship of the conference terminal and participant in a conference room. パラメータ調整処理の流れを示すフローチャートである。It is a flowchart which shows the flow of a parameter adjustment process. 利用可能帯域幅測定処理の流れを示すフローチャートである。It is a flowchart which shows the flow of an available bandwidth measurement process. 画像データの送信処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the transmission process of image data. 会議端末１０の側から見た会議室の様子を示す図である。It is a figure which shows the mode of the conference room seen from the conference terminal 10 side. 全体画像の一例である。It is an example of a whole image. フレーム画像領域の一例を示す図である。It is a figure which shows an example of a frame image area | region. 変形例（１７）における表示部１０５の表示態様の一例を表す図である。It is a figure showing an example of the display mode of the display part 105 in a modification (17).

符号の説明Explanation of symbols

１…会議システム、２ａ、２ｂ、２ｃ、２ｄ…参加者、３…机、１０、１０Ａ、１０Ｂ…会議端末、２０…通信網、１０１…制御部（画像データ生成手段）、１０２…通信ＩＦ部、１０３…記憶部（１０３ａ；ＲＯＭ、１０３ｂ；ＲＡＭ）、１０４…操作部、１０５…表示部、１０６…音声入力部（１０６ａ…マイクロホン、１０６ｂ…Ａ／Ｄコンバータ）、１０７…Ｗｅｂカメラ（画像データ生成手段）、１０８…音声出力部（１０８ａ…スピーカ、１０８ｂ…Ｄ／Ａコンバータ）、１０９…バス、１１０…エコーキャンセラ DESCRIPTION OF SYMBOLS 1 ... Conference system, 2a, 2b, 2c, 2d ... Participant, 3 ... Desk, 10, 10A, 10B ... Conference terminal, 20 ... Communication network, 101 ... Control part (image data generation means), 102 ... Communication IF part , 103 ... storage unit (103a; ROM, 103b; RAM), 104 ... operation unit, 105 ... display unit, 106 ... voice input unit (106a ... microphone, 106b ... A / D converter), 107 ... Web camera (image data) Generating means), 108 ... sound output unit (108a ... speaker, 108b ... D / A converter), 109 ... bus, 110 ... echo canceller

Claims

設定された撮影領域内において、１または複数の特定領域を設定する設定手段と、
前記設定された撮影領域を撮影し、前記撮影領域内の画像に対応する第１の画像データと前記設定手段が設定した特定領域内の画像に対応する第２の画像データを生成する画像データ生成手段と、
前記画像データ生成手段が生成した前記第１の画像データおよび前記第２の画像データを通信網を介して他の通信装置に出力する出力手段と
を具備し、
前記出力手段から出力される前記第２の画像データは、所定時間あたりの画面数が前記第１の画像よりも多いことを特徴とする通信装置。 Setting means for setting one or a plurality of specific areas in the set imaging area;
Image data generation for capturing the set imaging region and generating first image data corresponding to the image in the imaging region and second image data corresponding to the image in the specific region set by the setting unit Means,
Output means for outputting the first image data and the second image data generated by the image data generation means to another communication device via a communication network;
2. The communication apparatus according to claim 1, wherein the second image data output from the output means has a larger number of screens per predetermined time than the first image.

前記画像データ生成手段は、前記第２の画像データを生成するにあたり、前記第１の画像データよりも所定時間あたりの画面数を多く生成することを特徴とする請求項１に記載の通信装置。 The communication apparatus according to claim 1, wherein the image data generation unit generates a larger number of screens per predetermined time than the first image data when generating the second image data.

前記出力手段は、前記第２の画像データを前記第１の画像データよりも所定時間あたりの画面数を多く出力することを特徴とする請求項１に記載の通信装置。 The communication apparatus according to claim 1, wherein the output unit outputs the second image data in a larger number of screens per predetermined time than the first image data.

前記画像データ生成手段は、前記第１の画像を静止画として生成すると共に、前記第２の画像を動画として生成することを特徴とする請求項１ないし３のいずれかに記載の通信装置。 The communication apparatus according to claim 1, wherein the image data generation unit generates the first image as a still image and generates the second image as a moving image.

前記画像データ生成手段は、前記撮影領域において前記第２の画像データが表す画像の領域を含まない領域において前記第１の画像データを生成することを特徴とする請求項１ないし４のいずれかに記載の通信装置。 5. The image data generation unit generates the first image data in an area that does not include an area of an image represented by the second image data in the imaging area. 6. The communication device described.

接続された通信網において利用可能な通信帯域幅を測定する測定手段と、
通信帯域と対応した画質を指定するテーブルと、
通信に先立ち前記測定手段が測定した利用可能な通信帯域幅に対応する画質を、前記テーブルを参照して前記撮影手段に設定する画質調整手段と
を有する請求項１ないし５のいずれかに記載の通信装置。 A measuring means for measuring the available communication bandwidth in the connected communication network;
A table for specifying the image quality corresponding to the communication band;
The image quality adjustment means for setting the image quality corresponding to the available communication bandwidth measured by the measurement means prior to communication to the imaging means with reference to the table. Communication device.

接続された通信網において利用可能な通信帯域幅を測定する測定手段と、
通信帯域と対応した圧縮率を指定するテーブルと、
通信に先立ち前記測定手段が測定した利用可能な通信帯域幅に対応する圧縮率を、前記テーブルを参照して前記撮影手段に設定する圧縮率調整手段と
を有する請求項１ないし５のいずれかに記載の通信装置。 A measuring means for measuring the available communication bandwidth in the connected communication network;
A table for specifying the compression rate corresponding to the communication bandwidth;
6. A compression ratio adjusting unit that sets a compression rate corresponding to an available communication bandwidth measured by the measurement unit prior to communication with the imaging unit with reference to the table. The communication device described.