WO2012142793A1 - 一种视频通讯终端及视频通讯方法 - Google Patents

一种视频通讯终端及视频通讯方法 Download PDF

Info

Publication number
WO2012142793A1
WO2012142793A1 PCT/CN2011/076751 CN2011076751W WO2012142793A1 WO 2012142793 A1 WO2012142793 A1 WO 2012142793A1 CN 2011076751 W CN2011076751 W CN 2011076751W WO 2012142793 A1 WO2012142793 A1 WO 2012142793A1
Authority
WO
WIPO (PCT)
Prior art keywords
module
video
audio
signal
audio signal
Prior art date
Application number
PCT/CN2011/076751
Other languages
English (en)
French (fr)
Inventor
姜韦
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2012142793A1 publication Critical patent/WO2012142793A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W88/00Devices specially adapted for wireless communication networks, e.g. terminals, base stations or access point devices
    • H04W88/02Terminal devices
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/52Details of telephonic subscriber devices including functional features of a camera
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • H04N7/142Constructional details of the terminal equipment, e.g. arrangements of the camera and the display
    • H04N2007/145Handheld terminals

Definitions

  • the present invention relates to the field of communications, and in particular, to a video communication terminal and a video communication method. Background technique
  • wireless communication terminal equipment commonly known as wireless terminal
  • the basic function of a wireless terminal is to provide voice calls.
  • Video calling methods are becoming more and more popular among users.
  • the present invention provides a video communication terminal that automatically tracks a user's face.
  • the present invention employs the following technical solutions:
  • the present invention discloses a video communication terminal, including an audio module, a video module, an adjustment module, and a control module, wherein the control module is respectively connected to the audio module, the video module, and the adjustment module, and the adjustment module Connected to the video module;
  • the audio module is configured to collect audio information, and convert the collected audio information into an audio signal, and then transmit the audio information to the control module;
  • the video module is configured to collect video information, and convert the video information into a video signal, and then transmit the video information to the control module;
  • the control module is configured to determine a video module adjustment parameter according to the received audio signal and the video signal, and send the adjustment parameter to the adjustment module;
  • the adjusting module is configured to adjust the video module according to the adjustment parameter.
  • the audio module includes at least three audio buffers
  • the intensity of the audio signal collected by the audio collector is inversely proportional to the distance between the audio collector and the sound source;
  • the audio collector is disposed at different locations of the video communication terminal.
  • the audio module further includes an analysis sub-module configured to perform frequency domain analysis on the pre-collected audio signal to obtain features of the audio signal.
  • the audio module further includes a filtering submodule, configured to filter the noise signal in the audio signal according to the audio signal feature.
  • control module determines a video module adjustment parameter according to the received audio signal and the video signal, as follows:
  • the control module determines the position of the sound source according to the position of the at least three audio collectors and the audio signals of different intensities collected by the audio collector;
  • the angle and/or the telescopic distance that the video module needs to rotate is determined, and the video module is adjusted as a parameter.
  • the adjusting parameters include:
  • Video module rotation angle, and / or video module telescopic distance are Video module rotation angle, and / or video module telescopic distance.
  • the present invention also discloses a video communication method, including:
  • the audio module collects the audio information, and converts the collected audio information into an audio signal and transmits the audio information to the control module;
  • the video module collects the video information, and converts the video information into a video signal and transmits the video information to the control module;
  • the control module determines a video module adjustment parameter according to the received audio signal and the video signal, and sends the adjustment parameter to the adjustment module;
  • the adjustment module adjusts the video module according to the adjustment parameter.
  • the method before the audio module converts the collected audio information into an audio signal and transmits the audio information to the control module, the method further includes:
  • the analysis sub-module in the audio module performs frequency domain analysis on the pre-collected audio signal to obtain characteristics of the audio signal.
  • the method further includes:
  • a filter sub-module in the audio module filters the noise signal in the audio signal based on the characteristics of the audio signal.
  • control module determines, according to the received audio signal and the video signal, that the video module adjustment parameter is:
  • the control module determines the position of the sound source according to the position of the at least three audio collectors and the audio signals of different intensities collected by the audio collector;
  • the angle and/or the telescopic distance that the video module needs to rotate is determined, and the video module is adjusted as a parameter.
  • the invention discloses a video communication terminal, which comprises an audio module, a video module, an adjustment module and a control module.
  • the audio module is used for collecting audio information, and converting the collected audio information into an audio signal and transmitting it to the office.
  • a control module the video module is configured to collect video information, and convert the video information into a video signal and then transmit the video information to the control module;
  • the control module is configured to determine a video module adjustment according to the received audio signal and the video signal And adjusting the parameter to the adjustment module;
  • the adjustment module is configured to adjust the video module according to the adjustment parameter.
  • the video communication terminal of the present invention determines the position of the user's face according to the voice of the user, and then adjusts The whole video module automatically tracks the user's face, avoiding the user's face falling out of the viewfinder range of the video module caused by shaking, etc., so that the user's face is always in the video image, and the image is clearly imaged, which brings better use to the user.
  • the whole video module automatically tracks the user's face, avoiding the user's face falling out of the viewfinder range of the video module caused by shaking, etc., so that the user's face is always in the video image, and the image is clearly imaged, which brings better use to the user.
  • the video communication terminal of the present invention determines the position of the user's face according to the voice of the user, and then adjusts The whole video module automatically tracks the user's face, avoiding the user's face falling out of the viewfinder range of the video module caused by shaking, etc., so that the user's face is always in the video image, and the image is clearly
  • FIG. 1 exemplarily describes a system configuration diagram of a video communication terminal of the present invention
  • FIG. 2 exemplarily depicts an audio collector profile on the video communication terminal of the present invention
  • FIG. 3 exemplarily illustrates a schematic diagram of determining a sound source location of the video communication terminal of the present invention
  • FIG. 4 exemplarily describes the present invention
  • FIG. 5 exemplarily describes the flow chart of the video communication method of the present invention.
  • the video communication terminal disclosed in the present invention includes an audio module, a video module, an adjustment module, and a control module, wherein the control module is respectively connected to the audio module, the video module, and the adjustment module, and the adjustment module is The video modules are connected.
  • Embodiment 1
  • a video communication terminal includes an audio module, a video module, an adjustment module, and a control module, where the control module is respectively connected to the audio module, the video module, and the adjustment module.
  • the adjustment module is connected to the video module.
  • the audio module is configured to collect audio information, convert the collected audio information into an audio signal, and transmit the audio information to the control module.
  • the video module is configured to collect video information, convert the video information into a video signal, and then transmit the video information to the control module.
  • the video module is usually a camera.
  • the control module is configured to determine a video module according to the received audio signal and the video signal Adjust the parameters and send adjustment parameters to the adjustment module.
  • the adjustment module is configured to control rotation and expansion of the video module according to the adjustment parameter.
  • the adjustment module is usually a small stepping motor, which can drive the video module to rotate up and down and left and right, and can also drive the video module to advance or retreat in a small range.
  • the audio module includes at least three audio buffers; the intensity of the audio signal collected by the audio collector is inversely proportional to the distance between the audio collector and the sound source, that is, the audio collector and the sound The farther the source is, the smaller the intensity of the audio signal collected by the ⁇ , and the closer the distance, the greater the intensity of the audio signal collected by ⁇ .
  • the audio collector is disposed at different positions of the video communication terminal.
  • Audio collectors are typically high sensitivity microphones.
  • three high-sensitivity microphones can already determine the direction of video collection, and increase the number of microphones to make the determined video collection direction more accurate.
  • the three high-sensitivity microphones are usually triangular.
  • the audio module further includes an analysis sub-module for performing frequency domain analysis on the pre-collected audio signal to obtain characteristics of the audio signal.
  • the audio module further includes a filtering sub-module for filtering a noise signal in the audio signal according to the audio signal characteristic.
  • the video communication terminal is not always in a relatively quiet environment.
  • the audio collector of this embodiment has high sensitivity, and the noisy environment may cause misjudgment. Therefore, the video communication terminal of this embodiment must first In a relatively quiet environment, the user's voice information is collected and analyzed in the frequency domain to obtain the characteristics of the user's audio signal.
  • the filtering sub-module in the audio module can be based on the pre-collected audio signal.
  • the feature of filtering the noise signal outside the fundamental frequency can improve the accuracy of the judgment of the video communication terminal.
  • the positions of the three audio collectors 8, B, and C are as shown in Fig. 2. After the three audio collectors collect the audio information of the same sound source at the same time, they are converted into audio signals, and the above three The audio signal is sent to the control module.
  • the distance between each microphone and the sound source can be known based on the intensity of the audio signal.
  • the method for determining the location of the sound source is:
  • the audio collector A and the sound source is a
  • the audio collector A is the center of the sphere, and the sphere having the radius a is formed
  • the distance between the audio collector B and the sound source is b
  • the distance between the audio collector C and the sound source is c
  • the audio collector C is the center of the sphere, and the sphere having the radius c is made
  • the intersection of the three spheres is the position of the sound source.
  • the image captured by the camera has an imaging interval, and the imaging interval includes an imaging range of the planar image, and an imaging clear distance in the depth direction.
  • the purpose of the camera adjustment is to make the user's full face clear image, and the full face is located in the center of the image. Therefore, adjusting the camera needs to adjust two parameters, one is to adjust the camera and the sound source (human face) The distance, the second is to image the sound source (human mouth) in the right position, so that the entire face is in the center of the image.
  • the sound source generally refers to the user's mouth. Referring to the human face, it can be found that the mouth is located symmetrically on the left and right sides of the person's face, about two-thirds from top to bottom.
  • the determined imaging direction includes: setting a position from the top to the bottom of the center line of the imaging range of the camera to be an imaging point in a case where the imaging clear distance is ensured.
  • the control module needs to determine the angle and/or the telescopic distance that the video module needs to rotate, that is, the video module adjustment parameter.
  • the control module After determining the video module adjustment parameters, the control module sends the adjustment parameters to the adjustment module, where the adjustment parameters include:
  • the corresponding parameter is zero.
  • the adjustment module rotates the video module according to the angle and controls the video module to expand and contract the appropriate distance so that the face is clearly imaged and located at the center of the image.
  • the video communication terminal of the present invention can track the user's face all the time, so that the user's face is always at the center of the video image, and the image is clear, which can overcome the shaking, etc.
  • Embodiment 2 is a diagrammatic representation of Embodiment 1:
  • a video communication method of an embodiment includes the following steps:
  • Step 101 The analysis sub-module in the audio module performs frequency domain analysis on the pre-collected audio signal to obtain features of the audio signal.
  • Step 102 When the video communication terminal is used for communication, the audio module collects audio information of the user during the call and converts the audio information into an audio signal.
  • Step 103 The filtering submodule of the audio module filters the noise signal in the audio signal according to the audio signal feature, and transmits the noise signal to the control module.
  • Step 104 The video module collects video information, converts the video information into a video signal, and transmits the video information to the control module.
  • Step 105 The control module determines a video module according to the received audio signal and the video signal. Integral parameters, and send adjustment parameters to the adjustment module;
  • Step 106 The adjustment module adjusts the video module according to the adjustment parameter, so that the video module can track the orientation of the user's face in real time.
  • the invention discloses a video communication terminal, which comprises an audio module, a video module, an adjustment module and a control module.
  • the audio module is used for collecting audio information, and converting the collected audio information into an audio signal and transmitting it to the office.
  • a control module the video module is configured to collect video information, and convert the video information into a video signal and then transmit the video information to the control module;
  • the control module is configured to determine a video module adjustment according to the received audio signal and the video signal And adjusting the parameter to the adjustment module;
  • the adjustment module is configured to adjust the video module according to the adjustment parameter.
  • the video communication terminal of the present invention determines the position of the user's face according to the voice of the user, and then adjusts the video module to automatically track the face of the user, thereby avoiding the user's face being separated from the viewing range of the video module due to shaking or the like, so that the user's face is always located.
  • the video image is clearly imaged, giving the user a better experience.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Multimedia (AREA)
  • Studio Devices (AREA)
  • Telephone Function (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明公开了一种视频通讯终端,包括音频模块、视频模块、调整模块以及控制模块,所述音频模块用于釆集音频信息,并将釆集到的音频信息转换为音频信号后传送至所述控制模块;所述视频模块用于釆集视频信息,并将视频信息转换为视频信号后传送至所述控制模块;所述控制模块用于根据接收到的音频信号和视频信号确定视频模块调整参数,并发送调整参数至所述调整模块;所述调整模块用于根据所述调整参数调整视频模块。本发明的视频通讯终端自动追踪用户的面部。

Description

一种视频通讯终端及视频通讯方法 技术领域
本发明涉及通讯领域, 尤其涉及一种视频通讯终端及视频通讯方法。 背景技术
随着无线通讯技术的发展, 无线通讯终端设备俗称无线终端, 已经成 为人们生活中的必备品。 无线终端的基本功能是提供语音通话, 近些年来,
3G技术已经也逐步走向成熟, 视频通话技术也逐步引入无线终端, 视频通 话方式正在越来越广泛的被用户所推崇。
现有无线终端的内置摄像头绝大多数固定在无线终端内部, 或者仅能 小角度转动, 当用户进行视频通话时, 因为摄像头取景范围有限, 所以经 常会出现因无线终端位置的变动而使用户的头像脱离摄像头的取景范围, 从而导致视频通话的对方无法看到本方用户的视频头像。 发明内容
本发明提供了一种视频通讯终端, 可自动追踪用户面部。
为解决上述技术问题, 本发明釆用了如下的技术方案:
一方面, 本发明公开了一种视频通讯终端, 包括音频模块、 视频模块、 调整模块以及控制模块, 其中, 所述控制模块与所述音频模块、 视频模块 及调整模块分别相连, 所述调整模块与所述视频模块相连;
所述音频模块, 用于釆集音频信息, 并将釆集到的音频信息转换为音 频信号后传送至所述控制模块;
所述视频模块, 用于釆集视频信息, 并将视频信息转换为视频信号后 传送至所述控制模块; 所述控制模块, 用于根据接收到的音频信号和视频信号确定视频模块 调整参数, 并发送调整参数至所述调整模块;
所述调整模块, 用于根据所述调整参数调整视频模块。
上述视频通讯终端的一个实施例中, 所述音频模块包括至少三个音频 釆集器;
所述音频釆集器釆集的音频信号的强度与所述音频釆集器和声源之间 的距离成反比;
所述音频釆集器设置于所述视频通讯终端的不同位置。
上述视频通讯终端的一个实施例中, 所述音频模块还包括分析子模块, 用于对预先釆集到的音频信号进行频域分析, 获得所述音频信号的特征。
上述视频通讯终端的一个实施例中, 所述音频模块还包括过滤子模块, 用于根据所述音频信号特征, 过滤音频信号中的噪声信号。
上述视频通讯终端的一个实施例中, 所述控制模块根据接收到的音频 信号和视频信号确定视频模块调整参数, 按如下方式进行:
所述控制模块根据至少三个音频釆集器的位置及上述音频釆集器釆集 到的强度不同的音频信号, 确定声源的位置;
依照将所述声源置于所述视频模块的成像点上的要求, 确定所述视频 模块需要转动的角度和 /或伸缩距离, 并作为视频模块调整参数。
上述视频通讯终端的一个实施例中, 所述调整参数包括:
视频模块转动角度、 和 /或视频模块伸缩距离。
另一方面, 本发明还公开了一种视频通讯方法, 包含:
音频模块釆集音频信息, 并将釆集到的音频信息转换为音频信号后传 送至控制模块;
视频模块釆集视频信息, 并将视频信息转换为视频信号后传送至所述 控制模块; 所述控制模块根据接收到的音频信号和视频信号确定视频模块调整参 数, 并发送调整参数至所述调整模块;
所述调整模块根据所述调整参数调整视频模块。
上述视频通讯方法的一个实施例中, 在所述音频模块将釆集到的音频 信息转换为音频信号后传送至控制模块之前, 所述方法还包括:
音频模块中分析子模块对预先釆集到的音频信号进行频域分析, 获得 所述音频信号的特征。
上述视频通讯方法的一个实施例中, 在所述获得所述音频信号的特征 之后, 所述方法还包括:
音频模块中的过滤子模块根据所述音频信号特征, 过滤音频信号中的 噪声信号。
上述视频通讯方法的一个实施例中, 所述控制模块根据接收到的音频 信号和视频信号确定视频模块调整参数为:
所述控制模块根据至少三个音频釆集器的位置及上述音频釆集器釆集 到的强度不同的音频信号, 确定声源的位置;
依照将所述声源置于所述视频模块的成像点上的要求, 确定视频模块 需要转动的角度和 /或伸缩距离, 并作为视频模块调整参数。
和现有技术相比, 本发明的有益效果在于:
本发明公开了一种视频通讯终端, 包括音频模块、 视频模块、 调整模 块以及控制模块, 所述音频模块用于釆集音频信息, 并将釆集到的音频信 息转换为音频信号后传送至所述控制模块; 所述视频模块用于釆集视频信 息, 并将视频信息转换为视频信号后传送至所述控制模块; 所述控制模块 用于根据接收到的音频信号和视频信号确定视频模块调整参数, 并发送调 整参数至所述调整模块; 所述调整模块用于根据所述调整参数调整视频模 块。 本发明的视频通讯终端根据用户的声音确定用户面部的位置, 然后调 整视频模块自动追踪用户的面部, 避免了因为晃动等原因造成的用户面部 脱离视频模块的取景范围, 可使用户的面部一直位于视频图像中, 且清晰 成像, 给用户带来了更好的使用体验。 附图说明
图 1示例性地描述了本发明的视频通讯终端的***结构图;
图 2示例性地描述了本发明的视频通讯终端上的音频釆集器分布图; 图 3示例性地描述了本发明的视频通讯终端确定声源位置示意图; 图 4示例性地描述了本发明中的调整模块调整视频模块的示意图; 图 5示例性地描述了本发明的视频通讯方法的流程图。 具体实施方式
本发明公开的一种视频通讯终端, 包括音频模块、 视频模块、 调整模 块以及控制模块, 其中, 所述控制模块与所述音频模块、 视频模块及调整 模块分别相连, 所述调整模块与所述视频模块相连。
下面对照附图并结合具体实施方式对本发明进行进一步详细说明。 实施例一:
如图 1 所示, 本发明一种实施例的视频通讯终端, 包括音频模块、 视 频模块、 调整模块以及控制模块, 其中, 所述控制模块与所述音频模块、 视频模块及调整模块分别相连, 所述调整模块与所述视频模块相连。
所述音频模块, 用于釆集音频信息, 并将釆集到的音频信息转换为音 频信号后传送至所述控制模块。
所述视频模块, 用于釆集视频信息, 并将视频信息转换为视频信号后 传送至所述控制模块。
视频模块通常为摄像头。
所述控制模块, 用于根据接收到的音频信号和视频信号确定视频模块 调整参数, 并发送调整参数至所述调整模块。
所述调整模块, 用于根据所述调整参数控制视频模块的转动及伸缩。 调整模块通常为小型的步进电机, 可以带动视频模块上下左右的转动, 还可以带动视频模块小范围的前进或后退。
所述音频模块包括至少三个音频釆集器; 所述音频釆集器釆集的音频 信号的强度与所述音频釆集器与声源之间的距离成反比, 即音频釆集器与 声源相距越远, 釆集到的音频信号强度越小, 距离越近, 釆集到的音频信 号强度越大。
音频釆集器设置于所述视频通讯终端的不同位置。
音频釆集器通常为高灵敏度的麦克风。
一般情况下, 三个高灵敏度麦克风已经可以确定视频釆集方向了, 增 加麦克风的个数, 可使确定的视频釆集方向更为准确。 三个高灵敏度的麦 克风通常呈三角形设置。
音频模块还包括分析子模块, 用于对预先釆集到的音频信号进行频域 分析, 获得所述音频信号的特征。
所述音频模块还包括过滤子模块, 用于根据所述音频信号特征, 过滤 音频信号中的噪声信号。
一般情况下, 视频通讯终端并不是一直处于相对安静的环境中的, 本 实施例的音频釆集器灵敏度较高, 吵杂的环境会造成误判, 因此, 本实施 例的视频通讯终端要先在相对安静的环境中, 釆集用户的声音信息并进行 频域分析, 以获得用户的音频信号的特征。
由于每个人说话的声音的基础频率是相对稳定的, 获得用户的音频信 号特征后, 若在相对杂音多的环境中使用时, 音频模块中的过滤子模块就 可以根据预先釆集到的音频信号的特征, 将基础频率之外的噪声信号过滤 , 可提高视频通讯终端判断的准确性。 下面以三个音频釆集器按三角方式设置于视频通讯终端上为例, 说明 控制模块如何根据接收到的音频信号定位视频釆集的方向的。
三个音频釆集器八、 B、 C的位置如图 2所示, 这三个音频釆集器同一 时刻釆集到同一个声源的音频信息后, 转化为音频信号, 并将上述三个音 频信号发送至控制模块。
由于音频信号的强度和麦克风与声源之间的距离成反比, 因此根据音 频信号的强度, 可以知道每个麦克风与声源之间的距离。
具体多少强度的音频信号强度对应多少的距离, 在出厂时可以设定好。 如图 3所示, 确定声源位置的方法为:
若音频釆集器 A与声源之间的距离为 a, 则以音频釆集器 A为球心, 做半径为 a的球面; 音频釆集器 B与声源之间的距离为 b,则以音频釆集器 B为球心, 做半径为 b的球面; 音频釆集器 C与声源之间的距离为 c, 则以 音频釆集器 C为球心, 做半径为 c的球面; 三个球面的交点即为声源的位 置。
摄像头釆集的图像有成像区间, 成像区间包括平面图像的成像范围, 还有纵深方向上的成像清晰距离。
本实施例中, 摄像头调整的目的是让用户的全脸清晰成像, 并且全脸 位于图像的中央, 因此, 调整摄像头需要调整两方面的参数, 一是调整摄 像头与声源 (人脸)之间的距离, 二是将声源 (人的嘴)成像在合适的位 置, 使全脸位于图像的中央。
声源一般指的是用户的嘴, 参照人脸, 可以发现, 嘴位于人的脸部左 右对称、 从上到下约三分之二的位置。
因此, 本实施例中, 确定的成像方向包括: 在保证成像清晰距离的情 况下, 将摄像头成像范围左右对称的中心线上、 从上到下三分之二的位置 设置为成像点。 为了将所述声源置于所述视频模块的成像点上时, 控制模块需要确定 视频模块需要转动的角度和 /或伸缩距离, 即视频模块调整参数。
确定视频模块调整参数后, 控制模块发送调整参数至调整模块, 调整 参数中包括:
视频模块转动角度, 及视频模块伸缩距离。
如果视频模块不需要转动, 或者视频模块不需要伸缩, 则对应的参数 为零。
如图 4所示, 调整模块接收到调整参数后, 按照其中的角度转动视频 模块, 并控制视频模块伸缩适当的距离, 使得人脸清晰成像, 并且位于图 像的中心位置。
根据用户嘴的位置, 实时调整视频模块的朝向和距离, 可以使本发明 的视频通讯终端一直追踪用户的面部, 使用户的面部一直处于视频图像的 中心位置, 而且成像清晰, 可以克服因为晃动等原因造成的用户脱离取景 范围的弊端, 满足用户多方面的需求。
实施例二:
如图 5所示, 一种实施例的视频通讯方法, 包含以下步骤:
步骤 101 ,音频模块中的分析子模块对预先釆集到的音频信号进行频域 分析, 获得所述音频信号的特征。
步骤 102, 釆用视频通讯终端通讯时, 音频模块釆集用户通话时的音频 信息, 并转化为音频信号。
步骤 103 , 音频模块的过滤子模块根据音频信号特征, 过滤音频信号中 的噪声信号, 并传送至所述控制模块。
步骤 104,视频模块釆集视频信息, 并将视频信息转换为视频信号后传 送至所述控制模块。
步骤 105 ,控制模块根据接收到的音频信号和视频信号确定视频模块调 整参数, 并发送调整参数至所述调整模块;
步骤 106,调整模块根据所述调整参数调整视频模块, 使视频模块可以 实时跟踪用户脸的朝向。
本发明公开了一种视频通讯终端, 包括音频模块、 视频模块、 调整模 块以及控制模块, 所述音频模块用于釆集音频信息, 并将釆集到的音频信 息转换为音频信号后传送至所述控制模块; 所述视频模块用于釆集视频信 息, 并将视频信息转换为视频信号后传送至所述控制模块; 所述控制模块 用于根据接收到的音频信号和视频信号确定视频模块调整参数, 并发送调 整参数至所述调整模块; 所述调整模块用于根据所述调整参数调整视频模 块。 本发明的视频通讯终端根据用户的声音确定用户面部的位置, 然后调 整视频模块自动追踪用户的面部, 避免了因为晃动等原因造成的用户面部 脱离视频模块的取景范围, 可使用户的面部一直位于视频图像中, 且清晰 成像, 给用户带来了更好的使用体验。
以上内容是结合具体的实施方式对本发明所作的进一步详细说明, 不 能认定本发明的具体实施只局限于这些说明。 对于本发明所属技术领域的 普通技术人员来说, 在不脱离本发明构思的前提下, 还可以做出若干简单 推演或替换, 都应当视为属于本发明的保护范围。

Claims

权利要求书
1、 一种视频通讯终端, 其特征在于, 包括音频模块、 视频模块、 调整 模块以及控制模块, 其中, 所述控制模块与所述音频模块、 视频模块及调 整模块分别相连, 所述调整模块与所述视频模块相连;
所述音频模块, 用于釆集音频信息, 并将釆集到的音频信息转换为音 频信号后传送至所述控制模块;
所述视频模块, 用于釆集视频信息, 并将视频信息转换为视频信号后 传送至所述控制模块;
所述控制模块, 用于根据接收到的音频信号和视频信号确定视频模块 调整参数, 并发送调整参数至所述调整模块;
所述调整模块, 用于根据所述调整参数调整视频模块。
2、 如权利要求 1所述的视频通讯终端, 其特征在于, 所述音频模块包 括至少三个音频釆集器;
所述音频釆集器釆集的音频信号的强度与所述音频釆集器和声源之间 的距离成反比;
所述音频釆集器设置于所述视频通讯终端的不同位置。
3、 如权利要求 2所述的视频通讯终端, 其特征在于, 所述音频模块还 包括分析子模块, 用于对预先釆集到的音频信号进行频域分析, 获得所述 音频信号的特征。
4、 如权利要求 3所述的视频通讯终端, 其特征在于, 所述音频模块还 包括过滤子模块, 用于根据所述音频信号特征, 过滤音频信号中的噪声信 号。
5、 如权利要求 2至 4任一项所述的视频通讯终端, 其特征在于, 所述 控制模块根据接收到的音频信号和视频信号确定视频模块调整参数, 按如 下方式进行: 所述控制模块根据至少三个音频釆集器的位置及上述音频釆集器釆集 到的强度不同的音频信号, 确定声源的位置;
依照将所述声源置于所述视频模块的成像点上的要求, 确定所述视频 模块需要转动的角度和 /或伸缩距离, 并作为视频模块调整参数。
6、 如权利要求 5所述的视频通讯终端, 其特征在于, 所述调整参数包 括:
视频模块转动角度、 和 /或视频模块伸缩距离。
7、 一种视频通讯方法, 其特征在于, 所述方法包含:
音频模块釆集音频信息, 并将釆集到的音频信息转换为音频信号后传 送至控制模块;
视频模块釆集视频信息, 并将视频信息转换为视频信号后传送至所述 控制模块;
所述控制模块根据接收到的音频信号和视频信号确定视频模块调整参 数, 并发送调整参数至所述调整模块;
所述调整模块根据所述调整参数调整视频模块。
8、 如权利要求 7所述的视频通讯方法, 其特征在于, 在所述音频模块 将釆集到的音频信息转换为音频信号后传送至控制模块之前, 所述方法还 包括:
音频模块中分析子模块对预先釆集到的音频信号进行频域分析, 获得 所述音频信号的特征。
9、 如权利要求 8所述的视频通讯方法, 其特征在于, 在所述获得所述 音频信号的特征之后, 所述方法还包括:
音频模块中的过滤子模块根据所述音频信号特征, 过滤音频信号中的 噪声信号。
10、 如权利要求 7至 9任一项所述的视频通讯方法, 其特征在于, 所 述控制模块根据接收到的音频信号和视频信号确定视频模块调整参数为: 所述控制模块根据音频模块的至少三个音频釆集器的位置及所述音频 釆集器釆集到的强度不同的音频信号, 确定声源的位置;
依照将所述声源置于所述视频模块的成像点上的要求, 确定视频模块 需要转动的角度和 /或伸缩距离, 并作为视频模块调整参数。
PCT/CN2011/076751 2011-04-21 2011-07-01 一种视频通讯终端及视频通讯方法 WO2012142793A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201110100686.8A CN102752573B (zh) 2011-04-21 2011-04-21 一种视频通讯终端及视频通讯方法
CN201110100686.8 2011-04-21

Publications (1)

Publication Number Publication Date
WO2012142793A1 true WO2012142793A1 (zh) 2012-10-26

Family

ID=47032452

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/076751 WO2012142793A1 (zh) 2011-04-21 2011-07-01 一种视频通讯终端及视频通讯方法

Country Status (2)

Country Link
CN (1) CN102752573B (zh)
WO (1) WO2012142793A1 (zh)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104284081B (zh) * 2014-05-14 2016-05-18 深圳警翼数码科技有限公司 一种执法记录仪及其控制方法
CN105120203A (zh) * 2015-10-10 2015-12-02 无锡康斯特科技有限公司 一种视频通讯终端***及视频通讯方法
CN108683855A (zh) * 2018-07-26 2018-10-19 广东小天才科技有限公司 一种摄像头的控制方法及终端设备
CN109168075B (zh) * 2018-10-30 2021-11-30 重庆辉烨物联科技有限公司 一种视频信息传输方法、***、服务器

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1713717A (zh) * 2004-06-25 2005-12-28 北京中星微电子有限公司 摄像机拍摄方位数字声控定向方法
CN201213278Y (zh) * 2008-07-02 2009-03-25 希姆通信息技术(上海)有限公司 手机摄像人脸智能追踪装置
CN201426153Y (zh) * 2009-05-27 2010-03-17 中山佳时光电科技有限公司 用于视频会议智能摄像头控制***
CN101753805A (zh) * 2008-12-01 2010-06-23 厦门市罗普特科技有限公司 声源追踪智能摄像机

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3546784B2 (ja) * 1999-12-14 2004-07-28 日本電気株式会社 携帯端末

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1713717A (zh) * 2004-06-25 2005-12-28 北京中星微电子有限公司 摄像机拍摄方位数字声控定向方法
CN201213278Y (zh) * 2008-07-02 2009-03-25 希姆通信息技术(上海)有限公司 手机摄像人脸智能追踪装置
CN101753805A (zh) * 2008-12-01 2010-06-23 厦门市罗普特科技有限公司 声源追踪智能摄像机
CN201426153Y (zh) * 2009-05-27 2010-03-17 中山佳时光电科技有限公司 用于视频会议智能摄像头控制***

Also Published As

Publication number Publication date
CN102752573B (zh) 2015-04-01
CN102752573A (zh) 2012-10-24

Similar Documents

Publication Publication Date Title
CN104735598B (zh) 助听***与助听***的语音撷取方法
US9516201B2 (en) Imaging control apparatus, imaging control method, and program
CN103581606B (zh) 一种多媒体采集装置和方法
CN110415695A (zh) 一种语音唤醒方法及电子设备
CN102594990A (zh) 一种智能手机底座与手机及其实现方法
TWI678696B (zh) 語音資訊的接收方法、系統及裝置
JP2009065669A (ja) イア・バイオメトリックスを使用してハンドヘルド・オーディオ・デバイス(handheldaudiodevice)を設定する方法および装置
CN106791699A (zh) 一种远程头戴交互式视频共享***
WO2012142793A1 (zh) 一种视频通讯终端及视频通讯方法
CN104349040B (zh) 用于视频会议***中的摄像机底座及其方法
CN104378635B (zh) 基于麦克风阵列辅助的视频感兴趣区域的编码方法
CN106887236A (zh) 一种声像联合定位的远距离语音采集装置
WO2014185170A1 (ja) 画像処理装置、画像処理方法およびプログラム
WO2018098626A1 (zh) 一种通信终端
US11856387B2 (en) Video conferencing system and method thereof
CN207266143U (zh) 具有usb3.0接口的语音跟踪ptz摄像机
CN100420298C (zh) 摄像机拍摄方位数字声控定向方法
CN112367473A (zh) 一种基于声纹到达相位的可旋转摄像装置及其控制方法
CN105721645A (zh) 手机语音外设
WO2013170802A1 (zh) 一种提高移动终端通话音质的方法及装置
CN208445660U (zh) 一种拍摄设备
JP2014072835A (ja) 会議装置
JP6835205B2 (ja) 撮影収音装置、収音制御システム、撮影収音装置の制御方法、及び収音制御システムの制御方法
WO2022022200A1 (zh) 变焦视频的音量调节方法及装置、视频拍摄设备
WO2018064883A1 (zh) 一种录音方法、装置、设备及计算机存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11863796

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11863796

Country of ref document: EP

Kind code of ref document: A1