CN114762039A - 一种会议数据处理方法及相关设备 - Google Patents

一种会议数据处理方法及相关设备 Download PDF

Info

Publication number
CN114762039A
CN114762039A CN201980102782.0A CN201980102782A CN114762039A CN 114762039 A CN114762039 A CN 114762039A CN 201980102782 A CN201980102782 A CN 201980102782A CN 114762039 A CN114762039 A CN 114762039A
Authority
CN
China
Prior art keywords
audio
conference
voiceprint
additional information
clip
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201980102782.0A
Other languages
English (en)
Inventor
刘智辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Publication of CN114762039A publication Critical patent/CN114762039A/zh
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/568Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/40Support for services or applications
    • H04L65/403Arrangements for multi-party communication, e.g. for conferences
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/10Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/41Electronic components, circuits, software, systems or apparatus used in telephone systems using speaker recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/55Aspects of automatic or semi-automatic exchanges related to network data storage and management
    • H04M2203/552Call annotations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2203/00Aspects of automatic or semi-automatic exchanges
    • H04M2203/60Aspects of automatic or semi-automatic exchanges related to security aspects in telephonic communication systems
    • H04M2203/6054Biometric subscriber identification

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Telephonic Communication Services (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

本发明实施例提供一种会议数据处理方法及相关设备,该方法应用于会议***,该方法包括:会议终端在会议进行的过程中根据声源方位采集第一会场的音频片段;生成采集的多个音频片段各自对应的第一附加信息;向会议信息处理设备发送会议过程中录制的会议音频和多个音频片段对应的第一附加信息,会议音频被会议信息处理设备分割成多个音频片段并附有相应的第二附加信息,其中,每个音频片段对应的第二附加信息包括用于确定音频片段的发言人身份的信息和相应的音频片段的标识信息;第一附加信息和第二附加信息被会议信息处理设备用于生成参会人员与发言的对应关系。采用本申请实施例,能够更准确地确定会议过程中的发言与发言者的对应关系。

Description

PCT国内申请,说明书已公开。

Claims (45)

  1. PCT国内申请,权利要求书已公开。
CN201980102782.0A 2019-12-31 2019-12-31 一种会议数据处理方法及相关设备 Pending CN114762039A (zh)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/130978 WO2021134720A1 (zh) 2019-12-31 2019-12-31 一种会议数据处理方法及相关设备

Publications (1)

Publication Number Publication Date
CN114762039A true CN114762039A (zh) 2022-07-15

Family

ID=76686340

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201980102782.0A Pending CN114762039A (zh) 2019-12-31 2019-12-31 一种会议数据处理方法及相关设备

Country Status (4)

Country Link
US (1) US20220335949A1 (zh)
EP (1) EP4068282A4 (zh)
CN (1) CN114762039A (zh)
WO (1) WO2021134720A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2022062874A (ja) * 2020-10-09 2022-04-21 ヤマハ株式会社 話者予測方法、話者予測装置、およびコミュニケーションシステム
CN115396627A (zh) * 2022-08-24 2022-11-25 易讯科技股份有限公司 一种录屏视频会议的定位管理方法及***

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7117157B1 (en) * 1999-03-26 2006-10-03 Canon Kabushiki Kaisha Processing apparatus for determining which person in a group is speaking
CN102572372B (zh) * 2011-12-28 2018-10-16 中兴通讯股份有限公司 会议纪要的提取方法和装置
CN102968991B (zh) * 2012-11-29 2015-01-21 华为技术有限公司 一种语音会议纪要的分类方法、设备和***
US20190303879A1 (en) * 2018-04-02 2019-10-03 Ca, Inc. Meeting recording software
US10867610B2 (en) * 2018-05-04 2020-12-15 Microsoft Technology Licensing, Llc Computerized intelligent assistant for conferences
CN108922538B (zh) * 2018-05-29 2023-04-07 平安科技(深圳)有限公司 会议信息记录方法、装置、计算机设备及存储介质
CN109388701A (zh) * 2018-08-17 2019-02-26 深圳壹账通智能科技有限公司 会议记录生成方法、装置、设备和计算机存储介质
CN110232925A (zh) * 2019-06-28 2019-09-13 百度在线网络技术(北京)有限公司 生成会议记录的方法、装置和会议终端

Also Published As

Publication number Publication date
EP4068282A1 (en) 2022-10-05
EP4068282A4 (en) 2022-11-30
US20220335949A1 (en) 2022-10-20
WO2021134720A1 (zh) 2021-07-08

Similar Documents

Publication Publication Date Title
EP3963576B1 (en) Speaker attributed transcript generation
JP2022532313A (ja) 分散システムにおいてユーザの好みに最適化するためのカスタマイズされた出力
US9064160B2 (en) Meeting room participant recogniser
US10923139B2 (en) Systems and methods for processing meeting information obtained from multiple sources
US20190215464A1 (en) Systems and methods for decomposing a video stream into face streams
US11138980B2 (en) Processing overlapping speech from distributed devices
KR101636716B1 (ko) 발언자를 구별하는 영상 회의 장치 및 방법
JP6999734B2 (ja) オーディオビジュアルデータに基づく話者ダイアライゼーション方法および装置
US10812921B1 (en) Audio stream processing for distributed device meeting
WO2019184650A1 (zh) 字幕生成方法及终端
JP2007528031A (ja) 音声および映像ソースデータを分離および評価する技術
US20220335949A1 (en) Conference Data Processing Method and Related Device
CN111883168A (zh) 一种语音处理方法及装置
JP2005055668A (ja) 音声処理装置
CN110196914B (zh) 一种将人脸信息录入数据库的方法和装置
US20210174791A1 (en) Systems and methods for processing meeting information obtained from multiple sources
US9165182B2 (en) Method and apparatus for using face detection information to improve speaker segmentation
CN114333853A (zh) 一种音频数据的处理方法、设备和***
JP5030868B2 (ja) 会議音声録音システム
Hung et al. Towards audio-visual on-line diarization of participants in group meetings
CN113611308A (zh) 一种语音识别方法、装置、***、服务器及存储介质
CN114764690A (zh) 一种智能进行会议纪要的方法、装置和***
CN113542604A (zh) 视频对焦方法及装置
RU2821283C2 (ru) Индивидуально настроенный вывод, который оптимизируется для пользовательских предпочтений в распределенной системе
CN117392995A (zh) 基于多模态的话者分离方法、装置、设备及存储介质

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination