CN110751942A - 一种识别特征声音的方法和装置 - Google Patents
一种识别特征声音的方法和装置 Download PDFInfo
- Publication number
- CN110751942A CN110751942A CN201810801712.1A CN201810801712A CN110751942A CN 110751942 A CN110751942 A CN 110751942A CN 201810801712 A CN201810801712 A CN 201810801712A CN 110751942 A CN110751942 A CN 110751942A
- Authority
- CN
- China
- Prior art keywords
- sound
- characteristic
- sound data
- unit
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 62
- 206010011224 Cough Diseases 0.000 claims description 57
- 238000012549 training Methods 0.000 claims description 49
- 241000282898 Sus scrofa Species 0.000 claims description 45
- 238000001514 detection method Methods 0.000 claims description 31
- 230000006870 function Effects 0.000 claims description 31
- 238000013527 convolutional neural network Methods 0.000 claims description 23
- 230000008569 process Effects 0.000 claims description 14
- 238000012545 processing Methods 0.000 claims description 13
- 238000003860 storage Methods 0.000 claims description 13
- 230000004913 activation Effects 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 9
- 210000002569 neuron Anatomy 0.000 claims description 9
- 238000007781 pre-processing Methods 0.000 claims description 6
- 238000010586 diagram Methods 0.000 description 15
- 238000013528 artificial neural network Methods 0.000 description 8
- 230000008901 benefit Effects 0.000 description 8
- 238000004422 calculation algorithm Methods 0.000 description 8
- 238000012544 monitoring process Methods 0.000 description 8
- 241000282887 Suidae Species 0.000 description 7
- 238000004891 communication Methods 0.000 description 7
- 238000000605 extraction Methods 0.000 description 7
- 230000000694 effects Effects 0.000 description 6
- 230000005236 sound signal Effects 0.000 description 6
- 238000004364 calculation method Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 239000012634 fragment Substances 0.000 description 4
- 238000009432 framing Methods 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000003595 spectral effect Effects 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 238000012795 verification Methods 0.000 description 4
- 206010039740 Screaming Diseases 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 238000013529 biological neural network Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000010355 oscillation Effects 0.000 description 3
- 238000009395 breeding Methods 0.000 description 2
- 230000001488 breeding effect Effects 0.000 description 2
- 230000005713 exacerbation Effects 0.000 description 2
- 239000000835 fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 238000013139 quantization Methods 0.000 description 2
- 208000023504 respiratory system disease Diseases 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 208000031361 Hiccup Diseases 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 230000006806 disease prevention Effects 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 230000009429 distress Effects 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 208000017574 dry cough Diseases 0.000 description 1
- 230000037433 frameshift Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 238000011031 large-scale manufacturing process Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 244000144972 livestock Species 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000002265 prevention Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/04—Segmentation; Word boundary detection
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/063—Training
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/10—Speech classification or search using distance or distortion measures between unknown speech and reference templates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/24—Speech recognition using non-acoustical features
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Image Analysis (AREA)
Abstract
Description
Claims (18)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810801712.1A CN110751942A (zh) | 2018-07-20 | 2018-07-20 | 一种识别特征声音的方法和装置 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810801712.1A CN110751942A (zh) | 2018-07-20 | 2018-07-20 | 一种识别特征声音的方法和装置 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110751942A true CN110751942A (zh) | 2020-02-04 |
Family
ID=69274750
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810801712.1A Pending CN110751942A (zh) | 2018-07-20 | 2018-07-20 | 一种识别特征声音的方法和装置 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110751942A (zh) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111462920A (zh) * | 2020-05-24 | 2020-07-28 | 绍兴声科科技有限公司 | 用于监控传染疾病流行程度的声音监测方法及*** |
CN113160835A (zh) * | 2021-04-23 | 2021-07-23 | 河南牧原智能科技有限公司 | 一种猪只声音提取方法、装置、设备及可读存储介质 |
CN114041779A (zh) * | 2021-11-26 | 2022-02-15 | 河南牧原智能科技有限公司 | 用于对牲畜呼吸道疾病进行识别的识别***和计算机设备 |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5611002A (en) * | 1991-08-09 | 1997-03-11 | U.S. Philips Corporation | Method and apparatus for manipulating an input signal to form an output signal having a different length |
US6535847B1 (en) * | 1998-09-17 | 2003-03-18 | British Telecommunications Public Limited Company | Audio signal processing |
WO2003061299A1 (en) * | 2002-01-18 | 2003-07-24 | Koninklijke Philips Electronics N.V. | Audio coding |
CN1669070A (zh) * | 2002-08-08 | 2005-09-14 | 科斯莫坦股份有限公司 | 使用可变长度合成和简化互相关计算的音频信号时间缩放比例修改方法 |
US20110295510A1 (en) * | 2010-03-05 | 2011-12-01 | Vialogy Llc | Active Noise Injection Computations for Improved Predictability in Oil and Gas Reservoir Characterization and Microseismic Event Analysis |
US20120143610A1 (en) * | 2010-12-03 | 2012-06-07 | Industrial Technology Research Institute | Sound Event Detecting Module and Method Thereof |
CN103207408A (zh) * | 2013-03-01 | 2013-07-17 | 中煤科工集团西安研究院 | 被动地震监测数据压缩方法及控制*** |
CN103280220A (zh) * | 2013-04-25 | 2013-09-04 | 北京大学深圳研究生院 | 一种实时的婴儿啼哭声识别方法 |
CN104934040A (zh) * | 2014-03-17 | 2015-09-23 | 华为技术有限公司 | 音频信号的时长调整方法和装置 |
CN105719642A (zh) * | 2016-02-29 | 2016-06-29 | 黄博 | 连续长语音识别方法及***、硬件设备 |
CN106952644A (zh) * | 2017-02-24 | 2017-07-14 | 华南理工大学 | 一种基于瓶颈特征的复杂音频分割聚类方法 |
CN107095645A (zh) * | 2016-02-22 | 2017-08-29 | 上海宽带技术及应用工程研究中心 | 一种基于鼾声的睡眠呼吸低通暂停综合症的诊断装置 |
CN108172213A (zh) * | 2017-12-26 | 2018-06-15 | 北京百度网讯科技有限公司 | 娇喘音频识别方法、装置、设备及计算机可读介质 |
-
2018
- 2018-07-20 CN CN201810801712.1A patent/CN110751942A/zh active Pending
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5611002A (en) * | 1991-08-09 | 1997-03-11 | U.S. Philips Corporation | Method and apparatus for manipulating an input signal to form an output signal having a different length |
US6535847B1 (en) * | 1998-09-17 | 2003-03-18 | British Telecommunications Public Limited Company | Audio signal processing |
WO2003061299A1 (en) * | 2002-01-18 | 2003-07-24 | Koninklijke Philips Electronics N.V. | Audio coding |
CN1669070A (zh) * | 2002-08-08 | 2005-09-14 | 科斯莫坦股份有限公司 | 使用可变长度合成和简化互相关计算的音频信号时间缩放比例修改方法 |
US20110295510A1 (en) * | 2010-03-05 | 2011-12-01 | Vialogy Llc | Active Noise Injection Computations for Improved Predictability in Oil and Gas Reservoir Characterization and Microseismic Event Analysis |
US20120143610A1 (en) * | 2010-12-03 | 2012-06-07 | Industrial Technology Research Institute | Sound Event Detecting Module and Method Thereof |
CN103207408A (zh) * | 2013-03-01 | 2013-07-17 | 中煤科工集团西安研究院 | 被动地震监测数据压缩方法及控制*** |
CN103280220A (zh) * | 2013-04-25 | 2013-09-04 | 北京大学深圳研究生院 | 一种实时的婴儿啼哭声识别方法 |
CN104934040A (zh) * | 2014-03-17 | 2015-09-23 | 华为技术有限公司 | 音频信号的时长调整方法和装置 |
CN107095645A (zh) * | 2016-02-22 | 2017-08-29 | 上海宽带技术及应用工程研究中心 | 一种基于鼾声的睡眠呼吸低通暂停综合症的诊断装置 |
CN105719642A (zh) * | 2016-02-29 | 2016-06-29 | 黄博 | 连续长语音识别方法及***、硬件设备 |
CN106952644A (zh) * | 2017-02-24 | 2017-07-14 | 华南理工大学 | 一种基于瓶颈特征的复杂音频分割聚类方法 |
CN108172213A (zh) * | 2017-12-26 | 2018-06-15 | 北京百度网讯科技有限公司 | 娇喘音频识别方法、装置、设备及计算机可读介质 |
Non-Patent Citations (1)
Title |
---|
任双雪: "风电功率爬坡事件的滑动窗检测与实例分析", 电网与清洁能源, 31 January 2018 (2018-01-31), pages 109 - 111 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111462920A (zh) * | 2020-05-24 | 2020-07-28 | 绍兴声科科技有限公司 | 用于监控传染疾病流行程度的声音监测方法及*** |
CN113160835A (zh) * | 2021-04-23 | 2021-07-23 | 河南牧原智能科技有限公司 | 一种猪只声音提取方法、装置、设备及可读存储介质 |
CN114041779A (zh) * | 2021-11-26 | 2022-02-15 | 河南牧原智能科技有限公司 | 用于对牲畜呼吸道疾病进行识别的识别***和计算机设备 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Mukherjee et al. | Line spectral frequency-based features and extreme learning machine for voice activity detection from audio signal | |
CN103117061B (zh) | 一种基于语音的动物识别方法及装置 | |
WO2021082420A1 (zh) | 声纹认证方法、装置、介质及电子设备 | |
CN110767218A (zh) | 端到端语音识别方法、***、装置及其存储介质 | |
CN110751942A (zh) | 一种识别特征声音的方法和装置 | |
CN110930989B (zh) | 语音意图识别方法、装置、计算机设备和存储介质 | |
CN113066499A (zh) | 一种陆空通话说话人身份识别方法及装置 | |
CN104205215A (zh) | 自动实时言语障碍矫正 | |
CN113077821A (zh) | 音频质量检测方法、装置、电子设备及存储介质 | |
CN116741159A (zh) | 音频分类及模型的训练方法、装置、电子设备和存储介质 | |
CN112992190A (zh) | 音频信号的处理方法、装置、电子设备和存储介质 | |
US11282514B2 (en) | Method and apparatus for recognizing voice | |
Tang et al. | Transound: Hyper-head attention transformer for birds sound recognition | |
CN113327584B (zh) | 语种识别方法、装置、设备及存储介质 | |
CN112244863A (zh) | 信号识别方法、信号识别装置、电子设备及可读存储介质 | |
CN109634554B (zh) | 用于输出信息的方法和装置 | |
CN113160823B (zh) | 基于脉冲神经网络的语音唤醒方法、装置及电子设备 | |
Wang et al. | A hierarchical birdsong feature extraction architecture combining static and dynamic modeling | |
CN114898737A (zh) | 声学事件检测方法、装置、电子设备和存储介质 | |
CN115240647A (zh) | 声音事件检测方法、装置、电子设备及存储介质 | |
CN112863548A (zh) | 训练音频检测模型的方法、音频检测方法及其装置 | |
CN107437414A (zh) | 基于嵌入式gpu***的并行化游客识别方法 | |
CN112992175A (zh) | 一种语音区分方法及其语音记录装置 | |
CN113327616A (zh) | 声纹识别方法、装置、电子设备及存储介质 | |
CN112381989A (zh) | 排序方法、装置、***和电子设备 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant after: Jingdong Technology Holding Co.,Ltd. Address before: 101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant before: Jingdong Digital Technology Holding Co.,Ltd. Address after: 101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant after: Jingdong Digital Technology Holding Co.,Ltd. Address before: 101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant before: JINGDONG DIGITAL TECHNOLOGY HOLDINGS Co.,Ltd. Address after: 101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant after: JINGDONG DIGITAL TECHNOLOGY HOLDINGS Co.,Ltd. Address before: 101111 Room 221, 2nd Floor, Block C, 18 Kechuang 11th Street, Beijing Economic and Technological Development Zone Applicant before: BEIJING JINGDONG FINANCIAL TECHNOLOGY HOLDING Co.,Ltd. |
|
CB02 | Change of applicant information |