GB2486038B - Speech-to-text conversion - Google Patents

Speech-to-text conversion

Info

Publication number
GB2486038B
GB2486038B GB1110992.3A GB201110992A GB2486038B GB 2486038 B GB2486038 B GB 2486038B GB 201110992 A GB201110992 A GB 201110992A GB 2486038 B GB2486038 B GB 2486038B
Authority
GB
United Kingdom
Prior art keywords
speech
transcribing
recorded
recording
standardised
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
GB1110992.3A
Other versions
GB201110992D0 (en
GB2486038A (en
Inventor
Andrew Levine
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to GB1110992.3A priority Critical patent/GB2486038B/en
Priority to GB1118583.2A priority patent/GB2513821A/en
Publication of GB201110992D0 publication Critical patent/GB201110992D0/en
Publication of GB2486038A publication Critical patent/GB2486038A/en
Priority to PCT/EP2012/062256 priority patent/WO2013000868A1/en
Priority to EP12734850.6A priority patent/EP2766899A1/en
Application granted granted Critical
Publication of GB2486038B publication Critical patent/GB2486038B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/42221Conversation recording systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/568Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • G10L21/007Changing voice quality, e.g. pitch or formants characterised by the process used
    • G10L21/013Adapting to target pitch
    • G10L2021/0135Voice conversion or morphing
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/40Electronic components, circuits, software, systems or apparatus used in telephone systems using speech recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2201/00Electronic components, circuits, software, systems or apparatus used in telephone systems
    • H04M2201/60Medium conversion

Abstract

A method of automatically transcribing speech comprising the steps of: recording a portion of speech to be transcribed (20); processing the recorded portion of speech (20); and transcribing the processed recording of the portion of speech to a text file (64) using a speech-to-text transcription algorithm, the speech-to-text transcription algorithm utilising a pre-existing user profile (30) to render a substantially accurate text file (64), wherein the step of processing the recorded speech comprises morphing the recorded speech such that the processed recording resembles the same portion of speech as spoken by a standardised voice, and wherein the pre-existing user profile (30) is optimised for transcribing portions of speech spoken in the standardised voice.
GB1110992.3A 2011-06-28 2011-06-28 Speech-to-text conversion Active GB2486038B (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
GB1110992.3A GB2486038B (en) 2011-06-28 2011-06-28 Speech-to-text conversion
GB1118583.2A GB2513821A (en) 2011-06-28 2011-06-28 Speech-to-text conversion
PCT/EP2012/062256 WO2013000868A1 (en) 2011-06-28 2012-06-25 Speech-to-text conversion
EP12734850.6A EP2766899A1 (en) 2011-06-28 2012-06-25 Speech-to-text conversion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
GB1110992.3A GB2486038B (en) 2011-06-28 2011-06-28 Speech-to-text conversion

Publications (3)

Publication Number Publication Date
GB201110992D0 GB201110992D0 (en) 2011-08-10
GB2486038A GB2486038A (en) 2012-06-06
GB2486038B true GB2486038B (en) 2013-09-25

Family

ID=44485314

Family Applications (2)

Application Number Title Priority Date Filing Date
GB1110992.3A Active GB2486038B (en) 2011-06-28 2011-06-28 Speech-to-text conversion
GB1118583.2A Withdrawn GB2513821A (en) 2011-06-28 2011-06-28 Speech-to-text conversion

Family Applications After (1)

Application Number Title Priority Date Filing Date
GB1118583.2A Withdrawn GB2513821A (en) 2011-06-28 2011-06-28 Speech-to-text conversion

Country Status (3)

Country Link
EP (1) EP2766899A1 (en)
GB (2) GB2486038B (en)
WO (1) WO2013000868A1 (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180247640A1 (en) * 2013-12-06 2018-08-30 Speech Morphing Systems, Inc. Method and apparatus for an exemplary automatic speech recognition system
US9300801B1 (en) 2015-01-30 2016-03-29 Mattersight Corporation Personality analysis of mono-recording system and methods
AU2018237123B2 (en) * 2017-03-22 2022-08-04 Xibin Liao Bruton's tyrosine kinase inhibitors
US10891947B1 (en) 2017-08-03 2021-01-12 Wells Fargo Bank, N.A. Adaptive conversation support bot
EP3477636A1 (en) * 2017-10-30 2019-05-01 Seth, Sagar Analysis mechanisms of telephone conversations for contextual information element extraction
CN109493868B (en) * 2018-12-13 2024-04-09 中国平安财产保险股份有限公司 Policy entry method and related device based on voice recognition
CA3147589A1 (en) * 2019-07-15 2021-01-21 Axon Enterprise, Inc. Methods and systems for transcription of audio data
CN111062729A (en) * 2019-11-28 2020-04-24 中国银行股份有限公司 Information acquisition method, device and equipment
CN114079695A (en) * 2020-08-18 2022-02-22 北京有限元科技有限公司 Method, device and storage medium for recording voice call content
CN112329765A (en) * 2020-10-09 2021-02-05 中保车服科技服务股份有限公司 Text detection method and device, storage medium and computer equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0564166A2 (en) * 1992-04-02 1993-10-06 AT&T Corp. Automatic speech recognizer
WO1996013827A1 (en) * 1994-11-01 1996-05-09 British Telecommunications Public Limited Company Speech recognition
US5649060A (en) * 1993-10-18 1997-07-15 International Business Machines Corporation Automatic indexing and aligning of audio and text using speech recognition
US6307576B1 (en) * 1997-10-02 2001-10-23 Maury Rosenfeld Method for automatically animating lip synchronization and facial expression of animated characters
US6438520B1 (en) * 1999-01-20 2002-08-20 Lucent Technologies Inc. Apparatus, method and system for cross-speaker speech recognition for telecommunication applications

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6535848B1 (en) * 1999-06-08 2003-03-18 International Business Machines Corporation Method and apparatus for transcribing multiple files into a single document
US6332122B1 (en) * 1999-06-23 2001-12-18 International Business Machines Corporation Transcription system for multiple speakers, using and establishing identification
US20030050777A1 (en) * 2001-09-07 2003-03-13 Walker William Donald System and method for automatic transcription of conversations
US7830408B2 (en) * 2005-12-21 2010-11-09 Cisco Technology, Inc. Conference captioning
US8478598B2 (en) * 2007-08-17 2013-07-02 International Business Machines Corporation Apparatus, system, and method for voice chat transcription
US8972506B2 (en) * 2008-12-15 2015-03-03 Verizon Patent And Licensing Inc. Conversation mapping
US9871916B2 (en) * 2009-03-05 2018-01-16 International Business Machines Corporation System and methods for providing voice transcription

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0564166A2 (en) * 1992-04-02 1993-10-06 AT&T Corp. Automatic speech recognizer
US5649060A (en) * 1993-10-18 1997-07-15 International Business Machines Corporation Automatic indexing and aligning of audio and text using speech recognition
WO1996013827A1 (en) * 1994-11-01 1996-05-09 British Telecommunications Public Limited Company Speech recognition
US6307576B1 (en) * 1997-10-02 2001-10-23 Maury Rosenfeld Method for automatically animating lip synchronization and facial expression of animated characters
US6438520B1 (en) * 1999-01-20 2002-08-20 Lucent Technologies Inc. Apparatus, method and system for cross-speaker speech recognition for telecommunication applications

Also Published As

Publication number Publication date
GB201110992D0 (en) 2011-08-10
EP2766899A1 (en) 2014-08-20
GB2486038A (en) 2012-06-06
GB2513821A (en) 2014-11-12
WO2013000868A1 (en) 2013-01-03
GB201118583D0 (en) 2011-12-07

Similar Documents

Publication Publication Date Title
GB2486038B (en) Speech-to-text conversion
WO2013192218A3 (en) Dynamic language model
GB201205790D0 (en) Transcription of speech
EP4318463A3 (en) Multi-modal input on an electronic device
MX2015009812A (en) Method and system for recognizing speech commands.
GB2489489B (en) A speech processing system and method
Yan et al. A scalable approach to using DNN-derived features in GMM-HMM based acoustic modeling for LVCSR.
WO2011133766A3 (en) Methods and systems for training dictation-based speech-to-text systems using recorded samples
WO2011074771A3 (en) Apparatus and method for foreign language study
MX2014010795A (en) Device for extracting information from a dialog.
WO2012169737A3 (en) Display apparatus and method for executing link and method for recognizing voice thereof
GB2514943A (en) Voice authentication and speech recognition system and method
DE60211197D1 (en) METHOD AND DEVICE FOR THE CONVERSION OF SPANISHED TEXTS AND CORRECTION OF THE KNOWN TEXTS
WO2009158581A3 (en) System and method for spoken topic or criterion recognition in digital media and contextual advertising
WO2009063445A3 (en) A method and apparatus for fast search in call-center monitoring
MX2013014171A (en) Display apparatus and method for executing link and method for recognizing voice thereof.
EP2494546B8 (en) Method, server and system for transcription of spoken language
GB2506278A (en) Voice transformation with encoded information
JP2012522278A5 (en)
WO2009008055A1 (en) Speech recognizer, speech recognition method, and speech recognition program
WO2012134877A3 (en) Computer-implemented systems and methods evaluating prosodic features of speech
WO2012075476A3 (en) Warped spectral and fine estimate audio encoding
WO2013127825A8 (en) Computer-implemented method and system for generating a report
Kwon et al. Extraction of speech features for emotion recognition
WO2008039755A3 (en) Phonetically enriched labeling in unit selection speech synthesis

Legal Events

Date Code Title Description
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 1172439

Country of ref document: HK

REG Reference to a national code

Ref country code: HK

Ref legal event code: WD

Ref document number: 1172439

Country of ref document: HK

PCNP Patent ceased through non-payment of renewal fee

Effective date: 20180628

S28 Restoration of ceased patents (sect. 28/pat. act 1977)

Free format text: APPLICATION FILED

S28 Restoration of ceased patents (sect. 28/pat. act 1977)

Free format text: RESTORATION ALLOWED

Effective date: 20190514