WO1999014741A1

WO1999014741A1 - Method for recognising a keyword in speech

Info

Publication number: WO1999014741A1
Application number: PCT/DE1998/002633
Authority: WO
Inventors: Jochen Junkawitsch; Harald Höge
Original assignee: Siemens Aktiengesellschaft
Priority date: 1997-09-18
Filing date: 1998-09-07
Publication date: 1999-03-25
Also published as: CN1125433C; CN1270688A; JP2001516904A; DE59801227D1; EP1012828B1; EP1012828A1; ES2161550T3; US6505156B1

Abstract

The invention relates to a method for recognising a keyword in speech. A beginning of the keyword is assumed at each sampling instant. The system then attempts to represent this key word on a sequence of HMM states describing the key word. The best path is determined in a representation space using a Viterbi algorithm and a local confidence measure is used instead of an emission probability used in the Viterbi algorithm. If a global confidence measure made up of local confidence measures falls below a bottom limit for the best Viterbi path, the key word is recognised and the sampling instant which was assumed as the beginning of the word is confirmed.

Description

Beschreibungdescription

Verfahren zur Erkennung eines Schlüsselworts in gesprochener SpracheMethod for recognizing a keyword in spoken language

Die Erfindung betrifft ein Verfahren zur Erkennung eines Schlüsselworts in gesprochener Sprache.The invention relates to a method for recognizing a keyword in spoken language.

Bei der Erkennung eines Schlüsselworts in gesprochener Sprache ist bisher stets eine Modellierung der kompletten gesprochenen Äußerung erforderlich. Dem Fachmann sind im wesentlichen zwei Verfahren bekannt:When recognizing a keyword in spoken language, it has always been necessary to model the complete spoken utterance. Essentially two methods are known to the person skilled in the art:

Aus [1] ist ein Verfahren zur Erkennung eines Schlüsselworts bekannt, das einen Spracherkenner mit großem Wortschatz verwendet. Dabei wird versucht, die gesprochene Sprache vollständig zu erkennen. Anschließend werden die erkannten Wörter auf eventuell vorhandene Schlüsselwörter hin untersucht. Dieses Verfahren ist wegen des großen Wortschatzes und wegen der Probleme bei einer Modellierung von spontansprachlichen Äußerungen und Geräuschen, also nicht eindeutig einem Wort zuordenbaren Teil des Sprachsignals, aufwendig und fehlerbehaftet.A method for recognizing a keyword is known from [1], which uses a speech recognizer with a large vocabulary. An attempt is made to fully recognize the spoken language. The recognized words are then examined for any existing keywords. This method is complex and error-prone because of the large vocabulary and because of the problems with modeling spontaneous expressions and noises, that is to say part of the speech signal that cannot be clearly assigned to a word.

Ein anderes Verfahren verwendet spezielle Füll -Modelle (auch: Filier-, Garbage-Modelle) , um Äußerungsteile zu modellieren, die nicht zum Wortschatz der Schlüsselwörter gehören (sog. OOV-Anteile, OOV=Out of Vocabulary) . Ein derartiger Spracherkenner ist in [2] beschrieben und umfaßt die Schlüsselwörter sowie ein Füll -Modell oder mehrere Füll-Another method uses special fill models (also: filleting, garbage models) to model parts of the utterance that do not belong to the vocabulary of the keywords (so-called OOV parts, OOV = Out of Vocabulary). Such a speech recognizer is described in [2] and includes the key words as well as a fill model or several fill

Modelle. Dieses Verfahren liefert eine Folge von Füll- und Schlüsselwortsymbolen. Dabei ist es eine Schwierigkeit, ein geeignetes Füll -Modell zu entwerfen bzw. zu trainieren, die sich gut von den modellierten Schlüsselwörtern abheben, also eine hohe Diskriminanz bezüglich der Schlüsselwortmodelle aufweisen. Weiterhin sind aus [3] oder [4] Hidden-Markov-Modelle (HMMs) bekannt. Auch ist aus [3] oder [4] bekannt, einen besten Pfades mittels des Viterbi-Algorithmus zu bestimmen.Models. This method provides a sequence of fill and keyword symbols. It is difficult to design or train a suitable filler model that stands out well from the modeled key words, that is, it is highly discriminatory with regard to the key word models. Hidden Markov models (HMMs) are also known from [3] or [4]. It is also known from [3] or [4] to determine a best path using the Viterbi algorithm.

Hidden-Markov-Modelle (HMMs) dienen der Beschreibung diskreter stochastischer Prozesse (auch Markov-Prozesse genannt) . Im Bereich der Spracherkennung dienen Hidden- Markov-Modelle u.a. zum Aufbau eines Wortlexikonε , in dem die aus Untereinheiten aufgebauten Wortmodelle verzeichnet sind.Hidden Markov models (HMMs) are used to describe discrete stochastic processes (also called Markov processes). In the area of speech recognition, hidden Markov models are used, among other things. to build a word lexicon in which the word models constructed from subunits are listed.

Formal wird ein Hidden-Markov-Modell beschrieben durch:A hidden Markov model is formally described by:

λ = (A, B, π) (0-1)λ = (A, B, π) (0-1)

mit einer quadratischen Zuεtandsübergangsmatrix A, die Zustandsübergangswahrscheinlichkeiten A_j_-_j enthält:with a quadratic state transition matrix A, which contains state transition probabilities A _j _- _j :

^A = {^Aij} ^{mi ±} > i ^{= 1} ---_' ^N (0-2) ^A = { ^A ij} ^{mi ±} > i ^{= 1} --- _' ^N (0-2)

und einer Emissionsmatrix B, die Emissionswahrscheinlich- keiten B^ umfaßt :and an emission matrix B, which includes emission probabilities B ^:

B = {B_ik} mit i = 1,...,N; k = 1,...,M (0-3)B = {B _ik } with i = 1, ..., N; k = 1, ..., M (0-3)

Zur Initialisierung dient ein N-dimensionaler Vektor π , eine Auftrittswahrscheinlichkeit der N Zustände für den Zeitpunkt t = 1 festlegt:An N-dimensional vector π is used for initialization, which defines a probability of occurrence of the N states for the time t = 1:

π = {_πi} = p(s(l) = si) (0-4)π = { _πi } = p (s (l) = si) (0-4)

Dabei wird allgemein mitIt is generally with

P(s(t) = q_t) (0-5)P (s (t) = q _t ) (0-5)

die Wahrscheinlichkeit dafür bezeichnet, daß sich die Markovkette S = {s(l), s(2), s(3) s(t), . . .} ( 0 - 6 )the probability indicates that the Markov chain S = {s (l), s (2), s (3) s (t),. , .} (0 - 6)

zum Zeitpunkt t im Zustand q_t befindet. Dabei weist die Markovkette s_. einen Wertebereichis in state q _t at time t. The Markov chain s _. a range of values

s(t) e {s_1# ε₂, • • • , s_N} (0-7)s (t) e {s _{1 #} ε ₂ , • • •, s _N } (0-7)

auf, wobei dieser Wertebereich eine endliche Menge von N Zuständen enthält. Der Zustand, in dem sich der Markov-Prozeß zum Zeitpunkt t befindet, heißt q_t ., this range of values containing a finite set of N states. The state in which the Markov process is at time t is called q _t .

Die Emissionswahrscheinlichkeit B-_j__k ergibt sich aus dem Auftreten eines bestimmten Symbols σ^ im Zustand S_j_ zuThe emission probability B- _j _ _k results from the occurrence of a certain symbol σ ^ in state S _j _ zu

wobei ein Zeichenvorrat ∑ der Größe M gemäßwhere a character set ∑ according to the size M

∑ = {σ_1# σ₂, ... , σ_M} (0-9)∑ = {σ _{1 #} σ ₂ , ..., σ _M } (0-9)

bestimmte Symbole σ_k umfaßt (k=l..M) .includes certain symbols σ _k (k = l..M).

Ein Zustandsraum aus Hidden-Markov-Modellen ergibt sich, indem jeder Zustand des Hidden-Markov-Modells eine vorgegebene Menge Nachfolgezustände haben kann: sich selbst, den nächsten Zustand, den übernächsten Zustand, usf. Der Zustandsraum mit allen möglichen Übergängen wird als Trellis bezeichnet. Bei Hidden-Markov-Modellen der Ordnung 1 ist eine mehr als ein Zeitschritt zurückliegende Vergangenheit irrelevant .A state space from hidden Markov models results in that each state of the Hidden Markov model can have a predefined set of successor states: itself, the next state, the state after next, etc. The state space with all possible transitions is called a trellis . For hidden Markov models of order 1, a past more than one time step is irrelevant.

Dem Viterbi -Algorithmus liegt die Idee zugrunde, daß, wenn man sich lokal auf einem optimalen Pfad im Zustandsraum (Trellis) befindet, dieser immer Bestandteil eines globalen optimalen Pfades ist. Wegen der Ordnung 1 der Hidden-Markov- Modelle ist nur der beste Vorgänger eines Zustandes zu betrachten, da die schlechteren Vorgänger vorab eine schlechtere Bewertung erhalten haben. Das bedeutet also, daß man rekursiv, beginnend vom ersten Zeitpunkt an, Zeitschritt für Zeitschritt, den optimalen Pfad suchen kann, indem für jeden Zeitschritt alle möglichen Fortsetzungen des Pfades bestimmt werden und nur die beste Fortsetzung ausgewählt wird.The Viterbi algorithm is based on the idea that if you are locally on an optimal path in the state space (trellis), it is always part of a global optimal path. Because of order 1 of the Hidden Markov Models can only be considered the best predecessor of a condition, since the worse predecessors received a poorer rating in advance. This means that you can search recursively, starting from the first point in time, step by time, the optimal path by determining all possible continuations of the path for each time step and selecting only the best continuation.

Bei den beiden in [1] und [2] beschriebenen Verfahren ist jeweils eine Modellierung der OOV-Anteile notwendig. Im ersten Fall [1] müssen die Wörter der Äußerung explizit im Wortschatz des Erkenners vorhanden sein, im zweiten Fall [2] werden alle OOV-Wörter und OOV-Geräusche durch spezielle Füll -Modelle dargestellt.In the two methods described in [1] and [2], modeling of the OOV components is necessary. In the first case [1], the words of utterance must be explicitly present in the recognizer's vocabulary, in the second case [2] all OOV words and OOV noises are represented by special fill models.

Die Aufgabe der Erfindung besteht darin, ein Verfahren anzugeben, das die Erkennung eines Schlüsselworts in gesprochener Sprache ermöglicht, wobei die oben beschriebenen Nachteile vermieden werden.The object of the invention is to provide a method which enables the recognition of a keyword in spoken language, the disadvantages described above being avoided.

Diese Aufgabe wird gemäß den Merkmalen des Patentanspruchs 1 gelöst.This object is achieved in accordance with the features of patent claim 1.

Erfindungsgemäß angegeben wird ein Verfahren zur Erkennung eines Schlüsselworts in gesprochener Sprache, wobei das Schlüsselwort dargestellt wird durch eine Folge von Zuständen W. Die gesprochene Sprache wird mit einer vorgegebenen Rate abgetastet und zu jedem Abtastzeitpunkt t ein Merkmalsvektor 0_t für ein zu dem Abtastzeitpunkt t gehörendes Sprachsignal aus der gesprochenen Sprache erstellt. Eine Folge 0 von Merkmalsvektoren O_t wird mittels eines Viterbi-Algorithmus auf die Folge von Zuständen W abgebildet, wobei in einem Zustand ein lokales Konfidenzmaß ein Emissionsmaß, vorzugsweise den negativen Logarithmus einer Emissionswahrscheinlichkeit, ersetzt. Der Viterbi- Algorithmus liefert ein globales Konfidenzmaß C (auch: Konfidenzmaß C) . Das Schlüsselwort wird in der gesprochenen Sprache erkannt, wenn gilt:According to the invention, a method for recognizing a key word in spoken language is specified, the key word being represented by a sequence of states W. The spoken language is scanned at a predetermined rate and at each sampling time t a feature vector 0 _t for one belonging to the sampling time t Speech signal created from the spoken language. A sequence 0 of feature vectors O _t is mapped to the sequence of states W by means of a Viterbi algorithm, in which case a local confidence measure is an emission measure, preferably the negative logarithm an emission probability. The Viterbi algorithm provides a global confidence measure C (also: confidence measure C). The key word is recognized in the spoken language if:

C(W,0) < T (1) ,C (W, 0) <T (1),

wobeiin which

C() das Konfidenzmaß, wW das Schlüsselwort, dargestellt als eineC () the confidence measure, wW the keyword, represented as one

Folge von Zuständen,Sequence of conditions

0 die Folge von Merkmalsvektoren O_t ,0 the sequence of feature vectors O _t ,

T einen vorgegebenen Schwellwert bbeezzeeiicchhnneenn.T a predetermined threshold bbeezzeeiicchhnneenn.

Ansonsten wird das Schlüsselwort in der gesprochenen Sprache nicht erkannt .Otherwise the keyword is not recognized in the spoken language.

Ein Vorteil der Erfindung besteht darin, daß innerhalb der gesprochenen Sprache ein Schlüsselwort erkannt wird, ohne daß die Äußerung insgesamt modelliert werden muß. Dadurch ergibt sich ein deutlich reduzierter Aufwand bei der Implementierung und demzufolge auch ein leistungsfähigeres (schnelleres) Verfahren. Durch die Verwendung des (globalen) Konfidenzmaßes C als ein grundlegendes Dekodierprinzip beschränkt sich die akustische Modellierung innerhalb des Dekodiervorgangs auf die Schlüsselwörter.An advantage of the invention is that a key word is recognized within the spoken language without the utterance having to be modeled as a whole. This results in a significantly reduced effort in the implementation and consequently also a more efficient (faster) procedure. By using the (global) confidence measure C as a basic decoding principle, acoustic modeling within the decoding process is limited to the key words.

Eine Weiterbildung besteht darin, daß zu jedem Abtastzeitpunkt t ein neuer Pfad durch den Zustandsraum der Hidden-Markov-Modelle in einem ersten Zustand der Folge von Zuständen W beginnt. Dadurch wird zu jedem Abtastzeitpunkt angenommen, daß ein Beginn eines Schlüsselwortes in der gesprochenen Sprache enthalten ist. Anhand des Konfidenzmaßes werden aus nachfolgenden AbtastZeitpunkten resultierenden Merkmalsvektoren auf die durch Hidden-Markov-Modelle repräsentierte Zustände des Schlüsselwortes abgebildet. Es ergibt sich am Ende der Abbildung, also am Pfad-Ende, ein globales Konfidenzmaß, anhand dessen eine Entscheidung getroffen wird, ob das der vermeintliche Beginn des Schlüsselwortes wirklich ein solcher war. Wenn ja, wird das Schlüsselwort erkannt, ansonsten wird es nicht erkannt.A further development consists in that a new path through the state space of the hidden Markov models begins in a first state of the sequence of states W at each sampling time t. As a result, it is assumed at every sampling time that a beginning of a keyword is contained in the spoken language. On the basis of the confidence measure, feature vectors resulting from subsequent sampling times are mapped to the states of the keyword represented by hidden Markov models. It At the end of the figure, i.e. at the end of the path, there is a global confidence measure, on the basis of which a decision is made as to whether the supposed beginning of the keyword was really such. If so, the keyword is recognized, otherwise it is not recognized.

Im Rahmen einer Weiterbildung der Erfindung ist das globale Konfidenzmaß C bestimmt durchIn the context of a further development of the invention, the global confidence measure C is determined by

C = - log P (W| 0) (2)C = - log P (W | 0) (2)

und das zugehörige lokale Konfidenzmaß c bestimmt ist durchand the associated local confidence measure c is determined by

wobeiin which

SJ einen Zustand der Folge von Zuständen,SJ a state of succession of states

P(W|0) eine Wahrscheinlichkeit für dasP (W | 0) a probability for that

Schlüsselwort unter der Bedingung einer Folge von Merkmalsvektoren O_t Keyword under the condition of a sequence of feature vectors O _t

P(O_t|sή) die Emissionswahrscheinlichkeit, P(s-j) die Wahrscheinlichkeit für den Zustand sj , P(O_t) die Wahrscheinlichkeit für den Merkmalsvektor O_t bezeichnen.P (O _t | sή) denote the emission probability, P (sj) the probability for the state sj, P (O _t ) the probability for the feature vector O _t .

Ein geeignetes globales Konfidenzmaß ist charakterisiert durch die Eigenschaft, daß es Aufschluß über den Grad einer Zuverlässigkeit angibt, mit der ein Schlüsselwort detektiert wird. Im negativen logarithmischen Bereich drückt ein kleinerA suitable global confidence measure is characterized by the property that it provides information about the degree of reliability with which a keyword is detected. In the negative logarithmic range, a small press

Wert des globalen Konfidenzmaßes C eine hohe Zuverlässigkeit aus .Value of the global confidence measure C high reliability.

Im Rahmen einer zusätzlichen Weiterbildung ist das Konfidenzmaß C bestimmt durch P (0| W) , ,As part of an additional training, the confidence measure C is determined by P (0 | W),,

C - -log _ (4) p (o| w)C - -log _ (4) p (o | w)

und das zugehörige lokale Konfidenzmaß ist bestimmt durchand the associated local confidence measure is determined by

wobeiin which

P(θ|W) die Wahrscheinlichkeit für die Folge von Merkmalsvektoren 0^- unter der Bedingung, daß nicht das Schlüsselwort W eintritt, Sj das Gegenereignis zum Zustand S (also: nicht der Zustand s-; ) bezeichnen.P (θ | W) the probability for the sequence of feature vectors 0 ^ - on the condition that the keyword W does not occur, Sj denote the counter event to the state S (ie: not the state s-;).

Der Vorteil der dargestellten Konfidenzmaße besteht u.a. darin, daß sie berechenbar sind, also kein vorhergehendes Training und/oder kein Schätzen erforderlich ist/sind.The advantages of the confidence measures shown include in that they are predictable, so no previous training and / or estimation is / are required.

Aus den Definitionen der globalen Konfidenzmaße lassen sich jeweils die Definitionen der lokalen Konfidenzmaße herleiten. In die Berechnung des Konfidenzmaßes für ein Schlüsselwort gehen lokale Konfidenzmaße zu denjenigen Zeitpunkten ein, die zeitlich mit der Äußerung dieses Schlüsselworts zusammenfallen.The definitions of the local confidence measures can be derived from the definitions of the global confidence measures. Local confidence measures at those points in time that coincide with the utterance of this keyword are included in the calculation of the confidence measure for a keyword.

Mit den BeziehungenWith relationships

^p(°t) = ∑^p(°t|s_k) -P(s_k) ⁽6) k ^p (° t) = ∑ ^p (° t | s _k ) -P (s _k ) ⁽ 6) k

und

and

lassen sich die lokalen Konfidenzmaße berechnen.the local confidence measures can be calculated.

Ferner ist es möglich, P(θ_t) bzw. P10_t Sj) durch geeigneteIt is also possible to use P (θ _t ) or P10 _t Sj) by suitable means

Näherungsverfahren zu bestimmen. Ein Beispiel für ein solches Näherungsverfahren ist die Mittelung der n-besten Emissionen -log Plθ_t Sj] zu jedem Zeitpunkt t.To determine approximation methods. An example of such an approximation method is the averaging of the n best emissions -log Plθ _t Sj] at every point in time t.

Der Dekodiervorgang wird üblicherweise mit Hilfe des Viterbi- Algorithmus durchgeführt :The decoding process is usually carried out using the Viterbi algorithm:

^Ct, s_j = ^min(^Ct -l, s_k + ^ct,s_j + ^akjJ • ^C t, s _j = ^min ( ^C t -l, s _k + ^c t, s _j + ^a kjJ •

wobeiin which

Ct _s. das globale akkumulierte Konfidenzmaß zumCt _s . the global accumulated confidence measure at

Zeitpunkt t im Zustand S ,Time t in state S,

^ct-l, sv- ^^as globale akkumulierte Konfidenzmaß zum ^c tl, sv- ^ ^as global accumulated confidence measure for

Zeitpunkt t-1 im Zustand s_k,Time t-1 in state s _k ,

c_{t Ξ}. das lokale Konfidenzmaß zum Zeitpunkt t imc _{t Ξ} . the local confidence measure at time t im

Zustand Sj ,State Sj,

a_kj eine Übergangsstrafe vom Zustand s_k in dena _k j a transition penalty from state s _k to

Zustand SjCondition Sj

bezeichnen.describe.

Da für eine Darstellung des globalen Konfidenzmaßes für ein Schlüsselwort keine lokalen Konfidenzmaße außerhalb der zeitlichen Grenzen des Schlüsselwortes benötigt werden, kann bei der Suche nach dem Schlüsselwort auf eine akustische Modellierung der OOV-Anteile verzichtet werden.Since for a representation of the global confidence measure for a keyword no local confidence measures outside the time limits of the keyword can be required when searching for the keyword, acoustic modeling of the OOV components is not required.

Durch Anwendung des Viterbi -Algorithmus mit der Möglichkeit, zu jedem Zeitpunkt t einen neuen Pfad im ersten Zustand eines Schlüsselworts, wobei vorzugsweise das Schlüsselwort in einzelne Zustände eines Hidden-Markov-Modells (HMMs) unterteilt ist, zu starten, wird das globale Konfidenzmaß für dieses Schlüsselwort optimiert und gleichzeitig der optimale Startzeitpunkt ermittelt (Back Tracking des Viterbi- Algorithmus) .By using the Viterbi algorithm with the option of starting a new path in the first state of a keyword at any time t, the keyword preferably being divided into individual states of a hidden Markov model (HMMs), the global confidence measure for optimizes this keyword and at the same time determines the optimal start time (back tracking of the Viterbi algorithm).

Es ist weiterhin zweckmäßig, für einen vorgegebenen Zeitraum, auch unterhalb des Schwellwerteε T nach einem Minimum zu suchen. Dadurch wird vermieden, daß innerhalb dieses vorgegebenen Zeitraums ein Schlüsselwort mehrfach erkannt wird.It is furthermore expedient to search for a minimum for a predetermined period of time, even below the threshold value T. This prevents a keyword from being recognized multiple times within this predetermined time period.

Gibt es Schlüsselwörter, die einander im Hinblick auf ihre durch die jeweilige Folge von Zuständen repräsentiertenThere are key words that represent each other in terms of their being represented by the respective sequence of states

Beschreibungεform, ähnlich εind, εo ist es nützlich, einen Mechanismus einzusetzen, der bei Erkennung eines Schlüsεelworteε ausschließt, daß ein anderes Schlüsselwort teilweise in dem Zeitraum deε erkannten Schlüsselwortε in dem geεprochenen Sprachsignal enthalten war.Descriptive form, similar, it is useful to use a mechanism that, when a key word is recognized, rules out that another key word was partially contained in the speech signal spoken during the period of the recognized key word.

Weiterbildungen der Erfindung ergeben sich auch aus den abhängigen Ansprüchen.Further developments of the invention also result from the dependent claims.

Anhand der folgenden Figuren werden Ausführungsbeispiele derExemplary embodiments of the

Erfindung näher dargestellt.Invention presented in more detail.

Es zeigenShow it

Fig.l ein Blockdiagramm eines Verfahrens zur Erkennung eines Schlüsselworts in gesprochener Sprache, Fig.2 eine Skizze, die die Bestimmung eines Konfidenzmaßes veranschaulicht ,1 shows a block diagram of a method for recognizing a keyword in spoken language, 2 shows a sketch which illustrates the determination of a confidence measure,

Fig.3 eine Skizze wie Fig.3, die den Verlauf eines angenommenen Konfidenzmaßes über eine vorgegebene Zeitdauer darstellt.3 shows a sketch like FIG. 3, which shows the course of an assumed confidence measure over a predetermined period of time.

In Fig.1 ist ein Blockdiagramm eineε Verfahrens zur Erkennung eines Schlüsselworts in kontinuierlicher Sprache dargestellt.1 shows a block diagram of a method for recognizing a keyword in continuous language.

In einem Schritt 101 wird daε Schlüsεelwort dargeεtellt durch eine Folge von Zuständen W. Vorzugεweiεe werden dazu Phonem- HMMε mit je drei Zuεtänden eingeεetzt (εiehe [3]) . In einem nächsten Schritt 102 wird die kontinuierliche Sprache abgetastet und zu jedem Abtastzeitpunkt t ein Merkmalsvektor 0_t für ein zu dem Abtastzeitpunkt t gehörendes Sprachsignal aus der kontinuierlichen Sprache erstellt. Dabei umfaßt der Merkmalsvektor 0^ eine vorgegebene Menge Merkmale, die für das Sprachsignal zu dem Abtastzeitpunkt t kennzeichnend sind, alε Komponenten.In a step 101, the keyword is represented by a sequence of states W. For this purpose, phoneme HMMs with three states each are used (see [3]). In a next step 102 the continuous speech is sampled and at each sampling time t a feature vector 0 _t for a speech signal belonging to the sampling time t is created from the continuous speech. Here, the feature vector 0 ^ comprises a predetermined set of features which are characteristic of the speech signal at the sampling time t, as components.

In einem Schritt 103 werden eine Folge von Merkmalεvektoren, die für verεchiedene Abtaεtzeitpunkte t aus dem Sprachsignal gewonnen worden sind, auf die Folge von Zuständen W abgebildet. Eine Abbildungsvorschrift stellt dabei der Viterbi-Algorithmus (siehe [3] ) dar. Die beim Viterbi- Algorithmus eingesetzte Emiεεionswahrscheinlichkeit -log P(O_t|sj) wird durch ein lokaleε Konfidenzmaß ersetzt. In einem Schritt 104 liefert der Viterbi-Algorithmus zu jedem Zeitpunkt ein globales Konfidenzmaß C, daε für die gefundenenIn a step 103, a sequence of feature vectors which have been obtained from the speech signal for different sampling times t are mapped onto the sequence of states W. The Viterbi algorithm represents a mapping rule (see [3]). The emission probability -log P (O _t | sj) used in the Viterbi algorithm is replaced by a local confidence measure. In a step 104, the Viterbi algorithm delivers a global confidence measure C, that is, for those found

Zuεtände der Folge von Zuεtänden W einzelne lokale Konfidenzmaße kumuliert umfaßt. Daε Schlüεεelwort wird in einem Schritt 105 in der kontinuierlichen Sprache erkannt, wenn giltStates of the sequence of states W include individual local confidence measures cumulatively. The key word is recognized in a step 105 in the continuous language if applies

C(W,0) (1) wobeiC (W, 0) (1) in which

C() daε globale Konfidenzmaß,C () global confidence measure,

W das Schlüsselwort, dargestellt als eineW is the keyword, represented as one

Folge von Zuεtänden,Sequence of states,

0 die Folge von Merkmalεvektoren 0_t ,0 the sequence of feature vectors 0 _t ,

T einen vorgegebenen Schwellwert bezeichnenT designate a predetermined threshold

Anεonεten wird daε Schlüεεelwort in der kontinuierlichen Sprache nicht erkannt.Otherwise, the keyword is not recognized in the continuous language.

Nachfolgend werden zwei mögliche Realiεierungen für ein globaleε Konfidenzmaß und jeweilε ein zugehörigeε lokaleε Konfidenzmaß beεchrieben. Weitere Konfidenzmaße sind vorstellbar.Two possible realizations for a global confidence measure and an associated local confidence measure are described below. Further confidence measures are conceivable.

Ein erstes KonfidenzmaßA first measure of confidence

Das erste globale Konfidenzmaß wird definiert aus dem negativen Logarithmus einer a-posteriori-Wahrεcheinlichkeit für das Schlüsselwort als ein Zuverlässigkeitsmaß:The first global confidence measure is defined from the negative logarithm of an a posteriori probability for the keyword as a measure of reliability:

C₁ = - log P (W| 0)C ₁ = - log P (W | 0)

Nachfolgend wird die Bayes ' sche-Regel in Verbindung mit folgenden Annahmen angewandt :In the following, the Bayesian rule is applied in connection with the following assumptions:

P(0) = π ^p(°t) ( 8 : tP (0) = π ^p (° t) (8: t

^p( ) = π ^p(ß_ψ(_t)) (9) ^p () = π ^p (ß _ψ ( _t )) (9)

iw) = π P Ot ψ(t)J ^{• a}ψ(t-l),ψ(t) (10) t Die Wahrεcheinlichkeit für eine Folge von Merkmalεvektoren P(0) wird dabei auεgedrückt als eine Multiplikation von Wahrscheinlichkeiten für einzelne Merkmalsvektoren P(0_t). Auf die gleiche Art wird die Wahrscheinlichkeit für ein ganzes Wort P(W) berechnet, indem die einzelnen Wahrscheinlichkeiten u_/CtH jedeε einzelnen auεgewählten Zuεtandε eineε HMMε multipliziert werden, wobei die Funktion ψ(t) eine Abbildung der Merkmalεvektoren (alεo der Zeit) auf die Zuεtände deε Schlüεselwortes ist. Die bedingte Wahrscheinlichkeit p(θ|w) entspricht der gewöhnlichen Wahrεcheinlichkeit deε HMMε, die ittelε der Emiεεionεwahrεcheinlichkeiten p[ 0_t Ξ^ ^-Λ 1 und der Transitionswahrεcheinlichkeiten a-vi_/f -l) v_j/(t) berechnet werden kann. Somit ergibt εich daε globale Konfidenzmaß C^ zu:iw) = π P Ot ψ (t) J ^{• a} ψ (tl), ψ (t) (10) t The probability for a sequence of feature vectors P (0) is expressed as a multiplication of probabilities for individual feature vectors P (0 _t ). In the same way, the probability for an entire word P (W) is calculated by multiplying the individual probabilities u _/ CtH of each individual selected state of an HMMε, the function ψ (t) mapping the characteristic vectors (alεo of time) is the state of the keyword. The conditional probability p (θ | w) corresponds to the usual probability of the HMMε, which can be calculated using the emission probabilities p [0 _t Ξ ^ ^ -Λ 1 and the transition probabilities a-vi _/ f -l) v _{j /} (t) . Thus the global confidence measure C ^ results in:

Betrachtet man die Arbeitεweiεe deε Viterbi-Algorithmuε, εo empfiehlt εich die Definition eineε lokalen Konfidenzmaßeε ^cl(°t j) , das innerhalb deε Suchvorgangε deε Viterbi- Algorithmuε benutzt wird:If one looks at the mode of operation of the Viterbi algorithm, it is recommended to define a local confidence measure ^c l (° tj) that is used within the search process of the Viterbi algorithm:

Die Wahrscheinlichkeit des Merkmalsvektors, der im Nenner der Gleichung (12) erscheint, kann berechnet werden, indem alle Zuεtände deε HMMs in Betracht gezogen werden:The probability of the feature vector appearing in the denominator of equation (12) can be calculated by taking into account all states of the HMM:

k

^'siehe auch Gleichung (6) ) . Die a-priori-Wahrεcheinlichkeit P(s_k) dieser Zustände ist in dem vorausgegangenen Training bestimmt worden. Somit ist das lokale Konfidenzmaß CιJθ_tεj) vollständig berechenbar. ^' see also equation (6)). The a priori probability P (s _k ) of these conditions was determined in the previous training. The local confidence measure CιJθ _t εj) can thus be fully calculated.

Ein zweites KonfidenzmaßA second measure of confidence

Die Definition eines zweiten globalen Konfidenzmaßes beεteht auε dem Verhältniε der bedingten Wahrεcheinlichkeiten einer Folge 0 von Merkmalεvektoren O_j- einmal unter der Bedingung einer daε Schlüεεelwort kennzeichnenden Folge von Zuεtänden W und ein andereε mal unter dem dazu inversen Modell W . Es ergibt εich:The definition of a second global confidence measure consists of the ratio of the conditional probabilities of a sequence 0 of feature vectors O _j - once under the condition of a sequence of states W characterizing that keyword and another time under the inverse model W. It results in:

p (ol w) C₂ = -log ' (4) . p (o| w)p (ol w) C ₂ = -log '(4). p (o | w)

Dabei stellt W lediglich ein Modell dar, das real nicht existiert, dessen Emisεionεwahrεcheinlichkeit aber berechnet werden kann. Im Gegenεatz zu der Definition deε ersten globalen Konfidenzmaßes führt diese Definition zu einem symmetrischen globalen Konfidenzmaß, daε bei 0 ein Symmetriezentum aufweist, fallsW represents only a model that does not actually exist, but whose emission probability can be calculated. In contrast to the definition of the first global confidence measure, this definition leads to a symmetrical global confidence measure that has a symmetry center at 0 if

P (θ| W) = P (θ| W) (14)P (θ | W) = P (θ | W) (14)

erfüllt ist. Analog zu dem Fall für das erste globale Konfidenzmaß ergibt sich durch Einsetzen der Gleichungen (8) , (9) und (10) unter Berücksichtigung des jeweils inversen Modells a_ψΛ- _-Λ _ψ und s_ψΛ-λ die folgende Gleichung:is satisfied. Analogous to the case for the first global confidence measure, the following equation is obtained by inserting equations (8), (9) and (10), taking into account the inverse model a _ψ Λ- _-Λ _ψ and s _ψ Λ-λ:

po_t ^svμ(t)J ^{• a}ιμ(t-l),ψ(t)po _t ^s vμ (t) J ^{• a} ιμ (tl), ψ (t)

C = ∑-log (15) t P° ψ(t)J ^{• a}v)/(t-l),vμ(t) Ein paεsendes lokaleε Konfidenzmaß c₂(0_t), daε bei der von dem Viterbi -Algorithmus durchgeführten Suche verwendet werden kann, wird definiert zu:C = ∑-log (15) t P ° ψ (t) J ^{• a} v) / (tl), vμ (t) A suitable local confidence measure c ₂ (0 _t ) that can be used in the search performed by the Viterbi algorithm is defined as:

Auch in dieεem Fall ist daε lokale Konfidenzmaß c₂0_t εj berechenbar, da der Nenner berechnet werden kann, indem alle gewichteten Emiεεionswahrscheinlichkeiten außer für PIO_t s selbεt berechnet werden können:In this case too, the local confidence measure c ₂ 0 _t εj can be calculated, since the denominator can be calculated by calculating all weighted emission probabilities except for PIO _t s itself:

εiehe auch Gleichung (7) ) .

ε see also equation (7)).

Somit führen beide Definitionen auf ein Konfidenzmaß, daε im Fall eines niedrigen Wertes (im Fall deε globalen Konfidenzmaßeε C₂ einen negativen Wert) eine hohe Zuverläsεigkeit dafür anzeigt, daß ein Schlüεεelwort richtig erkannt worden ist .Thus, both definitions lead to a confidence measure that in the case of a low value (in the case of the global confidence measure C ₂ a negative value) indicates a high degree of reliability that a keyword has been correctly recognized.

Als ein Vorteil dieseε berechenbaren Konfidenzmaßeε wird angegeben, daß weder zuεätzliche HMMs trainiert werden müssen, noch ein kunstvolleε Manipulieren anderer betroffener Parameter notwendig ist. Die Konfidenzmaße können unter Verwendung allgemeiner Phonem-HMMs berechnet werden.As an advantage of these predictable confidence measures, it is stated that neither additional HMMs need to be trained, nor is there an artful manipulation of other affected parameters necessary. Confidence measures can be calculated using general phoneme HMMs.

Die Definition von Konfidenzmaßen, wie oben gezeigt wurde, kann mit einer auf Hidden-Markov-Modellen baεierten Viterbi- Suche verknüpft werden. Jeder einzelne Zustand SJ der HMMs emittiert dann nicht den negativen Logarithmus einerThe definition of confidence measures, as shown above, can be linked to a Viterbi search based on hidden Markov models. Each individual state SJ of the HMMs then does not emit the negative logarithm of one

Wahrscheinlichkeit PIO^ Sj , sondern statt dessen ein lokalesProbability PIO ^ Sj, but instead a local one

Konfidenzmaß cj_ oder c₂. In Fig.2 iεt eine Skizze dargeεtellt, die die Beεtimmung eines Konfidenzmaßes veranschaulicht.Confidence measure cj_ or c ₂ . A sketch is shown in FIG. 2, which illustrates the determination of a confidence measure.

Im oberen Diagramm von Fig.2 sind auf der Abεziεεe diskrete Zeitpunkte t_l7t₂,... und auf der Ordinate das durch eine Folge von Zuεtänden ZS gekennzeichnete Schlüsselwort SW dargestellt. Im unteren Teil von Fig.2 wird ein kontinuierliches Sprachsignal über einer Zeitachse t gezeigt.In the upper diagram in FIG. ₂ , discrete times t ₁₇ t ₂ ,... _{Are shown} on the abscissa and the keyword SW identified by a sequence of states ZS is shown on the ordinate. In the lower part of FIG. 2, a continuous speech signal is shown over a time axis t.

Daε kontinuierliche Sprachεignal kann mehrere, auch unterεchiedliche, Schlüsselwörter enthalten, wobei zu einem Zeitpunkt vorzugsweiεe nur ein Schlüsεelwort enthalten ist.The continuous speech signal can contain several, also different, key words, preferably only one key word being contained at a time.

Das kontinuierliche Sprachsignal wird zu diskreten Zeitpunkten abgetastet und die zu dem jeweiligen Abtastzeitpunkt vorhandene Information in einem Merkmalsvektor O_t abgespeichert. Erfindungsgemäß wird davon ausgegangen, daß ein Schlüsselwort zu jedem dieser Abtastzeitpunkte beginnen kann. Also beginnt zu jedem der Zeitpunkte tl, t2 oder t3 je ein potentielleε Schlüsεelwort, deren Pfade im Verlauf deε Viterbi-Algorithmus rekombinieren können. Zur Vereinfachung wird hier von einem Schlüsselwort ausgegangen, wobei mehrere Schlüsselwörter je ein Verfahren für jedes zu erkennende Schlüsεelwort benötigen.The continuous speech signal is sampled at discrete points in time and the information available at the respective sampling point in time is stored in a feature vector O _t . According to the invention, it is assumed that a keyword can begin at each of these sampling times. So at each of the times t1, t2 or t3 a potential keyword begins, the paths of which can recombine in the course of the Viterbi algorithm. For the sake of simplicity, a keyword is assumed here, with several keywords each requiring a method for each keyword to be recognized.

Beginnt das Schlüsselwort also zu dem Zeitpunkt t^, so wird anhand der aus der kontinuierlichen Sprache gewonnenen Merkmalsvektoren O_t eine Abbildung der auf den Zeitpunkt ^ folgenden Merkmalsvektoren vorgenommen. Es wird jeweils der bezüglich des akkumulierten Konfidenzmaßeε beεte Pfad PF bestimmt. Eε ergibt εich für jeden Zeitpunkt t ein Konfidenzmaß C. Der Wert deε Kon idenzmaßes gibt Aufεchluß darüber, ob daε Schlüεεelwort in der kontinuierlichen Sprache enthalten war oder nicht und zum Zeitpunkt t geendet hat.If the keyword therefore begins at time t ^, the feature vectors O _t obtained from the continuous language are used to map the feature vectors following the time ^. In each case, the best path PF is determined with respect to the accumulated confidence measure. This results in a confidence measure C for each time t. The value of the confidence measure provides information as to whether the keyword was contained in the continuous language or not and ended at time t.

In Fig.2 sind beispielhaft Pfade eingezeichnet, die zu den Zeitpunkten η_, t₂ und t₃ beginnen und zu den Zeitpunkten t₄, I II III t₅ und tg zu den globalen Konfidenzmaßen C , C und C2 shows, by way of example, paths which begin at times η_, t ₂ and t ₃ and at times t ₄ , I II III t ₅ and tg on global confidence measures C, C and C

I II führen. Die zu C und C gehörenden globalen Konfidenzmaße korrespondieren zu dem möglichen Schlüεεelwortbeginn in tl,I II lead. The global confidence measures belonging to C and C correspond to the possible keyword start in tl,

III während daε globale Konfidenzmaß C am besten durch einen Pfad erreicht wird, der in t₂ beginnt.III while the global confidence measure C is best achieved by a path that begins in t ₂ .

Hierbei sei angemerkt, daß zu jedem Zeitpunkt t ein globales Konfidenzmaß C beobachtet wird, wobei durch Anwendung des Viterbi -Algorithmus ein zugehörender Startzeitpunkt ermittelt wird.It should be noted here that a global confidence measure C is observed at every point in time t, an associated starting point in time being determined by using the Viterbi algorithm.

Enthält die kontinuierliche Sprache etwas völlig anderes als das Schlüsselwort, so iεt daε Konfidenzmaß entεprechend εchlecht, es findet keine Erkennung statt. Auch ist gemäß der Arbeitsweise des Viterbi-Algorithmuε die Länge für verεchiedene Pfade zur Beεtimmung deε globalen Konfidenzmaßes nicht gleich, angedeutet dadurch, daß das globale Konfidenzmaß C aus den lokalen Konfidenzmaßen von vierIf the continuous language contains something completely different from the key word, then the measure of confidence is correspondingly bad; there is no recognition. Also, according to the mode of operation of the Viterbi algorithm, the length for different paths for determining the global confidence measure is not the same, indicated by the fact that the global confidence measure C is made up of the local confidence measures of four

Zuständen gebildet wird, während die globalen Konfidenzmaße II III C und C auε den lokalen Konfidenzmaßen von fünfStates is formed, while the global confidence measures II III C and C from the local confidence measures of five

Zuständen bestehen. Die Dauer der entsprechenden Schlüsselwörter ergibt sich somit zu 4Δt und zu 5Δt .Conditions exist. The duration of the corresponding keywords thus results in 4Δt and 5Δt.

Fig.3 veranschaulicht diesen Zusammenhang. Die aus Fig.2 ermittelten globalen Konfidenzmaße C¹, C¹¹ und C sind beiεpielhaft in Fig.3 an der Ordinate aufgetragen. Die Abszisεe kennzeichnet wieder die Zeit t.Figure 3 illustrates this relationship. The global confidence measures C ¹ , C ¹¹ and C determined from FIG. 2 are plotted on the ordinate in FIG. 3. The abscissa again indicates the time t.

Für jeden Zeitpunkt t ergibt sich jeweils ein eigenes globales Konfidenzmaß C.For each time t there is a global confidence measure C.

Vorzugεweiεe wird ein Minimum MIN deε globalen Konfidenzmaßeε C bestimmt und somit davon ausgegangen, daß in diesem Minimum MIN das Schlüsselwort in der kontinuierlichen Sprache vorhanden ist. Dies ist insofern von Bedeutung, als bereits zu einem Zeitpunkt t_a die Schwelle T für daε globale Konfidenzmaß C unterεchritten wird, alεo daε Schlüsselwort erkannt wird. Im Hinblick auf die variable dynamische Anpassung (unterschiedliche Zeitdauern zur Bestimmung deε globalenA minimum MIN of the global confidence measure C is preferably determined and it is therefore assumed that the keyword MIN is present in the continuous language in this minimum MIN. This is important insofar as the threshold T for the global confidence measure C is already undershot at a time t _a , so that the keyword is recognized. With regard to the variable dynamic adaptation (different time periods for determining the global

Konfidenzmaßeε) kann jedoch, wie hier in Fig.3 beiεpielhaft dargestellt ist, dieses Schlüsεelwort zu unmittelbar bevorεtehenden Zeitpunkten t_a+j_ "noch beεser" erkannt werden. Um feεtzuεtellen, wann daε Schlüεεelwort optimal erkannt wird, wird das Minimum MIN mit dem zugehörigen Zeitpunkt t_MIN ermittelt. Von diesem Zeitpunkt t_jv_{j N} aus wird mittelsConfidence measures), however, as is shown here in FIG. 3 by way of example, this keyword can be recognized "even better" at impending times t _{a + j} _. In order to determine when the keyword is optimally recognized, the minimum MIN is determined with the associated time t _MIN . From this point in time t _j v _{j N}

Backtracking (siehe [3] ) der Startzeitpunkt deε Schlüεselworteε in dem kontinuierlichen Sprachεignal beεtimmt. Eε wird alεo der Anfang deε geεprochenen Schlüεselworts in dem kontinuierlichen Sprachsignal ermittelt .Backtracking (see [3]) determines the start time of the key word in the continuous speech signal. The beginning of the spoken key word is thus determined in the continuous speech signal.

Hierbei sei angemerkt, daß für jedes Schlüsεelwort eine derartige Minimumεbestimmung durchgeführt werden kann, allerdings für die Zeitdauer eines Schlüsselwortes kein anderes Schlüsselwort erkannt werden kann. Werden parallel aus der kontinuierlichen Sprache überlappend mehrere Schlüsεelwörter erkannt, so ist vorzugsweiεe das Schlüsεelwort das richtige, dessen Konfidenzmaß im Vergleich zu den anderen Schlüsεelwörtern die höchεte Zuverläεsigkeit widerspiegelt . It should be noted here that such a minimum determination can be carried out for each keyword, but no other keyword can be recognized for the duration of a keyword. If a plurality of key words are recognized in parallel from the continuous language in an overlapping manner, then the key word is preferably the correct one, the degree of confidence of which, in comparison with the other key words, reflects the highest level of reliability.

Im Rahmen dieses Dokuments wurden folgende Veröffentlichungen zitiert :The following publications have been cited in this document:

[1] M. Weintraub: "Keyword- spotting using SRI ' ε DECIPHER large-vocabulary speech- recognition System", Proc . IEEE ICASSP, vol.2, 1993, Seiten 463-466.[1] M. Weintraub: "Keyword spotting using SRI 'ε DECIPHER large-vocabulary speech recognition system", Proc. IEEE ICASSP, vol.2, 1993, pages 463-466.

[2] H.Boulard, B. D ' hoore und J.-M. Boite : "Opti izing recognition and rejection Performance in wordspotting εystems", Proc. IEEE ICASSP, vol .1 , 1994, Seiten 373-376.[2] H.Boulard, B. D 'hoore and J.-M. Boite: "Opti izing recognition and rejection performance in wordspotting systems", Proc. IEEE ICASSP, vol. 1, 1994, pages 373-376.

[3] L.R. Rabiner, B.H. Juang : "An Introduction to Hidden Markov Modelε", IEEE ASSP Magazine, 1986, Seiten 4-16.[3] L.R. Rabiner, B.H. Juang: "An Introduction to Hidden Markov Model", IEEE ASSP Magazine, 1986, pages 4-16.

[4] A. Hauenεtein: "Optimierung von Algorithmen und Entwurf eineε Prozeεsors für die automatische Spracherkennung" , Doktorarbeit am Lehrstuhl für Integrierte Schaltungen der Technischen Universität München, 19.07.1993, Seiten 13-35. [4] A. Hauenεtein: "Optimization of algorithms and design of a processor for automatic speech recognition", doctoral thesis at the chair for integrated circuits at the Technical University of Munich, July 19, 1993, pages 13-35.

Claims

Patentansprüche claims

1. Verfahren zur Erkennung eineε Schlüsselworts in gesprochener Sprache, a) bei dem daε Schlüεεelwort dargestellt ist durch eine Folge von Zuεtänden W von Hidden-Markov-Modellen, b) bei dem die gesprochene Sprache mit einer vorgegebenen Rate abgetastet und zu jedem Abtaεtzeitpunkt t ein Merkmalsvektor 0_t für ein zu dem Abtastzeitpunkt t gehörendeε Sprachsignals aus der gesprochenen Sprache erstellt wird, c) bei dem eine Folge 0 von Merkmalsvektoren O_t mittels eineε Viterbi-Algorithmuε auf die Folge von Zuεtänden abgebildet werden, wobei bei einem Zustand ein lokales Konfidenzmaß ein Emisεionsmaß ersetzt, d) bei dem der Viterbi -Algorithmus ein globales Konfidenzmaß C liefert, e) bei dem das Schlüsselwort in der gesprochenen Sprache erkannt wird, wenn gilt:1. A method for recognizing a key word in spoken language, a) in which the key word is represented by a sequence of states W of hidden Markov models, b) in which the spoken language is scanned at a predetermined rate and switched on at each sampling time t Feature vector 0 _t for a speech signal belonging to the sampling time t is created from the spoken language, c) in which a sequence 0 of feature vectors O _{t are} mapped to the sequence of states using a Viterbi algorithm, with a state having a local confidence measure Emission measure replaced, d) in which the Viterbi algorithm delivers a global confidence measure C, e) in which the key word is recognized in the spoken language if:

C(W,0) T,C (W, 0) T,

wobeiin which

C() das Konfidenzmaß, W daε Schlüsselwort , dargestellt als eine Folge von Zuständen,C () the confidence measure, W daε keyword, represented as a sequence of states,

0 die Folge von Merkmalsvektoren O_t,0 the sequence of feature vectors O _t ,

T einen vorgegebenen Schwellwert bezeichnen, f) bei dem ansonsten daε Schlüεεelwort in der geεprochenen Sprache nicht erkannt wird.T designate a predetermined threshold value, f) at which otherwise the key word is not recognized in the language spoken.

2. Verfahren nach Anspruch 1, bei dem das Emissionsmaß ein negativer Logarithmus einer Emissionswahrscheinlichkeit ist. 2. The method of claim 1, wherein the emission measure is a negative logarithm of an emission probability.

3. Verfahren nach Anspruch 1 oder 2, bei dem zu jedem Abtastzeitpunkt t ein neuer Pfad in einem ersten Zustand der Folge von Zuständen W beginnt.3. The method according to claim 1 or 2, in which a new path begins in a first state of the sequence of states W at each sampling time t.

4. Verfahren nach einem der vorhergehenden Ansprüche,, bei dem der Viterbi-Algorithmus zu jedem AbtastZeitpunkt t ein globaleε Konfidenzmaß liefert.4. The method according to any one of the preceding claims, in which the Viterbi algorithm delivers a global confidence measure at each sampling time t.

5. Verfahren nach einem der vorhergehenden Anεprüche, bei dem daε Konfidenzmaß C beεtimmt iεt durch5. The method according to one of the preceding claims, in which the confidence measure C is determined by

C = - log P (W| 0)C = - log P (W | 0)

und daε zugehörige lokale Konfidenzmaß bestimmt ist durchand the associated local confidence measure is determined by

_c . - _log ϊiϊLFM , ^p (° ) _c . - _log ϊiϊLFM, ^p (°)

wobeiin which

SJ einen Zustand der Folge von Zuständen bezeichnet.SJ denotes a state of the sequence of states.

6. Verfahren nach einem der Anεprüche 1 biε 4, bei dem daε Konfidenzmaß C beεtimmt iεt durch6. The method according to one of claims 1 to 4, in which the confidence measure C is determined by

P (θ| W) C = -log ) ' ,P (θ | W) C = -log) ',

P (0| W)P (0 | W)

und das zugehörige lokale Konfidenzmaß bestimmt iεt durchand the associated local confidence measure is determined by

'

wobeiin which

W nicht das Schlüsselwort W, s-J; nicht den Zustand s-Ji bezeichnen.W not the keyword W, sJ; do not denote the state s-ji.

7. Verfahren nach einem der vorhergehenden Ansprüche, bei dem daε globale Konfidenzmaß für eine vorgegebene Zeitdauer beεtimmt wird und aus einem Minimum deε globalen Konfidenzmaßeε auf einen Startzeitpunkt deε Schlüεεelwortes rückgeschloεεen wird.7. The method as claimed in one of the preceding claims, in which the global confidence measure is determined for a predetermined period of time and a minimum of the global confidence measure is used to deduce a start time of the key word.

8. Verfahren nach Anεpruch 7, bei dem daε Minimum unterhalb einer vorgegebenen Schwelle liegt .8. The method according to claim 7, in which the minimum lies below a predetermined threshold.

9. Verfahren zur Erkennung mehrerer Schlüsselwörter, indem parallel für jedes Schlüsselwort ein Verfahren nach einem der vorhergehenden Ansprüche verwendet wird, wobei, εobald mehrere vorgegebene Schwellwerte unterεchritten werden, das Schlüsεelwort mit dem besseren Konfidenzmaß erkannt wird.9. A method for recognizing a plurality of keywords by using a method according to one of the preceding claims in parallel for each keyword, wherein, as soon as several predetermined threshold values are undershot, the keyword with the better confidence measure is recognized.

10. Verfahren nach Anspruch 9, bei dem für den Zeitraum, in dem ein Schlüsεelwort, daε erkannt worden iεt, in der gesprochenen Sprache enthalten war, kein weitereε Schlüεεelwort erkannt wird. 10. The method according to claim 9, in which no further key word is recognized for the period in which a key word, since it was recognized, was contained in the spoken language.