TW200818802A - Systems, methods, and apparatus for signal change detection - Google Patents

Systems, methods, and apparatus for signal change detection Download PDF

Info

Publication number
TW200818802A
TW200818802A TW96128125A TW96128125A TW200818802A TW 200818802 A TW200818802 A TW 200818802A TW 96128125 A TW96128125 A TW 96128125A TW 96128125 A TW96128125 A TW 96128125A TW 200818802 A TW200818802 A TW 200818802A
Authority
TW
Taiwan
Prior art keywords
sequence
frame
spectral tilt
inactive
value
Prior art date
Application number
TW96128125A
Other languages
Chinese (zh)
Other versions
TWI467979B (en
Inventor
Vivek Rajendran
Ananthapadmanabhan A Kandhadai
Original Assignee
Qualcomm Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qualcomm Inc filed Critical Qualcomm Inc
Publication of TW200818802A publication Critical patent/TW200818802A/en
Application granted granted Critical
Publication of TWI467979B publication Critical patent/TWI467979B/en

Links

Landscapes

  • Telephone Function (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Disclosed configurations include systems, methods, apparatus arranged to generate a sequence of spectral tilt values that is based on inactive frames of a speech signal. For each of a plurality of inactive frames of the speech signal, a transmit decision is made according to a change calculated among at least two corresponding values of the sequence. The outcome of the transmit decision determines whether a silence description is transmitted for the corresponding inactive frame.

Description

200818802 九、發明說明: 【發明所屬之技術領域】 本揭示案係關於信號處理。 【先前技術】 由數位技術進行之語音傳铪 曰得輸已變得普遍, 電話、諸如IP组立+ 、荷刎係在長途 兩如i匕。曰(VoIP)之封包交 話之數位無線電話中。此增長已㈣電錢诸如蜂巢式電200818802 IX. Description of the invention: [Technical field to which the invention pertains] The present disclosure relates to signal processing. [Prior Art] Voice transmission by digital technology has become commonplace. Telephones, such as IP group +, and the Netherlands are both long-distance.曰 (VoIP) packet calls in the digital radiotelephone. This growth has been (4) electricity money such as honeycomb

,道上傳送.語音通信之資 =少用於在傳輸通 的興趣。 门時、准持重建語音之感知品質 =㈣猎由榻取與人類語音產生模型相關之參數而I 備稱為"語音編碼器"。語音編碼器-般包括編 編碼器通常將傳入之語音信號(表示音訊 : 位仏就)劃分成稱為"訊框"之時間片段、分析每_ 訊框㈣取某些相關參數,且將該等參數量化成二進位表 :、’mil %或—進位資料封包。資料封包在傳輸 α '、即’有線或無線網路連接)上經傳輸至包括解碼器 之接收器°解碼器接收並處理資料封包、將其反量化以產 生該等錢’且使諸反量化之參數重新建立語音訊框。 ,〜典型對話中’每一說話者在約百分之六十的時間内 2叔的。語音編碼器通常經組態以區分語音信號中含有 '^音之訊框(”活動訊框")與語音信號中僅含有靜寂或背景 ^訊之訊框(”不活動訊框")。此編碼器可經㈣以使用不 5 、、扁馬模式及/或速率來編碼活動及不活動訊框。舉例 而言,語音編碼器通常經組態以在比經編碼之活動訊框低 123345.doc 200818802 碼之不活動訊框(亦稱為"靜寂描述符,,、 王又工私話通仏期間之任一時刻,可能預期至注立 至少一者的輸入將為不活動訊框。可能需要編:、: 連續之。不活動訊框而傳輸_。此操作亦稱為不 、只、兩X)。在—實例中,語音編碼器藉由針 串32個連貫之不活動訊框傳輸—個仙而執行DTX。對:, on the road transmission. Voice communication resources = less for the interest in transmission. Perceived quality of the door and the reconstructed voice = (4) Hunting takes the parameters related to the human speech production model and I is called "speech encoder". Speech encoders, including encoders, usually divide the incoming speech signal (indicating audio: position) into time segments called "frames", and analyze each frame (four) to take certain relevant parameters. And the parameters are quantized into a binary table: 'mil % or - carry data packet. The data packet is transmitted to the receiver including the decoder on the transmission α ', ie, the 'wired or wireless network connection'. The decoder receives and processes the data packet, dequantizes it to generate the money, and dequantizes the data. The parameters re-establish the voice frame. ~ ~ Typical conversation in 'every speaker in about 60% of the time 2 uncle. The speech encoder is usually configured to distinguish between a frame containing a '^ tone ("active frame" in the voice signal) and a frame containing only silence or background in the voice signal ("inactive frame"). ). The encoder can encode active and inactive frames via (4) using no, flat horse mode and/or rate. For example, a speech coder is typically configured to have an inactive frame that is lower than the encoded active frame by 123345.doc 200818802 (also known as "Quiet Descriptor,,, and Wang Yigong At any time during the period, it may be expected that the input to at least one of the notes will be an inactive frame. It may be necessary to edit:,: continuous. Inactive frame and transmit _. This operation is also called no, only, Two X). In the example, the speech encoder performs DTX by transmitting 32 consecutive inactive frames. Correct:

解碼益應用SID中之資訊來更新由舒適雜訊產生演算法用 於合成不活動訊框之雜訊產生模型。 、汁 【發明内容】 立根據一組態之一種處理語音信號之方法包括產生基於語 音信號之複數個不活動訊框之頻譜傾斜值序列。此方法包 括:計算頻譜傾斜值序列之至少兩個值之間的改變;及= 於該複數個不活動訊框#中之—不活動訊框,決^是否傳 輸該訊框之描述。在此方法中,決定是否傳輸該訊框之描The information in the SID application SID is updated to update the noise generation model used by the comfort noise generation algorithm for synthesizing the inactive frame. SUMMARY OF THE INVENTION A method for processing a speech signal according to a configuration includes generating a sequence of spectral tilt values for a plurality of inactive frames based on the speech signal. The method includes: calculating a change between at least two values of a sequence of spectral tilt values; and = in the plurality of inactive frames # - an inactive frame, determining whether to transmit the description of the frame. In this method, decide whether to transmit the description of the frame.

的速率下傳輸經編 靜寂描述"或SID) 述係基於計算出的改變。 據另、、且I、之一種電腦程式產品包括一電腦可讀媒 體。此媒體包括用於使至少一個電腦產生基於語音信號之 複數個不活動訊框之頻譜傾斜值序列的程式碼。此媒體包 括用於使至少一個電腦計算頻譜傾斜值序列之至少兩個值 之間的改變之程式碼;及用於使至少一個電腦針對該複數 個不活動訊框當中之一不活動訊框且基於計算出的改變來 決定是否傳輸該訊框之描述的程式碼。 根據又一組態之一種用於處理語音信號之裝置包括一序 123345.doc 200818802 列產生器,該序列產生器經組態以產生基於語音信號之複 數個不活動訊框之頻譜傾斜值序列。此裝置包括··一吁管 器,其經組態以計算頻譜傾斜值序列之至少兩個值之間的 改變;及一比較器,其經組態以針對該複數個不活動訊框 當中之一不活動訊框且基於計算出的改變來決定是否傳輸 該訊框之描述。The transmission rate of the warp silence description " or SID) is based on the calculated change. According to another, and one of the computer program products includes a computer readable medium. The medium includes code for causing at least one computer to generate a sequence of spectral tilt values for a plurality of inactive frames based on the speech signal. The medium includes code for causing at least one computer to calculate a change between at least two values of a sequence of spectral tilt values; and for causing at least one computer to actuate one of the plurality of inactive frames and A method of determining whether to transmit the description of the frame is based on the calculated change. An apparatus for processing a speech signal according to yet another configuration includes a sequence 123345.doc 200818802 column generator configured to generate a sequence of spectral tilt values for a plurality of inactive frames based on the speech signal. The apparatus includes a caller configured to calculate a change between at least two values of a sequence of spectral tilt values; and a comparator configured to target the plurality of inactive frames An inactive frame and a determination of whether to transmit the frame based on the calculated change.

根據再-組態之-種用於處理語音信號之裝置包括用於 產生基於語音㈣之複數個不活動訊框之頻譜傾斜值序列 的構件。此裝置包括:用於計算頻譜傾斜值序列之至少兩 :值之間的改變之構件;及用以針對該複數個不活動訊框 虽中之-不活動訊框且基於計算出的改變來決定是否傳輪 該訊框之描述的構件。 【實施方式】 +又所迷之組態包括用 ㈡,口 %队雙乏糸統、方 不活^士董°舉例而卜若干組態經揭示以用於備測信號之 更新。0::期間t之改變且基於此偵測而起始對信號描述之 ¥ ^、組悲通常意在用於封包交換網路(例如,經配 =據諸如IP語音或_之協定载運語音傳輸之有線及/ :無線網路)中,,盡管亦明確涵蓋並 :及 換網路中之使用。 〗隹罨路父 除非在其情境中明確加 中係用w H 乂限制,否則術語,*計算”在本文 甲係用於扣不其普通意義之任一 平滑及自複數個值中進行 ’§鼻、評估、 述及申請專利範圍的情:;擇。語"包含"用於_ 兄下’其並不排除其他元件或操 123345.doc 200818802 作。術語"A基於B”用於指示其普通意義之任一者,包括下 述情況·(1) "A基於至少b",及(η) "A等於B”(若在特定情 境中係合適的)。 實施DTX之編碼器可經組態以根據遮沒機制(Manking scheme)丢棄(或"遮沒”)大多數不活動訊框。遮沒機制之一 實例以規則間隔(例如,每16個或32個連貫不活動訊框一 -人)發布對靜寂描述(silenCe之更新。其他遮沒 _ 機制(亦稱為π智慧遮沒”機制)經組態以在偵測到可指示背 景雜讯改變之能量及/或頻譜特性波動後即發布對靜寂描 述之更新。 僅依賴於能量波動之遮沒機制可能有時無法偵測感知上 顯著的背景雜訊改變。在某些情況下,感知上不同的不活 動訊框將具有類似的能量特性(通常經編碼為增益值)。儘 管(例如)街道中之背景雜訊(”街道雜訊")可具有與擁擠空 間中之月景雜訊(”混串音雜訊"(babble n0ise))之能量分布 _ 相類似的隨時間能量分布,但是此等兩種類型之雜訊通常 以極為不同的方式被感知。無法區分感知上不同類型之雜 訊的遮沒機制可能在解碼器處產生可聽假聲(artif⑽。因 為活動訊框亦包括(例如)背景雜訊,所以在解碼器自解碼 之活動訊框切換至產生自不當SID之舒適雜訊時可能發生 可聽不連續性。 需要遮沒機制偵測感知上顯著的背景雜訊改變。舉例而 言’可能需要遮沒機制偵測背景雜訊之一或多個頻譜特性 (例如’頻譜傾斜)中之突然改變。如本文所述之方法或穿 123345.doc 200818802 置可用於實施此遮沒機制。或者,如本文所述之方法或裝 置可用於輔助另一遮沒機制。舉例而言,語音編碼器或語 音編碼方法可將如本文所述之方法或裝置與如美國專利申 請案公開案第2006/0171419號(Spind〇la等人,2〇〇6年8月3 日么開)中所述之遮沒機制或與經組態以偵測訊框能量變 化及/或浯音#號之頻譜特性變化(諸如,線譜對向量之間 的差)的另一遮沒機制相組合。The apparatus for processing speech signals according to re-configuration includes means for generating a sequence of spectral tilt values for a plurality of inactive frames based on speech (4). The apparatus includes: means for calculating at least two changes in a sequence of spectral tilt values: a change between values; and determining, for the inactive frame, the inactive frame and determining based on the calculated change Whether to pass the components described in the frame. [Embodiment] + The configuration that is also fascinated includes the use of (2), the port of the team is lacking, and the configuration is disclosed for the update of the prepared signal. 0:: The change of period t and the start of the description of the signal based on this detection, ^^, group sorrow is usually intended for packet-switched networks (for example, the distribution of voices according to agreements such as IP voice or _) In the transmission of cable and /: wireless network, although it is also explicitly covered and used in the network. 〗 隹罨 父 父 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 除非 父 父Nasal, evaluation, and the scope of the patent application: "Selection" "include" for _ brothers' does not exclude other components or operations 123345.doc 200818802. The term "A based on B" is used for Indicates any of its ordinary meanings, including the following: (1) "A is based on at least b", and (η) "A equals B" (if appropriate in a particular context). Implementing DTX encoding The device can be configured to discard (or "mask out) most of the inactive frames according to the Manking scheme. An instance of the occlusion mechanism issues a description of the silence at regular intervals (eg, every 16 or 32 consecutive inactive frames - one person) (silenCe update. Other occlusion _ mechanisms (also known as π wisdom occlusion) The "mechanism" is configured to issue an update to the silence description upon detection of fluctuations in energy and/or spectral characteristics indicative of background noise changes. The masking mechanism that relies solely on energy fluctuations may sometimes fail to detect Perceptually significant background noise changes. In some cases, perceptually different inactive frames will have similar energy characteristics (usually encoded as gain values), although (for example) background noise in the street (" Street noise ") may have a time-dependent energy distribution similar to the energy distribution _ in the crowded space ("babble n0ise"), but these two types The noise is usually perceived in a very different way. The masking mechanism that does not distinguish between different types of noises that are perceptually possible may produce an audible false sound at the decoder (artif(10). Because the active frame also includes, for example, background noise. Therefore, an audible discontinuity may occur when the decoder self-decoding active frame is switched to a comfortable noise generated from an inappropriate SID. An obscuration mechanism is required to detect a perceptually significant background noise change. For example, ' An obscuration mechanism may be required to detect a sudden change in one or more spectral characteristics of the background noise (eg, 'spectral tilt'). The method as described herein or the 123345.doc 200818802 can be used to implement this obscuration mechanism. A method or apparatus as described herein may be used to assist in another occlusion mechanism. For example, a speech coder or a speech coding method may be as described herein with a method or apparatus as disclosed in US Patent Application Publication No. 2006/ The masking mechanism described in 0171419 (Spind〇la et al., August 3, 2016) is configured to detect the change in frame energy and/or the spectral characteristics of the #### Another occlusion mechanism of the variation, such as the difference between the line spectrum and the vector, is combined.

圖1A展不根據一般組態之方法撾1〇〇之流程圖。基於語 音#號之複數個不活動訊框,任務T2〇〇產生頻譜傾斜值序 列。任務Τ400計算頻譜傾斜值序列内之改變(例如,序列 之至少兩個值之間的改變)。對於語音信號之一不活動訊 才C而σ,任務Τ5 00決定是否傳輸該訊框之描述,其中該決 疋基於計算出的改變。舉例而言,是否傳輸描述之決定可 基於(Α)計算出的改變之量值與(Β)臨限值之間的關係。 在方法Μ1 〇〇之典型實施例中,頻譜傾斜值序列當中之 母者基於對應不活動訊框之頻譜傾斜。語音信號之訊框 之頻譜傾斜為描述訊框内之能量在頻率範圍上之分布的 值。通常,頻譜傾斜指示對應訊框上信號之頻譜之斜率, 且可為正的或負的。產生頻譜傾斜序列之下—個值之行為 亦稱為”更新”該序列。 顆%傾斜值序列之值通常g 且 句饮吋间順序,从优付 列之連續值對應於時間上連續的信號片段。以此方式配 ':曰傾斜值序列可被說成表示描述語音信號之能譜斜 奴^間之改變的輪廓(亦即’頻譜傾斜輪廓)。 123345.doc 200818802 可實施任務t2gg以用若干不同方式中之任— 頻譜傾斜值㈣。舉❹言,任務τ細可經組== 元件或陣列(例如,半導體記憶體單元或:= ,語音編碼方法)之另一任務或自諸如語= ^衣置之I件接收此序列。或者,任務Τ2⑽可經組態以 如本文所述計算此序列。 〜 任務T2GG可經組態以輸出所接收或計算出的序列(本文 中亦表示為4而作為產生之頻譜傾斜值序列。或者, =可經組態以藉由對此序列樣行—或多個其他操心 產生-頻譜傾斜值序歹,卜此等其他操作可包括自序歹“之 值當中選擇另一序列:例如,每”個值選擇(其中4大於! 之整數),及/或僅選擇對應於不活動訊框之彼等值。如本 文所述,此等其他操作亦可包括平滑所接收的、計算出的 或選定之序列。 τ 語音信號之每一片段(亦稱為"片段"或”訊框")在時間上 之持續時間通常經㈣為足夠短的,錢得可_信號之 頻譜包絡保持為相對平穩的。舉例而言,一個典型訊框長 度為20毫秒’此對應於8千赫(kHz)取樣速率下之16〇個樣 本’儘管可使用被視為適合於特定應用之任何訊框長度或 取樣速率。在某些應时,訊框為非重疊的,而在其他應 用中則使用重疊訊框機制。舉例而言,語音編碼器普遍地 在編碼器處使用重疊訊框機制而在解碼器處使用非重疊訊 框機制。 在典型應肖中’ €輯閘陣列、經組態以執彳方法繼〇〇之 123345.doc -11 - 200818802 種任務中之一者、一者以上乃至全部㈣。舉❹言, 此或此等任務可實施為待由諸如處理器之可程式化陣列執 订之機器可執行碼。方法M⑽之任務亦可由—個以上之 此陣列執行。在此等或其他實施例中,該等任務可執行於 無線通信設備内,諸如,蜂巢式電話或具有此通信能力之 其他設備。此設備可經組態以(例如,使用諸如请之一 ^個協;t)與電路交換及/或封包交換網路相通信。舉例 春 ❿言’此設備可包括經組態以傳輸編碼之活動訊框及仙 之灯電路。方法M1啊可實施為體現於電腦程式產品⑼ 如’或多個貢㈣存媒體,諸如,磁碟、快閃記憶卡或 其他非揮發性記憶卡、半導體記憶體晶片等)中之機器可 讀碼。 在方法M100之典型應用中,任務丁4〇〇在任務丁2⑽所產 ^之頻譜傾斜值序列上進行迭代,以基於頻譜傾斜值之連 續對而計算-系列改變,且任務丁5〇〇在該系列改變上進行 • 迭代以執行一系列傳輸決定。一般而言,任務T200作為現 行過程執行,且任務丁400與丁500以串行方式或並行方式進 打迭代,從而使得頻譜傾斜值以及對應的計算出的改變及 傳輸指示針對語音信號之每一不活動訊框而產生(例如, 可能在一或多個不活動訊框之初始化時期之後)。亦可能 實施方法M100以使得相比每個不活動訊框,任務τ2〇〇較 不頻繁地(例如,對於每兩個或三個訊框)產生頻譜傾斜 值,使得任務Τ400與任務Τ200 一樣頻繁地執行或相比任務 Τ200較不頻繁地(例如,對於任務Τ2〇〇之每兩個或三個迭 123345.doc •12- 200818802 代)執行,及/或使得任務T500與任務T400 —樣頻繁地地執 行或相比任務T400較不頻繁地(例如,對於任務T400之每 兩個或三個迭代)執行。 圖1B展示根據一般組態之裝置A100之方塊圖。序列產 生器120經組態以產生基於語音信號之複數個不活動訊框 之頻譜傾斜值序列。舉例而言,序列產生器12〇可經組態 以執行如本文所揭示之任務T200之實施例。計算器14〇經Figure 1A shows a flow chart based on the general configuration method. Based on the plurality of inactive frames of the voice ##, the task T2〇〇 produces a sequence of spectral tilt values. Task Τ400 calculates a change in the sequence of spectral tilt values (e.g., a change between at least two values of the sequence). For one of the voice signals inactive C and σ, task Τ500 determines whether to transmit the description of the frame, where the decision is based on the calculated change. For example, the decision whether to transmit the description can be based on the relationship between the (()) calculated magnitude of the change and the (Β) threshold. In a typical embodiment of the method ,1〇〇, the mother of the sequence of spectral tilt values is tilted based on the spectrum of the corresponding inactive frame. The spectral tilt of the frame of the speech signal is the value that describes the distribution of the energy within the frame over the frequency range. Typically, the spectral tilt indicates the slope of the spectrum of the signal on the corresponding frame and can be positive or negative. The behavior of generating a value below the spectral tilt sequence is also referred to as "updating" the sequence. The value of the % tilt value sequence is usually g and the sentence order is continuous, and the continuous value from the preferred column corresponds to the temporally continuous signal segment. In this way, the ':曰 tilt value sequence can be said to represent a contour describing the change of the energy spectrum of the speech signal (i.e., the 'spectral tilt profile). 123345.doc 200818802 The task t2gg can be implemented in any of several different ways - the spectral tilt value (4). As a rumor, the task τ can receive this sequence via another task of the group == element or array (eg, semiconductor memory unit or := , speech encoding method) or from an I piece such as a language. Alternatively, task Τ 2 (10) can be configured to calculate this sequence as described herein. ~ Task T2GG can be configured to output the received or calculated sequence (also denoted herein as 4 as a sequence of generated spectral tilt values. OR, = can be configured to act on this sequence - or more Other manipulations generate a spectral tilt value sequence, such other operations may include selecting another sequence from the value of the sequence: for example, every "value selection (where 4 is greater than an integer of !), and/or only selection Corresponding to the values of the inactive frames. As described herein, such other operations may also include smoothing the received, computed, or selected sequences. τ Each segment of the speech signal (also known as a "fragment The duration of the " or "frame" ") is usually short enough in time (4), and the spectrum envelope of the money _ signal remains relatively stable. For example, a typical frame length is 20 milliseconds' This corresponds to 16 samples at a sampling rate of 8 kilohertz (kHz), although any frame length or sampling rate that is considered suitable for a particular application can be used. In some cases, the frames are non-overlapping, In other applications, The overlapping frame mechanism is used. For example, the speech encoder generally uses the overlapping frame mechanism at the encoder and the non-overlapping frame mechanism at the decoder. In the typical mode, the frame is configured. One or more of the tasks of 123345.doc -11 - 200818802, one or more of them (4). In this way, this or these tasks can be implemented as programmable by a processor, such as a processor. Array-arranged machine executable code. The task of method M(10) may also be performed by more than one such array. In these or other embodiments, the tasks may be performed within a wireless communication device, such as a cellular telephone or have Other devices of this communication capability. The device can be configured to communicate with a circuit switched and/or packet switched network (e.g., using one of the protocols; t). For example, this device can include It is configured to transmit the encoded active frame and the fairy light circuit. The method M1 can be implemented as embodied in a computer program product (9) such as 'or multiple tribute (4) storage media, such as a magnetic disk, a flash memory card or other non- Volatile Machine readable code in a memory card, semiconductor memory chip, etc. In a typical application of method M100, the task is iterated over the sequence of spectral tilt values produced by task D2 (10) to slope based on the spectrum. The continuous pair of values is calculated - the series is changed, and the task is performed on the series of changes • Iteration is performed to perform a series of transmission decisions. In general, task T200 is executed as the current process, and the tasks are 400 and D Iterating in a serial or parallel manner such that the spectral tilt value and the corresponding calculated change and transmission indication are generated for each inactive frame of the speech signal (eg, one or more inactive frames may be present) After the initialization period) it is also possible to implement method M100 such that task τ2〇〇 produces a spectral tilt value less frequently (eg, for every two or three frames) compared to each inactive frame, such that the task Τ400 is executed as frequently as task Τ200 or less frequently than task Τ200 (for example, for every two or three of tasks Τ2〇〇123345.doc •12-2 00818802 executes, and/or causes task T500 to be executed as frequently as task T400 or less frequently than task T400 (e.g., every two or three iterations of task T400). Figure 1B shows a block diagram of an apparatus A100 in accordance with a general configuration. Sequence generator 120 is configured to generate a sequence of spectral tilt values for a plurality of inactive frames based on the speech signal. For example, sequence generator 12A can be configured to perform an embodiment of task T200 as disclosed herein. Calculator 14

組態以計算頻譜傾斜值序列之至少兩個值之間的改變。舉 例而言’計算器14〇可經組態以執行如本文所揭示之任務 Τ400之實施例。比較器15〇經組態以決定是否傳輸語音信 號之不活動片段之描述,其中該決定基於計算出的改變 (例如,基於(Α)計算出的改變之量值與⑺)臨限值之間的關 係)。舉例而言,比較器150可經組態以執行如本文所揭示 之任務Τ500之實施例。在典型應用中,裝置Αι〇〇之實施 例經配置以處理頻譜傾斜值序列,並基於該序列產生一系 列傳輸決定。 μ 衣置Α100之各種元件可以視為適合於所欲應用之硬體、 軟體及/或動體之任-組合而實施。舉例而言,此等元件 之任一者均可實施為一或多個邏輯閘陣列。此等元件之任 兩者或兩者以上乃至全部可實毅同—陣列或若干相同陣 列内。此或此等陣列可實施於一或多個晶片内(例如,包 括兩個或兩個以上晶片之晶片組内)。裝置Α⑽之各種元 ::任一者亦可實施為一或多個電腦(例如’經程式 仃一或多個指令集或指令序列之陣列,亦稱為”處理 123345.doc -13. 200818802 且此等元件之任兩者或兩者以上乃至全部可實施於同一個 此電腦或若干相同的此電腦内。裝置A1〇〇之各種元件可包 括於無線通信設備内,諸如,蜂巢式電話或具有此類通信 能力之其他設備。此設備可經組態以(例如,使用諸如 VoIP之一或多個協定)與電路交換及/或封包交換網路相通 仏。舉例而言,此設備可包括經組態以根據對應傳輸決定 之、、、σ果而傳輸SID之浯音編碼器,及/或經組態以傳輸編碼 之活動訊框及SID之RF電路。 值可用於指示訊框之頻譜傾斜的參數之一實例為第一反 射係數心,且下文將描述其他此類參數。任務可經配 置以自較大程序(諸如,語音編碼方法)之另一任務接收頻 譜傾斜值序列。或者,任務T2〇〇可經實施以包括任務 Τ210,如下文所述,任務Τ21〇經組態以計算此等值。同樣 地,序列產生器120可經配置以自諸如語音編碼器或通信 設備之較大裝置之另一元件接收頻譜傾斜值序列。或者, _ 序列產生器12〇可經實施以包括計算器128,如下文所述, 計算器128經組態以計算此等值。 任務Τ200可經實施以包括任務Τ3〇〇,任務丁3〇〇平滑頻 譜傾斜值序列。任務Τ300之典型實施例經組態以根據自我 回歸模型(諸如,無限脈衝回應(IIR)濾波器)對頻譜傾斜值 序列進行濾波。任務T300之特定實例執行下述之第一級 IIR濾波操作,以將經平滑序列少之每一值計算為輪入之\ 譜傾斜值序列X之當前值與經平滑序列^之前_值的加權項 均值: 平 123345.doc -14- 200818802 Ο) y[n]^ax[n] + {l^a)y[n-\], 其中/7表示順序索引。視所要之平滑度而定,增益因數a可 具有自〇至1之任一值.一般而言,增益因數“具有不大Z 〇·6之值。舉例而言,增益因數α可具有處於自〇·ι(或自 0.15)至〇·4(或至〇·5)之範圍中之值。在一特定實例中,序 列X為第一反射係數h之一系列值,且增益因數“具有值 〇·2(零點二)。圖1C展示方法Μ100之實施例馗101之流程 圖,其中任務Τ2〇〇實施為任務Τ3〇〇。圖⑴展示裝置A!⑻ :實施例AUH之方塊圖,其中序列產生器12〇實施為經組 怨以執行任務Τ300之實施例之平滑器13〇。 災,2展示平滑器130之實施例132之一實例的方塊圖。平 ’月為132包括:第—乘法器,其經配置以將增益因數G10應 用於輸入之頻譜傾斜值序列之當前值χ[”];第二乘法器^ 其經配置以將增益因數G2〇應用於如自延緩元件D所獲取 2平滑之頻譜傾斜值序列之前一值*1];及加法器, ,、、、、二配置以輸出作為該兩個乘積之和的Μ”]。可能需要(例 為了 t疋性)增盈因數G10具有如上文參考任務T300所 2述之值α,且需要增益因數G2〇具有值(1〜)。在一特定 貝例中,序列X為第一反射係數幻之一系列值,增益因數 Gl0 a有值〇·2(零點二),且增益因數G20具有值0.8(零點 )如上所述’平滑器132可以視為適合於所欲應用之硬 體、軟體及7或韌體之任一組合而實施。 甘 ^gt * » 卜’任務T300可經組態以藉由對頻譜傾斜值序 I23345.doc -15- 200818802 歹或對序列X執行平滑操作的結果)執行一或多自其他求 平均值、積分及/或低通濾波操作而計算經平滑之頻譜傾 斜值序❸之值。舉例而言,在方法⑷⑽之一替代實:例 中,任務T300經組態以根據移動平均模型(諸如,有限脈 衝回應(FIR)濾波器)而對序列礎行濾波。在方法咖〇之 另一替代實施例中,任務Τ3〇0經組態以根據自我回歸移動 平均(ARMA)模型而對序列x進行濾波。類似地,平滑器Configure to calculate the change between at least two values of the sequence of spectral tilt values. For example, the 'computer' can be configured to perform the embodiments of the task 如400 as disclosed herein. Comparator 15 is configured to determine whether to transmit a description of an inactive segment of the speech signal, wherein the decision is based on the calculated change (eg, based on (Α) the calculated magnitude of the change and (7)) threshold Relationship). For example, comparator 150 can be configured to perform an embodiment of task Τ500 as disclosed herein. In a typical application, an embodiment of the device 经ι〇〇 is configured to process a sequence of spectral tilt values and generate a series of transmission decisions based on the sequence. The various components of the device 100 can be implemented as any combination of hardware, soft body and/or body to be applied. For example, any of these elements can be implemented as one or more logic gate arrays. Either or both of these elements may be implemented in an array or in several identical arrays. The array or arrays can be implemented in one or more wafers (e.g., within a wafer set comprising two or more wafers). The various elements of the device (10): either can be implemented as one or more computers (eg, 'programmed by one or more instruction sets or arrays of instruction sequences, also known as "processing 123345.doc -13. 200818802 and Any two or more of these elements, or even all of them, may be implemented in the same computer or in several identical computers. The various components of device A1 may be included in a wireless communication device, such as a cellular telephone or have Other devices of this type of communication capability. The device can be configured to communicate with a circuit switched and/or packet switched network (e.g., using one or more protocols such as VoIP). For example, the device can include A voice encoder configured to transmit a SID according to the corresponding transmission decision, and a sigma effect, and/or an RF circuit configured to transmit the encoded active frame and SID. The value can be used to indicate the spectral tilt of the frame An example of one of the parameters is the first reflection coefficient kernel, and other such parameters are described below. The task can be configured to receive a sequence of spectral tilt values from another task of a larger program, such as a speech encoding method. Task T2 can be implemented to include task Τ 210, which is configured to calculate such values, as described below. Likewise, sequence generator 120 can be configured to operate from, for example, a speech encoder or communication device. Another component of the larger device receives a sequence of spectral tilt values. Alternatively, the sequence generator 12 can be implemented to include a calculator 128 that is configured to calculate the value as described below. A sequence of smooth spectral tilt values can be implemented to include a task. A typical embodiment of task 300 is configured to tilt the spectrum according to a self-regressive model, such as an infinite impulse response (IIR) filter. The sequence of values is filtered. The specific instance of task T300 performs the first level IIR filtering operation described below to calculate each value of the smoothed sequence as the current value and the smoothed sequence of the sequence X of the spectral tilt value. The weighted term mean of the previous_value: flat 123345.doc -14- 200818802 Ο) y[n]^ax[n] + {l^a)y[n-\], where /7 represents the sequential index. Depending on the smoothness, the gain factor a can have its own A value of 1 to any Generally, the gain factor "square having a Z-value of 6 small. For example, the gain factor a may have a value in the range from ι·ι (or from 0.15) to 〇·4 (or to 〇·5). In a particular example, sequence X is a series of values of the first reflection coefficient h, and the gain factor "has a value 〇·2 (zero point two). Figure 1C shows a flow chart of an embodiment Μ101 of method Μ100, where the task Figure 2 shows a block diagram of an embodiment AUH in which the sequence generator 12 is implemented as a smoother 13 of an embodiment of the task Τ300. 2 shows a block diagram of an example of an embodiment 132 of smoother 130. The flat 'month 132 includes: a first multiplier configured to apply a gain factor G10 to the current value of the input spectral tilt value sequence χ [ a second multiplier ^ configured to apply a gain factor G2 〇 to a value *1] before the sequence of 2 smooth spectral tilt values obtained by the delay element D; and adders, ,,,, and Configured to output Μ as the sum of the two products.] It may be desirable (for example for t疋) that the gain factor G10 has the value α as described above with reference to task T300 2, and that the gain factor G2 需要 has a value ( 1~). In a particular case, the sequence X is the first reflection system. One series of magical values, the gain factor Gl0a has a value 〇·2 (zero point two), and the gain factor G20 has a value of 0.8 (zero point) as described above. The smoother 132 can be regarded as a hardware suitable for the intended application. Implemented by any combination of software and 7 or firmware. Gan ^gt * » 卜 'Task T300 can be configured to perform smoothing operations on the spectrum tilt value sequence I23345.doc -15- 200818802 歹 or on sequence X Result) performing one or more values of the smoothed spectral tilt value sequence from other averaging, integration and/or low pass filtering operations. For example, in one of methods (4)(10) instead of real: in the example, task T300 The sequence is configured to filter the sequence based on a moving average model, such as a finite impulse response (FIR) filter. In another alternative embodiment of the method, the task Τ3〇0 is configured to self-regression The moving average (ARMA) model is used to filter the sequence x. Similarly, the smoother

130可實施為經組態以基於兩個或兩個以上輸入值而產生 平π值之積分姦或其他低通濾波器(諸如,FIR或arma濾 波器)。 u 方法M100通常經實施以使得在任務T3〇〇中經平滑之頻 譜傾斜值序列X之每一值對應於語音信號之複數個連續訊 框中之一者。類似地,裝置Α1〇〇通常經實施以使得由平滑 器130進行平滑之序列1之每一值對應於語音信號之複數個 連續訊框中之一者。注意,此等連續訊框無需為連貫的, 下文將對此更為詳細地進行描述。 語音信號將通常含有活動訊框以及不活動訊框。然而, 在活動訊框期間之能量分布很可能主要歸因於背景雜訊之 外的因素,以使得來自活動訊框之能量分布值不太可能提 供關於背景雜訊改變之可靠資訊。因此,可能需要頻譜傾 斜值序列X僅包括對應於不活動訊框之值。在此情況下, 序列X之值可對應於語音信號中不連貫的連續(不活動)訊 框〇 為說明此原理’圖3展示一實例,其中每一圓圈表不語 123345.doc -16- 200818802 音信號中隨著時間的—系列連貫訊框中之—者。表示不活 動訊框之圓圈各自標記有頻譜傾斜值序列X中之對應值之 索引編號。在此實例中,值74及75在序列中係連貫的。儘 管對應於值74及75之不活動訊框在語音信號中係連續的, 但是其由活動訊框區塊分隔,且因此並非彼此連貫。 方法Ml GG可經配置以使得任務τ则僅接收序列X中對應 於不活動訊框之頻譜傾斜值。或者,任務T3Q()可經實施以 自對應於連貫訊框之頻譜傾斜值序列當巾僅選擇對應於不 活動成框之彼等值。舉例而言,任務τ则之此實施例可經 組態以如下文所述基於接收自語音編碼器、語音編碼方法 或語音活動偵測任務τ⑽之語音活動指示而選擇對應於不 活動訊框之頻譜傾斜值(及/或去除對應於活動訊框之值)。 同樣地,裝置Α1 〇〇可經配置以使得平滑器13〇僅接收序 列X中對應於不活動訊框之頻譜傾斜值。或者,平滑器 可、^實靶以自對應於連貫訊框之頻譜傾斜值序列當中僅選 擇對應於不活動訊框之彼等值。舉例而言,平滑器130之 施例可經組態以如下文所述基於接收自語音編碼器、 °。曰、扁碼彳法或語音活動们則器110之語音活動指示而選 、十…於不活動訊框之頻譜傾斜值(及/或去除對應於活動 訊框之值)。 任務T400計算任務12〇〇所產生之頻譜傾斜值序列之至少 兩個值之間的改變。舉例而言,任務τ權可經組態以根據 古,下述表達式之表達式計算經平滑序列y之連貫值之間 的差值(亦稱為"德耳塔(Delta)"): 、 123345.doc -17- 200818802 φζ】 =刺一她卜1], (2) 其中Ζ表不輸出,且5表示增益因數。圖4展示計算器ΐ4〇之 貝施例142,實施例J42可用於執行任務14〇〇之此實例中b 等於1的特定情況(亦即,根據第一級FIR高通濾波操作 Φ]=刺-咖-η)。計算器140及/或任務丁4〇〇之其他實施例可 經組態以使用b之不同值而應用此濾波操作。舉例而言,b 之值可根據所要之頻率回應進行選擇。對於任務T2〇〇經組 馨 態以產生序列X的情況而言,Τ400或計算器142之此實施例 可經配置以根據諸如咖]=_]-φ-ι]之表達式而計算差值。 如上所述’計异器142可以視為適合於所欲應用之硬體、 軟體及/或韌體之任一組合而實施。 其他或另外,任務Τ400可經組態以對產生之頻譜傾斜值 序列執行一或多個其他微分操作,諸如,不同的高通濾波 操作(例如,將第一級IIR高通濾波器應用於產生之序列), 或用其他方式計算產生之序列之值之間的距離或其他改 • 變。類似地,計算器140可實施為經組態以計算兩個或兩 個以上輸入值之間的差值或其他距離或改變的微分器、差 值計算器或其他高通IIR或FIR濾波器。 任務T400所計算出的改變可用於指示產生之頻譜傾斜值 序列之改變速率。舉例而言,如上所述之z[>2]之量值可用 於指示背景雜訊之頻譜傾斜輪廓自一個不活動訊框至下一 個不活動訊框改變了多少。任務T4〇〇通常經配置以用迭代 方式計算一系列距離,該等距離之量值表示在各別訊框時 123345.doc -18- 200818802 期上經平滑輪廓之改變速率。 任務T5 00決定是否傳輸語音信號之不活動片段之描述, 其中該決定基於任務T400所計算出的對應改變。舉例而 言,任務T500可經組態以藉由將計算出的改變之量值與臨 限值Γ相比較而決定是否傳輸描述。任務τ5⑽之此實施例 可經組態以根據此比較之結果而設定二進位旗標: (3)130 may be implemented as an integral or other low pass filter (such as an FIR or arma filter) configured to produce a flat π value based on two or more input values. The method M100 is typically implemented such that each value of the smoothed spectral tilt value sequence X in task T3 对应 corresponds to one of a plurality of consecutive frames of the speech signal. Similarly, the device 〇〇1〇〇 is typically implemented such that each value of the sequence 1 smoothed by the smoother 130 corresponds to one of a plurality of consecutive frames of the speech signal. Note that these consecutive frames need not be coherent, as will be described in more detail below. The voice signal will usually contain an active frame and an inactive frame. However, the energy distribution during the active frame is likely to be primarily due to factors outside the background noise, so that the energy distribution values from the active frame are unlikely to provide reliable information about background noise changes. Therefore, it may be desirable for the spectral tilt value sequence X to include only values corresponding to inactive frames. In this case, the value of the sequence X may correspond to a discontinuous continuous (inactive) frame in the speech signal. To illustrate this principle, Figure 3 shows an example in which each circle is not uttered 123345.doc -16- 200818802 The sound signal in time—the series of consecutive frames. The circles indicating the inactive frame are each marked with the index number of the corresponding value in the sequence of spectral tilt values X. In this example, the values 74 and 75 are consecutive in the sequence. Although the inactive frames corresponding to values 74 and 75 are contiguous in the speech signal, they are separated by active frame blocks and are therefore not coherent with each other. The method M1 GG can be configured such that the task τ receives only the spectral tilt values in the sequence X corresponding to the inactive frames. Alternatively, task T3Q() may be implemented from a sequence of spectral tilt values corresponding to consecutive frames when the wipes only select values corresponding to the inactive frames. For example, the embodiment of task τ can be configured to select an inactive frame based on a voice activity indication received from a speech encoder, a speech encoding method, or a voice activity detection task τ (10) as described below. The spectral tilt value (and/or the value corresponding to the active frame). Likewise, the device Α1 〇〇 can be configured such that the smoother 13 〇 receives only the spectral tilt values in the sequence X corresponding to the inactive frames. Alternatively, the smoother can select only the values corresponding to the inactive frames from the sequence of spectral tilt values corresponding to the consecutive frames. For example, the embodiment of smoother 130 can be configured to be based on receiving from a speech encoder, ° as described below.曰, 彳, or voice activity, the voice activity indicator of the device 110 is selected, and the spectrum tilt value of the inactive frame (and/or the value corresponding to the active frame is removed). Task T400 calculates a change between at least two values of the sequence of spectral tilt values produced by task 12〇〇. For example, the task τ weight can be configured to calculate the difference between the consecutive values of the smoothed sequence y according to the expression of the following expression (also known as "Delta") : , 123345.doc -17- 200818802 φζ] = 刺一一卜1], (2) where Ζ is not output, and 5 is the gain factor. Figure 4 shows a calculator 142, which can be used to perform a specific case where b is equal to 1 in this example of task 14 (i.e., according to the first stage FIR high pass filtering operation Φ) = thorn - Coffee-n). Other embodiments of the calculator 140 and/or the task can be configured to apply this filtering operation using different values of b. For example, the value of b can be selected based on the desired frequency response. For the case where the task T2 馨 馨 以 to generate the sequence X, this embodiment of the Τ 400 or the calculator 142 can be configured to calculate the difference according to an expression such as 咖]=_]-φ-ι] . As described above, the counter 142 can be implemented as any combination of hardware, software, and/or firmware suitable for the application. Alternatively or additionally, task 400 can be configured to perform one or more other differential operations on the generated sequence of spectral tilt values, such as different high pass filtering operations (eg, applying a first stage IIR high pass filter to the generated sequence) ), or otherwise calculate the distance between the values of the resulting sequence or other changes. Similarly, calculator 140 can be implemented as a differentiator, difference calculator, or other high pass IIR or FIR filter configured to calculate a difference or other distance or change between two or more input values. The change calculated by task T400 can be used to indicate the rate of change of the resulting sequence of spectral tilt values. For example, the magnitude of z[>2] as described above can be used to indicate how much the spectral tilt profile of the background noise has changed from one inactive frame to the next inactive frame. Task T4〇〇 is typically configured to iteratively calculate a series of distances that represent the rate of change of the smoothed profile over the respective frames. Task T5 00 determines whether to transmit a description of the inactive segment of the speech signal, wherein the decision is based on the corresponding change computed by task T400. For example, task T500 can be configured to determine whether to transmit a description by comparing the magnitude of the calculated change to a threshold Γ. This embodiment of task τ5(10) can be configured to set the binary flag based on the result of this comparison: (3)

其中旗標ρ|>]之值指示傳輪決定之結果。在此情況下,一 或邏輯TRUE之值係正傳輸指示(亦即,具有正態之傳 輸指不、傳輸賦能指示、對傳輸之決定之指示),其指示 應針對备‘訊框而傳輸對靜寂描述之更新;且零或邏輯 FALSE之户切]值係負傳輸指示(亦即,具有負態之傳輸指 示、傳輸去能指*、對不要傳輸之決定之指示),其指示 不應針對當前訊框而傳輸對靜寂描述之更新。在一實例 中,臨限值Γ具有值〇·2。較低臨限值可用於提供對產生之 頻譜傾斜值序列中之變化的較大敏感性,而較高臨限值可 用於提供產生之頻譜傾斜值序列中之瞬變的較大去除。 无、白此項技術者將認識到,在方法Μι〇〇之替代實施例 I任各T400可根據諸如下述表達式之表達式而將改變計 异為一量值: = δ咖一ι]|, 且任務T5GG可經組態以根據諸如下述比較之比較的結果而 123345.doc -19- 200818802 設定二進位旗標: pN = 1, ζ[η] > τ 〇,其他方面 方法Mioo亦可經實施以包括任務75〇〇之不同變型,諸 如,將臨限值與計算出的改變之兩者或兩者以上之平均量 值(例如,當前及先前訊框之計算出的改變之平均量值)相 比車父之實施例。 圖5展示比較器15〇之實施例152之方塊圖,實施例152可 用於執行任務Τ500之實施例。在此實例中,比較器152經 組恶以藉由計算所計算出的改變之量值並將該量值與臨限 值Τ10相比較而執行傳輸決定。在一特定實例中,臨限值 Τ10具有值〇.2(零點二)。圖6展示比較器150之另一實施例 154之方塊圖,實施例154可用於執行任務Τ5〇〇之實施例。 在此實例中,比較器154經組態以分別將計算出的改變之 帶正負號值與正臨限值Τ10及負臨限值Τ20相比較,並在計 算出的改變大於(或者,不小於)臨限值Tio或小於(或者, 不大於)限值Τ20時發布正傳輸指示。在一實例中,臨限 值T20具有為臨限值T10之負值的值,以使得比較器152與 154經組態以產生相同結果。然而,比較器154亦可經實施 以使得臨限值T20視需要與臨限值T10具有不同的量值。 比較器150之另一實施例經配置以自計算器14〇接收計算 出的改變作為一量值,並將此量值與臨限值Τ10相比較。 如上所述’比較器15 0之此等實施例(亦即,包括比較器 152及154)可以視為適合於所欲應用之硬體、軟體及/或韌 123345.doc -20- 200818802 體之任一組合而實施。圖7A展示裝置A100之一實施例 A1〇2之方塊圖,實施例八1〇2經組態以對輸入信號对w執行 如上所述之多種操作以產生對應傳輸指示。 圖8 A展示一指令集之原始碼列表之一實例,該指令集可 由可程式化邏輯元件陣列或其他狀態機(例如,電腦或處 理器)執行以執行方法撾101之一實施例,該實施例包括任 矛;ο T3 0〇、T400及T5 00之實施例。在此實例中,變數k〇保 瞻存¥鈾訊框之頻譜傾斜值,變數y—current最初保存經 平滑之頻譜傾斜值序列3;之最近值,而旗標p保存傳輸指示 之狀態。Part 1(第1部分)藉由使用增益因數a之值〇_2根據 上文之表達式(1)來計算經平滑序列7之當前值而執行任務 T300。Part 2(第2部分)藉由使用增益因數b之值1根據上文 之表達式(2)來計算經平滑序列y之當前值與最近值之間的 改變而執行任務T400。Part 3(第3部分)藉由使用臨限值〇2 根據計算出的改變與臨限值之間的比較結果來設定旗標p φ 而執行任務T500。在典型應用中,以迭代方式執行該指令 集(例如,針對每一不活動訊框),從而使得每一迭代之變 數y—current的初始值為在先前迭代期間所計算出的變數 y_current最終值。 如上所述,任務T300可經組態以基於頻譜傾斜值序列% 之一或多個過去值及/或經平滑之頻譜傾斜值序列^之一或 多個過去值而計算經平滑序列;;之當前值。然而,對於經 平滑序列7之初始值而言,序列X之過去值及/或經平滑序列 7之過去值可能不存在。若任務T300使用任意值或零值替 123345.doc • 21 · 200818802 代過去值來計算經平滑序列y之值,則結果可使任務τ4〇〇 輸出大得不適當的一計算出的改變’此又可導致任務τ5〇〇 甚至在頻譜傾斜輪廓實際上恆定的情況下亦輸出正傳輸指 〇 可能需要初始化經組態以保存序列1及/或經平滑序列y 之過去值的一或多個變數(例如,資料儲存位置)。此初始 化可在任務T300首次執行之前執行,及/或可在任務T3〇〇 巾執行。舉例而言’-或多個此類變數可經初始化成序列 X之當前值。在特定實例中,經組態以儲存經平滑序列之 過去值(上文之表達式(1)中之抖心丨])之變數被初始化為輸 入序列之當前值(上文之表達式(1)中之对吶)。對於任務 Τ400經配置以基於值对„]及对心^而計算改變的不同實例 而言,經組態以儲存輸入序列之過去值X丨]之變數被初 始化為輸入序列之當前值χ[π]。其他或另外,方法Μ1〇〇可 經組態以避免針對最先幾個不活動訊框輸出正傳輸指示 參 (例如,藉由迫使任務Τ500針對彼等訊框輸出具有負態之 傳輸指示)°在此情況下,任務Τ200(可能包括任務T3〇〇) 可經組態以將任意值或零值用於一或多個過去值之每一 者,而非如本文所述初始化彼等變數。 圖8 Β展不一指令集之原始碼列表之另一實例,該指令华 可由可程式化邏輯元件陣列或其他狀態機(例如,處理器) 執行以執行方法Ml01之一實施例,該實施例包括任務 T300之實施例T310以及任務T400及T500之實施例。在此 實例中’任務T3 10包括一初始化操作,該初始化操作使用 123345.doc -22- 200818802 變數Y_ VALID來指示是否之前已調用該指令集且因此指示 儲存於變數y_cinrent中之值是否有效。在此情況下,調用 常式(例如,較大程序,諸如語音編碼方法)將經組態以在 調用該指令集之前將Y_VALID之值初始化為FALSE。若該 指令集判定Y_VALID之值為FALSE(亦即,若該指令集係 首次執行),則將變數y_current初始化為變數k0之當前 值。 靜寂描述(SID)通常包括訊框之頻譜包絡之描述及/或訊 框之能量包絡之描述。此等描述可得自當前不活動訊框及/ 或一或多個先前不活動訊框。SID亦可叫作其他名稱,諸 如”靜寂描述更新π、”靜寂描述符H、”靜寂***描述符"、 n舒適雜訊描述符訊框”及"舒適雜訊參數”。在如文件 3GPP2 C.S0014-C版本 1.0 ’’Enhanced Variable Rate Codec, Speech Service Options 3, 68, and 70 for Wideband Spread Spectrum Digital Systems”中所述之增強型可變速率編解碼 器(EVRC)之特定實例中,使用雜訊激勵線性預測(NELP) 編碼模式以八分之一速率(每訊框16個位元)對SID進行編 碼,而使用碼激勵線性預測(CELP)、原型音高週期^卩卩) 或NELP編碼模式以全速率(每訊框171個位元)、半速率(每 訊框80個位元)或四分之一速率(每訊框40個位元)對活動訊 框進行編碼。 頻譜包絡描述一般包括一組編碼參數,諸如濾波係數、 反射係數、線譜頻率(LSF)、線譜對(LSP)、導抗頻譜頻率 (ISF)、導抗譜對(ISP)、倒頻譜係數、或對數面積比。可 123345.doc -23- 200818802 配置為一或多個向量之該組編碼參數通常作為一或多個索 引量化至對應查找表或π碼薄”中。 SID内之頻譜包絡描述之典型長度目前處於八至28個位 元之範圍中。在如上文引用之3GPP2 C.S0014-C版本1.0中 所述之EVRC的特定實例中,每一 16位元之SID包括碼薄中 用於頻譜包絡之低頻資訊的四位元之索引LSPIDX1,及碼 薄中用於頻譜包絡之高頻資訊的四位元之索引LSPIDX2。 在如文件ETSI TS 126 092 V6.0.0(歐洲電信標準協會 (ETSI),Sophia Antipolis Cedex,FR,2004年 12 月)中戶斤述 之適應性多速率(AMR)語音編解碼器之特定實例中,每一 35位元之SID包括用於三個LSF子向量之每一者的8位元或 9位元長的索引。在如文件ETSI TS 126 192 V6.0.0 (ETSI,2004年12月)中所述之AMR寬頻語音編解碼器之特 定實例中,每一 35位元之SID包括用於五個ISF子向量之每 一者的5位元或6位元長的索引。 能量包絡描述可包括一待應用於訊框之增益值(亦稱為”增 益訊框π)。其他或另外,能量包絡描述可包括待應用於訊 框之若干子訊框之每一者的若干增益值(統稱為”增益分布 (gain profile)")。通常,可將增益訊框及/或增益分布作為 一或多個索引量化至對應碼薄中,儘管在某些情況下可使 用一演算法以在不使用碼薄的情況下量化及/或反量化增 益訊框及/或增益分布。SID内之能量包絡描述之典型長度 目前處於5至8個位元之範圍中。在如上文引用之3GPP2 C.S0014-C ν·1·0(版本1.0)中所述之EVRC之特定實例中, 123345.doc -24- 200818802 每一 16位元之SID包括8位元之能量索引FGIDX。在如上文 引用之ETSI TS 126 092 V6.0.0中所述之AMR語音編解碼 器及上文引用之ETSI TS 126 192 V6.0.0中所述之AMR寬 頻語音編解碼器的特定實例中,每一 35位元之SID包括6位 元之能量索引。 方法Ml00或裝置A100可用作遮沒機制以支援DTX。舉 例而言,包括方法M100之程序或包括裝置A100之設備可 經組態以僅在任務T500所產生之傳輸指示之狀態為正時執 行SID之傳輸。其他遮沒機制亦可用於支援DTX。一個此 實例為每當最近SID傳輸之後所出現的連貫不活動訊框之 數目達到(或者,超過)臨限值DTX_MAX時便發布正SID傳 輸指示的方法或裝置。DTX_MAX之典型值包括16及32。 遮沒機制之另一實例在每當最近活動訊框之後所出現的連 貫不活動訊框之數目達到(或者,超過)一臨限值時便發布 正SID傳輸指示。 可用於支援DTX之其他遮沒機制包括經組態以在偵測到 語音信號之能量及/或頻譜包絡描述之改變時發布正SID傳 輸指示之若干機制。舉例而言,此機制可經組態以在偵測 到訊框的頻譜包絡描述(例如,LSF、LSP、ISF或ISP向量) 與最後傳輸之SID的頻譜包絡描述之間的距離超過臨限值 (或者,不小於臨限值)時發布正SID傳輸指示,其指示傳 輸當前不活動訊框之描述的決定。可能需要在計算距離之 前對頻譜包絡描述進行濾、波(例如,平滑)。此機制之一變 型經組態以在其亦偵測到當前不活動訊框的能量包絡描述 123345.doc •25- 200818802 與最後傳輸之SID的能量包絡描述之間的距離超過臨限值 (或者’不小於臨限值)時發布正SID傳輸指示。另一變型 經組悲以在其偵測到滿足此等條件之任一者時發布正SID 傳輸和示。可使用的其他遮沒機制包括經組態以根據臨限 值與一諸如訊框之平均絕對值或訊框之能量值(例如,樣 本平方和)之值(可對該值進行濾、波及/或加權)之間的比較 而發布正SID傳輸指示之若干機制。 _ 可用於支援DTX之遮沒機制之另一實例經組態以在偵測The value of the flag ρ|>] indicates the result of the transfer decision. In this case, the value of the one or logical TRUE is a positive transmission indication (ie, a normal transmission indication, a transmission grant indication, an indication of the decision of the transmission), and the indication should be transmitted for the standby frame. An update to the silence description; and a zero or logical FALSE value] is a negative transmission indication (ie, a negative transmission indication, a transmission de-signification*, an indication of a decision not to transmit), the indication is not The update to the silence description should be transmitted for the current frame. In one example, the threshold Γ has a value 〇·2. A lower threshold can be used to provide greater sensitivity to changes in the resulting sequence of spectral tilt values, while a higher threshold can be used to provide a larger removal of transients in the resulting sequence of spectral tilt values. None, the person skilled in the art will recognize that each of the T400s in the alternative embodiment of the method 可ι can vary the change to a magnitude based on an expression such as the following expression: = δ咖一一] |, and task T5GG can be configured to set the binary flag according to the result of the comparison such as the following comparison 123345.doc -19- 200818802: pN = 1, ζ[η] > τ 〇, other methods Mioo It can also be implemented to include different variants of task 75, such as an average magnitude of both or more of the threshold and the calculated change (eg, the calculated change of the current and previous frames) The average magnitude is compared to the embodiment of the car father. FIG. 5 shows a block diagram of an embodiment 152 of a comparator 15 that may be used to perform an embodiment of task Τ500. In this example, comparator 152 is configured to perform a transmission decision by calculating the magnitude of the calculated change and comparing the magnitude to threshold value Τ10. In a particular example, the threshold Τ10 has a value 〇.2 (zero point two). 6 shows a block diagram of another embodiment 154 of comparator 150, which may be used to perform an embodiment of the task. In this example, the comparator 154 is configured to compare the calculated positive and negative sign values with the positive threshold Τ10 and the negative threshold Τ20, respectively, and to calculate the change greater than (or, not less than The positive transmission indication is issued when the threshold Tio or less than (or, not greater than) the limit Τ20. In an example, the threshold value T20 has a value that is a negative value of the threshold value T10 such that the comparators 152 and 154 are configured to produce the same result. However, the comparator 154 can also be implemented such that the threshold T20 has a different magnitude than the threshold T10 as needed. Another embodiment of the comparator 150 is configured to receive the calculated change from the calculator 14A as a magnitude and compare the magnitude to the threshold Τ10. As described above, the embodiments of the comparator 150 (i.e., including the comparators 152 and 154) can be considered to be suitable for the hardware, software, and/or toughness of the application 123345.doc -20-200818802. Implemented in any combination. Figure 7A shows a block diagram of an embodiment A1〇2 of apparatus A100, which is configured to perform various operations as described above on input signal pair w to produce a corresponding transmission indication. 8A shows an example of a source code list of an instruction set that can be executed by an array of programmable logic elements or other state machine (eg, a computer or processor) to perform an embodiment of the method 101, the implementation Examples include any spear; ο T3 0〇, T400, and T5 00 embodiments. In this example, the variable k preserves the spectral tilt value of the uranium frame, the variable y-current initially holds the most recent value of the smoothed spectral tilt value sequence 3, and the flag p holds the state of the transmission indication. Part 1 (Part 1) performs task T300 by calculating the current value of smoothed sequence 7 according to the above expression (1) using the value 〇_2 of the gain factor a. Part 2 (Part 2) performs task T400 by calculating the change between the current value and the most recent value of the smoothed sequence y according to the above expression (2) using the value 1 of the gain factor b. Part 3 (Part 3) performs task T500 by setting the flag p φ based on the comparison result between the calculated change and the threshold using the threshold 〇2. In a typical application, the set of instructions is executed in an iterative manner (eg, for each inactive frame) such that the initial value of the variable y_current for each iteration is the final value of the variable y_current calculated during the previous iteration. . As described above, task T300 can be configured to calculate a smoothed sequence based on one or more past values of the sequence of spectral tilt values and/or one or more past values of the sequence of smoothed spectral tilt values; The current value. However, for the initial value of smoothed sequence 7, the past value of sequence X and/or the past value of smoothed sequence 7 may not be present. If the task T300 uses any value or zero value for the 123345.doc • 21 · 200818802 generation past value to calculate the value of the smoothed sequence y, the result can cause the task τ4〇〇 to output an inappropriately large calculated change. It may in turn cause the task τ5〇〇 to output a positive transmission indicator even if the spectral tilt profile is substantially constant. It may be necessary to initialize one or more variables configured to hold the past values of sequence 1 and/or smoothed sequence y. (for example, data storage location). This initialization may be performed prior to the first execution of task T300, and/or may be performed at task T3. For example, '- or more such variables may be initialized to the current value of sequence X. In a particular example, the variable configured to store the past value of the smoothed sequence (the heartbeat in expression (1) above) is initialized to the current value of the input sequence (expression above (1) ) in the confrontation). For the different instances in which the task 400 is configured to calculate changes based on the value pair „] and the heart ^, the variable configured to store the past value of the input sequence X丨] is initialized to the current value of the input sequence χ [π Alternatively or additionally, the method 〇〇1〇〇 can be configured to avoid outputting positive transmission indications for the first few inactive frames (eg, by forcing the task Τ500 to output a negative transmission indication for the frames) In this case, task Τ200 (possibly including task T3〇〇) may be configured to use any value or zero value for each of one or more past values instead of initializing them as described herein. Figure 8 illustrates another example of a source code list of a different instruction set that may be executed by an array of programmable logic elements or other state machine (e.g., a processor) to perform an embodiment of method Ml01, Embodiments include an embodiment T310 of task T300 and an embodiment of tasks T400 and T500. In this example, 'task T3 10 includes an initialization operation that uses 123345.doc -22-200818802 variable Y_VALID to indicate Whether the instruction set has been previously called and thus indicates whether the value stored in the variable y_cinrent is valid. In this case, the calling routine (eg, a larger program, such as a speech encoding method) will be configured to invoke the instruction set The value of Y_VALID is previously initialized to FALSE. If the instruction set determines that the value of Y_VALID is FALSE (ie, if the instruction set is first executed), the variable y_current is initialized to the current value of the variable k0. Silent Description (SID) It usually includes a description of the spectral envelope of the frame and/or a description of the energy envelope of the frame. These descriptions may be taken from the current inactive frame and/or one or more previously inactive frames. The SID may also be called other Names such as "Quiet Description Update π," Silence Descriptor H, "Quiet Insert Descriptor", "Comfort Noise Descriptor Frame" and "Comfort Noise Parameters". Enhanced Variable Rate Codec (EVRC) as described in document 3GPP2 C.S0014-C Version 1.0 ''Enhanced Variable Rate Codec, Speech Service Options 3, 68, and 70 for Wideband Spread Spectrum Digital Systems' In a specific example, the noise-stimulated linear prediction (NELP) coding mode is used to encode the SID at an eighth rate (16 bits per frame), using code-excited linear prediction (CELP), the prototype pitch period^卩卩) or NELP encoding mode to the active frame at full rate (171 bits per frame), half rate (80 bits per frame) or quarter rate (40 bits per frame) Encoding. The spectral envelope description generally includes a set of coding parameters such as filter coefficients, reflection coefficients, line spectral frequency (LSF), line spectrum pair (LSP), impedance spectrum frequency (ISF), impedance spectrum pair (ISP), Cepstral coefficient, or logarithmic area ratio. 123345.doc -23- 200818802 The set of encoding parameters configured as one or more vectors is typically quantized as one or more indices into a corresponding lookup table or π codebook. The typical length of the spectral envelope description within the SID is currently in the range of eight to twenty-eight bits. In a specific example of an EVRC as described in 3GPP2 C.S0014-C, version 1.0, cited above, each 16-bit SID includes a four-bit index LSPIDX1 for low frequency information of the spectral envelope in the codebook, and The four-bit index LSPIDX2 for high frequency information of the spectral envelope in the codebook. In a specific example of an adaptive multi-rate (AMR) speech codec as described in the document ETSI TS 126 092 V6.0.0 (European Telecommunications Standards Institute (ETSI), Sophia Antipolis Cedex, FR, December 2004) Each 35-bit SID includes an 8-bit or 9-bit long index for each of the three LSF sub-vectors. In a specific example of the AMR wideband speech codec as described in document ETSI TS 126 192 V6.0.0 (ETSI, December 2004), each 35-bit SID includes for each of the five ISF sub-vectors. One of the 5-bit or 6-bit long index. The energy envelope description may include a gain value to be applied to the frame (also referred to as a "gain frame π." Other or additional, the energy envelope description may include a number of sub-frames to be applied to each of the subframes of the frame. Gain values (collectively referred to as "gain profiles"). In general, the gain frame and/or gain profile can be quantized into the corresponding codebook as one or more indices, although in some cases an algorithm can be used to quantize and/or reverse without using a codebook. Quantize the gain frame and/or gain profile. The typical length of the energy envelope description within the SID is currently in the range of 5 to 8 bits. In a specific example of EVRC as described above in 3GPP2 C.S0014-C ν.1.0 (version 1.0), 123345.doc -24- 200818802 each 16-bit SID includes 8-bit energy Index FGIDX. In a specific example of the AMR speech codec as described in ETSI TS 126 092 V6.0.0, cited above, and the AMR wideband speech codec described in ETSI TS 126 192 V6.0.0 referenced above, each The 35-bit SID includes a 6-bit energy index. Method M100 or device A100 can be used as an obstruction mechanism to support DTX. By way of example, a program comprising method M100 or a device comprising apparatus A100 can be configured to perform the transmission of the SID only when the status of the transmission indication generated by task T500 is positive. Other obscuration mechanisms can also be used to support DTX. One such example is a method or apparatus that issues a positive SID transmission indication each time the number of consecutive inactive frames that occur after the most recent SID transmission reaches (or exceeds) the threshold DTX_MAX. Typical values for DTX_MAX include 16 and 32. Another example of an occlusion mechanism issues a positive SID transmission indication when the number of consecutive inactive frames that occur after the most recent active frame reaches (or exceeds) a threshold. Other occlusion mechanisms that may be used to support DTX include mechanisms configured to issue a positive SID transmission indication upon detection of a change in the energy of the speech signal and/or a change in the spectral envelope description. For example, the mechanism can be configured to detect a distance between a spectral envelope description (eg, LSF, LSP, ISF, or ISP vector) of the frame and a spectral envelope description of the last transmitted SID that exceeds a threshold. (or, not less than the threshold) a positive SID transmission indication is issued indicating the decision to transmit the description of the currently inactive frame. It may be necessary to filter (eg, smooth) the spectral envelope description before calculating the distance. A variant of this mechanism is configured to detect the distance between the energy envelope description of the current inactive frame 123345.doc •25- 200818802 and the energy envelope description of the last transmitted SID exceeds the threshold (or A positive SID transmission indication is issued when 'not less than the threshold value. Another variant is to post positive SID transmissions and indications when it detects that any of these conditions are met. Other occlusion mechanisms that may be used include values configured to be based on a threshold value and an average absolute value of the frame or the energy value of the frame (eg, the sum of the squares of the samples) (the value may be filtered, filtered, and/or Several mechanisms for issuing positive SID transmission indications are compared between comparisons or weightings. _ Another instance of the masking mechanism that can be used to support DTX is configured to detect

到最後傳輸之SID與當前不活動訊框之間的itakura距離超 過臨限值(或者,不、小於臨限值)時發布正SID傳輸指示。 此機制之一變型經組態以在偵測到(A)最後傳輸之SID與 (B)當前不活動訊框與先前不活動訊框之平均值之間的 Itakura距離超過臨限值(或者,不小於臨限值)時發布正 SID傳輸指示。Itakura距離為基於自相關及殘餘能量值之 頻譜改變量度,且此機制之描述可在ITU_T φ Rec〇mmendation G.729 附錄 Β(國際電信聯盟,Geneva, CH,1996年10月)中查知。 方法Ml 〇〇或裝置A100之實施例可與一或多個其他遮沒 機制(諸如,上述之彼等遮沒機制之一或多者)相組合。舉 例而言’包括或執行此實施例之裝置可經組態以在其遮沒 機制之任一者針對彼訊框發布正SID傳輸指示時傳輸SID。 圖7B展示此實例之一實施例,其中使用邏輯,,或,,運算將若 干不同的傳輸指示組合成複合傳輸指示。 如上所述,SID可得自一或多個不活動訊框。舉例而 123345.doc -26- 200818802 s ’可此耑要包括裝置a l 〇 〇之設備或包括方法μ i 〇 〇之程 序計算並傳輸表示若干經編碼不活動訊框之平均值之 SID ’而非按照單個經編碼不活動訊框來傳輸SID。此平均 值可使用FIR或HR濾波操作及/或藉由使用諸如中值濾波之 統計方法進行計算,其中該中值濾波可包括廢棄離群值或 用中值取代離群值。舉例而言,該設備或程序可經組態以 藉由用一或多個先前不活動訊框之彼等能量及頻譜包絡描 述以統計方式平滑當前訊框之能量及頻譜包絡描述而計算 SID,從而使得所得之SID含有近期最常出現的增益及頻率 值。 ' 對其計算平均值之訊框的數目可為固定的,或可根據 (例如)平穩性量度而改變。此量度之一實例為在不同之兩 組汛框上所獲得之頻譜平均值之間的距離(例如,工“匕以 距離)。在如上文引用之G 729附錄B中所述之一個此實例 中’對六個過去訊框(包括當前訊框)及對兩個過去訊框計 平句值。若此等兩個平均值之間的距離超過臨限值(或 者,不小於臨限值),則SID包括對兩個訊框求平均值的頻 譜描述(例如,假設信號係局部不平穩的)。其他方面, SID包括對六個訊框求平均值的頻譜描述(例如,假設信號 係局。P平穩的)。在如上文引用2ETSI TS 126 〇 〇 中斤述之AMR見頻編解碼器之特定實例巾,sid包括一抖 動才曰丁胃抖動指示之狀態係根據當前訊框與七個先前訊 框之間的頻譜距離之和或根據當前訊框之能量與過去訊框 之平均能量值之間的距離而設定。 123345.doc -27· 200818802The positive SID transmission indication is issued when the itakura distance between the last transmitted SID and the current inactive frame exceeds the threshold (or, no, less than the threshold). A variant of this mechanism is configured to detect the Itakura distance between (A) the last transmitted SID and (B) the current inactive frame and the previous inactive frame exceeds the threshold (or, A positive SID transmission indication is issued when not less than the threshold. The Itakura distance is a measure of spectral change based on autocorrelation and residual energy values, and a description of this mechanism can be found in the ITU_T φ Rec〇mmendation G.729 appendix (International Telecommunication Union, Geneva, CH, October 1996). The method M1 or the embodiment of apparatus A100 can be combined with one or more other occlusion mechanisms, such as one or more of the above-described occlusion mechanisms. For example, a device that includes or performs this embodiment can be configured to transmit a SID when any of its occlusion mechanisms issue a positive SID transmission indication for a frame. Figure 7B shows an embodiment of this example in which a logical, or, operation is used to combine different transmission indications into a composite transmission indication. As mentioned above, the SID can be derived from one or more inactive frames. For example, 123345.doc -26- 200818802 s 'This may include a device of device a or a program including method μ i 计算 to calculate and transmit a SID representing the average of several coded inactive frames instead of The SID is transmitted in accordance with a single encoded inactive frame. This average may be calculated using FIR or HR filtering operations and/or by using statistical methods such as median filtering, which may include discarding outliers or replacing outliers with median values. For example, the device or program can be configured to calculate the SID by statistically smoothing the energy and spectral envelope description of the current frame with the energy and spectral envelope descriptions of one or more previously inactive frames. The resulting SID thus contains the most frequently occurring gain and frequency values in the near future. The number of frames on which the average is calculated may be fixed or may vary depending on, for example, the measure of stationarity. An example of such a measure is the distance between the spectral averages obtained on two different sets of frames (eg, "distance"). One such instance as described in Appendix B of G 729, cited above In the 'six past frames (including the current frame) and the values of the two past frames. If the distance between these two averages exceeds the threshold (or, not less than the threshold) The SID includes a spectral description that averages the two frames (eg, assuming the signal is partially unstable). In other aspects, the SID includes a spectral description that averages the six frames (eg, assuming a signal system) P is smooth.) In the specific example of the AMR video codec as described above in 2ETSI TS 126 ,, the sid includes a jitter-based gastric jitter indication status based on the current frame and seven The sum of the spectral distances between the previous frames is set according to the distance between the energy of the current frame and the average energy value of the past frame. 123345.doc -27· 200818802

方法M1〇〇可經實施以使得任務Τ2〇〇自另一過程(諸如, 語音編碼過程)接收頻譜傾斜值序列。舉例而言,經㈣ 以執行方法Μ1〇〇之實施例之設備或系統通常亦經組態以 對居音信號執行語音編碼方法。語音編碼方法可包括線性 預測編碼(LPC)分析,該分析計算-組係數,該組係數將 §吾音信號在時m之樣本模型化為語音信號在/之前的時刻 之樣本之線性組合。由通信設備(例如,蜂巢式電話)之語 音編碼器所執行之LPC分析通常具有級數四、六、八、 16 2〇 24、28或32。就對語音信號之不同頻帶 執行獨立LPC分析而言,任務T2⑽可經配置以接收基於對 低頻帶(例如,包括i kHz以下之頻率)或中頻帶(例如,包 括至少處於1 kHz與2 kHz之間的頻率)之分析的頻譜傾斜值 序列。 任務T 2 〇 〇可經配置以接收頻譜傾斜值序列作為反射係數 序列,諸如,第一或第二反射係數序列。本文所揭示之組 態之範圍包括包含方法M100與語音編碼方法之組合(例 如,如圖9所述)之若干方法,以及包括方法]^1〇〇之若干語 音編碼方法。 裝置A100可經實施以使得序列產生器12〇自諸如語音編 碼器之另一裝置接收頻譜傾斜值序列。舉例而言,包括裝 置A100之實施例之設備或系統通常將亦包括一語音編碼 器,該語音編碼器可經組態以對語音信號執行LPC分析。 在此情況下,序列產生器120可經配置以接收頻謹傾斜值 序列作為反射係數序列。本文所揭示之組態之範圍包括包 123345.doc •28- 200818802 含裝置A1〇o與語音編瑪器之組合(例如,如圖騎述)之若 干裝置,以及包括裝置八100之若干語音編碼器。 或者,任務T200可經實施以包括任務T21〇,任務τ2ι〇 基於語音信號之複數個不活動訊框而計算頻譜傾斜值序 列。任務Τ210可經組態以(例如)根據如下文所述之若干不 同技術之一或多者而對一系列訊框之每一者評估信號之頻 譜傾斜。圖11Α展示方法河1〇〇之實施例河2〇〇之流程圖, 其中實施例Μ200包括任務Τ2〇〇之此實施例丁2〇2。任務 Τ210亦可經配置以將計算出的頻譜傾斜值序列提供至較大 過程(諸如,語音編碼方法)之其他任務。方法Μ1〇〇亦可經 貫施以使得將任務Τ200實施為任務Τ21 〇。 圖11Β展示裝置Α100之實施例Α200之方塊圖,其中實施 例Α200包括序列產生器120之實施例122。序列產生器122 包括計算器128,計算器128經組態以基於語音信號之複數 個不活動訊框而計算頻譜傾斜值序列。舉例而言,計算器 128可經組態以執行如本文所揭示之任務Τ21〇之實施例。 如同裝置Α200之其他元件,計算器128亦可以視為適合於 所欲應用之硬體、軟體及/或韌體之任一組合而實施。計 算器128亦可經配置以將計算出的頻譜傾斜值序列提供至 诸如5吾音編碼裔之較大裝置之其他任務。裝置a 1 〇 〇亦可經 實施以使得將序列產生器120實施為計算器128。 任務T210之典型實施例經組態以將頻譜傾斜計算為語音 信號之對應訊框之第一反射係數。可將訊框之第一反射係 數(通常表示為A:〇)計算為比及(l)AR(〇)(亦即,訊框之正規化 123345.doc -29- 200818802 第一自相關值),對於處於-1至+1之範圍中之樣本值而 言,比及(1)/Λ(0)具有處於-1與+1之間的純量值。在此表達 式中,及(1)表示訊框之第一自相關係數(亦即,在滯後一個 樣本時訊框之自相關函數之值),且及(〇)表示訊框之第零個 自相關係數(亦即,在零滯後時訊框之自相關函數之值)。 在其他實施例中,任務Τ2ΐ〇經組態以將頻譜傾斜計算為 語音信號之對應訊框之第二反射係數。訊框之第二反射係 數(通常表示為灸〇可計算為: kx =-及(2)- W ⑴ R(0)R(2)- R(l)2 (1 ~ ^,2)/?(0) R(0)2 - R(l)2 其中及(2)表不訊框之第二自相關係數(亦即,在滯後兩個樣 本h訊框之自相關函數之值)。任務T210亦可經實施以基 於一或多個其他參數(諸如,一或多個LPC濾波係數)而計 异對應訊框之一或多個反射係數(例如,第一及/或第二反 射係數)。 任務T210之實施例之範圍並不限於將頻譜傾斜計算為反 射係數之彼等實施例。其他或另外,任務T210可經組態以 執仃或多個其他頻譜評估技術,從而計算一或多個訊框 之頻ϋθ傾斜。此等頻譜評估技術可包括將每一訊框之頻譜 傾斜汁#為呵頻帶之能量與低頻帶之能量之間的比。此計 ^可包括對片段執行頻率變換,諸如離散傅立葉變換 (DFT) it匕等頻譜評估技術可包括將頻譜傾斜計算為每一 片奴内之零父又之數目。在此情況下,較高數目的零交又 可用來指示較大量的高頻能量。 123345.doc -30- 200818802 在計异頻譜傾斜值序列時,任務T210可經組態以基於自 相關函數之值而執行計算,諸如,如上所述計算一或多個 反射係數。計算LPC模型參數(諸如濾波或反射係數)之自 相關方法涉及執行一系列迭代以求解包括特普立茲 (Toeplitz)矩陣之方程式。在某些實施例中,任務丁21〇經組 態以根據用於求解此方程式之熟知的李文森(Levins〇n)及/ 或杜賓(Durbin)遞歸演算法之任一者而執行自相關方法。 此演异法通常將反射係數(亦稱為偏相關(pARC〇R)係數、 負PARC0R係數或Schu卜Szego參數)計算為產 生一組LPC濾 波係數之過程中的中間值。 在其他實施例中,任務T21〇經組態以執行一系列迭代, 從而計算一或多個反射係數而非一組濾波係數。舉例而 言,任務Τ210可經組態以使演算法之實 施例來獲取一或多個反射係數。或者,任務T2 i 〇可經組態 以使用另一热知迭代方法之實施例進而從自相關值獲取一 或多個反射係數,諸如Schur遞歸演算法(其可經組態而用 於有效的平行計算)或Burg遞歸演算法。 任矛々T210可經組悲以計算語音信號之對應訊框的自相關 凸數之或夕個值。舉例而言,任務T21 0可經組態以根據 諸如下述表達式之表達式而針對特定滯後值m(其中犯為不 小於零之整數)來評估訊框之自相關函數; e(w)=乞 + W] ”中N表示訊框中之樣本之數目。或者,任務丁21 〇可經組 123345.doc -31 * 200818802 ι以(例#自w"編碼器,或語音編碼方法或其他過程 接收自相關函數之值。 語音編碼H或語音編碼方法可經組態以將自相關函數之 值用於編碼操作中,諸如斗瞀T DP> t 知 帝如计异LPC模型之參數(例如,濾波 及/或反射係數)。可能需要此語音編碼器或語音編衫法 對自相關值執行-或多個預處理操作。舉例而言,可藉由 執行諸如下述操作之操作而對自相關值♦)進行頻: 滑: 、曰Method M1 can be implemented such that task 接收2 receives a sequence of spectral tilt values from another process, such as a speech encoding process. For example, a device or system via the embodiment of the method (4) is typically also configured to perform a speech encoding method on the voice signal. The speech coding method may include a linear predictive coding (LPC) analysis that computes a set of coefficients that model the samples of the sigma signal at time m as a linear combination of samples at the time before/before the speech signal. The LPC analysis performed by the speech coder of a communication device (e.g., a cellular telephone) typically has a level of four, six, eight, 16 2, 24, 28, or 32. For performing independent LPC analysis on different frequency bands of the speech signal, task T2(10) may be configured to receive based on a pair of low frequency bands (eg, including frequencies below i kHz) or a medium frequency band (eg, including at least 1 kHz and 2 kHz) A sequence of spectral tilt values for the analysis of the frequency). Task T 2 〇 〇 may be configured to receive a sequence of spectral tilt values as a sequence of reflection coefficients, such as a sequence of first or second reflection coefficients. The scope of the embodiments disclosed herein includes several methods including a combination of the method M100 and a speech encoding method (e.g., as described in Fig. 9), and a plurality of speech encoding methods including the method. Apparatus A100 can be implemented to cause sequence generator 12 to receive a sequence of spectral tilt values from another apparatus, such as a voice coder. For example, a device or system including an embodiment of apparatus A100 will typically also include a speech encoder that can be configured to perform LPC analysis on the speech signal. In this case, sequence generator 120 can be configured to receive a sequence of frequency tilt values as a sequence of reflection coefficients. The scope of the configuration disclosed herein includes a package 123345.doc • 28-200818802 including a combination of device A1〇o and a speech coder (eg, as illustrated), and a number of speech encodings including device eight 100 Device. Alternatively, task T200 can be implemented to include task T21, which computes a sequence of spectral tilt values based on a plurality of inactive frames of the speech signal. Task Τ 210 can be configured to evaluate the spectral slope of the signal for each of a series of frames, for example, based on one or more of a number of different techniques as described below. Figure 11A shows a flow chart of an embodiment of a method river, wherein the embodiment Μ200 includes the embodiment of the task 丁2〇2. Task 210 may also be configured to provide a sequence of calculated spectral tilt values to other tasks of a larger process, such as a speech encoding method. The method 〇〇1〇〇 can also be applied continuously so that the task Τ200 is implemented as a task Τ21 〇. Figure 11A shows a block diagram of an embodiment 200 of apparatus 100, wherein embodiment 200 includes an embodiment 122 of sequence generator 120. Sequence generator 122 includes a calculator 128 that is configured to calculate a sequence of spectral tilt values based on a plurality of inactive frames of the speech signal. For example, the calculator 128 can be configured to perform the embodiments of the tasks disclosed herein. As with the other components of the device 200, the calculator 128 can also be implemented as any combination of hardware, software, and/or firmware suitable for the application. The calculator 128 can also be configured to provide the calculated sequence of spectral tilt values to other tasks such as a larger device of the 5th code. The device a 1 〇 〇 can also be implemented such that the sequence generator 120 is implemented as a calculator 128. The exemplary embodiment of task T210 is configured to calculate the spectral tilt as the first reflection coefficient of the corresponding frame of the speech signal. The first reflection coefficient of the frame (usually expressed as A: 〇) can be calculated as the ratio (1)AR(〇) (that is, the normalization of the frame 123345.doc -29-200818802 first autocorrelation value) For a sample value in the range of -1 to +1, the ratio (1) / Λ (0) has a scalar value between -1 and +1. In this expression, (1) represents the first autocorrelation coefficient of the frame (ie, the value of the autocorrelation function of the frame when laging one sample), and (〇) indicates the zeroth of the frame. Autocorrelation coefficient (ie, the value of the autocorrelation function at the zero lag time frame). In other embodiments, task Τ2 is configured to calculate the spectral tilt as the second reflection coefficient of the corresponding frame of the speech signal. The second reflection coefficient of the frame (usually expressed as moxibustion can be calculated as: kx =- and (2)- W (1) R(0)R(2)- R(l)2 (1 ~ ^, 2)/? (0) R(0)2 - R(l)2 where (2) the second autocorrelation coefficient of the frame is not (ie, the value of the autocorrelation function in the hysteresis of the two samples). T210 can also be implemented to count one or more reflection coefficients (eg, first and/or second reflection coefficients) of the corresponding frame based on one or more other parameters, such as one or more LPC filter coefficients. The scope of the embodiment of task T210 is not limited to the embodiment in which the spectral tilt is calculated as the reflection coefficient. Alternatively or additionally, task T210 can be configured to perform one or more other spectrum estimation techniques to calculate one or more The frequency of the frames is ϋθ. These spectrum evaluation techniques may include slanting the spectrum of each frame to the ratio between the energy of the band and the energy of the low band. This calculation may include performing frequency transformation on the segment. Spectral evaluation techniques such as discrete Fourier transform (DFT) it may include calculating the spectral tilt as the number of zero fathers in each slave. In the case where a higher number of zero crossings can be used to indicate a larger amount of high frequency energy. 123345.doc -30- 200818802 Task T210 can be configured to be based on the value of the autocorrelation function when counting the sequence of skew values of the spectrum A calculation is performed, such as calculating one or more reflection coefficients as described above. An autocorrelation method of calculating LPC model parameters, such as filtering or reflection coefficients, involves performing a series of iterations to solve an equation including a Toeplitz matrix. In some embodiments, the task is configured to perform an autocorrelation method according to any of the well-known Levins〇n and/or Durbin recursive algorithms for solving this equation. This algorithm typically calculates the reflection coefficient (also known as the partial correlation (pARC〇R) coefficient, the negative PARC0R coefficient, or the Schub Szego parameter) as an intermediate value in the process of generating a set of LPC filter coefficients. Task T21 is configured to perform a series of iterations to calculate one or more reflection coefficients rather than a set of filter coefficients. For example, task Τ 210 can be configured to make the algorithm realistic Embodiments to obtain one or more reflection coefficients. Alternatively, task T2 i 〇 can be configured to use one embodiment of another thermally known iterative method to obtain one or more reflection coefficients from autocorrelation values, such as a Schur recursive algorithm. (It can be configured for efficient parallel computing) or Burg recursive algorithm. Any spear T210 can be used to calculate the ortho-value of the autocorrelation convex of the corresponding frame of the speech signal. Task T21 0 may be configured to evaluate the autocorrelation function of the frame for a particular hysteresis value m (where an integer is not less than zero) according to an expression such as the following expression; e(w)=乞+ N in W] ” indicates the number of samples in the frame. Alternatively, the task 21 21 〇 can receive the value of the autocorrelation function by the group 123345.doc -31 * 200818802 ι (example #自w" encoder, or speech coding method or other process. Speech coding H or speech coding method can be Configured to use the value of the autocorrelation function in the encoding operation, such as the TMP> t, such as the parameters of the different LPC model (eg, filtering and/or reflection coefficients). This vocoder or speech may be required. The carding method performs - or a plurality of pre-processing operations on the autocorrelation value. For example, the autocorrelation value ♦) can be performed by performing an operation such as the following: Slip: , 曰

Rw(m) = 1.00003 R(m)9U姻]斗), 在此情境巾’任務T210可經組態以對自相關值執行頻譜平 滑或另一預處理操操作,及/或使用經過頻譜平滑或^其 他方式進行預處理的自相關值來計算頻譜傾斜參數之值。 在將自相關函數應用於語音信號(例如,藉由任務 T210,或語音編碼器或語音編碼方法)之前,可能需要將 視窗函數應用於該信號。舉例而言,可能需要1當前 正被應用自相關函數之訊框外面的語音信穿歸爱在苹 情況下,視窗函數>φ2]為矩形或三角形的。可能需要使用 在視窗之每一端具有低樣本權值之楔形視窗函此可幫 助減少視窗外之分量的影響。舉例而言,可处币 、 ρ j犯需要升餘弦 視窗,諸如,下述之漢明(Hamming)視窗函數· 〇 ] .54 - 0 .46 cos 其他方面 123345.doc -32- 200818802 其中N為訊框中之樣本之數目。 可使用的其他楔形視窗包括漢寧(Hanning)、布雷克曼 (Blackman)、凱斯(Kaiser)及巴列特(Bartieu)視窗。有窗= 框\[«]可根據諸如下述表達式之表達式而計算: K[n]^s[n]w[n]; 0<η<Ν-\ ο 視窗函數無需對稱,以使得視窗之一半可以與另一半不同 ❿ ^式進行加權。亦可使用混合視窗,諸如漢明餘弦視 ® ’或具有兩半不同視窗(例如,大小不同的兩個漢明視 窗)一之視窗。可在樣本值及/或有窗值用於評估自相關函數 之前料執行諸如感知加權之—或多個其他預處理操作 (例如,藉由任務Τ210或語音編碼器或語音編碼方法卜 視窗函數可經組態以包括當前訊框之樣本以及來自 ,士夕個鄰近汛框之樣本。在某些情況下,視窗包括來自 當前訊框以及鄰近的先前及後來訊框之樣本(例如,包括 • 緊接在2〇毫秒訊框之前及之後的5毫秒之5-20-5視窗)。在 其他t月況下’視窗包括僅來自當前訊框及鄰近的先前訊框 t樣本(例如’包括當前2〇毫秒訊框及先前訊框之最後1〇 毫秒的10-20視窗)。 ' 對將視窗函數應用於語音信號(例如,藉由任務T210或 語音編碼器或語音編碼方法)的情況而言,訊框 一 純可根據諸如下述表達式之表達式而計算: 相關 ( N-\-m 123345.doc -33- 200818802 如上所述,可能需要任務T300或平滑器13〇平滑僅包括 對應於不活動訊框之值的序列。在此情況下,方法 或裝置A100可經配置以(例如,自語音編碼器或語音編碼 方法)接收訊框中之語音活動之位準的指示。舉例而古^ 此指示(亦稱為”語音活動指示")可具有二進位變數或旗桿 之形式,該二進位變數或旗標之狀態指示對應訊框是活= 的還是不活動的。 語音活動指示可用於控制平滑任務T300之操作。舉例而 t,語音活動指示可料允許自對應不活動訊框產生經平 滑之頻譜傾斜值,及/或防止自對應活動訊框產生經平滑 之頻譜傾斜值。在-個此實例中,電腦或處理器經組態: 控制任務T300僅在語音活動指示指示對應訊框為不活動訊 框時平滑頻譜傾斜值。或者,任務T3〇〇可包括根據對_ 音活動偵測之值而決定是否產生經平滑之頻譜傾斜值或決 定接受還是去除頻譜傾斜值。圖12Α展示方法謝〇1之實施 φ 賴110之流程®,實施例_〇包括任務丁则之此實 T320 。 ' 語音活動指示可用於控制計算任務T21()之操作。舉例而 言:語音活動指示可用於允許產生對應不活動訊框之頻譜 傾:,及/或防止產生對應活動訊框之頻譜傾斜。在一個 此κ例中,處理器經組態以控制任務丁21〇僅在語音活動指 示指示當前訊框為不活動訊框時計算頻譜傾斜。或者,根 f對應,吾音活動指示之值,任務τ2ι〇可經組態以包括決定 生、"疋訊框之頻譜傾斜,或可經組態以控制其輸入 123345.doc -34- 200818802 (例如,接受還是去除訊框)及/或其輸出(例如,是否發布 頻譜傾斜值)。圖12B展示方法M200之實施例M210之流程 圖’實施例M210包括任務T202之實施例T204,其中任務 T204包括任務T210之此實施例T220。 作為接收語音活動指示之替代方式,方法Μ1 〇〇可經實 施以包括任務Τ100,任務Τ100經組態以指示訊框是活動的 還是不活動的。舉例而言,任務Τ1 〇〇可經組態以計算如上 所述之語音活動指示(VAI)。圖12C展示方法Μ101之包括 響任務Τ100之實施例Μ12〇的流程圖,且圖12D展示方法 M200之包括任務T100之實施例M220的流程圖。任務T100 可經組態以基於一或多個因素而將訊框分為活動或不活動 的’該或該等因素諸如全頻帶能量、低頻帶能量、高頻帶 能量、頻譜參數(例如,一或多個LSF及/或反射係數)、週 期性及零交叉率。舉例而言,此分類可包括將此類特性之 值與固定或適應性臨限值相比較,及/或計算此類特性之 φ 值的改變量值(例如,兩個值之間的差之量值,或一值與 一移動平均值之間的差之量值)並將該量值與固定或適應 性臨限值相比較。 任務T100可經組態以評估當前訊框在低頻帶及高頻帶之 每一者中之能量,並在每一頻帶中之能量小於(或者,不 大於)各別臨限值時指示訊框為不活動的。此等臨限值可 為固定或適應性的。舉例而言,每一臨限值可基於所要之 、、扁馬速率。在上文所引用之c v i .〇之章節4.7中描 述了 一對適應性臨限值之一實例。在此實例中,每一頻^ 123345.doc •35· 200818802 之臨限值基於錨定操作點(如得自所要之平均資料速率)、 先前訊框之在彼頻帶中之背景雜訊位準之估計及先前訊框 之在彼頻帶中之信雜比。 自活動語音至不活動語音之過渡通常發生在一段具有若 干汛框之日守期上,且除背景雜訊之外,在自活動語音過渡 之後的最先幾個不活動訊框亦可包括發音殘餘。發音殘餘 (voicing remnant)可使得此等後過渡不活動訊框具有與背 景雜訊之彼等頻譜傾斜不同的頻譜傾斜,且此等差別可破 壞任務T200所產生之頻譜傾斜值序列,並導致不必要的 SID傳輸。 如上所述,可能需要任務T200產生僅基於不活動訊框之 序列X之值。同樣地,可能需要任務73〇〇產生僅基於來自 不活動訊框之一或多個頻譜傾斜值的經平滑序列y之值。 亦可能需要方法M100之實施例避免使用來自一或多個後 過渡訊框之頻譜傾斜值更新頻譜傾斜輪廓。此限制可幫助 減小決定任務T500作出錯誤正傳輸指示之可能性。 任務T200可經組態以根據對應不活動訊框與先前活動訊 框之間的時間距離而產生所產生之頻譜傾斜值年列之一或 多個值。舉例而言,任務T200或任務T300之此實施例可經 組態以在自活動語音過渡之後針對一或多個不活動訊框而 延緩或暫時中止頻譜傾斜輪廓更新之開始。圖i3A及圖 13B分別說明此過渡及此延緩或暫時中止之影響的實例。 圖13A展示後過渡訊框中之發音殘餘所引起的經平滑頻譜 傾斜輪廓振幅之急劇改變。此改變可導致不當的正sid傳 123345.doc -36- 200818802 下改為引起振幅之急劇降低。藉由比較,圖l3B展示一實 例,其巾應用延緩(亦稱為”延遲”)以在後過渡訊框期間去 輪決定。在此特定實例中,頻譜傾斜參數為第—反射係數 k〇以使彳于發音殘餘引起經平滑頻譜傾斜輪廓之振幅之急 劇上升,儘管發音殘餘可在使用另—頻譜傾斜參數m ’並不發生圖13A中所 ’在自活動語音過渡至 月匕經平滑輪廓之更新。在此情況下 看到的急劇上升。在一特定實例中Rw(m) = 1.00003 R(m)9U marriage], in this context towel 'task T210 can be configured to perform spectral smoothing or another pre-processing operation on the autocorrelation value, and/or use spectral smoothing Or ^ other ways to pre-process the autocorrelation value to calculate the value of the spectral tilt parameter. Before applying the autocorrelation function to the speech signal (e.g., by task T210, or a speech coder or speech coding method), it may be desirable to apply a window function to the signal. For example, it may be necessary to have a voice letter outside the frame to which the autocorrelation function is currently applied. In the case where the window function > φ2] is rectangular or triangular. It may be necessary to use a wedge window function with low sample weights at each end of the window to help reduce the effects of components outside the window. For example, the coin, ρ j is required to raise the cosine window, such as the following Hamming window function · 〇] .54 - 0 .46 cos other aspects 123345.doc -32- 200818802 where N is The number of samples in the frame. Other wedge windows that can be used include Hanning, Blackman, Kaiser, and Bartieu windows. There is a window = box \[«] can be calculated according to an expression such as the following expression: K[n]^s[n]w[n]; 0<η<Ν-\ ο The window function does not need to be symmetrical, so that One and a half of the window can be weighted differently from the other half. You can also use a hybrid window, such as the Hamming Cosine View ® or a window with two halves of different windows (for example, two Hamming windows of different sizes). The sample value and/or the windowed value may be subjected to, for example, perceptual weighting—or a plurality of other pre-processing operations (eg, by task Τ210 or a speech coder or a speech coding method). Configurable to include samples of the current frame and samples from adjacent frames. In some cases, the window includes samples from the current frame and adjacent previous and subsequent frames (eg, including • tight 5-20-5 windows 5 ms before and after the 2 〇 millisecond frame. In other t months, the window includes only the current frame and neighboring previous frame t samples (eg 'include current 2') 〇 millisecond frame and the last 1 sec 10-20 window of the previous frame. ' For the case where the window function is applied to a speech signal (for example, by task T210 or a speech coder or speech coding method), The frame can be calculated according to an expression such as the following expression: Correlation (N-\-m 123345.doc -33- 200818802 As mentioned above, it may be necessary to perform task T300 or smoother 13〇 smoothing only including corresponding to no live A sequence of values of the frame. In this case, the method or apparatus A100 can be configured to receive an indication of the level of voice activity in the frame (e.g., from a speech coder or a speech encoding method). The indication (also referred to as "voice activity indication") may have the form of a binary variable or flag, the status of the binary variable or flag indicating whether the corresponding frame is live = or inactive. The voice activity indication can be used to control Smoothing the operation of task T300. For example, t, the voice activity indication may allow a smoothed spectral tilt value to be generated from the corresponding inactive frame, and/or to prevent smoothed spectral tilt values from being generated from the corresponding active frame. In this example, the computer or processor is configured: Control Task T300 only smoothes the spectral tilt value when the voice activity indication indicates that the corresponding frame is an inactive frame. Alternatively, task T3 may include detecting the sound activity according to the voice. The value determines whether to generate a smoothed spectral tilt value or to determine whether to accept or remove the spectral tilt value. Figure 12Α shows the implementation of the method 〇 110 Process, Example _〇 includes the task T320. The voice activity indication can be used to control the operation of the computing task T21(). For example: the voice activity indication can be used to allow the generation of the spectrum of the corresponding inactive frame: and/or Preventing the spectral tilt of the corresponding active frame. In one such example, the processor is configured to control the task to calculate the spectral tilt only when the voice activity indication indicates that the current frame is an inactive frame. f corresponds to the value of the voice activity indication, task τ2ι〇 can be configured to include the spectral slope of the decision frame, or can be configured to control its input 123345.doc -34- 200818802 (eg, Accept or remove the frame) and / or its output (for example, whether to release the spectrum tilt value). Figure 12B shows a flow of an embodiment M210 of method M200. The embodiment M210 includes an embodiment T204 of task T202, wherein task T204 includes this embodiment T220 of task T210. As an alternative to receiving voice activity indications, method Μ1 〇〇 can be implemented to include task Τ 100, which is configured to indicate whether the frame is active or inactive. For example, task Τ1 〇〇 can be configured to calculate a voice activity indication (VAI) as described above. Figure 12C shows a flowchart of an embodiment Μ12 of the method Μ101, and Figure 12D shows a flowchart of an embodiment M220 of the method M200 including the task T100. Task T100 can be configured to divide the frame into active or inactive ones based on one or more factors such as full band energy, low band energy, high band energy, spectral parameters (eg, one or Multiple LSF and/or reflection coefficients), periodicity and zero crossing rate. For example, this classification may include comparing the value of such a characteristic to a fixed or adaptive threshold, and/or calculating the magnitude of the change in the value of such a characteristic (eg, the difference between two values) The magnitude, or the magnitude of the difference between a value and a moving average, is compared to a fixed or adaptive threshold. Task T100 can be configured to evaluate the energy of the current frame in each of the low and high frequency bands, and the indication frame is when the energy in each frequency band is less than (or, not greater than) the respective threshold. Inactive. These thresholds can be fixed or adaptive. For example, each threshold can be based on the desired, flat horse rate. An example of a pair of adaptive thresholds is described in section 4.7 of c v i . In this example, the threshold for each frequency is based on the anchor operating point (if the desired average data rate), the background noise level of the previous frame in the other band. The estimate and the signal-to-noise ratio of the previous frame in the band. The transition from active speech to inactive speech usually occurs on a period of time with a number of frames, and in addition to background noise, the first few inactive frames after the transition from active speech can also include pronunciation. Residual. The voicing remnant may cause the post-transition inactive frames to have a spectral tilt that is different from the spectral tilt of the background noise, and such differences may corrupt the sequence of spectral tilt values generated by task T200 and result in no The necessary SID transmission. As noted above, task T200 may be required to generate a value based only on the sequence X of inactive frames. Likewise, task 73 may be required to generate a value based on the smoothed sequence y from one or more spectral tilt values from the inactive frame. It may also be desirable for embodiments of method M100 to avoid updating the spectral tilt profile using spectral tilt values from one or more post transition frames. This limitation can help reduce the likelihood that the decision task T500 will make an incorrect positive transmission indication. Task T200 can be configured to generate one or more values of the resulting spectral tilt value year column based on the temporal distance between the corresponding inactive frame and the previous active frame. For example, this embodiment of task T200 or task T300 can be configured to delay or temporarily suspend the beginning of a spectrally skewed profile update for one or more inactive frames after a transition from active speech. Figures i3A and 13B illustrate examples of this transition and the effects of this delay or temporary suspension, respectively. Figure 13A shows a sharp change in the amplitude of the smoothed spectral tilt profile caused by the residual of the sound in the post transition frame. This change can result in an improper positive sid transmission 123345.doc -36- 200818802 to cause a sharp drop in amplitude. By way of comparison, Figure 13B shows an example in which the towel application is deferred (also referred to as "delay") to decide during the post-transition frame. In this particular example, the spectral tilt parameter is the first-reflection coefficient k〇 such that the singular residual causes a sharp rise in the amplitude of the smoothed-spectrum oblique profile, although the pronunciation residual can be used without the use of another spectral tilt parameter m′ In Figure 13A, the transition from the active speech transition to the lunar smooth contour is updated. In this case, the sharp rise seen. In a specific instance

不活動語音之後使用五個訊框之延遲。 圖14展示一指令集之原始碼列表之一實例,該指令集可 由可程式化邏輯元件陣列或其他狀態機(例如,處理器)執 行以執行方法Μ1 00之一實施例,該實施例包括任務丁3j 〇 之實施例T312以及任務丁4〇〇及丁5〇〇之實施例。在此實例 中,任務T3 12讀取儲存語音活動指示之當前狀態的變數 FRAME—ACTIVE。若 FRAME-ACTIVE之值為 TRUE(此指 示當前訊框係活動的),則將延遲計數儲存至變數 hangover一 1_,且該指令集終止。在此特定實例中,延遲計 數為5 ’儘管可使用任何其他正整數值。當 FRAME—ACTIVE之值變為FALSE時(此指示當前訊框係不 活動的)’該^曰令集之母一後續迭代使變數hangover 1之值 遞減,且至變數hangover」之值達到零時便早早終止。在 此實例中,任務T400及T5〇〇使用如上文參考圖8B而描述 之指令加以實施。 、 方法Ml 00及裝置Α1 〇〇之實例包括經組態以板據一更新 控制信號之狀態而控制頻譜傾斜輪廓之更新的若干實施 123345.doc -37- 200818802The delay of using five frames after inactive voice. 14 shows an example of a source code list of an instruction set that may be executed by an array of programmable logic elements or other state machine (eg, a processor) to perform an embodiment of method Μ1 00, which includes tasks Examples of the embodiment T312 and the tasks of Ding 3j and Ding 5〇〇. In this example, task T3 12 reads the variable FRAME_ACTIVE that stores the current state of the voice activity indication. If the value of FRAME-ACTIVE is TRUE (this indicates that the current frame is active), the delay count is stored to the variable hangover-1_, and the instruction set is terminated. In this particular example, the delay count is 5 ' although any other positive integer value can be used. When the value of FRAME_ACTIVE changes to FALSE (this indicates that the current frame is inactive), the subsequent iteration of the mother of the command set decrements the value of the variable hangover 1, and the value of the variable hangover reaches zero. It will be terminated early. In this example, tasks T400 and T5 are implemented using instructions as described above with reference to Figure 8B. Examples of the method M1 00 and the device 包括1 包括 include several implementations configured to control the update of the spectral tilt profile in accordance with the state of the board updating the control signal 123345.doc -37- 200818802

例。此信號可基於如上所述之語音活動指#。圖i4所示之 變數FRAME—ACTIVE為更新控制信號(具體言之,更新去 能信號)之一實例。延遲邏輯電路5〇可用於藉由在語音行 為指示中延緩活動至不活動之過渡而計算更新控制信號^ 圖15展示延遲邏輯電路5〇之實施例52,實施例”經組:以 產生更新控制信號(具體言之,更新賦能信號)。在Z圖 中’語音活動指示之狀態對於不活動訊框而言為低而對於 活動訊框而言為高,具有三個延緩元件之子取樣延緩線用 於實施三個訊框之延遲,且邏輯"或非,,運算用於組合當前 與延緩的語音活動指示。在其他實例中,語音活動指示之 狀態對於不活動訊框而言可為高而對於活動訊框而言可為 低,且在此情況下,可使用邏輯"及"運算組合當前與延緩 的語音活動指示。就子取樣延缓線而言,此電路之其他實 例可根據所要之延遲持續時間而使用任一數目的延緩二 件。或者’延遲邏輯電路50可經實施以使用延緩計數器自 活動至不活動之過渡進行遞減計數(或遞增計數),及/或計 异更新去能信號而非更新賦能信號。 序列產生器120可經組態以根據對應不活動訊框與先前 活動訊框之間的時間距離而產生所產生之頻譜傾斜值序列 之或夕個值。舉例而言,序列產生器120或平滑器13〇可 經組態以根據所要之延遲而在活動至不活動之過渡之後暫 時中止U斜輪廓更新之開始。序列產生器⑽或平滑 w 13 0之此κ鉍例可經組態以包括如上所述之延遲邏輯電 路50之實施例。圖16Α展示平滑器132之一個此實施例 123345.doc -38- 200818802 ⑼。在^實例中,器(例如,多工器)根據更新控制信 號之狀L而在序列之當前值(亦即,小])與經平滑頻譜傾 斜輪廓之先前值(亦gp,咖]])之間切換平滑器之輪入。、 f ’平滑器11G之實施例可經組態以在更新控制信號為 高時儲存當前值X㈤,且在更新控制信號為低時將此 值用於輸入。 圖16B展不平滑器132之另一實施例136,實施例包括 藝如^所述之延遲邏輯電路5G之實施例。此實例包括兩個選 擇二' (例如夕工器),該兩個選擇器經組態以根據更新控 制信號之狀態而輸出不同的增益因數。第一選擇器輸出待 應用於χ[π]之增益因數。當更新控制信號之狀態為高時, 此選擇器便輸出增益因數F1〇,且當更新控制信號之狀態 為低時,此選擇器輸出增益因數F12。第二選擇器輸出待 應用於少[心1]之增益因數。當更新控制信號之狀態為高 時,此選擇器輸出增益因數F2〇,且當更新控制信號之狀 _ 悲為低時’此選擇器輸出增益因數F22。在一實例中,增 盈因數F10及F12分別具有值〇·2及〇,且增益因數F2〇及F22 分別具有值0 · 8及1. 〇。 平滑器136之另一實施例可經組態以在每一增益因數之 兩個以上的值之間進行選擇,從而使得平滑器自暫時中止 至正常操作之過渡更為漸進。舉例而言,替代產生雙態控 制信號之延遲邏輯電路,此平滑器可包括延遲邏輯電路50 之經組態以產生具有兩個以上狀態之控制信號的實施例。 延遲邏輯電路50之此實施例可經組態以產生回應於活動至 123345.doc -39- 200818802 不活動之過渡而經歷C個狀態之更新控制信號,其中c為大 於二之整數。在此情況下,平滑器136之該兩個選擇器可 經組態以使得,回應於過渡且在一系列c個訊框上,應用 於尤切]之增益因數經歷自最小值至最大值(例如,自〇·〇至 0.2)之c個值,而應用於讨心;!]之增益因數經歷自最大值至 最小值(例如,自1·〇至〇.8)之c個值。 編碼增益量度描述如語音編碼器(或語音編碼方法)所接example. This signal can be based on the voice activity finger # as described above. The variable FRAME_ACTIVE shown in Figure i4 is an example of an update control signal (specifically, an update disable signal). The delay logic circuit 5 can be used to calculate an update control signal by delaying the transition from active to inactive in the voice behavior indication. Figure 15 shows an embodiment 52 of the delay logic circuit 5, the embodiment "grouping: to generate update control" Signal (specifically, update the enable signal). In the Z diagram, the state of the voice activity indication is low for the inactive frame and high for the active frame, with three sub-sample delay lines for the delay element. The delay used to implement the three frames, and the logical " or non-, operation is used to combine the current and deferred voice activity indication. In other instances, the state of the voice activity indication may be high for the inactive frame. It can be low for the active frame, and in this case, the logical "and" operation can be used to combine the current and deferred voice activity indications. For sub-sample delay lines, other examples of this circuit can be based on Any number of delays are used for the desired delay duration. Or 'delay logic 50 can be implemented to use the delay counter from active to inactive transition The line counts down (or increments), and/or counts the update enable signal instead of the update enable signal. Sequence generator 120 can be configured to vary the time interval between the corresponding inactive frame and the previous active frame. And generating a value of the sequence of spectral tilt values generated. For example, the sequence generator 120 or the smoother 13 can be configured to temporarily suspend the U-slope after the transition from active to inactive according to the desired delay. The beginning of the contour update. The sequence generator (10) or smoothing w 13 0 can be configured to include the embodiment of the delay logic circuit 50 as described above. Figure 16A shows one such embodiment of the smoother 132 123345. Doc -38- 200818802 (9). In the example, the device (for example, the multiplexer) is based on the current value of the sequence (ie, small) and the previous value of the smoothed profile of the smoothed spectrum according to the L of the update control signal ( Also, the switch of the smoothing device is switched between.) The embodiment of the f'smoother 11G can be configured to store the current value X (f) when the update control signal is high, and when the update control signal is low Use this value for input. Another embodiment 136 of the 16B non-smoother 132, the embodiment includes an embodiment of the delay logic circuit 5G as described in the art. This example includes two selections two (e.g., a shovel), the two selectors It is configured to output different gain factors according to the state of the update control signal. The first selector outputs a gain factor to be applied to χ[π]. When the state of the update control signal is high, the selector outputs a gain factor F1〇, and when the state of the update control signal is low, the selector outputs a gain factor F12. The second selector outputs a gain factor to be applied to less [heart 1]. When the state of the update control signal is high, this The selector outputs a gain factor F2 〇, and when the state of the update control signal is low, the selector outputs a gain factor F22. In an example, the gain factors F10 and F12 have values 〇·2 and 〇, respectively, and the gain factors F2〇 and F22 have values of 0·8 and 1. 〇, respectively. Another embodiment of smoother 136 can be configured to select between more than two values of each gain factor, thereby making the smoother transition from temporary suspension to normal operation more gradual. For example, instead of a delay logic circuit that produces a two-state control signal, the smoother can include an embodiment of the delay logic circuit 50 configured to generate a control signal having more than two states. This embodiment of delay logic circuit 50 can be configured to generate an update control signal that experiences C states in response to an inactivity transition to 123345.doc -39 - 200818802, where c is an integer greater than two. In this case, the two selectors of smoother 136 can be configured such that, in response to the transition and on a series of c frames, the gain factor applied to the cut is experienced from minimum to maximum ( For example, the c values from 〇·〇 to 0.2) are applied to the center of mind; the gain factor of the !] is subjected to c values from the maximum value to the minimum value (for example, from 1·〇 to 〇.8). The coding gain metric is described as a speech coder (or speech coding method)

收之信號之能量與對應編碼誤差之能量之間的關係。通 常,語音編碼器或語音編碼方法比起不活動訊框而言更為 有效地編碼活動訊框,以使得編碼增益量度對於活動訊框 而言高於不活動訊框。訊框之編碼增益量度之一實例為初 始信號能量Ein(例如,有窗訊框之能量)與編碼殘餘能量The relationship between the energy of the received signal and the energy of the corresponding coding error. In general, a speech coder or speech coding method encodes an active frame more efficiently than an inactive frame such that the coding gain metric is higher for the active frame than the inactive frame. An example of a coded gain metric for a frame is the initial signal energy Ein (eg, the energy of the window frame) and the coded residual energy.

Eerr之比。在此等情況下,通常將每一信號之能量計算為 樣本之量值之和。LPC分析之另-普通編碼増益量度為預 測增益’可將其計算為(1一〇之乘冑之倒&,對於所有 …(或者,對於所有i,KW)),其中ALpc分析之級 數,而^指示第i個反射係數。 語音編碼器或語音編碼方法所遠忐 —> ▲ 1乃次所遷成之編碼增益度往往隨 著信號改變之統計而逐訊框地發生變化。然而,在一系列 不活動訊框期間’可能_㈣相對平穩以使得其統計不 會發生顯著的變化。因此,可預期編碼增益量度之值匕甚 至在背景雜訊於感知上發生顯著改 I K I期間亦保持相對恆 定。 可指示語音信號由於背 編碼增益量度之值Ge之較大改變 123345.doc -40· 200818802 景雜訊改I之外的因素而發生改變。可引起值&之此改變 之一個因素為處於編碼器之語音活動偵測器之偵測臨限值 以下的語音活動。在此情況下,較大改變亦可發生在頻譜 傾斜值中,從而導致即使背景雜訊尚未顯著改變,任務 T500仍作出正SID傳輸決定。 可能需要實施方法Μ100以慮及與編碼增益量度之值 之改變相關聯的頻譜傾斜改變。舉例而言,任務Τ2〇〇之實 施例Τ230或任務Τ300之實施例乃3〇可經組態以基於編二 增益量度之值Gc之變化的量值而賦能或去能輪廓更新。 在某些情況下’編碼增益量度可依據編碼誤差進行計 算,正如在如下之表達式中: 正如在如下之表 同樣地,預測增益亦可計算為預測誤差 達式中:Eerr ratio. In such cases, the energy of each signal is typically calculated as the sum of the magnitudes of the samples. The other-common coding benefit measure of the LPC analysis is the prediction gain' which can be calculated as (1 胄 胄 & && for all... (or for all i, KW)), where the number of ALpc analysis And ^ indicates the i-th reflection coefficient. The speech coder or speech coding method is far-fetched —> ▲ 1 The coding gain degree of the gradual change is often changed frame by frame with the statistics of the signal change. However, during a series of inactive frames, 'may _ (four) be relatively flat so that its statistics do not change significantly. Therefore, it is expected that the value of the coding gain metric will remain relatively constant even during periods of significant change in background noise. It can indicate that the speech signal has changed greatly due to factors other than the value of the back coding gain metric, 123345.doc -40· 200818802. One factor that can cause this change in value & is the speech activity below the detection threshold of the encoder's voice activity detector. In this case, a large change can also occur in the spectral tilt value, causing the task T500 to make a positive SID transmission decision even if the background noise has not changed significantly. It may be desirable to implement method Μ100 to account for spectral tilt changes associated with changes in the value of the coding gain metric. For example, the embodiment of the task Τ 230 or task Τ 300 may be configured to energize or de-enable contour updates based on the magnitude of the change in the value Gc of the second gain metric. In some cases the 'coded gain metric can be calculated from the coding error, as in the following expression: As in the table below, the prediction gain can also be calculated as the prediction error equation:

,1<K/)) 0 算,該等其他表 ,對於所有’⑻(或者,對於所有 編碼增益量度亦可根據其他表達式進行吁 達式(例如)亦包括下述乘積: 或包括Ein與Eerr之間的比作為因數或項 編碼增益量度可在線性標度上或另 數標度上)進行表示。此等I " 0 ,在對 式: 此等表達式之實例包括下述表達 123345.doc 200818802 log#、岭、logn(1〇、1 阶 Em t να-^2) 編碼增盈量度通常係針對每一訊框而評估,但亦可較不頻 繁地(例如,針對每兩個或每三個訊框)及/或在較長間隔上 (例如,在一對或三個訊框上)進行評估。 在典型配置中,任務Τ230或任務Τ33〇經組態以在值&自 個不活動訊框至下一不活動訊框改變超過臨限量(或 馨者不小於臨限量)時去能所產生之頻譜傾斜輪廓之更 新。在一特定實例中,任務Τ330經組態以在預測增益之值 自先刖不活動訊框至當前不活動訊框改變超過〇·72 dB時去 能經平滑輪廓之更新。任務T230或任務T330之實施例可經 組態以應用延遲,從而將此去能擴展至一或多個後續訊 框。任務Τ230或任務Τ330之另一實施例亦可經組態以如上 文所述(例如,參考圖13Α至圖16Β)在自活動語音過渡之後 應用延遲。 • 可能需要實施裝置Α100以慮及與編碼增益量度(諸如, 述貝例之者)之值之改變相關聯的頻譜傾斜輪廓改 變。舉例而言,裝置A1〇〇可經實施以包括經組態以產生一 更新控制彳§號之控制信號產生器6〇,該更新控制信號之狀 態基於預測增益之變化之量值。圖17A展示控制信號產生 器60之貝例62之方塊圖。控制信號產生器60亦可經實施 以應用延遲,如同在圖17B所示之控制信號產生器料之實 例中一樣。在一特定實例中,臨限值T3〇之值為〇72 dB。 替代經組態以在語音活動指示中延緩活動至不活動之過渡 123345.doc -42· 200818802 的電路或除了該電路之外,平滑器134或136之實施例可包 括控制幺號產生器60之實施例。舉例而言,此實施例可包 括如圖18所不之控制信號產生器66,控制信號產生器“組 合延遲邏輯電路62與控制信號產生器64之操作。 方法M100之實施例可經組態以根據編碼增益量度之值 之改變而控制SID傳輸指示之產生。舉例而言,方法M1〇〇 之實施例可包括任務T400之一實施例,任務T4〇〇之該實施 _ 例經組恶以在編碼增益量度(例如,預測增益)之值自一個 不活動訊框至下一不活動訊框改變超過臨限量(或者,不 小於臨限量)時輸出距離零。另外或在替代例中,方法 Μ100之實施例可包括任務Τ5⑽之一實施例,任務乃⑽之 該實施例經組態以根據預測增益之變化之量值而賦能或去 旎正SID傳輸指示之產生。任務丁5〇〇之一個此實施例τ5ΐ〇 經組態以去能正SID傳輸指示之產生,除非預測增益自先 前不活動訊框至當前不活動訊框改變小於(或者,不超過) _ 臨限值。在一個此特定實例中,該臨限值為〇·65(1Β。除了 控制頻譜傾斜輪廓之更新之外或作為控制頻譜傾斜輪廓之 更新的替代方式,可執行對傳輸指示之產生之控制。 裝置A100之實施例可經組態以根據編碼增益量度之值匕 之改變而控制SID傳輸指示之產生。圖19A展示傳輸指示 控制電路70之一實例72之方塊圖,實例72經組態以根據臨 限值T40與預測增益改變之量值之間的關係而閘控正sid傳 輸指示。在一特定實例中,臨限值T4〇之值為〇·65 dB。圖 19B展示比較器152之實施例156之方塊圖,實施例156包括 123345.doc -43- 200818802 傳輸指示控制電路72。 裝置A100之實施例可經組態以基於編碼增益量度之值^ 之改變而控制更新控制信號與sm傳輸指示之產生。圖 展示控制電路80之經組態以執行此等操作之一實例82的方 塊圖。此電路可經配置以自比較器150接收SID傳輸指示, 並將更新控制信號提供至平滑器13〇。此電路亦可實施於 平滑器130或比較器15〇中。舉例而言,在平滑器134或136 中,控制電路82可經配置以取代延遲邏輯電路52,並根據 預測增益而閘控來自比較器15〇之SID傳輸指示。在另一實 例中,控制電路82可配置於比較器152中,以根據預測增 盃而閘控SID傳輸指示,且亦將更新控制信號提供至平滑 器 130 〇 圖21展示一指令集之原始碼列表之一實例,該指令集可 由可程式化邏輯元件陣列或其他狀態機(例如,處理器)執 行以執行方法Μ100之一實施例,該實施例包括任務丁3 12 及Τ330之實施例Τ332、任務Τ500之實施例Τ510及任務 Τ400之實施例。在此實例中,變數frame一ACTIVE之狀 態指示當前訊框是活動的還是不活動的,變數Y_VALID之 狀態指示是否之前已調用該指令集(且因此而指示儲存於 變數y一current中之值是否有效),且變數Gc之值指示當前 訊框之預測增益。 若該指令集判定Y_VALID之值為FALSE(亦即,若該指 令集係首次執行),則將變數Gc_current初始化為變數Gc之 當前值。將Gc之當前值與過去值之間的絕對差儲存至變數 123345.doc -44- 200818802, 1 < K / )) 0 , these other tables, for all '(8) (or, for all coding gain metrics can also be based on other expressions (for example) also includes the following product: or include Ein and The ratio between Eerr is expressed as a factor or term coding gain metric on a linear scale or on another scale. Such I " 0 , in the formula: Examples of such expressions include the following expression 123345.doc 200818802 log#, ridge, logn (1〇, 1 order Em t να-^2) coding gain measure is usually Evaluated for each frame, but less frequently (for example, for every two or every three frames) and/or at longer intervals (for example, on one or three frames) to evaluate. In a typical configuration, task Τ 230 or task Τ 33 is configured to generate a value when the value & from the inactive frame to the next inactive frame changes beyond the threshold amount (or the linger is not less than the threshold amount) The update of the spectral tilt profile. In a particular example, task Τ 330 is configured to update the smoothed contour when the value of the predicted gain is changed from the first inactive frame to the current inactive frame by more than 〇·72 dB. Embodiments of task T230 or task T330 can be configured to apply a delay to extend this de-enforcement to one or more subsequent frames. Another embodiment of task Τ 230 or task Τ 330 may also be configured to apply a delay after the transition from the active speech as described above (e.g., with reference to Figures 13A through 16B). • It may be desirable to implement device Α100 to account for spectral tilt profile changes associated with changes in the value of the coding gain metric (such as those described). For example, device A1 can be implemented to include a control signal generator 6 that is configured to generate an update control parameter, the state of the update control signal being based on the magnitude of the change in predicted gain. Figure 17A shows a block diagram of a sample 62 of control signal generator 60. Control signal generator 60 can also be implemented to apply delays as in the example of the control signal generator shown in Figure 17B. In a particular example, the threshold T3 〇 is 〇 72 dB. An embodiment of the smoother 134 or 136 may include a control nickname generator 60 instead of or in addition to the circuitry configured to delay active to inactive transitions in the voice activity indication 123345.doc -42.200818802 Example. For example, this embodiment can include control signal generator 66 as shown in FIG. 18, which controls the operation of combination delay logic circuit 62 and control signal generator 64. Embodiments of method M100 can be configured to The generation of the SID transmission indication is controlled according to the change in the value of the coding gain metric. For example, the embodiment of the method M1 可 may include an embodiment of task T400, the implementation of task T4 _ The value of the coded gain metric (eg, predicted gain) is outputted from zero when an inactive frame to a next inactive frame changes beyond a threshold amount (or no less than a threshold amount). Additionally or alternatively, the method Μ100 Embodiments may include an embodiment of task Τ 5 (10), the task being that the embodiment of (10) is configured to energize or de-sequence the generation of the SID transmission indication based on the magnitude of the change in the predicted gain. One such embodiment τ5 is configured to disable the generation of a positive SID transmission indication unless the predicted gain is less than (or not exceeded) from the previous inactive frame to the current inactive frame _ Threshold. In one particular example, the threshold is 〇·65 (1Β. In addition to controlling the update of the spectral tilt profile or as an alternative to controlling the update of the spectral tilt profile, the generation of the transmission indication can be performed. Control. Embodiments of apparatus A100 can be configured to control the generation of SID transmission indications based on changes in the value of the coding gain metric. Figure 19A shows a block diagram of one example 72 of transmission indication control circuit 70, example 72 The state gates the positive sid transmission indication according to the relationship between the threshold value T40 and the magnitude of the predicted gain change. In a particular example, the threshold value T4〇 is 〇·65 dB. Figure 19B shows the comparator The block diagram of embodiment 156 of 152, embodiment 156 includes 123345.doc -43 - 200818802 transmission indication control circuit 72. Embodiments of apparatus A100 can be configured to control the update control signal based on a change in the value of the coding gain metric^ Generation of the sm transmission indication. The diagram shows a block diagram of the control circuit 80 configured to perform one of the operations of the example 82. The circuit can be configured to receive the SID transmission indication from the comparator 150, The update control signal is provided to smoother 13. This circuit can also be implemented in smoother 130 or comparator 15A. For example, in smoother 134 or 136, control circuit 82 can be configured to replace delay logic 52, and gating the SID transmission indication from the comparator 15 according to the predicted gain. In another example, the control circuit 82 can be configured in the comparator 152 to gate the SID transmission indication according to the predicted booster, and Providing an update control signal to smoother 130 Figure 21 shows an example of a source code list of an instruction set that can be executed by an array of programmable logic elements or other state machine (e.g., a processor) to perform method Μ100 In one embodiment, the embodiment includes an embodiment of a task 332, an embodiment Τ 332 of the task 340, an embodiment 510 of the task Τ 500, and an embodiment of the task Τ 400. In this example, the state of the variable frame-ACTIVE indicates whether the current frame is active or inactive, and the state of the variable Y_VALID indicates whether the instruction set has been previously called (and thus indicates whether the value stored in the variable y-current is Valid), and the value of the variable Gc indicates the predicted gain of the current frame. If the instruction set determines that the value of Y_VALID is FALSE (i.e., if the instruction set is first executed), the variable Gc_current is initialized to the current value of the variable Gc. Store the absolute difference between the current and past values of Gc to the variable 123345.doc -44- 200818802

Gc—diff,且若此差大於臨限值,則應用兩個訊框之延遲。 在第3部分中,僅在GCjiff之值小於臨限值時才設定旗標 p 〇 陳述本文所述之邏輯實施例之特定實例以解釋本揭示案 _ 而非對其進行限制,且熟習此項技術者將易瞭解,替代性 邏輯實施例包括在本揭示案之範疇中。舉例而言,在一情 境中實施為經配置以僅在所有其輸入均為高時才產生活動 咼仏號之’’及’’閘的選擇邏輯可在另一情境中實施為經配置 以僅在所有其輸入均為低時才產生活動低信號之”或”閘。 自弟 值至苐一值之遞減计數亦可實施為自第二值至第一 值之遞增計數,且反之亦然。正或TRUE指示在一情境中 可用二進位高值表示,而在另一情境中可用二進位低值表 示預期且由此揭示此等及其他實施性均等方式亦包括在 本揭示案之範嘴内。 在上述實例中,假設頻譜傾斜值序列包括一系列連貫不 籲活動訊框中之每-者之值H亦預期方法Μι〇〇及裝 置A100可經實施以使得頻譜傾斜值序列包括少於一系列連 貫不活動訊框中之每一者之一個值。舉例而言,該序列可 包括該系列中之每隔一個訊框(或每隔兩個訊框等)之值。 序歹〗"7藉由忽略中間訊框或廢棄來自此等訊框之值而獲 或藉由求母一對(二個等等)訊框之值的平均值而獲 取。其他或另外,此等原理可應用於其他序列,諸如編碼 增益量度值序列。 熟習此項技術者將瞭解,資訊及信號可用多種不同技藝 123345.doc -45- 200818802 及技術之任一者來表示。舉例而言,可在整個上述描述中 提及的貝料、指令、命令、資訊、信號、位元及符號可由 包壓、電流、電磁波、磁場或磁性粒子、光場或光學粒子 或其任一組合表示。儘管自其獲得所產生之頻譜傾斜值序 列的信號稱為”語音信號",但是亦預期且由此揭示此信號 亦可在活動訊框中載運音樂或其他非語音資訊内容。u 如本文所述之裝置A1G()之各種實施例的元件可製造為駐 畕於(例如)同一晶片上或晶片組之兩個或兩個以上晶片之 間的電子及/或光學設備。此設備之一實例為固定或可程 式化邏輯7L件陣列,諸如電晶體或閘。如本文所述之裝置 A100之各種實施例的一或多個元件亦可整個或部分地實施 為或夕個#曰令集,呑亥或該等指令集經配置以執行於一咬 多個固定或可程式化邏輯元件陣列上,諸如微處理器、喪 式處理益、IP核心、數位信號處理器、場可程式化問陣列 ((=A))。、特殊應用標準產品(Assp)及特殊應用積體電路 之實施例之_或多個元件可能用於執行並非與 2衣置之操作直接相_任務或執行並㈣ 直接相關的其他指令集,諸如盘 置之㈢ ^ 兴甘入入有該裝置之設備或系 、、充之另一操作相關的任務。裝置” 一从_ ^衣置A1 00之實施例之一或多個 兀牛亦可能具有共同結構( 程式碼之對鄉在不_間上執行 耵應於不同兀件之部分的處理器、經 同時間上執行對應於不同 、、執订以在不 時間上料尤门 小π 70件之任務的指令集,或在不同 對不同元件執行操作的電子及/或光學設備的配 123345.doc -46 - 200818802 置)。在一個此實例中,平滑器13〇、計算器14〇及比較器 150經實施為經配置以執行於同一處理器上之指令集。在 另此只例中,序列產生器120乃至語音編碼器(其可包括 裝置A1 〇〇)經實施為經配置以執行於彼處理器上之一 個指令集。 …夕 提供所述組態上陳述錢熟習此項技術者能夠製迭 或使用本文所揭示之方法及其他結構。本文所展示並描述 之流程圖及其他結構僅為實例,且此等結構之其他變型亦 處於本揭示案之範疇内。對此等組態之多種修改係可能 的,且本文所述之一般原理亦可應用於其他組態。 本文所述之組態可部分或整個地實施為硬連線電路、 施為製造成特殊應用積體電路之電路組態,或實施為载入 至非揮發性儲存器中之㈣程式或作為機器可讀褐而自資 料儲存媒體载人或“至資料儲存媒體之軟體程式,此: 器可讀碼為可由邏輯元件陣列(諸如,微處理器或 位信號處理單元)執行㈣令。資料儲存㈣可為儲存元 件之陣列,該等儲存元件諸如,半導體記憶體(其可 (但不限於)動態或靜態RAM(隨機存取記憶體)、r 記憶體)及/或快閃RAM);或鐵電、磁電阻、雙向、聚人^ 相變記憶體;或諸如磁碟或光碟之碟片媒體。術語"軟°或, 應理解為包括原始碼、組合語言碼、機器碼、_ Μ、宏碼、微碼、可由邏輯元件陣賴行之任—個^ 個指令集或指令序列及此等實例之任―組合。 三夕 本文所揭示之方法亦可確實地(例如,在上文所列舉的 I23345.doc -47- 200818802 一或多個資料儲存媒體中) ..^ r)骽現為一或多個指令集,該或 “指令集可由包括邏輯元件陣列(例如,處理器、微處 理益、微控制器或其他有限狀態機)之機器讀取及/或執 :。因此’本揭示案並非意在限於上文所示之組態,而是 忍在符合與本文中以任—方式所揭示之原理及新顆特徵一 致的取廣泛辄驚,包括所申請之附加申請專利範圍,其中 附加申請專利範圍形成原始揭示案之一部分。 _ 熟習此項技術者將進一步瞭解到,結合本文所揭示之組 悲而描述之各種說明性邏輯區塊、模組、電路及操作可實 施為電子硬體、電腦軟體或兩者之組合。此等邏輯區塊、 模組、電路及操作可使用通用處理器、數位信號處理器 (DSP)、ASIC、FPGA或其他可程式化邏輯設備、離散閘或 電晶體邏輯、離散硬體組件或經設計以執行本文所述功能 之其任一組合實施或執行。通用處理器可為微處理器,但 在替代例中,處理器可為任何習知處理器、控制器、微控 _ 制器或狀態機。處理器亦可實施為計算設備之組合,例 如,DSP與微處理器之組合、複數個微處理器之組合、一 或多個微處理器與DSP核相結合之組合或任何其他此類組 態。 本文所述之方法及演算法之任務可直接以軟體、以可由 處理器執行之軟體模組或以該兩者之組合而實施。軟體模 組可駐留於RAM記憶體、快閃記憶體、r〇m記憶體、 EPROM記憶體、EEPROM記憶體、暫存器、硬碟、抽取式 磁碟、CD-ROM或此項技術中已知的任何其他形式之儲存 123345.doc -48- 200818802 媒體中。說明性儲存媒體耦接至處理器,以使得處理7 自該儲存媒體讀取資訊及將資訊寫入至該儲存媒體。在替 代例中,儲存媒體可整合至處理器。處理器及儲存媒體可 駐留於ASIC中。ASIC可駐留於使用者終端機中。在替代 例中,處理器及儲存媒體可作為離散組件而駐留於使=者 終端機中。 【圖式簡單說明】 圖1A展示根據一組態之方法M100之流程圖。 圖1B展示根據一組態之裝置A1〇〇之方塊圖。 圖1C展示方法Ml 00之實施例M101之流程圖。 圖1D展示裝置八100之實施例A1〇1之方塊圖。 圖2展示平滑器13〇之實施例132之方塊圖。 圖3展示一說明性實例,其中每一圓圈表示語音信號中 Ik著時間的一系列連續訊框中之一者。 圖4展示計算器14〇之實施例142之方塊圖。 圖5展示比較器150之實施例152之方塊圖。 圖6展示比較器bo之實施例154之方塊圖。 圖7A展示裝置幻〇〇之實施例八1〇2之方塊圖。 圖7B展示將若干不同的傳輸指示組合成複合傳輸指示的 一實例。 圖8 A展不可經執行以執行方法M100之一實施例之指令 集的原始碼列表。 圖8B展不可經執行以執行方法Ml00之另一實施例之指 令集的原始碼列表。 123345.doc -49- 200818802 圖9展示包含方法Ml 01與語音編碼方法之組合之方法的 流程圖。 圖10展示包含裝置A101與語音編碼器之組合之裝置的方 塊圖。 圖11A展示方法M100之實施例M200之流程圖。 圖11B展示裝置A100之實施例A200之流程圖。 ' 圖12A展示方法M101之實施例M110之流程圖。 圖12B展示方法M200之實施例M210之流程圖。 _ 圖12C展示方法M101之實施例M120之流程圖。 圖12D展示方法M200之實施例M220之流程圖。 圖13 A及圖13B分別展示在應用及不應用延遲之情況下 的經平滑頻譜傾斜輪廓之實例。 圖14展示可經執行以執行方法Ml 00之另一實施例之指 令集的原始碼列表。 圖15展示延遲邏輯電路之實例之方塊圖。 圖16A展示平滑器132之實施例134之方塊圖。 _ 圖16B展示平滑器132之實施例136之方塊圖。 圖17A展示控制信號產生器60之一實例62之方塊圖,其 ^ 中實例62經組態以基於預測增益而產生更新控制信號。 . 圖17B展示控制信號產生器62之一實例64之方塊圖,其 中實例64經組態以應用延遲。 圖18展示控制信號產生器64之實施例66之方塊圖,其中 實施例66亦包括延遲邏輯電路52。 圖19A展示傳輸指示控制電路70之一實例72之方塊圖。 123345.doc -50- 200818802 圖19B展示比較器152之實施例156之方塊圖。 圖20展示控制電路8〇之一實例82之方塊圖,其中實例u 經組態以產生更新控制信號並閘控SID傳輸指示。 圖21展示可經執行以執行方法M1〇〇之另一實施例之# 令集的原始碼列表。 曰Gc_diff, and if the difference is greater than the threshold, the delay of the two frames is applied. In Part 3, the flag p is set only when the value of GCjiff is less than the threshold. A specific example of the logical embodiment described herein is set forth to explain the disclosure _ instead of limiting it, and is familiar with this The skilled artisan will readily appreciate that alternative logical embodiments are included within the scope of the present disclosure. For example, selection logic implemented in a context as a ''and'' gate that is configured to generate an activity nickname only if all of its inputs are high may be implemented in another context to be configured to only The OR gate of the active low signal is generated when all of its inputs are low. The countdown from the value of the value to the value of 苐 can also be implemented as an incremental count from the second value to the first value, and vice versa. A positive or TRUE indication may be represented by a binary high value in one context, and a binary low value may be used in another context to indicate an expectation and thus revealing that such and other implementation equalizations are also included in the scope of the present disclosure. . In the above example, it is assumed that the sequence of spectral tilt values includes a value H for each of a series of consecutive non-motion frames. It is also contemplated that the method Μι〇〇 and device A100 can be implemented such that the sequence of spectral tilt values includes less than a series. A value of each of the consecutive inactive frames. For example, the sequence can include values for every other frame (or every two frames, etc.) in the series. The sequence number "7 is obtained by ignoring the intermediate frame or discarding the values from the frames or by averaging the values of the pair of mothers (two, etc.) frames. Other or additional, these principles can be applied to other sequences, such as a sequence of coded gain metric values. Those skilled in the art will appreciate that information and signals can be represented by any of a variety of different techniques, 123345.doc -45- 200818802 and technology. For example, the baits, instructions, commands, information, signals, bits, and symbols that may be referred to throughout the above description may be overwritten, current, electromagnetic, magnetic or magnetic particles, light fields, or optical particles, or any thereof. Combined representation. Although the signal from which the resulting sequence of spectral tilt values is obtained is referred to as the "speech signal", it is also expected and thus revealed that the signal can also carry music or other non-speech information content in the active frame. The elements of the various embodiments of apparatus A1G() can be fabricated as electronic and/or optical devices that reside on, for example, the same wafer or between two or more wafers of a wafer set. An example of such an apparatus An array of fixed or programmable logic 7L elements, such as a transistor or a gate. One or more elements of various embodiments of apparatus A100 as described herein may also be implemented in whole or in part as a set of arrangements. The instruction set is configured to perform on a plurality of fixed or programmable logic element arrays, such as a microprocessor, a memory processor, an IP core, a digital signal processor, and a field programmable array. ((=A)), Special Application Standard Products (Assp) and special application integrated circuit embodiments of the _ or multiple components may be used to perform operations that are not directly related to the operation of the 2 garments - task or execution and (iv) direct related His instruction set, such as the disk (3) ^ Xing Gan into the device or system with the device, is charged with another operation-related task. Device "One or more of the embodiment of the clothing A1 00 The yak may also have a common structure (the code of the code is not executed on the processor of the different parts of the town, and the execution of the processor at the same time is different, and the order is made in time. The instruction set of the task of a small π 70 piece, or the arrangement of electronic and/or optical devices that operate differently on different components. 123345.doc -46 - 200818802). In one such example, smoother 13, calculator 14 and comparator 150 are implemented as a set of instructions configured to execute on the same processor. In another example, sequence generator 120 and even a speech encoder (which may include device A1(R)) are implemented to be configured to execute on one of the sets of instructions on the processor. ??? Providing the configuration in the form of a person skilled in the art can fabricate or use the methods and other structures disclosed herein. The flowcharts and other structures shown and described herein are merely examples, and other variations of such structures are also within the scope of the present disclosure. Many modifications to these configurations are possible, and the general principles described herein can be applied to other configurations as well. The configuration described herein may be implemented, in part or in whole, as a hardwired circuit, as a circuit configuration fabricated into a special application integrated circuit, or as a (4) program loaded into a non-volatile memory or as a machine Readable brown from the data storage media manned or "software program to the data storage medium, this: the device readable code can be executed by an array of logic elements (such as a microprocessor or bit signal processing unit) (4) order. Data storage (4) May be an array of storage elements such as semiconductor memory (which may be, but not limited to, dynamic or static RAM (random access memory), r memory) and/or flash RAM); or iron Electrical, magnetoresistance, bidirectional, polyphonic phase change memory; or disc media such as disk or CD. The term "soft" or, should be understood to include source code, combined language code, machine code, _ Μ, Macro code, microcode, can be used by any combination of logic elements - a set of instructions or sequences of instructions and combinations of such examples. The method disclosed in this article can also be confirmed (for example, in the above Listed I23345.doc -47- 2008188 02 One or more data storage media) ..^ r) is now one or more instruction sets, which may include an array of logic elements (eg, processor, microprocessor, microcontroller, or other Finite state machine) machine reading and / or execution:. Therefore, the present disclosure is not intended to be limited to the configuration shown above, but rather to be consistent with the general principles and new features disclosed in this document, including the application for additional applications. The scope of the patent, in which the scope of the additional patent application forms part of the original disclosure. It will be further appreciated by those skilled in the art that the various illustrative logical blocks, modules, circuits, and operations described in connection with the teachings disclosed herein may be implemented as an electronic hardware, a computer software, or a combination of both. Such logic blocks, modules, circuits, and operations may use general purpose processors, digital signal processors (DSPs), ASICs, FPGAs, or other programmable logic devices, discrete gate or transistor logic, discrete hardware components, or The design is implemented or performed in any combination of the functions described herein. A general purpose processor may be a microprocessor, but in the alternative, the processor may be any conventional processor, controller, microcontroller, or state machine. The processor can also be implemented as a combination of computing devices, such as a combination of a DSP and a microprocessor, a combination of a plurality of microprocessors, a combination of one or more microprocessors and a DSP core, or any other such configuration. . The methods and algorithms described herein can be implemented directly in software, in a software module executable by a processor, or in a combination of the two. The software module can reside in RAM memory, flash memory, r〇m memory, EPROM memory, EEPROM memory, scratchpad, hard disk, removable disk, CD-ROM or this technology. Know any other form of storage 123345.doc -48- 200818802 in the media. The illustrative storage medium is coupled to the processor such that the process 7 reads information from the storage medium and writes information to the storage medium. In the alternative, the storage medium can be integrated into the processor. The processor and the storage medium can reside in the ASIC. The ASIC can reside in the user terminal. In the alternative, the processor and the storage medium may reside as discrete components in the terminal. BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1A shows a flow chart of a method M100 according to a configuration. Figure 1B shows a block diagram of a device A1 according to a configuration. 1C shows a flow chart of an embodiment M101 of method M100. 1D shows a block diagram of an embodiment A1〇1 of apparatus eight 100. 2 shows a block diagram of an embodiment 132 of a smoother 13A. Figure 3 shows an illustrative example in which each circle represents one of a series of consecutive frames of time in the speech signal. 4 shows a block diagram of an embodiment 142 of the calculator 14. FIG. 5 shows a block diagram of an embodiment 152 of comparator 150. FIG. 6 shows a block diagram of an embodiment 154 of comparator bo. Figure 7A shows a block diagram of an embodiment of the device illusion. Figure 7B shows an example of combining several different transmission indications into a composite transmission indication. Figure 8 shows a list of source code that cannot be executed to perform an instruction set of an embodiment of method M100. Figure 8B shows a list of source codes that cannot be executed to perform the instruction set of another embodiment of method M100. 123345.doc -49- 200818802 Figure 9 shows a flow chart of a method comprising a combination of a method Ml 01 and a speech coding method. Figure 10 shows a block diagram of an apparatus comprising a combination of apparatus A 101 and a speech coder. 11A shows a flowchart of an embodiment M200 of method M100. 11B shows a flowchart of an embodiment A200 of apparatus A100. FIG. 12A shows a flow chart of an embodiment M110 of method M101. FIG. 12B shows a flowchart of an embodiment M210 of method M200. FIG. 12C shows a flow chart of an embodiment M120 of method M101. 12D shows a flowchart of an embodiment M220 of method M200. Figures 13A and 13B show examples of smoothed spectral tilt profiles, respectively, with and without application delay. 14 shows a list of source codes that may be executed to perform an instruction set of another embodiment of method M100. Figure 15 shows a block diagram of an example of a delay logic circuit. FIG. 16A shows a block diagram of an embodiment 134 of smoother 132. FIG. 16B shows a block diagram of an embodiment 136 of smoother 132. Figure 17A shows a block diagram of an example 62 of control signal generator 60, where instance 62 is configured to generate an update control signal based on the predicted gain. Figure 17B shows a block diagram of an example 64 of control signal generator 62, where instance 64 is configured to apply a delay. 18 shows a block diagram of an embodiment 66 of control signal generator 64, wherein embodiment 66 also includes delay logic circuit 52. 19A shows a block diagram of an example 72 of a transmission indication control circuit 70. 123345.doc -50- 200818802 FIG. 19B shows a block diagram of an embodiment 156 of comparator 152. 20 shows a block diagram of an example 82 of control circuitry 8 in which instance u is configured to generate an update control signal and gate the SID transmission indication. 21 shows a list of source codes that may be executed to execute the # command set of another embodiment of method M1.曰

【主要元件符號說明】 50, 52 62, 64, 66 72 82 120, 122 128, 140, 142 130, 132, 134, 136 150, 152, 154, 156 A100, A101,A102, A200 G10, G20, F10, F12, F20, T10, T20, T30, T40 延遲邏輯電路 控制信號產生器 傳輸指示控制電路 控制電路 序列產生器 計算器 平滑器 比較器 裝置 F22 增益因數 臨限值 123345.doc -51 -[Main component symbol description] 50, 52 62, 64, 66 72 82 120, 122 128, 140, 142 130, 132, 134, 136 150, 152, 154, 156 A100, A101, A102, A200 G10, G20, F10 , F12, F20, T10, T20, T30, T40 Delay Logic Control Signal Generator Transmission Indication Control Circuit Control Circuit Sequence Generator Calculator Smoother Comparator Device F22 Gain Factor Threshold 123345.doc -51 -

Claims (1)

200818802 十、申請專利範圍: 一種處理一語音信號之方法,該方法包含: 產生-基於該語音信號之複數個不活動訊框之頻 斜值序列; 、% 、 計算該頻譜傾斜值序列之至少兩個值之間的一改變及 曰對於該複數個不活動訊框當中之—不活動訊框,決定 是否傳輸該訊框之一描述, 、200818802 X. Patent Application Range: A method for processing a speech signal, the method comprising: generating a sequence of frequency skew values based on a plurality of inactive frames of the speech signal; , %, calculating at least two sequences of the spectral tilt values A change between values and a decision on whether to transmit a description of the frame in the inactive frame of the plurality of inactive frames, 其中 的改變 該決定是否傳輸該訊框之一描 述係基於該計算出 2.如請求们之處理一語音信號之方法,其中該產生一頻 譜傾斜值序列包含平滑另—頻譜傾斜值序列以產生該頻 譜傾斜值序列, 其中該另-序列之該等頻譜傾斜值之每—者指示該複 數個不活動訊框中之一對應者之一頻譜傾斜。 3·如請求項!之處理一語音信號之方法,其中該等頻譜傾 斜值之每一者係基於該語音錢之一制不活動訊框之 至少一個反射係數。 4·如4求項1之處理一語音信冑之方法,纟中複數個該等 頻譜傾斜值之每一者係基於該頻譜傾斜值序列中之其他 頻譜傾斜值中之至少一者。 5.如請求項1之處理一語音㈣之方&amp;,其中複數個該等 頻譜傾斜值之每一者係基於(A)該複數個不活動訊框之— 對應者之一頻譜傾斜及(B)該頻譜傾斜值序列中之其他頻 譜傾斜值中之至少一者。 123345.doc 200818802 6·如請求jf ! 4 A 改 、翅理一語音信號之方法,其中該計算出的 、糸基於該頻譜傾斜值序列中之連貫值之間的一差 值。 7. 如請求if ! A 、之處理一語音信號之方法,其中該計算一改 變包合Μ x 汁异該頻譜傾斜值序列中之鄰近 離。 此 8·如明求項1之處理-語音信號之方法,其中該決定是否 訊框之一描述包含將該計算出的改變與一臨限值 9. 如:求項1之處理-語音信號之方法,其中該決定是否 傳=該訊框之—描述的結果係基於(A)該計算出的改變之 里值與(B)—臨限值之間的一關係。 10. ::月求:i之處理一語音信號之方法,其中該方法包含 =該^定是否傳輸該訊框之—描述的結果為—傳輸該訊 =之-描述之衫,則傳輸—靜寂描述,該靜寂描述包 括一頻譜包絡描述及一能量包絡描述中之至少一者。 如請求項10之處理一語音信號之方法,其中該方法包含 基於⑷複數個不活動訊框之每一者之頻譜包絡描述及 ⑻後數個不活動訊框之每—者之能量包絡描述當中的至 少一者而計算該靜寂描述。 匕如請求項丨之處理—語音信號之方法, 傳輸該訊框之-描述係基於下述各項當中之至二否 ⑷-描㈣訊框之—_包絡之向量、(B)該訊框卜 殘餘此置、(C)至一7R、、壬#κ. )至不活動訊框之一描述之最近傳輪的一 123345.doc 200818802 時間距離、(D)至最近活動訊框之一時間距離、該訊 框之一能量包絡之一描述、(F)該訊框之一平均絕對值, 及(G)該訊框之《—能量值。 13·如請求項12之處理一語音信號之方法,其中該方法包含 若該決定是否傳輸該訊框之一描述的結果為一傳輸該訊 框之一描述之決定,則傳輸一靜寂描述,該靜寂描述包 括一頻譜包絡描述及一能量包絡描述中之至少一者。 14·如請求項〗之處理一語音信號之方法,其中該決定是否 傳輸該訊框之一描述包含回應於偵測到一編碼增益量度 之一改變超過一臨限值而決定不傳輸該訊框之一描述。 月求員14之處理一浯音信號之方法,其中該編碼增益 里度之每值係基於該語音信號之一對應不活動訊框之 複數個反射係數的值。 其中該方法包含 I6·如請求項1之處理一語音信號之方法, 料該頻譜傾斜值序列巾之複數個該等頻譜傾斜值之每 一者而計算該頻譜傾斜值與該頻譜傾斜值序列中之至少 一個其他頻譜傾斜值之間的一改變,且 其中該方法包含對於該語音信號之另一複數個不活動 訊框之每-者而決定是否傳輸該訊框之—描述,且 對於該另一複數個不活動訊框之每一者,該 定是否傳輸該訊框之-描述的結果係基於該等計算出 改變中之至少一者。 17.:吻求項16之處理一語音信號之方法,其中,對於該 一複數個不活動訊框中之至少_些不活動訊框之每 123345.doc 200818802 者’該決定是否傳輸該訊框之一描述的結果係一不傳輸 該訊框之一描述的決定。 18·如請求項16之處理一語音信號之方法,其中,對於該另 一複數個不活動訊框之每一者,該決定是否傳輸該訊框 之一描述包含回應於偵測到一編碼增益量度之一改變超 過一臨限值而決定不傳輸該訊框之一描述。 19·如请求項18之處理一語音信號之方法,其中,對於該另 一複數個不活動訊框之每一者,一編碼增益量度之該改 變係基於(A)該語音信號中處於該訊框之前的一第一不活 動訊框之該編碼增益量度的一值及(B )該語音信號中處於 該訊框之前的且不同於該第一不活動訊框之一第二不活 動訊框之該編碼增益量度的一值。The method of changing the decision to transmit a frame is based on the method of calculating 2. a request for processing a speech signal, wherein the generating a sequence of spectral tilt values comprises smoothing a sequence of another spectral tilt value to generate the A sequence of spectral tilt values, wherein each of the spectral tilt values of the another sequence indicates that one of the plurality of inactive frames is spectrally tilted. 3. A method of processing a speech signal as claimed in claim 1, wherein each of the spectral tilt values is based on at least one reflection coefficient of the inactive frame of the speech money. 4. The method of claim 1, wherein each of the plurality of spectral tilt values is based on at least one of other spectral tilt values in the sequence of spectral tilt values. 5. The processing of claim 1 - a square of speech (4), wherein each of the plurality of spectral tilt values is based on (A) the plurality of inactive frames - one of the corresponding spectral slopes and B) at least one of the other spectral tilt values in the sequence of spectral tilt values. 123345.doc 200818802 6. The method of requesting jf! 4 A to change a winged speech signal, wherein the calculated 糸 is based on a difference between consecutive values in the sequence of spectral tilt values. 7. A method of processing a speech signal as requested by if ! A, wherein the calculation changes the proximity of the inclusion Μ x juice to the sequence of spectral tilt values. The method of processing a speech signal according to claim 1, wherein the determining whether one of the frames includes the calculated change and a threshold value of 9. For example, the processing of the item 1 - the speech signal The method, wherein the decision is to pass the frame - the result of the description is based on (A) a relationship between the calculated value of the change and the (B) - threshold. 10. :: Monthly request: i's method of processing a speech signal, wherein the method includes = whether the message is transmitted or not - the result of the description is - the transmission of the message = the description of the shirt, then the transmission - static The silence description includes at least one of a spectral envelope description and an energy envelope description. A method for processing a speech signal according to claim 10, wherein the method comprises: (4) a spectral envelope description of each of the plurality of inactive frames and (8) an energy envelope description of each of the plurality of inactive frames The silent description is calculated for at least one of them. For example, the method of processing the request-speech signal, the description of the transmission of the frame is based on the following two to none (4) - tracing (four) frame - _ envelope vector, (B) the frame Remaining this, (C) to a 7R, 壬#κ.) to one of the inactive frames, one of the recent passes, a 123345.doc 200818802 time distance, (D) to one of the most recent active frames Distance, one of the energy envelopes of the frame, (F) the average absolute value of one of the frames, and (G) the energy value of the frame. 13. The method of claim 12, wherein the method comprises transmitting a silence description if the decision to transmit one of the frames is a decision to transmit a description of the frame, The silence description includes at least one of a spectral envelope description and an energy envelope description. 14. The method of claim 1, wherein the determining whether to transmit a frame description comprises determining not to transmit the frame in response to detecting that one of the coding gain metrics changes by more than a threshold. One description. A method of processing a chirp signal by a monthly requester 14, wherein each value of the encoding gain is based on a value of a plurality of reflection coefficients of one of the speech signals corresponding to the inactive frame. Wherein the method comprises the method of processing a speech signal according to claim 1, wherein each of the plurality of spectral tilt values of the spectral tilt value sequence is used to calculate the spectral tilt value and the spectral tilt value sequence. a change between at least one other spectral tilt value, and wherein the method includes deciding whether to transmit the frame for each of the other plurality of inactive frames of the voice signal, and for the other Each of the plurality of inactive frames, whether or not to transmit the frame - the result of the description is based on at least one of the calculated changes. 17. The method of processing a speech signal of a claim 16, wherein, for each of the plurality of inactive frames, each of the inactive frames is 123345.doc 200818802 'determines whether to transmit the frame One of the results described is a decision not to transmit one of the frames described. 18. The method of claim 16, wherein for each of the other plurality of inactive frames, the determining whether to transmit the frame description comprises responding to detecting a coding gain One of the measures changes beyond a threshold and decides not to transmit a description of the frame. 19. The method of claim 18, wherein for each of the other plurality of inactive frames, the change in a coding gain metric is based on (A) the speech signal is in the signal a value of the coding gain metric of a first inactive frame before the frame and (B) a second inactive frame of the voice signal preceding the frame and different from the first frame of the first inactive frame A value of the coding gain metric. 如請求項1之處理一語音信號之方法,其中該產生一頻 譜傾斜值序列包含對於該複數個不活動訊框當中之至少 一些不活動訊框之每一者,根據該不活動訊框與該語音 信號之一先前活動訊框之間的一時間距離而產生該頻譜 傾斜值序列當中之一對應者。 21·如請求項20之處理一語音信號之方法,其中該產生該頻 譜傾斜值序列當中之一對應者包含當該不活動訊框與該 扣曰彳5號之一先前活動訊框之間的該時間距離小於一臨 限值時’將該頻譜傾斜值設定為該頻譜傾斜值序列當中 之前一者。 22.如請求項丨之處理一語音信號之方法,其中該產生一頻 &quot;晋傾斜值序列包含對於該複數個不活動訊框當中之至少 123345.doc -4- 200818802 一些二活動訊框之每-者,根據該不活動訊框之一編瑪 增益量度而計算該頻譜傾斜值序列當中之一對應者。 23.如請求们之處理一語音信號之方法,纟中該產生―頻 譜傾斜值序列包含對於該頻譜傾斜值序列當中之至少一 者之每者,回應於偵測到一編碼增益量度之一改變超 過一臨限值而將該頻譜傾斜值設定為該頻譜傾斜值序列 當中之前一者。 24· -種電腦程式產品,其包含一電腦可讀媒體,該媒體包 含: 、用於使至少一個電腦產生一基於語音信號之複數個不 活動訊框之頻譜傾斜值序列的程式碼·, 用於使至少一個電腦計算該頻譜傾斜值序列之至少兩 個值之間的一改變之程式碼;及 用於使至少一個電腦針對該複數個不活動訊框當中之 一不活動訊框且基於該計算出的改變來決定是否傳輸該 訊框之一描述的程式碼。 25·如凊求項24之電腦程式產品,其中用於使至少一個電腦 產生一頻譜傾斜值序列之該程式碼經組態以使該至少一 個電腦基於該頻譜傾斜值序列中之其他頻譜傾斜值中之 至少一者而產生複數個該等頻譜傾斜值之每一者。 26·如請求項24之電腦程式產品,其中用於使至少一個電腦 計算一改變之該程式碼經組態以使該至少一個電腦基於 該頻譜傾斜值序列中之連貫值之間的一差值而計算該改 變。 123345.doc 200818802 27.如請求項24之電腦程式產品,其中用於使至少—個電腦 決定是否傳輸該訊框之—描述之該程式碼經組態以使該 -個電腦基於⑷該計算出的改變之一量值與⑻一 臨限值之間的_關係而決定是否傳輸該訊框之—描述。The method of claim 1, wherein the generating a sequence of spectral tilt values comprises for each of the at least some of the plurality of inactive frames, according to the inactive frame and the One of the sequence of spectral tilt values is generated by a temporal distance between one of the speech signals and the previous active frame. 21. The method of claim 20, wherein the generating one of the sequence of spectral tilt values comprises between the inactive frame and a previously active frame of the buckle 5 When the time distance is less than a threshold value, the spectrum tilt value is set to the previous one of the spectrum tilt value sequences. 22. The method of claim 1, wherein the generating a frequency &quot;slope sequence comprises at least 123345.doc -4- 200818802 of the plurality of inactive frames For each, one of the spectral tilt value sequences is calculated according to one of the inactive frames. 23. The method of processing a speech signal as claimed by the requester, wherein the generating a sequence of spectral tilt values comprises, for each of at least one of the sequence of spectral tilt values, in response to detecting a change in a coding gain metric The spectral tilt value is set to be the previous one of the spectral tilt value sequences over a threshold value. A computer program product comprising a computer readable medium, comprising: a code for causing at least one computer to generate a sequence of spectral tilt values of a plurality of inactive frames based on a voice signal, And causing at least one computer to calculate a changed code between the at least two values of the sequence of spectral tilt values; and for causing the at least one computer to inactive for one of the plurality of inactive frames and based on the The calculated change determines whether to transmit the code described in one of the frames. 25. The computer program product of claim 24, wherein the code for causing at least one computer to generate a sequence of spectral tilt values is configured to cause the at least one computer to be based on other spectral tilt values in the sequence of spectral tilt values. Each of the plurality of spectral tilt values is generated by at least one of the plurality. 26. The computer program product of claim 24, wherein the code for causing at least one computer to calculate a change is configured such that the at least one computer is based on a difference between consecutive values in the sequence of spectral tilt values. And calculate the change. 123345.doc 200818802 27. The computer program product of claim 24, wherein the code for causing at least one computer to determine whether to transmit the frame is configured to cause the computer to calculate based on (4) The change between one of the magnitudes and (8) a threshold value determines whether to transmit the frame-description. 28. 如請求項24之電腦程式產品,其中用於使至少一個電腦 決定是否傳輸該訊框之—描述之該程式碼包括用以使該 至少一個電腦回應於-編碼增益量度之超過—臨限值之 一改變而決^不傳輸該訊框之—描述的程式碼。 29. 如請求項24之電腦程式產品,《中詩使至少―個電腦 計算:改變之該程式碼經組態以使該至少—個電腦針對 該頻譜傾斜值序列中之複數個該等頻譜傾斜值之每—者 而計算該頻譜傾斜值與該頻譜傾斜值序列中之至少一個 其他頻譜傾斜值之間的一改變,且 其中用於使至少一個電腦決定是否傳輸該訊框之—描 述之該程式碼經組態以使該至少一個電腦針對該語音信 號之另一複數個不活動訊框之每一者而決定是否傳輸^ 訊框之一描述,且 其中用於使至少一個電腦決定是否傳輸該訊框之一描 述之該程式碼經組態以使得對於該另一複數個不活動訊 框之每一者,是否傳輸該訊框之一描述之該決定係基於 該專計算出的改變中之至少一者。 3〇·如請求項24之電腦程式產品,其中用於使至少一個電腦 產生一頻譜傾斜值序列之該程式碼包含用於使該至少一 個電腦針對該複數個不活動訊框當中之至少一些不活動 123345.doc 200818802 汛框之每一者而根據該不活動訊框與該語音信號之一先 鈾活動訊框之間的一時間距離來產生該頻譜傾斜值序列 當中之一對應者的程式碼。 31·如請求項24之電腦程式產品,其中用於使至少一個電腦 產生一頻譜傾斜值序列之該程式碼經組態以使該至少一 個電腦針對該頻譜傾斜值序列當中之至少一者之每一者 而回應於偵測到一編碼增益量度之一改變超過一臨限值 來將該頻譜傾斜值設定為該頻譜傾斜值序列當中之前一 •者。 32·如請求項24之電腦程式產品,其中用於使至少一個電腦 產生一頻譜傾斜值序列之該程式碼經組態以使該至少一 個電腦平滑另一頻譜傾斜值序列而產生該頻譜傾斜值序 列, 八中忒另序列之該等頻譜傾斜值之每一者指示該複 數個不活動訊框之一對應者之一頻譜傾斜。 _ 33· —種用於處理一語音信號之裝置,該裝置包含: 序列產生态,其經組態以產生一基於該語音信號之 複數個不活動訊框之頻譜傾斜值序列; 一计异裔,其經組態以計算該頻譜傾斜值序列之至少 兩個值之間的一改變·,及 一比較器,其經組態以針對該複數個不活動訊框當中 之一不活動訊框且基於該計算出的改變來決定是否傳輸 該訊框之一描述。 34·如明求項33之用於處理一語音信號之裝置,其中該比較 123345.doc 200818802 器經組態以基於(A)該計算出的改變之一量值與⑺)一臨 限值之間的一關係而決定是否傳輸該訊框之一描述。 35. ⑩36. 37. 38. 如請求項33之用於處理一語音信號之裝置,其中該裝置 包含一無線通信設備,該設備包括該序列產生器、該計 算器及該比較器,且 其中該設備經組態以回應於該比較器所作出的一傳輸 該訊框之-描述之決定而傳輸—靜寂描述,該靜寂描述 包括一頻譜包絡描述及一能量包絡描述中之至少一者。 如請求項33之用於處理一語音信m之裝置,《中該比較 器經組態以回應於一編碼增益量度之超過一臨限值之一 改變而決定不傳輸該訊框之一描述。 如請求項33之用於處理一語音信號之裝置,其中該叶曾 器經組態料對該頻譜傾斜值序财之複數個該等頻: 傾斜值之每-者而計算該頻譜傾斜值與該頻譜傾斜值序 列中之至少—個其他頻譜傾斜值之間的—改變,且 其中該比較器經組態以針對該語音信號之另一複數個 不活動訊框之每-者而決定是否傳輸該訊框之一描述,且 其中該比較器經組態以使得對於該另一複數個不活動 訊框之每-者’是否傳輸該訊框之―描述之該決定係其 於該等計算出的改變中之至少一者。 ”土 如請求項33之用於處理一注立 曰^號之裝置,其中該序列 產生器經組態以針澍兮銘叙 丁對孩禝數個不活動訊框當中之至少一 些不活動訊框之每一去;4日沾 t而根據該不活動訊框與該語立 號之一先前活動訊框之鬥 曰&quot; 之間的一時間距離來產生該頻譜傾 123345.doc 200818802 斜值序列當中之一對應者。 39.如請求項33之用於處理一語音信號之裝置,其中該序列 產生器經組態以針對該頻譜傾斜值序列當中之至少一者 之每一者而回應於偵測到一編碼増益量度之一改變超過 一臨限值來將該頻譜傾斜值設定為該頻譜傾斜值序列當 中之前一者。 4〇·如請求項33之用於處理一語音信號之裝置,其中該序列 產生裔經組悲以藉由平滑另一頻譜傾斜值序列而產生該 頻譜傾斜值序列, 其中該另一序列之該等頻譜傾斜值之每一者指示該複 數個不活動汛框之一對應者之一頻譜傾斜。 41·· 一種用於處理一語音信號之裝置,該裝置包含: 用於產生一基於該語音信號之複數個不活動訊框之頻 譜傾斜值序列的構件; 用於什异該頻譜傾斜值序列之至少兩個值之間的一改 變之構件;及 用於針對該複數個不活動訊框當中之一不活動訊框且 基於該计异出的改變來決定是否傳輸該訊框之一描述的 構件。 42·如請求項41之用於處理一語音信號之裝置,其中該裝置 〇 3用於回應於用於決定之該構件所作出的—傳輸該訊 描述之决疋而傳輸一靜寂描述之構件,該靜寂描 述包括一頻譜包絡描述及一能量包絡描述中之至少一 者。 123345.doc 200818802 43. 如請求項41之用於處理一語音信號之裝置,其中用於產 生-頻譜傾斜值序列之該構件經組態以針對該複數個不 活動訊框當中之至少—衫活動訊框之每—者而根據該 不活動訊框與該語音信號之—先前活動訊框之間的一時 間距離來產生該頻譜傾斜值序列當中之一對應者。 44. 如請求項41之用於處理—語音信號之裝置,〜其中用於產 生一頻譜㈣值序狀該構件經㈣以針對該頻譜傾斜 值序列當中之至少-者之每__者而回應於彳貞測到一編碼 增益量度之-改變超過—臨限值來將該頻譜傾斜值設定 為該頻譜傾斜值序列當中之前一者。 45. 如請求項41之用於處理一語音信號之裝置,其中用於產 &gt;生一頻譜傾斜值序列之該構件經組態以藉由平滑另二頻 譜傾斜值序列而產生該頻譜傾斜值序列, 其中該另一序列之該等頻譜傾斜值之每一者指示該複 數個不活動訊框之一對應者之一頻譜傾斜。 46· —種處理一語音信號之方法,該方法包含: 產生一基於該語音信號之複數個不活動訊框 斜值序列; 、° s 計算該頻譜傾斜值序列之至少兩個值之間的一改變;及 對於該複數個不活動訊框當中之一 个居動訊框,決定 是否傳輸該訊框之一描述, 其中該決定是否傳輸該訊框之一描述係基 的改變,且 |开出 其中該產生-頻譜傾斜值序列包含針對該複數個不活 123345.doc -10- 200818802 動訊框當中之至少一些不活動訊框之每一者, 活動訊框與該語音信號之一先前活動訊框之間 距雜來產生該頻譜傾斜值序列當中之一對應者 根據該不 的一時間28. The computer program product of claim 24, wherein the at least one computer determines whether to transmit the frame - the code described includes the step of causing the at least one computer to respond to the -coding gain metric - threshold One of the values is changed and the code described by the frame is not transmitted. 29. The computer program product of claim 24, wherein the Chinese poem causes at least one computer to calculate: the modified code is configured such that the at least one computer is tilted for a plurality of the spectra in the sequence of tilt values of the spectrum A change between the spectral tilt value and at least one other spectral tilt value in the sequence of spectral tilt values is calculated for each of the values, and wherein the at least one computer determines whether to transmit the frame - the description The code is configured to cause the at least one computer to determine whether to transmit a description of the frame for each of the other plurality of inactive frames of the voice signal, and wherein the at least one computer determines whether to transmit The code described in one of the frames is configured such that for each of the other plurality of inactive frames, the decision to transmit one of the frames is based on the specially calculated change. At least one of them. 3. The computer program product of claim 24, wherein the code for causing the at least one computer to generate a sequence of spectral tilt values comprises causing the at least one computer to target at least some of the plurality of inactive frames Activity 123345.doc 200818802 Each of the frames generates a code corresponding to one of the sequence of spectral tilt values based on a time distance between the inactive frame and the first uranium activity frame of the voice signal . 31. The computer program product of claim 24, wherein the code for causing at least one computer to generate a sequence of spectral tilt values is configured to cause the at least one computer to target at least one of the spectral tilt value sequences One responds to detecting that one of the coding gain metrics changes beyond a threshold to set the spectral tilt value to the previous one of the sequence of spectral tilt values. 32. The computer program product of claim 24, wherein the code for causing at least one computer to generate a sequence of spectral tilt values is configured to cause the at least one computer to smooth another sequence of spectral tilt values to produce the spectral tilt value Each of the spectral tilt values of the sequence, the eighth sequence, indicates that one of the plurality of inactive frames is spectrally tilted. </ RTI> a device for processing a speech signal, the device comprising: a sequence generation state configured to generate a sequence of spectral tilt values of a plurality of inactive frames based on the speech signal; Having been configured to calculate a change between at least two values of the sequence of spectral tilt values, and a comparator configured to inactive for one of the plurality of inactive frames and A determination of whether to transmit one of the frames is determined based on the calculated change. 34. The apparatus for processing a speech signal of claim 33, wherein the comparing 123345.doc 200818802 is configured to be based on (A) the calculated magnitude of the change and (7) a threshold value A relationship between the two determines whether to transmit a description of the frame. 35. The apparatus for processing a voice signal of claim 33, wherein the apparatus comprises a wireless communication device, the device comprising the sequence generator, the calculator, and the comparator, and wherein The device is configured to transmit a silence description in response to a decision by the comparator to transmit the frame-description, the silence description including at least one of a spectral envelope description and an energy envelope description. The apparatus of claim 33 for processing a voice message m, wherein the comparator is configured to determine not to transmit a description of the frame in response to a change in a coding gain metric exceeding a threshold. The apparatus for processing a voice signal according to claim 33, wherein the leaf device calculates the spectrum tilt value by using a plurality of the equal frequency of the spectrum tilt value a change between at least one other spectral tilt value in the sequence of spectral tilt values, and wherein the comparator is configured to determine whether to transmit for each of the other plurality of inactive frames of the speech signal One of the frames is described, and wherein the comparator is configured such that the decision to describe whether or not to transmit the frame for the other plurality of inactive frames is calculated by the comparator At least one of the changes. "A device of claim 33 for processing a note of a number, wherein the sequence generator is configured to attenuate at least some of the inactive frames of the child's inactive frames. Each of the boxes goes; 4 days dip and according to the time interval between the inactive frame and the previous activity frame of one of the language signs to generate the spectrum tilt 123345.doc 200818802 oblique value 39. The apparatus of claim 33, wherein the sequence generator is configured to respond to each of at least one of the sequence of spectral tilt values. Detecting that one of the coding benefit metrics changes by more than a threshold to set the spectral tilt value to the previous one of the sequence of spectral tilt values. 4. A device for processing a speech signal according to claim 33, Wherein the sequence produces a group of sorrows to generate the sequence of spectral tilt values by smoothing another sequence of spectral tilt values, wherein each of the spectral tilt values of the other sequence indicates the plurality of inactive frames One of the counterparts is spectrally tilted. 41·· A device for processing a speech signal, the device comprising: means for generating a sequence of spectral tilt values of a plurality of inactive frames based on the speech signal; a component of a change between at least two values of the sequence of spectral tilt values; and for determining, for the one of the plurality of inactive frames, an inactive frame and determining whether to transmit based on the change of the metering A component described in one of the frames. 42. The apparatus of claim 41 for processing a voice signal, wherein the apparatus 用于3 is responsive to a decision made by the means for determining - transmitting the description of the message And transmitting a silent description component, the silence description including at least one of a spectral envelope description and an energy envelope description. 123345.doc 200818802 43. The apparatus for processing a voice signal according to claim 41, Wherein the means for generating a sequence of spectral tilt values is configured to target at least one of the plurality of inactive frames based on the inactivity a time interval between the frame and the previous active frame of the speech signal to generate a corresponding one of the sequence of spectral tilt values. 44. The device for processing a speech signal according to claim 41, wherein Generating a spectral (quad) value sequence that the component passes (4) in response to at least one of the spectral tilt value sequences, in response to detecting a code gain metric - a change exceeding a threshold value The spectral tilt value is set to be the one of the sequence of spectral tilt values. 45. The apparatus for processing a speech signal of claim 41, wherein the means for generating a sequence of spectral tilt values is configured The sequence of spectral tilt values is generated by smoothing another sequence of spectral tilt values, wherein each of the spectral tilt values of the other sequence indicates a spectral tilt of one of the plurality of inactive frames. a method for processing a speech signal, the method comprising: generating a plurality of inactive frame oblique value sequences based on the speech signal; and calculating a ratio between at least two values of the spectral tilt value sequence Changing; and determining, for one of the plurality of inactive frames, whether to transmit a description of the frame, wherein the determining whether to transmit one of the frames describes a change in the basis, and | The generated-spectral tilt value sequence includes each of at least some of the inactive frames of the plurality of inactive 123345.doc -10- 200818802 motion frames, the active frame and one of the voice signals of the previous active frame The distance between the ones of the sequence of spectral tilt values is generated according to the time interval 123345.doc -11 -123345.doc -11 -
TW96128125A 2006-07-31 2007-07-31 Systems, methods, and apparatus for signal change detection TWI467979B (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US83468906P 2006-07-31 2006-07-31

Publications (2)

Publication Number Publication Date
TW200818802A true TW200818802A (en) 2008-04-16
TWI467979B TWI467979B (en) 2015-01-01

Family

ID=40925461

Family Applications (1)

Application Number Title Priority Date Filing Date
TW96128125A TWI467979B (en) 2006-07-31 2007-07-31 Systems, methods, and apparatus for signal change detection

Country Status (2)

Country Link
CN (1) CN101496095B (en)
TW (1) TWI467979B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI501220B (en) * 2009-03-13 2015-09-21 Koninkl Philips Electronics Nv Embedding and extracting ancillary data

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105225668B (en) * 2013-05-30 2017-05-10 华为技术有限公司 Signal encoding method and equipment
JP5981408B2 (en) * 2013-10-29 2016-08-31 株式会社Nttドコモ Audio signal processing apparatus, audio signal processing method, and audio signal processing program
EP2980798A1 (en) * 2014-07-28 2016-02-03 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Harmonicity-dependent controlling of a harmonic filter tool
CN105590629B (en) * 2014-11-18 2018-09-21 华为终端(东莞)有限公司 A kind of method and device of speech processes
CN106847306B (en) * 2016-12-26 2020-01-17 华为技术有限公司 Abnormal sound signal detection method and device
EP3815082B1 (en) * 2018-06-28 2023-08-02 Telefonaktiebolaget Lm Ericsson (Publ) Adaptive comfort noise parameter determination
CN108962275B (en) * 2018-08-01 2021-06-15 电信科学技术研究院有限公司 Music noise suppression method and device
CN112530407B (en) * 2020-11-25 2021-07-23 北京快鱼电子股份公司 Language identification method and system

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5341456A (en) * 1992-12-02 1994-08-23 Qualcomm Incorporated Method for determining speech encoding rate in a variable rate vocoder
US6687668B2 (en) * 1999-12-31 2004-02-03 C & S Technology Co., Ltd. Method for improvement of G.723.1 processing time and speech quality and for reduction of bit rate in CELP vocoder and CELP vococer using the same
US6807525B1 (en) * 2000-10-31 2004-10-19 Telogy Networks, Inc. SID frame detection with human auditory perception compensation
KR20050049103A (en) * 2003-11-21 2005-05-25 삼성전자주식회사 Method and apparatus for enhancing dialog using formant

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI501220B (en) * 2009-03-13 2015-09-21 Koninkl Philips Electronics Nv Embedding and extracting ancillary data

Also Published As

Publication number Publication date
TWI467979B (en) 2015-01-01
CN101496095B (en) 2012-11-21
CN101496095A (en) 2009-07-29

Similar Documents

Publication Publication Date Title
JP4995913B2 (en) System, method and apparatus for signal change detection
TW200818802A (en) Systems, methods, and apparatus for signal change detection
TWI587290B (en) Apparatus and method for generating an adaptive spectral shape of comfort noise, and related computer program
EP2089877B1 (en) Voice activity detection system and method
EP2363852B1 (en) Computer-based method and system of assessing intelligibility of speech represented by a speech signal
JP6587659B2 (en) Coding method determining method and apparatus
JP6127143B2 (en) Method and apparatus for voice activity detection
CN104584120B (en) Generate comfort noise
EP2059925A2 (en) Time-warping frames of wideband vocoder
Mack et al. Single-Channel Dereverberation Using Direct MMSE Optimization and Bidirectional LSTM Networks.
Kim et al. Voice activity detection based on conditional MAP criterion incorporating the spectral gradient
Kumar et al. A new pitch detection scheme based on ACF and AMDF
Zhen et al. On psychoacoustically weighted cost functions towards resource-efficient deep neural networks for speech denoising
Hjalmarsson et al. Measuring final lengthening for speaker-change prediction
Ajgou et al. An efficient approach for MFCC feature extraction for text independent speaker identification system
Van Pham et al. Voice activity detection algorithms using subband power distance feature for noisy environments.
Wang et al. Voice Activity Detection based on Combination of Weighted Sub-band Features using Auto-Correlation Function
BRPI0911932A2 (en) equipment and method for processing an audio signal for speech intensification using a feature extraction