CN108417220B

CN108417220B - Voice signal coding and decoding methods based on agent model Volterra modeling

Info

Publication number: CN108417220B
Application number: CN201810142277.6A
Authority: CN
Inventors: 张玉梅; 刘江山; 吴晓军; 吴霞
Original assignee: Shaanxi Normal University
Current assignee: Shaanxi Normal University
Priority date: 2018-02-11
Filing date: 2018-02-11
Publication date: 2019-06-25
Anticipated expiration: 2038-02-11
Also published as: CN108417220A

Abstract

A kind of voice signal coding and decoding methods based on agent model Volterra modeling construct prediction model by being pre-processed to the chaos voice signal of input, with Volterra modeling method, determine chaos voice signal prediction model and encode, decoding step forms.Existing artificial bee colony algorithm is improved since the present invention uses, preemphasis, adding window, framing pretreatment are carried out to the chaos voice signal of input, establish chaos voice signal prediction model, determine the parameter in chaos voice signal prediction model, complete the coding of chaos voice signal, according to the data having after encoding, conventionally it is decoded.The present invention utilizes the chaos feature of voice signal, rapidly and accurately realizes and is encoded, decoded to chaos voice signal, and with step, simple, easy to accomplish, high accuracy for examination, can be used for encoding chaos voice signal, decode.

Description

Voice signal coding and decoding methods based on agent model Volterra modeling

Technical field

The invention belongs to calculate and applied technical field, and in particular to Chaotic time series forecasting model.

Background technique

In recent years, reaching its maturity with hardware device and mechanics of communication, has change for the efficiency of transmission of voice It is required that.Studies have found that voice signal time series is nonlinear, and show as apparent chaotic characteristic.Utilize chaos spy Property building voice signal prediction model be considered as a kind of outstanding feasible method.Most of researcher's construction one non-thread Property prediction model be all directly using Volterra modeling method:

And it is cumbersome to do phase space reconfiguration process.And it need to be built on the basis of voice signal chaotic characteristic using evolution algorithm Voice signal Chaotic time series forecasting model is found.Existing evolution algorithm efficiency is lower, to particular problem without specific aim.It is existing Somebody's work ant colony algorithm computational efficiency is low, solving precision is insufficient, existing observation peak stage Search equation are as follows:

The information of each iteration cannot be fully utilized.

Summary of the invention

Technical problem to be solved by the present invention lies in the above-mentioned prior art is overcome, provide a kind of step it is simple, It is easy to accomplish, speed is fast, accuracy rate it is high based on agent model Volterra modeling voice signal coding and decoding methods.

Solving technical solution used by above-mentioned technical problem is to comprise the steps of:

(1) the chaos voice signal of input is pre-processed

In the chaos voice signal of input, the uniform frame of waveform is found as analysis frame, preemphasis is carried out, adding window, divides Frame pretreatment.

Above-mentioned adding window pretreatment is carried out using following window function:

N is limited positive integer in formula.

(2) prediction model is constructed with Volterra modeling method

By the information of step (1) analysis frame, chaos voice signal prediction model is established by formula (2):

U (n-i τ) is the analysis frame signal of input in formula, and m is that the memory span of Chaotic time series forecasting model is limited Positive integer, h₁(i) and h₂(i, j) is undetermined coefficient, and u (n-i τ) is the n-th-i τ sample of correspondence analysis frame, and n-i τ is step (1) the sample serial number of analysis frame in, u (n-j τ) are the n-th-j τ sample of corresponding analysis frame, and n-j τ is analysis in step (1) The sample serial number of frame, τ are delay times for limited positive integer, and j, n are limited positive integer.

(3) it determines chaos voice signal prediction model and encodes

The chaos voice signal of analysis frame in step (1) is determined into chaos voice signal institute with adaptive artificial bee colony algorithm Corresponding delay time T, Embedded dimensions s, undetermined coefficient h₁(i), undetermined coefficient h₂(i, j), using agent model method as close Like fitness function, Embedded dimensions s, delay time T, the undetermined coefficient h of high fitness are selected₁(i) and undetermined coefficient h₂(i, J), it is obtained most as original fitness function by greedy selection method using the mean square error between predicted value and actual value Good Embedded dimensions s, delay time T, undetermined coefficient h₁(i) and h₂(i, j) is substituted into above-mentioned formula (2), completes chaos voice signal Coding.

(4) it decodes

By smallest embedding dimension number s, delay time τ, the undetermined coefficient h of the chaos voice signal of extraction₁(i) and h₂(i, j) Substitution formula (2), obtains the prediction model to induction signal, according to the data having after encoding, is conventionally decoded.

Adaptive artificial bee colony algorithm in step of the present invention (3) are as follows:

ω is weight coefficient between (0,1) in formula, and c1, c2 are Studying factors 2,It is the random number of [- 1,1], x_best For the global optimum nectar source of each iteration, x_ijFor current nectar source position, i is the serial number of nectar source vector, and j is respective components, x_neighborFor the neighbouring nectar source position in current nectar source, neighbor is that vector serial number in nectar source cannot be equal to i, and ω is by following two formula Son determines:

ω=ω_min+ρ(ω_max-ω_min) (4)

ω_minThe upper bound for being ω is 0.2, ω_maxThe lower bound for being ω is 0.9, a 2, and maxcyle is the largest the number of iterations Being the largest the number of iterations for 2000, a 2, maxcyle is 1000 or 1500 or 2000.

Agent model method in step of the present invention (3) are as follows:

(1) the Embedded dimensions s in the chaos Phase Space Reconstruction of Speech Signals of analysis frame, delay time T are added to original In Volterra model, the m in formula (1) is replaced with s.

(2) according to the model for introducing s and τ belonging to step (1), using the agent model method of adaptive artificial bee colony algorithm Determine undetermined coefficient h₁(i) and h₂(i, j).

Use radial base neural net as approximate fitness function, by approximate fitness function and true fitness function Models coupling uses, approximate fitness function are as follows:

K in formula (| | x-c_i| |) it is used kernel function, a_iFor the value to be assessed, c_iFor radial base neural net Central point, true fitness function are as follows:

Y in formula_iFor actual value,For predicted value, L is prediction length.

Determine best undetermined coefficient h₁(i)、h₂Whether (i, j), detection mean square error reach requirement, are not up to error requirements, Iteration again.

Since the present invention is using existing artificial bee colony algorithm is improved, the chaos voice signal of input is carried out Preemphasis, adding window, framing pretreatment, establish chaos voice signal prediction model, determine in chaos voice signal prediction model Parameter, complete chaos voice signal coding, according to have coding after data, be conventionally decoded.This hair The bright chaos feature using voice signal, rapidly and accurately realizes and is encoded, decoded to chaos voice signal, has step Simply, easy to accomplish, high accuracy for examination can be used for encoding chaos voice signal, decode.

Detailed description of the invention

Fig. 1 is process flow chart of the invention.

Fig. 2 is the waveform diagram that embodiment 1 inputs chaos voice signal phonetic symbol [b].

Fig. 3 is the experimental result that embodiment 1 determines chaos voice signal prediction model and encodes.

Fig. 4 is the experimental result that embodiment 2 determines chaos voice signal prediction model and encodes.

Fig. 5 is the experimental result that embodiment 3 determines chaos voice signal prediction model and encodes.

Specific embodiment

The present invention is described in more detail with reference to the accompanying drawings and examples, but the present invention is not limited to following embodiment party Formula.

Embodiment 1

By taking phonetic symbol [b] in the chaos voice signal chosen in standard pronunciation mark corpus as an example, it is based on agent model The voice signal coding and decoding methods step (as shown in Figure 1) of Volterra modeling is as follows:

(1) the chaos voice signal of input is pre-processed

Fig. 2 is the waveform diagram of the chaos voice signal phonetic symbol [b] of input, in the chaos voice signal phonetic symbol [b] of input, The uniform frame of waveform is found as analysis frame, carries out preemphasis, adding window, framing pretreatment, preemphasis is conventional method, using biography Delivery function carries out preemphasis.

N is limited positive integer in formula.

(2) prediction model is constructed with Volterra modeling method

The information of step (1) analysis frame is shown in Fig. 3, the present embodiment sample length that therefrom intercepted length is 400, by formula (2) Establish chaos voice signal prediction model:

(3) it determines chaos voice signal prediction model and encodes

The chaos voice signal of analysis frame in step (1) is determined into chaos voice signal institute with adaptive artificial bee colony algorithm Corresponding delay time T, Embedded dimensions s, undetermined coefficient h₁(i), undetermined coefficient h₂(i, j), adaptive artificial bee colony algorithm are as follows:

ω=ω_min+ρ(ω_max-ω_min) (4)

ω_minThe upper bound for being ω is 0.2, ω_maxThe lower bound for being ω is 0.9, a 2, and maxcyle is the largest the number of iterations It is 2000.

Using agent model method as approximate fitness function, select the Embedded dimensions s of high fitness, delay time T, Undetermined coefficient h₁(i), undetermined coefficient h₂(i, j), the agent model method of the present embodiment are as follows:

(2) according to the model for introducing s and τ belonging to step (1), using the agent model method of adaptive artificial bee colony algorithm Determine undetermined coefficient h₁(i) and h₂(i, j):

Y in formula_iFor actual value,For predicted value, L is prediction length；

Using the mean square error between predicted value and actual value as original fitness function, by greedy selection method, Greedy selection method is conventional method, obtains smallest embedding dimension number s, delay time T, undetermined coefficient h₁(i) and undetermined coefficient h₂ (i, j) is substituted into above-mentioned formula (2), completes the coding of chaos voice signal.

With adaptive artificial bee colony algorithm obtain chaos voice signal phonetic symbol [b] corresponding to delay time T be 8, insertion dimension Number s is 12, undetermined coefficient h in chaos voice signal prediction model₁(i) and undetermined coefficient h₂(i, j) is shown in Table 1, table 2, Fig. 3.

Best undetermined coefficient h in 1 embodiment 1 of table₁(i)

h₁(1)

1

‐0.0020

‐0.0531

‐0.0898

‐0.1363

0.0555

0.6349

‐0.0617

Best undetermined coefficient h in 2 embodiment 1 of table₂(i, j)

h₂(i, j)	I=1	I=2	I=3	I=4	I=5	I=6	I=7	I=8
									J=1	0.8258	-0.4758	0.2718	1	-1	0.1292	-1	0.7767
J=2	0.0449	-0.0179	0.1362	-0.1184	1	0.3567	-0.3045
									J=3	0.5248	0.2685	-0.9564	0.7436	-0.3485	0.3652
J=4	-0.9852	0.5326	0.2134	0.3452	0.2741
									J=5	0.1245	0.5236	-12354	1
J=6	-0.9654	0.1455	0.2542
									J=7	0.6532	0.8541
J=8	0.8745

By table 1, table 2, Fig. 3 as it can be seen that in chaos voice signal phonetic symbol [b] optimal embedding dimension s be 12, the optimum delay time It is 8, undetermined coefficient h₁(i)、h₂When (i, j) is data in table, the worst error of sample cumulative is 0.199474, has reached error Therefore range is output in the file of formulation.The file exported in figure is found, is substituted into above-mentioned formula (2), chaos voice is completed The coding of signal.

(4) it decodes

By smallest embedding dimension number s, delay time τ, the undetermined coefficient h of phonetic symbol [b] in the chaos voice signal of extraction₁(i) And h₂(i, j) substitute into formula (2), obtain the prediction model to induction signal, according to have coding after data, conventionally into Row decoding.

Embodiment 2

By taking phonetic symbol [b] in the chaos voice signal chosen in standard pronunciation mark corpus as an example, it is based on agent model Steps are as follows for the voice signal coding and decoding methods of Volterra modeling:

(1) the chaos voice signal of input is pre-processed

It is same as Example 1 that pre-treatment step is carried out to the chaos voice signal of input.

(2) prediction model is constructed with Volterra modeling method

It is same as Example 1 with Volterra modeling method building prediction model step.

(3) it determines chaos voice signal prediction model and encodes

ω=ω_min+ρ(ω_max-ω_min) (9)

ω in formula_minThe upper bound for being ω is 0.2, ω_maxThe lower bound for being ω is 0.9, a 2, and maxcyle is the largest iteration Number is 1000.

Using agent model method as approximate fitness function, select the Embedded dimensions s of high fitness, delay time T, Undetermined coefficient h₁(i), undetermined coefficient h₂The agent model method of (i, j), the present embodiment are same as Example 1.With adaptive artificial Ant colony algorithm obtains that delay time T corresponding to chaos voice signal phonetic symbol [b] is 8, Embedded dimensions s is 12, chaos voice signal Undetermined coefficient h in prediction model₁(i) and undetermined coefficient h₂(i, j) is shown in Table 3, table 4, Fig. 4.

Best undetermined coefficient h in 3 embodiment 2 of table₁(i)

h₁(1)	h₁(1)	h₁(1)	h₁(1)	h₁(1)	h₁(1)	h₁(1)	h₁(1)
								1	1.1321	0.0672	-0.4031	0.0203	-0.2818	0.1010	0.2818

Best undetermined coefficient h in 4 embodiment 2 of table₂(i, j)

Other steps are same as Example 1.

Complete the coding and decoding of chaos voice signal phonetic symbol [b].

Embodiment 3

(1) the chaos voice signal of input is pre-processed

(2) prediction model is constructed with Volterra modeling method

(3) it determines chaos voice signal prediction model and encodes

ω=ω_min+ρ(ω_max-ω_mim) (12)

ω in formula_minThe upper bound for being ω is 0.2, ω_maxThe lower bound for being ω is 0.9, a 2, and maxcyle is the largest iteration Number is 1500.

Using agent model method as approximate fitness function, select the Embedded dimensions s of high fitness, delay time T, Undetermined coefficient h₁(i), undetermined coefficient h₂The agent model method of (i, j), the present embodiment are same as Example 1.With adaptive artificial Ant colony algorithm obtains that delay time T corresponding to chaos voice signal phonetic symbol [b] is 8, Embedded dimensions s is 12, chaos voice signal Undetermined coefficient h in prediction model₁(i) and undetermined coefficient h₂(i, j) is shown in Table 5, table 6, Fig. 5.

Table 5 applies the best undetermined coefficient h in example 3₁(i)

h₁(1)	h₁(1)	h₁(1)	h₁(1)	h₁(1)	h₁(1)	h₁(1)	h₁(1)
								1	0.2119	-0.4320	-0.0315	0.0995	0.0014	-0.1405	0.0898

Best undetermined coefficient h in 6 embodiment 3 of table₂(i, j)

h₂(i, j)	I=1	I=2	I=3	I=4	I=5	I=6	I=7	I=8
									J=1	0.2358	-0.9652	0.2148	0.3541	-1	0.7022	-1	0.3354
J=2	0.6249	-0.6931	0.3654	-0.6944	0.2982	0.6367	-0.4508
									J=3	0.9852	0.7564	-0.2485	0.4267	0.5130	0.7452
J=4	-0.3498	0.3215	0.3124	0.2347	0.7824
									J=5	0.7545	0.1453	-0.1154	1.2647
J=6	-0.5496	0.3265	0.3542
									J=7	0.3541	0.4516
J=8	0.1264

Other steps are same as Example 1.

Complete the coding and decoding of chaos voice signal phonetic symbol [b].

According to above-mentioned principle, different phonetic symbols in the chaos voice signal chosen in standard pronunciation mark corpus, using being based on The voice signal coding and decoding methods of agent model Volterra modeling, can code and decode different phonetic symbols.

Claims

1. a kind of voice signal coding and decoding methods based on agent model Volterra modeling, it is characterised in that by following step Rapid composition:

(1) the chaos voice signal of input is pre-processed

In the chaos voice signal of input, the uniform frame of waveform is found as analysis frame, progress preemphasis, adding window, framing are pre- Processing；

N is limited positive integer in formula；

(2) prediction model is constructed with Volterra modeling method

In formula u (n-i τ) be input analysis frame signal, m be Chaotic time series forecasting model memory span be it is limited just Integer, h₁(i) and h₂(i, j) is undetermined coefficient, and u (n-i τ) is the n-th-i τ sample of correspondence analysis frame, and n-i τ is step (1) The sample serial number of middle analysis frame, u (n-j τ) are the n-th-j τ sample of corresponding analysis frame, and n-j τ is analysis frame in step (1) Sample serial number, τ be delay time be limited positive integer, j,_nFor limited positive integer；

(3) it determines chaos voice signal prediction model and encodes

Corresponding to the chaos voice signal of analysis frame in step (1) is determined chaos voice signal with adaptive artificial bee colony algorithm Delay time T, Embedded dimensions s, undetermined coefficient h₁(i), undetermined coefficient h₂(i, j), using agent model method as approximate suitable Response function selects Embedded dimensions s, delay time T, the undetermined coefficient h of high fitness₁(i) and undetermined coefficient h₂(i, j) is adopted With the mean square error between predicted value and actual value as original fitness function, by greedy selection method, obtain best embedding Enter dimension s, delay time T, undetermined coefficient h1 (i) and h₂(i, j) is substituted into above-mentioned formula (2), completes the volume of chaos voice signal Code；

Above-mentioned adaptive artificial bee colony algorithm are as follows:

ω is weight coefficient between (0,1) in formula, and c1, c2 are Studying factors 2,It is the random number of [- 1,1], x_bestIt is every The global optimum nectar source of secondary iteration, xi_jFor current nectar source position, i is the serial number of nectar source vector, and j is respective components, x_neighbor For the neighbouring nectar source position in current nectar source, neighbor is that vector serial number in nectar source cannot be equal to i, and ω is true by following two formula It is fixed:

ω=ω_min+ρ(ω_max-ω_min) (4)

ω_minThe upper bound for being ω is 0.2, ω_maxThe lower bound for being ω is 0.9, a 2, and maxcyle is the largest the number of iterations and is It is 1000 or 1500 or 2000 that 2000, a 2, maxcyle, which are the largest the number of iterations,；

Above-mentioned agent model method are as follows:

(1) the Embedded dimensions s in the chaos Phase Space Reconstruction of Speech Signals of analysis frame, delay time T are added to original In Volterra model, the m in formula (1) is replaced with s；

(2) it according to the model for introducing s and τ belonging to step (1), is determined using the agent model method of adaptive artificial bee colony algorithm Undetermined coefficient h₁(i) and h₂(i, j):

Use radial base neural net as approximate fitness function, by approximate fitness function and true fitness function model It is used in combination, approximate fitness function are as follows:

K in formula (| | x-c_i| |) it is used kernel function, a_iFor the value to be assessed, c_iFor the center of radial base neural net Point, true fitness function are as follows:

Y in formula_iFor actual value,For predicted value, L is prediction length；

Determine best undetermined coefficient h₁(i)、h₂Whether (i, j), detection mean square error reach requirement, are not up to error requirements, again Iteration；

(4) it decodes

By smallest embedding dimension number s, delay time τ, the undetermined coefficient h of the chaos voice signal of extraction₁(i) and h₂(i, j) is substituted into Formula (2), obtains the prediction model to induction signal, according to the data having after encoding, is conventionally decoded.