CN105991102A

CN105991102A - Media playing apparatus possessing voice enhancement function

Info

Publication number: CN105991102A
Application number: CN201510071342.7A
Authority: CN
Inventors: 周雅泛; 李程越
Original assignee: TPV Investment Co Ltd
Current assignee: Top Victory Investments Ltd; TPV Investment Co Ltd
Priority date: 2015-02-11
Filing date: 2015-02-11
Publication date: 2016-10-05

Abstract

A media playing apparatus comprises a first signal superposition module, a first blind source separation module, a second blind source separation module, a second signal superposition module, a voice gain adjustment module, a third signal superposition module and a fourth signal superposition module, wherein the first signal superposition module superposes first and second sound channel signals so as to generate a sound channel superposition signal; the first blind source separation module carries out blind source separation on the first sound channel signal and the sound channel superposition signal so as to generate a first voice signal with a small amount of background sound and a first background sound signal with a small amount of voice; the second blind source separation module carries out blind source separation on the second sound channel signal and the sound channel superposition signal so as to generate a second voice signal with a small amount of background sound and a second background sound signal with a small amount of voice; the second signal superposition module superposes the first and second voice signals to generate a voice superposition signal; the voice gain adjustment module gains the voice superposition signal to generate a voice enhancement signal; the third signal superposition module superposes the first background sound signal and the voice enhancement signal to generate a first sound channel output signal possessing voice enhancement; and the fourth signal superposition module superposes the second background sound signal and the voice enhancement signal to generate a second sound channel output signal possessing voice enhancement.

Description

There is the media playing apparatus of voice enhanced function

Technical field

The present invention relates to a kind of media playing apparatus, particularly relate to a kind of media playing apparatus with voice enhanced function.

Background technology

Media playing apparatus, such as television set, typically have the sound sources such as human speech, music, the environment sound in the track (soundtrack) of its program play.Sometimes, because other wave volume is excessive in program, or for no other reason than that not carrying out audio mixing in the broadcast sounds production phase, cause human speech to be difficult to catch.In order to become apparent from hearing that human speech, television set are provided with various acoustic pattern, such as dpi mode, drama pattern etc..The implementation of these acoustic patterns typically uses fixing low pass, high pass, band filter or a combination thereof, is amplified the signal of frequency range 500 ~ 3500 Hz at general human speech place.

But, above-mentioned use fixed filters amplifies the technology of voice can meet with some problems.First, owing to cannot know that original speech volume is the least or the biggest, it is not easy to determine how many signals is amplified so that speech volume improves limited.Second, the signal at frequency range 500 ~ 3500 Hz not only comprises human speech, also comprises the sound of other non-voice, such as noise, when not having voice, can amplify the noise of this frequency range on the contrary.

Summary of the invention

It is an object of the invention to provide a kind of media playing apparatus with voice enhanced function, can be amplified mainly for human speech, determine how much amplify according to its volume simultaneously.

For achieving the above object, the present invention provides a kind of media playing apparatus with voice enhanced function, and it receives the first sound channel signal with voice and background sound and second sound channel signal.Media playing apparatus comprises the first signal superposition (signal addition) module, the first blind source separating (blind source separation, BSS) module, the second blind source separating module, secondary signal superposition module, speech gain adjustment (speech gain adjustment) module, the 3rd signal superposition module and the 4th signal superposition module.First signal superposition module superposition the first sound channel signal and second sound channel signal, to produce sound channel superposition signal.First blind source separating module receives the first sound channel signal and carries out blind source separating with sound channel superposition signal, to produce the first voice signal with a small amount of background sound and the first background noise signal with a small amount of voice.Second blind source separating module receives second sound channel signal and carries out blind source separating with sound channel superposition signal, to produce the second voice signal with a small amount of background sound and the second background noise signal with a small amount of voice.Secondary signal superposition module superposition the first voice signal and the second voice signal, to produce voice superposition signal.Speech gain adjusting module adjusts yield value and according to this voice superposition signal is carried out gain, to produce speech enhan-cement signal.3rd signal superposition module superposition the first background noise signal and speech enhan-cement signal, to produce first channel output signal with speech enhan-cement effect.4th signal superposition module superposition the second background noise signal and speech enhan-cement signal, to produce the second sound channel output signal with speech enhan-cement effect.

In one embodiment of this invention, first blind source separating module or the second blind source separating module comprise first input end, the second input, the first wave filter, the second wave filter, first adder (adder), second adder, the 3rd wave filter, the 4th wave filter, the first outfan, the second outfan and adjustment unit, wherein, first input end receives the first audio signal, second input receives the second audio signal, the first outfan output mixed signal of the first solution, the second outfan output mixed signal of the second solution.Wherein, first input end couples the input of the first wave filter.Second input couples the input of the second wave filter.Two inputs of first adder are respectively coupled to the first wave filter and the outfan of the 4th wave filter, and the outfan of first adder couples input and first outfan of the 3rd wave filter.Two inputs of second adder are respectively coupled to the second wave filter and the outfan of the 3rd wave filter, and the outfan of second adder couples input and second outfan of the 4th wave filter.Adjustment unit receives the first mixed signal of solution and second and solves mixed signal, and use Minimum mutual information (minimum mutual information according to this, or maximum entropy (maximum entropy, ME) algorithm adjusts the transfer function (transfer functions) of the 3rd wave filter and the 4th wave filter MMI).

In one embodiment of this invention, adjustment unit also adjusts the first wave filter and the transfer function of the second wave filter.

In one embodiment of this invention, the first sound channel signal and second sound channel signal are respectively left channel signals and right-channel signals.

In one embodiment of this invention, media playing apparatus is TV, sound equipment, walkman, mobile phone, CD audio and video player or computer.

Technological means described in said one embodiment can be applicable in another embodiment above-mentioned, to obtain a new embodiment, as long as these technological means are the most conflicting.

The present invention will carry out blind source separating with first, second sound channel signal of voice Yu background sound because using first, second blind source separating module, isolate first, second voice signal with a small amount of background sound and first, second background noise signal with a small amount of voice, then after first, second voice signal being amplified, carry out superposition again with first, second background noise signal, therefore can produce first, second channel output signal with speech enhan-cement effect.

In addition, the present invention can carry out feedback control according to its isolated first, second voice signal and first, second background noise signal because of the first, second blind source separating module used, i.e. adjust the transfer function of the 3rd and the 4th wave filter therein, making isolated first, second voice signal with more a small amount of background sound, and isolated first, second background noise signal is with more a small amount of voice；And, first, second blind source separating module can also carry out feedback control according to the volume of voice in isolated first, second voice signal, i.e. adjust the transfer function of first and second wave filter therein, the speech gain adjusting module coordinating media playing apparatus rear end the most again makes the volume of voice in isolated first, second voice signal adjust to suitable size, and the volume of voice in first, second channel output signal that media playing apparatus finally exports therefore can be made to adjust to suitable size.

Accompanying drawing explanation

The present invention is further detailed explanation with detailed description of the invention below in conjunction with the accompanying drawings.

Fig. 1 is the block chart of the media playing apparatus with voice enhanced function according to one embodiment of the invention.

Fig. 2 is the block chart of the blind source separating module according to one embodiment of the invention.

Description of symbols:

10 media playing apparatus

11 first signal superposition modules

12 first blind source separating modules

13 second blind source separating modules

14 secondary signal superposition modules

15 speech gain adjusting modules

16 the 3rd signal superposition modules

17 the 4th signal superposition modules

20 blind source separating modules

201 first input ends

202 second inputs

203 first wave filter

204 second wave filter

205 first adders

206 second adders

207 the 3rd wave filter

208 the 4th wave filter

209 first outfans

210 second outfans

211 adjustment units

Lin the first sound channel signal

Rin second sound channel signal

Min sound channel superposition signal

Lbg the first background noise signal

Lsp the first voice signal

Rbg the second background noise signal

Rsp the second voice signal

Msp voice superposition signal

Msp' speech enhan-cement signal

Lout the first channel output signal

Rout second sound channel output signal

X1 the first audio signal

X2 the second audio signal

U1 first solves mixed signal

U2 second solves mixed signal

The transfer function of W11 the first wave filter

The transfer function of W22 the second wave filter

The transfer function of W21 the 3rd wave filter

The transfer function of W12 the 4th wave filter.

Detailed description of the invention

Fig. 1 is the block chart of the media playing apparatus with voice enhanced function according to one embodiment of the invention.Referring to Fig. 1, media playing apparatus 10 can be TV, sound equipment, walkman, mobile phone, CD audio and video player or computer (such as desk computer or tablet PC), it is not limited to this.Media playing apparatus 10 receives the first sound channel signal Lin and second sound channel signal Rin, and the first sound channel signal Lin and second sound channel signal Rin is all with voice and background sound, and wherein, background sound comprises the sound of the non-voices such as music, the environment sound, noise.First sound channel signal Lin and second sound channel signal Rin can be respectively left channel signals and right-channel signals, it is not limited to this.

Media playing apparatus 10 comprises first signal superposition module the 11, first blind source separating module the 12, second blind source separating module 13, secondary signal superposition module 14, speech gain adjusting module the 15, the 3rd signal superposition module 16 and the 4th signal superposition module 17.

First signal superposition module 11 superposition the first sound channel signal Lin and second sound channel signal Rin, to produce sound channel superposition signal Min.

First blind source separating module 12 receives the first sound channel signal Lin and sound channel superposition signal Min and carries out blind source separating, with the first voice signal Lsp of a small amount of background sound with produce the first background noise signal Lbg with a small amount of voice.Blind source separating is a kind of particularly Digital Signal Processing (digital signal processing, DSP) technology, these independent signals can be separated in the case of not knowing independent signal characteristic information from several mixed signals with independent signal by a certain extent.Such as different blind source separation algorithms such as Minimum mutual information algorithm, maximum entropy algorithms, there is different amount of calculation, calculating convergence rate and separating effects.Therefore, first sound channel signal Lin and sound channel superposition signal Min is after the first blind source separating module 12, the first voice signal Lsp and the first background noise signal Lbg can only be isolated to a certain extent, i.e., first voice signal Lsp mainly comprise voice but also can with a small amount of background sound, and the first background noise signal Lbg mainly comprises background sound but also can be with a small amount of voice.

Second blind source separating module 13 receives second sound channel signal Rin and sound channel superposition signal Min and carries out blind source separating, to produce the second voice signal Rsp and the second background noise signal Rbg with a small amount of voice with a small amount of background sound.Identical with the situation of the first blind source separating module, because second sound channel signal Rin and sound channel superposition signal Min is after the second blind source separating module 13, the second voice signal Rsp and the second background noise signal Rbg can only be isolated to a certain extent, i.e., second voice signal Rsp mainly comprise voice but also can with a small amount of background sound, and the second background noise signal Rbg mainly comprises background sound but also can be with a small amount of voice.

Secondary signal superposition module 14 superposition the first voice signal Lsp and the second voice signal Rsp, to produce voice superposition signal Msp.

Speech gain adjusting module 15 adjusts yield value and according to this voice superposition signal Msp is carried out gain, to produce speech enhan-cement signal Msp'.

3rd signal superposition module 16 superposition the first background noise signal Lbg and speech enhan-cement signal Msp', to produce the first channel output signal Lout with speech enhan-cement effect, i.e. first channel output signal Lout with the volume of voice the most exaggerated relative to the volume of background sound, therefore can become apparent from hearing human speech in other sound source.

4th signal superposition module 17 superposition the second background noise signal Rbg and speech enhan-cement signal Msp', to produce second sound channel output signal Rout with speech enhan-cement effect, i.e. second sound channel output signal Rout with the volume of voice the most exaggerated relative to the volume of background sound, therefore can become apparent from hearing human speech in other sound source.

First channel output signal Lout and second sound channel output signal Rout can be respectively outputted to the speaker (not illustrating) of outside and play out.

The present invention is because using first, second blind source separating module 12, 13 by with voice and the first of background sound, second sound channel signal Lin, Rin carries out blind source separating, isolate first with a small amount of background sound, second voice signal Lsp, Rsp with the first of a small amount of voice, second background noise signal Lbg, Rbg, then by with the first of a small amount of background sound, second voice signal Lsp, after Rsp is amplified, again with the first of a small amount of voice, second background noise signal Lbg, Rbg carries out superposition, therefore can produce and there is the first of speech enhan-cement effect, second sound channel output signal Lout, Rout.

Fig. 2 is the block chart of the blind source separating module according to one embodiment of the invention.Referring to Fig. 2, blind source separating module 20 comprises first input end the 201, second input the 202, first wave filter the 203, second wave filter 204, first adder 205, second adder the 206, the 3rd wave filter the 207, the 4th wave filter the 208, first outfan the 209, second outfan 210 and adjustment unit 211.Wherein, first input end 201 and the second input 202 receive the first audio signal X1 and the second audio signal X2 respectively, and the first outfan 209 and the second outfan 210 export the first mixed signal U1 and second of solution respectively and solve mixed signal U2.

First input end 201 couples the input of the first wave filter 203.Second input 202 couples the input of the second wave filter 204.Two inputs of first adder 205 are respectively coupled to the first wave filter 203 and outfan of the 4th wave filter 208, and the outfan of first adder 205 couples input and first outfan 209 of the 3rd wave filter 207.Two inputs of second adder 206 are respectively coupled to the second wave filter 204 and outfan of the 3rd wave filter 207, and the outfan of second adder 206 couples input and second outfan 210 of the 4th wave filter 208.

Adjustment unit 211 couples the first outfan 209 and the second outfan 210, to receive the first solution mixed signal U1 and second mixed signal U2 of solution, and use Minimum mutual information or maximum entropy algorithm to adjust the transfer function W21 and the transfer function W12 of the 4th wave filter 208 of the 3rd wave filter 207 according to this, and/or adjust the transfer function W11 and the transfer function W22 of the second wave filter 204 of the first wave filter 203.

In the present embodiment, the first blind source separating module 12 shown in Fig. 1 can use the blind source separating module 20 shown in Fig. 2, now the first audio signal X1 and second audio signal X2 of blind source separating module 20 can be the first sound channel signal Lin and sound channel superposition signal Min respectively, and the first solution mixed signal U1 and second mixed signal U2 of solution can be the first voice signal Lsp and the first background noise signal Lbg respectively.Additionally, the second blind source separating module 13 shown in Fig. 1 can use the blind source separating module 20 shown in Fig. 2, now the first audio signal X1 and second audio signal X2 of blind source separating module 20 can be second sound channel signal Rin and sound channel superposition signal Min respectively, and the first solution mixed signal U1 and second mixed signal U2 of solution can be the second voice signal Rsp and the second background noise signal Rbg respectively.

The present invention can carry out feedback control according to its isolated first, second voice signal Lsp, Rsp with a small amount of background sound with first, second background noise signal Lbg, Rbg with a small amount of voice because of the first, second blind source separating module 12,13 used, i.e. adjust the transfer function W21 and the transfer function W12 of the 4th wave filter 208 of the 3rd wave filter 207 therein, making isolated first, second voice signal Lsp, Rsp with more a small amount of background sound, and isolated first, second background noise signal Lbg, Rbg is with more a small amount of voice.And, first, second blind source separating module 12, 13 can also be according to isolated first, second voice signal Lsp, in Rsp, the volume of voice carries out feedback control, i.e. adjust the transfer function W11 and the transfer function W22 of the second wave filter 204 of the first wave filter 203 therein, the speech gain adjusting module 15 coordinating media playing apparatus 10 rear end the most again makes isolated first, second voice signal Lsp, in Rsp, the volume of voice adjusts to suitable size, therefore can make that media playing apparatus 10 finally exports first, second sound channel output signal Lout, in Rout, the volume of voice adjusts to suitable size.

In addition, it should be noted that, the first signal superposition module 11 that media playing apparatus 10 comprises, first blind source separating module 12, second blind source separating module 13, secondary signal superposition module 14, speech gain adjusting module 15, 3rd signal superposition module 16 and the 4th signal superposition module 17, the first wave filter 203 that blind source separating module 20 comprises, second wave filter 204, first adder 205, second adder 206, 3rd wave filter 207, 4th wave filter 208 and adjustment unit 211, these modules, device or unit all can use hardware or software mode to realize.

The foregoing is only presently preferred embodiments of the present invention, not in order to limit the present invention, all within the spirit and principles in the present invention, any modification, equivalent substitution and improvement etc. made, should be included within the scope of the present invention.

Claims

1. there is the media playing apparatus of voice enhanced function, receive the first sound channel signal with voice and background sound and second sound channel signal, it is characterised in that it comprises:

First signal superposition module, the first sound channel signal described in superposition and described second sound channel signal, to produce sound channel superposition signal；

First blind source separating module, receives described first sound channel signal and carries out blind source separating with described sound channel superposition signal, to produce the first voice signal with a small amount of background sound and the first background noise signal with a small amount of voice；

Second blind source separating module, receives described second sound channel signal and carries out blind source separating with described sound channel superposition signal, to produce the second voice signal with a small amount of background sound and the second background noise signal with a small amount of voice；

Secondary signal superposition module, the first voice signal described in superposition and described second voice signal, to produce voice superposition signal；

Speech gain adjusting module, adjusts yield value and described voice superposition signal is carried out gain, to produce speech enhan-cement signal according to this；

3rd signal superposition module, the first background noise signal described in superposition and described speech enhan-cement signal, to produce first channel output signal with speech enhan-cement effect；And

4th signal superposition module, the second background noise signal described in superposition and described speech enhan-cement signal, to produce the second sound channel output signal with speech enhan-cement effect.

The media playing apparatus with voice enhanced function the most according to claim 1, wherein, described first blind source separating module or the second blind source separating module comprise first input end, second input, first wave filter, second wave filter, first adder, second adder, 3rd wave filter, 4th wave filter, first outfan, second outfan and adjustment unit, wherein, described first input end receives the first audio signal, described second input receives the second audio signal, the described first outfan output mixed signal of the first solution, the described second outfan output mixed signal of the second solution；Wherein,

Described first input end couples the input of described first wave filter；

Described second input couples the input of described second wave filter；

Two inputs of described first adder are respectively coupled to the outfan of described first wave filter and described 4th wave filter, and the outfan of described first adder couples the input of described 3rd wave filter and described first outfan；

Two inputs of described second adder are respectively coupled to the outfan of described second wave filter and described 3rd wave filter, and the outfan of described second adder couples the input of described 4th wave filter and described second outfan；

Described adjustment unit receives described first and solves mixed signal and the described second mixed signal of solution, and uses Minimum mutual information or maximum entropy algorithm to adjust the transfer function of described 3rd wave filter and described 4th wave filter according to this.

The media playing apparatus with voice enhanced function the most according to claim 2, wherein, described adjustment unit also adjusts the transfer function of described first wave filter and described second wave filter.

The media playing apparatus with voice enhanced function the most according to claim 1, wherein, described first sound channel signal and described second sound channel signal are respectively left channel signals and right-channel signals.

The media playing apparatus with voice enhanced function the most according to claim 1, wherein, described media playing apparatus is TV, sound equipment, walkman, mobile phone, CD audio and video player or computer.