EP3672275A1 - Method and system for extracting source signal, and storage medium - Google Patents

Method and system for extracting source signal, and storage medium Download PDF

Info

Publication number
EP3672275A1
EP3672275A1 EP17921701.3A EP17921701A EP3672275A1 EP 3672275 A1 EP3672275 A1 EP 3672275A1 EP 17921701 A EP17921701 A EP 17921701A EP 3672275 A1 EP3672275 A1 EP 3672275A1
Authority
EP
European Patent Office
Prior art keywords
signal
signals
input signals
unwanted
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
EP17921701.3A
Other languages
German (de)
French (fr)
Other versions
EP3672275A4 (en
Inventor
Jiangang Zhang
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Incus Co Ltd
Original Assignee
Incus Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Incus Co Ltd filed Critical Incus Co Ltd
Publication of EP3672275A1 publication Critical patent/EP3672275A1/en
Publication of EP3672275A4 publication Critical patent/EP3672275A4/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R25/00Deaf-aid sets, i.e. electro-acoustic or electro-mechanical hearing aids; Electric tinnitus maskers providing an auditory perception
    • H04R25/40Arrangements for obtaining a desired directivity characteristic
    • H04R25/407Circuits for combining signals of a plurality of transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2225/00Details of deaf aids covered by H04R25/00, not provided for in any of its subgroups
    • H04R2225/43Signal processing in hearing aids to enhance the speech intelligibility
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2430/00Signal processing covered by H04R, not provided for in its groups
    • H04R2430/20Processing of the output signals of the acoustic transducers of an array for obtaining a desired directivity characteristic
    • H04R2430/23Direction finding using a sum-delay beam-former

Definitions

  • the disclosure relates to the field of signal processing technology, more particularly to a method, system and storage medium for extracting a target unwanted signal from a mixture of signals.
  • the separation technology is mainly to operate the hearing device by selectively adjusting the proportion of the signal, focusing on how to calculate the coefficient matrix more effectively, or using a combination of a directional microphone and an omnidirectional microphone to enhance the clarity of the voice, but the traditional independent component analysis (ICA) algorithm cannot achieve the ideal effect and the removal effect of the interference signal is not ideal. And the accuracy of the ICA algorithm is destroyed.
  • ICA independent component analysis
  • the disclosure solves the technical problem of incomplete signal separation while simplifying the operations, and achieves the effect of removing interference signals with extremely high precision by means of time-domain synchronization of signals.
  • One aspect of the disclosure discloses a method for removing a target unwanted signal from multiple signals, comprising: providing a set of input signals with each of the input signals comprising the wanted and unwanted signals; maximizing and maintaining the independence of the input signals; estimating the coefficients to maximize the independence; synchronizing the set of input signals; separating the set of synchronized input signals into channels with unwanted signal and channels without unwanted signal; and selecting optimal channel without unwanted signal as output signal intelligently.
  • a system for removing a target unwanted signal from multiple signal comprising a set of input units for inputting two or more input signals; a processor; and a memory storing computer readable instructions which when executed by the processor, cause the processor to: maximize and maintain the independence of the sets of input signals; extract the coefficients to maximize the independence among the input channels; synchronize the sets of input signals; separate the sets of synchronized input signals into channels with unwanted signal and channels without unwanted signal; and selecting the optimal channel without unwanted signal as output signal intelligently.
  • Still another aspect of the disclosure discloses a non-transitory computer storage medium, storing computer-readable instructions which when executed by a processor, cause the processor to perform a method for removing the unwanted signal from multiple signals, the method comprising: providing a set of input signals with each of the input signals comprising the wanted and unwanted signals; maximizing and maintaining the independence of the input signals; estimating the coefficients to maximize the independence; synchronizing the sets of input signals; separating the sets of synchronized input signals into channels with unwanted signal and channels without unwanted signal; and selecting optimal channel without unwanted signal as Output signal intelligently.
  • the asynchronization effect can be reversed or reduced and the source extraction performance can be improved, so that the perception of the target signals can be improved through the continuous removal of the unwanted signals even if the sources of wanted and unwanted signals are moving.
  • FIG. 1 shows a flow chart of a method 1000 for removing a target unwanted signal from sets of input signals according to an embodiment of the disclosure.
  • n receiving devices are prepared to receive signals sent from m signal sources.
  • a set of signals transmitted from each of receiving device are referred to as input signals of the corresponding receiving device.
  • Each of the input signals may comprise the signals sent from one or more of the signal sources, and these sent signals are also called as wanted signals. The others are unwanted signals.
  • the receiving device can be a transducer, a cloud platform, or a data-input interface.
  • the data-input interface is connected to a storage unit that gives a priority to storage wanted signals, and receives signal data from the storage unit.
  • the input signals may comprise unwanted signals that may be different from each other. However, the unwanted signals in the input signals may also be the same, and the disclosure has no limitation in this aspect.
  • the electronic listening device typically comprises at least two microphones, each of which may receive a mixture of a signal transmitted from a sound source (wanted signal) and an ambient background sound (unwanted signal). Since the microphones are usually placed at different positions, and thus the signal and the unwanted signal are received at mutually distanced locations, and the ambient background sound received by the microphones may be different in time domain and/or amplitude from each other. For example, in the scenario of sound stage recording and/or 360 audio recording, two or more microphones are used to measure the sound.
  • the brain wave device typically comprises at least two electrodes, each of which may receive a mixture of a signal transmitted from a brain wave source and an ambient noise. Since the electrodes are usually placed at different positions, and thus the signal and the noise are received at mutually distanced locations, and the ambient noises received by the electrodes may be different in time domain and/or amplitude from each other.
  • the echo receiving device typically comprises at least two transducers, each of which may receive a mixture of a signal transmitted from a sound source and an ambient noise. Since the transducers are usually placed at different positions, and thus the signal and the noise are received at mutually distanced locations, and the ambient noises received by the transducers may be different in time domain and/or amplitude from each other.
  • the transducers are usually placed at different positions, and thus the signal and the noise are received at mutually distanced locations, and the ambient noises received by the transducers may be different in time domain and/or amplitude from each other.
  • M i and M j there are two different transducers M i and M j , and a plurality of different signal sources S 1 , S 2 , ..., S n .
  • M i and M j follow the following formulas, respectively.
  • M i a 1 i S 1 t 1 + ⁇ 1 i + a 2 i S 2 t 2 + ⁇ 2 i + ... + a ni S n t n + ⁇ ni
  • M j a 1 j S 1 t 1 + ⁇ 1 j + a 2 j S 2 t 2 + ⁇ 2 j + ... + a nj S n t n + ⁇ nj
  • FIG. 7 shows the positions of two transducers and two signal resources in a two-dimensional space.
  • FIG. 7 represented in a two-dimensional space is only for simplified description, and all the positions can also be projected into a one-dimensional space, a three-dimensional space or a higher-dimensional space.
  • the disclosure is illustrated by the example of a sound signal.
  • a sound signal there are two sound sources S 1 and S 2 and two microphones M 1 and M 2 , and the propagation speed of sound is v, the sampling rate of the transducers is F s .
  • the sound energy decreases inversely with increasing distance between the sound sources and the transducers.
  • M1real and M2real in the left-hand side of the formula refer to multiple signals transmitted from microphones M1 and M2. Then, at step 200, a decomposition of the coefficient matrix is used to extract the maximum amount of wanted signals from the multiple signals.
  • the coefficient matrix is decomposed to increase the independence of the multiple signals.
  • the coefficient matrix is decomposed to maximize the independence of the multiple signals.
  • the embodiment is based on the premise that each of the signal sources is independent from each other, and the probability theory of the central limit theorem is a basis (that is: the statistical distribution of the sum of multiple independent variables tends toward a more normal distribution than the statistical distribution of each independent variable) for determining whether the statistical distribution of the sum of multiple independent variables in the embodiment tends toward a more normal distribution than that of each of the independent variables. Therefore, the coefficient matrix is decomposed by increasing the statistical distribution of the multiple signals as far as possible from the mean of a normal distribution to increase the independence of the signal sources. Specifically, with the parameter matrix coefficient designated as the dependent variable, an objective function is selected to calculate and estimate whether the variable tends toward a normal distribution, and an optimal parameter is calculated to converge to the objective function and obtain the decomposition parameter matrix.
  • the objective function whose value is equal to zero indicates that the probability distribution of y is normally distributed.
  • Kurtosis can also be replaced by other alternative measures as the standard measures far from the mean of a normal distribution, and there is no specific limitation on this in this disclosure. Then the objective function can be rewritten as: J y ⁇ E G y ⁇ E G v 2
  • Step 300 the input signals are synchronized in the time domain.
  • Step 300 can be implemented by four different methods. Step 300 will be described in details with reference to FIGS. 2 , 3 , 4 , and 5 as follows:
  • step 3101 is implemented to intercept two or more discrete segments of the unwanted signals, and each of the discrete segment is n milliseconds in duration.
  • n needs to be greater than 0.98 ms and less than 20.03 ms.
  • the duration time falls within this interval, humans cannot hear the echo while ensuring the accuracy of the signal interception. Therefore, the real-time processing is optimal, and the user has the best hearing effect.
  • step 3101 continuously intercepts the discrete segments of each of the multiple signals in real time.
  • the method according to the embodiment can process the time-domain signals in real time.
  • Each of the discrete segments of the multiple signals at every n milliseconds is determined by pattern recognition whether each of the discrete segments is an unwanted signal or not, and then the unwanted signals are further extracted.
  • there are two sound sources comprising a male and a female, in which the male voice is viewed as the unwanted signal.
  • the pattern recognition automatically recognizes whether each of the discrete segments of every n milliseconds of the multiple signals comprises the male voice. If one of the discrete segments appears the male voice, the discrete segment is extracted and proceed to the next step. If the female voice is viewed as an unwanted signal, the discrete segment appearing the female voice is extracted and proceed to the next step.
  • the two sound sources comprise a human voice and a non-human voice.
  • the unwanted signal can be detected by monitoring whether the unwanted signal transitions from a low level to a high level in n milliseconds (i.e., a step function). For example, a male voice is viewed as an unwanted signal. When a man speaks, he doesn't need to say phonemes together to form the sound of the whole word. When the discrete segments in which the male voice appears are detected, illustrating that the voice is the unwanted signal. This approach largely reduces the need for complicated noise (such as sound signals) detection processes and thus reduces the computational complexity and cost.
  • a discrete-time convolution of two detected segments of unwanted signals is calculated to obtain a time delay between them.
  • mx is the average value of x
  • my is the average value of y
  • d is the time delay.
  • the numerator part of the formula is the discrete-time convolution.
  • r d ⁇ i x i ⁇ mx ⁇ y i ⁇ d ⁇ my ⁇ i x i ⁇ mx 2 ⁇ i y i ⁇ d ⁇ my 2 where the time delay d is the maximum value from r (d).
  • the set of input signals are synchronized based on the obtained time delay d. For example, if the time delay between the detected unwanted signal segment in a first input signal fi(t) and the detected unwanted signal segment in a second input signal f 2 (t) is determined to be ⁇ , the first input signal f 1 (t) is synchronized to be fi(t-5). For another example, if the time delay between the detected unwanted signal segment in the first input signal fi(t) and the detected unwanted signal segment in the second input signal f 2 (t) is determined to be - ⁇ , the first input signal fi(t) is synchronized to be fi(t+5). Since the segments of the unwanted signals are continuously monitored in this embodiment in real time, the approach can continuously update the time delay during the iteration and dynamically track the change to the unwanted signal, as the signal sources and the transducers move in different directions or move with respect to each other.
  • the unwanted signal is received at mutually distanced locations.
  • the position of each of the unwanted signals with respect to the transducers are calculated, that is: the relative delays of each of the unwanted signals.
  • the unwanted signals are determined according to the relative delay of each of the unwanted signals.
  • the unwanted signals can also be determined by users in real time.
  • Max dir Fs * d / v
  • the sampling rate (Fs) is 48 kHz
  • the distance (d) between the two transducers is 2.47cm. Since the propagation speed of sound in the air (v) is 340m / s, and thus the maximum delay is 3. Therefore, the whole region can be divided into 7 areas with time delays of -3, -2, -1, 0, 1, 2, and 3, respectively. For example, referring to FIG. 8 , if a preset unwanted signal comes from an area with a time delay of -3, the delay is assigned a value of -3.
  • a time delay is determined according to an unwanted signal area selected by users in real time, or a preset unwanted signal area.
  • step 3203 the time delays obtained in step 3202 are synchronized according to step 3103.
  • the unwanted signals from all the relative delays are employed in this embodiment.
  • all the time delays are analyzed and calculated based on different signals (such as sound signals), the distance between transducers, and the propagation speed of the signal.
  • step3302 all the possible time delays T1, T2, ..., Tn are determined.
  • each of the different delays are synchronized again according to step 3103.
  • the direction of the wanted signal can also be preset, or selected by users in real time.
  • step 3402 the time delays in these directions are calculated.
  • step 3403 based on the method to obtain all of the signal directions in FIG. 4 , at step 3403, the time delays of these wanted signals are removed from all of the possible directions, and each of the remaining different time delays is synchronized again according to step 3103.
  • the synchronized input signals are separated into the channels with unwanted signal and channels without unwanted signals.
  • step 400 is implemented by a multiplication between matrix of synchronized signals and matrix of coefficients resulted from step 200.
  • step 500 How to choose between these two channels is explained in detail in step 500.
  • the matrix of the mixed signals synchronized with S2 is multiplied with the coefficient matrix. Then the result is output through a suitable channel.
  • step 500 for the two channels resulted from step 400, one of which with relatively lower signal energy can be selected as an output channel.
  • the method to calculate signal energy is to obtain the root mean square value of the signals. The selection process will be applied to the channels with unwanted signal and channels without unwanted signals resulted from step 500.
  • the output channel will be generated at different time delays.
  • the output channel is an optimal channel selected based on feature detection (for example, the optimal channel is a channel with the smallest number of unwanted signals in the generated channels).
  • the output channel is an optimal channel selected based on signal energy (for example, the optimal channel is a channel with the minimum amount of energy of unwanted signals in the generated channels).
  • the unwanted signals are appropriately removed, subsequent processing can be performed on the separated target unwanted signal and the wanted signal.
  • the wanted signal may be selectively amplified and the target unwanted signals may be selectively reduced to improve the perception of the signals.
  • the disclosure provides a device comprising a processor and an interface with human-computer interaction.
  • the device also includes, but is not limited to, a memory, a processor, an input/output module, and an information receiving module.
  • the processor is configured to perform the above steps 100, 200, 3201-3203 (or 3401-3403), 400 and 500, and to enhance the frequency domain (optional). Users select an unwanted signal area in real time through the interface with human-computer interaction.
  • the interface with human-computer interaction includes, but is not limited to, a voice receiving module, a transducer, a video receiving module, a touch screen, a keyboard, buttons, knobs, a projection interface, and a virtual 3D interface.
  • the selection methods for users in real time by using the interface with human-computer interaction include the use of voice instructions, different gestures or actions, and areas with different labels selected by users.
  • the interface with human-computer interaction is a touch screen where the user can click on any area.
  • the disclosure provides a device for removing a target unwanted signal.
  • the device is user-controllable and user-selectable, and can adjust the time delay in real time.
  • steps 100-400 may be performed in a different order than that described in the drawings.
  • steps 100 and 300 in the second embodiment i.e. steps 3201-3203
  • steps 3201-3203 can be performed in reversed order.
  • any two steps in steps 100-400 can be performed in parallel or in reverse order according to the functions involved.
  • step 200 is performed prior to step 300, that is: the coefficient matrix is calculated and then the input signal is synchronized in the time domain, which provides the advantages that the coefficient matrix will not be recalculated although the time delays are different, and thus a large number of calculation is omitted.
  • the result can be obtained by calculating the coefficient matrix just one time.
  • the disclosure draws the following conclusions from a large number of experiments: The coefficient matrix calculated from the synchronized mixed signal is almost the same as that calculated from the original mixed signal, illustrating that the method omits a large amount of calculation without losing the accuracy of the coefficient matrix.
  • the input signals received by one or more signal receiving devices can be removed according to a given criterion.
  • the input signal is a sound signal
  • the receiving device is a device (such as a microphone) for receiving sound signals.
  • the criterion is Fs*X/V ⁇ L/3 (where L is the length of the intercepted discrete signals, X is the distance between any two receiving devices for receiving sound signals, V is the propagation speed of signal, and Fs is the sampling rate)
  • the sound signals received by one of the receiving devices is removed.
  • the embodiment omits a large number of complex calculation while ensuring the accuracy of the pattern recognition, improves the calculation efficiency, and optimizes the power consumption.
  • the signals in the art could be referred to audio signals, image signals, electro-magnetic signals, brain wave signals, electric signals, radio wave signals or other forms of signals that could be picked up by transducers and the disclosure has no limitation in this aspect.
  • the perception of the target signals can be improved while reducing the computational cost.
  • the input signals are synchronized in time domain and thus the method according to the disclosure will introduce minimal frequency distortion.
  • FIG. 6 a structural schematic diagram of a computer system 3000 adapted to implement an embodiment of the disclosure is shown.
  • the computer system 3000 includes a central processing unit (CPU) 3001, which may execute various appropriate actions and processes in accordance with a program stored in an electronically programmable read-only memory (EPROM) 3002 or a program loaded into a random access memory (RAM).
  • the RAM 3003 also stores various programs and data required by operations of the system 3000.
  • the CPU 3001, the EPROM 3002 and the RAM 3003 are connected to each other through a bus 3004.
  • An input/output (I/O) interface 3005 is also connected to the bus 3004.
  • a Direct Memory Access interface 3006 is connected to the bus 3004 to enable fast data exchange.
  • the following components may be connected to the I/O interface 3005: an removable data storage 3007 comprising USB storage, solid state drive, hard drive etc.; a wireless data link 3008 comprising LAN, blue-tooth, near field communication devices; a signal convertor 3009 which are connected to data input channel(s) 3010 and data output channel(s) 3011.
  • an embedded computer system similar to the computer system 3000 but without keyboard, mouse, and hard disk. Update of programs will be facilitated via a wireless data link 3008 or removable data storage 3007.
  • the central processing unit can be a cloud processor, and the memory can be a cloud memory.
  • an embodiment of the disclosure comprises a computer program product, which comprises a computer program that is tangibly embodied in a machine-readable medium.
  • the computer program comprises program codes for executing the method as shown in the flow charts.
  • the computer program may be downloaded and installed from a network via the wireless data link 3008, and/or may be installed from the removable media 3007.
  • each of the blocks in the flow charts and block diagrams may represent a module, a program segment, or a code portion.
  • the module, the program segment, or the code portion comprises one or more executable instructions for implementing the specified logical function.
  • the functions denoted by the blocks may occur in a different sequence from that the sequence as shown in the figures. For example, in practice, two blocks in succession may be executed substantially in parallel, or in a reverse order, depending on the functionalities involved.
  • each block in the block diagrams and/or the flow charts and/or a combination of the blocks may be implemented by a dedicated hardware-based system executing specific functions or operations, or by a combination of a dedicated hardware and computer instructions.
  • the units or modules involved in the embodiments of the disclosure may be implemented by way of software or hardware.
  • the described units or modules may also be provided in a processor.
  • the names of these units or modules are not considered as a limitation to the units or modules.
  • the disclosure also provides a computer readable storage medium.
  • the computer readable storage medium may be the computer readable storage medium included in the apparatus in the above embodiments, and it may also be a separate computer readable storage medium which has not been assembled into the apparatus.
  • the computer readable storage medium stores one or more programs, which are used by one or more processors to execute the method for separating a target signal from noise described in the disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • General Health & Medical Sciences (AREA)
  • Neurosurgery (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

Provided are a method and system for continuously extracting target interference signals from selected signals, and a storage medium. The method comprises: collecting a two-channel or multi-channel input signal, each channel of the input signal containing a target interference signal; increasing independence of the input signal; calculating to obtain a resulting coefficient matrix after the independence of the input signal has been increased; synchronizing each pair or each group of input signals; separating the synchronized input signals into the target interference signal and a desired signal; and intelligently selecting an output signal.

Description

    TECHNICAL FIELD
  • The disclosure relates to the field of signal processing technology, more particularly to a method, system and storage medium for extracting a target unwanted signal from a mixture of signals.
  • BACKGROUND
  • In the field of signal processing and big data, a major challenge is to increase the signal-to-noise ratio of measured observations because they are often corrupted by unwanted signals. This problem applies to audio recordings (e.g., sound stage recording, hearing aids, 360 audio); biomedical applications (e.g., brain wave recording, brain imaging); remote sensing (radar signals, echo locations). The most common method to tackle the problem of unwanted signals has been the use of filter either in analogue or digital forms. However, very often the wanted and unwanted signals share the same frequency range and it would be impossible for a filter to separate wanted and unwanted signals.
  • At present, the separation technology is mainly to operate the hearing device by selectively adjusting the proportion of the signal, focusing on how to calculate the coefficient matrix more effectively, or using a combination of a directional microphone and an omnidirectional microphone to enhance the clarity of the voice, but the traditional independent component analysis (ICA) algorithm cannot achieve the ideal effect and the removal effect of the interference signal is not ideal. And the accuracy of the ICA algorithm is destroyed.
  • Therefore, there exists a need for technologies that can solve the asynchronization effect to effectively separate the unwanted signal and the wanted signal.
  • SUMMARY
  • In view of the problems in the prior art, the disclosure solves the technical problem of incomplete signal separation while simplifying the operations, and achieves the effect of removing interference signals with extremely high precision by means of time-domain synchronization of signals.
  • One aspect of the disclosure discloses a method for removing a target unwanted signal from multiple signals, comprising: providing a set of input signals with each of the input signals comprising the wanted and unwanted signals; maximizing and maintaining the independence of the input signals; estimating the coefficients to maximize the independence; synchronizing the set of input signals; separating the set of synchronized input signals into channels with unwanted signal and channels without unwanted signal; and selecting optimal channel without unwanted signal as output signal intelligently.
  • Another aspect of the disclosure, a system for removing a target unwanted signal from multiple signal is provided, which comprising a set of input units for inputting two or more input signals; a processor; and a memory storing computer readable instructions which when executed by the processor, cause the processor to: maximize and maintain the independence of the sets of input signals; extract the coefficients to maximize the independence among the input channels; synchronize the sets of input signals; separate the sets of synchronized input signals into channels with unwanted signal and channels without unwanted signal; and selecting the optimal channel without unwanted signal as output signal intelligently.
  • Still another aspect of the disclosure discloses a non-transitory computer storage medium, storing computer-readable instructions which when executed by a processor, cause the processor to perform a method for removing the unwanted signal from multiple signals, the method comprising: providing a set of input signals with each of the input signals comprising the wanted and unwanted signals; maximizing and maintaining the independence of the input signals; estimating the coefficients to maximize the independence; synchronizing the sets of input signals; separating the sets of synchronized input signals into channels with unwanted signal and channels without unwanted signal; and selecting optimal channel without unwanted signal as Output signal intelligently.
  • According to the disclosure, the asynchronization effect can be reversed or reduced and the source extraction performance can be improved, so that the perception of the target signals can be improved through the continuous removal of the unwanted signals even if the sources of wanted and unwanted signals are moving.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • Exemplary non-limiting embodiments of the present invention are described below with reference to the attached drawings. The drawings are illustrative and generally not to an exact scale. The same or similar elements on different figures are referenced with the same reference numbers.
    • FIG. 1 shows a flow chart of a method for removing a target unwanted signal from multiple signals according to an embodiment of the disclosure;
    • FIG. 2 shows a flow chart of a first operation method for synchronizing a set of input signals;
    • FIG. 3 shows a flow chart of a second operation method for synchronizing a set of input signals;
    • FIG. 4 shows a flow chart of a third operation method for synchronizing a set of input signals;
    • FIG. 5 shows a flow chart of a fourth operation method for synchronizing a set of input signals;
    • FIG. 6 shows a structural diagram of a computer system adapted to implement the method of one embodiment of the disclosure;
    • FIG. 7 shows a schematic diagram of positions of different sound sources to different transducers; and
    • FIG. 8 shows the time delay of two transducers with a certain interval.
    DETAILED DESCRIPTION
  • Hereinafter, the embodiments of the disclosure will be described in detail with reference to the detailed description as well as the drawings.
  • FIG. 1 shows a flow chart of a method 1000 for removing a target unwanted signal from sets of input signals according to an embodiment of the disclosure.
  • At step 100, n receiving devices are prepared to receive signals sent from m signal sources. A set of signals transmitted from each of receiving device are referred to as input signals of the corresponding receiving device. Each of the input signals may comprise the signals sent from one or more of the signal sources, and these sent signals are also called as wanted signals. The others are unwanted signals. The receiving device can be a transducer, a cloud platform, or a data-input interface. The data-input interface is connected to a storage unit that gives a priority to storage wanted signals, and receives signal data from the storage unit. In addition, the input signals may comprise unwanted signals that may be different from each other. However, the unwanted signals in the input signals may also be the same, and the disclosure has no limitation in this aspect. For example, in the scenario of an electronic listening device, the electronic listening device typically comprises at least two microphones, each of which may receive a mixture of a signal transmitted from a sound source (wanted signal) and an ambient background sound (unwanted signal). Since the microphones are usually placed at different positions, and thus the signal and the unwanted signal are received at mutually distanced locations, and the ambient background sound received by the microphones may be different in time domain and/or amplitude from each other. For example, in the scenario of sound stage recording and/or 360 audio recording, two or more microphones are used to measure the sound. Since the microphones are usually placed at different positions, and thus the signal and the noise are received at mutually distanced locations, and the ambient background sound received by the microphones may be different in time domain and/or amplitude from each other. For example, in the scenario of a Brain-Computer Interface device, the brain wave device typically comprises at least two electrodes, each of which may receive a mixture of a signal transmitted from a brain wave source and an ambient noise. Since the electrodes are usually placed at different positions, and thus the signal and the noise are received at mutually distanced locations, and the ambient noises received by the electrodes may be different in time domain and/or amplitude from each other. Similarly, in the scenario of underwater echo detection, the echo receiving device typically comprises at least two transducers, each of which may receive a mixture of a signal transmitted from a sound source and an ambient noise. Since the transducers are usually placed at different positions, and thus the signal and the noise are received at mutually distanced locations, and the ambient noises received by the transducers may be different in time domain and/or amplitude from each other. Suppose that there are two different transducers Mi and Mj, and a plurality of different signal sources S1, S2, ..., Sn. Mi and Mj follow the following formulas, respectively. Each of the different signal sources propagates to the transducers Mi and Mj, with their own amplitude and time delay. M i = a 1 i S 1 t 1 + τ 1 i + a 2 i S 2 t 2 + τ 2 i + + a ni S n t n + τ ni
    Figure imgb0001
    M j = a 1 j S 1 t 1 + τ 1 j + a 2 j S 2 t 2 + τ 2 j + + a nj S n t n + τ nj
    Figure imgb0002
  • Similarly, the signals received by other transducers can be deduced from the same formula.
  • To simplify the description, FIG. 7 shows the positions of two transducers and two signal resources in a two-dimensional space. FIG. 7 represented in a two-dimensional space is only for simplified description, and all the positions can also be projected into a one-dimensional space, a three-dimensional space or a higher-dimensional space. To simplify the description, the disclosure is illustrated by the example of a sound signal. Suppose that there are two sound sources S1 and S2 and two microphones M1 and M2, and the propagation speed of sound is v, the sampling rate of the transducers is Fs. The propagation time from the sound sources to the transducers follows the formula: t ij = Fs dis S i M j / υ
    Figure imgb0003
  • In any embodiment of the disclosure, v = 34029 cm / s, Fs = 44.1 kHz.
  • Ideally, the sound energy decreases inversely with increasing distance between the sound sources and the transducers. The following formula is used to represent the sound source received by the transducers: M 1 M 2 = t a + τ 1 / dis S 1 M 1 t b / dis S 2 M 1 t a / dis S 1 M 2 t b + τ 2 / dis S 2 M 2 S 1 S 2
    Figure imgb0004
  • In FIG. 7, the formula above can be written as follows, where all the constants are simplified as 1. M 1 real M 2 real = 1.35 t a + 5 1.37 t b 1.42 t a 1.13 t b + 20 S 1 S 2
    Figure imgb0005
  • In practical applications, S1, S2 and the coefficient matrix in the right-hand side of the formula are unknown. M1real and M2real in the left-hand side of the formula refer to multiple signals transmitted from microphones M1 and M2. Then, at step 200, a decomposition of the coefficient matrix is used to extract the maximum amount of wanted signals from the multiple signals.
  • At step 200, the coefficient matrix is decomposed to increase the independence of the multiple signals. Preferably, the coefficient matrix is decomposed to maximize the independence of the multiple signals. The embodiment is based on the premise that each of the signal sources is independent from each other, and the probability theory of the central limit theorem is a basis (that is: the statistical distribution of the sum of multiple independent variables tends toward a more normal distribution than the statistical distribution of each independent variable) for determining whether the statistical distribution of the sum of multiple independent variables in the embodiment tends toward a more normal distribution than that of each of the independent variables. Therefore, the coefficient matrix is decomposed by increasing the statistical distribution of the multiple signals as far as possible from the mean of a normal distribution to increase the independence of the signal sources. Specifically, with the parameter matrix coefficient designated as the dependent variable, an objective function is selected to calculate and estimate whether the variable tends toward a normal distribution, and an optimal parameter is calculated to converge to the objective function and obtain the decomposition parameter matrix.
  • For example: at step 200, the following function is selected as the objective function for calculating and estimating whether the variables tends toward a normal distribution: kurt y = E y 4 3 E y 2 2
    Figure imgb0006

    where E {} is the expected value, and y is each of the multiple signals. The objective function whose value is equal to zero indicates that the probability distribution of y is normally distributed. Kurtosis can also be replaced by other alternative measures as the standard measures far from the mean of a normal distribution, and there is no specific limitation on this in this disclosure. Then the objective function can be rewritten as: J y E G y E G v 2
    Figure imgb0007
  • Therefore, with the coefficient parameter matrix designated as the dependent variable, Newton's method is directly applied to the above objective function to find an optimal parameter that causes convergence to particular function and obtain the decomposition parameter matrix.
  • The specific calculation method is briefly listed below:
    1. 1. Choose an initial (e.g. random) weight vector w.
    2. 2. Let w+ = E{xg(w T x)} - E{g'(w T x)}w
    3. 3. Let w = w+ /∥w+
    4. 4. If not converged, go back to 2.
    where g is the derivative of G.
  • At step 300, the input signals are synchronized in the time domain. Step 300 can be implemented by four different methods. Step 300 will be described in details with reference to FIGS. 2, 3, 4, and 5 as follows:
  • Referring to FIG. 2, step 3101 is implemented to intercept two or more discrete segments of the unwanted signals, and each of the discrete segment is n milliseconds in duration. When the signal is an audio signal, n needs to be greater than 0.98 ms and less than 20.03 ms. When the duration time falls within this interval, humans cannot hear the echo while ensuring the accuracy of the signal interception. Therefore, the real-time processing is optimal, and the user has the best hearing effect.
  • Preferably, step 3101 continuously intercepts the discrete segments of each of the multiple signals in real time. The method according to the embodiment can process the time-domain signals in real time.
  • Each of the discrete segments of the multiple signals at every n milliseconds, is determined by pattern recognition whether each of the discrete segments is an unwanted signal or not, and then the unwanted signals are further extracted. For example, in the acoustic case, there are two sound sources comprising a male and a female, in which the male voice is viewed as the unwanted signal. The pattern recognition automatically recognizes whether each of the discrete segments of every n milliseconds of the multiple signals comprises the male voice. If one of the discrete segments appears the male voice, the discrete segment is extracted and proceed to the next step. If the female voice is viewed as an unwanted signal, the discrete segment appearing the female voice is extracted and proceed to the next step. For another example, the two sound sources comprise a human voice and a non-human voice. Those skilled in the art should understand that other appropriate technologies may also be employed in this step.
  • At step 3101, the unwanted signal can be detected by monitoring whether the unwanted signal transitions from a low level to a high level in n milliseconds (i.e., a step function). For example, a male voice is viewed as an unwanted signal. When a man speaks, he doesn't need to say phonemes together to form the sound of the whole word. When the discrete segments in which the male voice appears are detected, illustrating that the voice is the unwanted signal. This approach largely reduces the need for complicated noise (such as sound signals) detection processes and thus reduces the computational complexity and cost.
  • At step 3102, a discrete-time convolution of two detected segments of unwanted signals is calculated to obtain a time delay between them. Suppose that there are two mixed signals x and y, and the correlation formula between the two signals is: r = i x i mx y i d my i x i mx 2 i y i d my 2
    Figure imgb0008

    where mx is the average value of x, my is the average value of y, and d is the time delay. The numerator part of the formula is the discrete-time convolution.
  • Let the time delay d has different values, then the above formula can also be written as: r d = i x i mx y i d my i x i mx 2 i y i d my 2
    Figure imgb0009

    where the time delay d is the maximum value from r (d).
  • At step 3103, the set of input signals are synchronized based on the obtained time delay d. For example, if the time delay between the detected unwanted signal segment in a first input signal fi(t) and the detected unwanted signal segment in a second input signal f2(t) is determined to be δ, the first input signal f1(t) is synchronized to be fi(t-5). For another example, if the time delay between the detected unwanted signal segment in the first input signal fi(t) and the detected unwanted signal segment in the second input signal f2(t) is determined to be -δ, the first input signal fi(t) is synchronized to be fi(t+5). Since the segments of the unwanted signals are continuously monitored in this embodiment in real time, the approach can continuously update the time delay during the iteration and dynamically track the change to the unwanted signal, as the signal sources and the transducers move in different directions or move with respect to each other.
  • Referring to FIG. 3, at step 3201, since the microphones are usually placed at different positions, and thus the unwanted signal is received at mutually distanced locations. First, in the embodiment, the position of each of the unwanted signals with respect to the transducers are calculated, that is: the relative delays of each of the unwanted signals. Second, the unwanted signals are determined according to the relative delay of each of the unwanted signals. Alternatively, the unwanted signals can also be determined by users in real time.
  • Preferably, suppose that the distance between the signal source and the first transducer 1 is d1, and the distance between the signal source and the second transducer 2 is d2, and the sampling rate of signal is Fs, and the propagation speed of signal is v. The relative delay dir is calculated as follows: dir = Fs * d 1 d 2 / v
    Figure imgb0010
  • Suppose that the distance between the transducers is d, the maximum direction Max (dir) is calculated as follows: Max dir = Fs * d / v
    Figure imgb0011
  • If the result is not an integer, then rounding should be applied to give an integer, and all directions are as following: Max(dir),...,-1,0,1,...,Max(dir).
  • Referring to the direction described by the distance shown in FIG. 8. Suppose that the sampling rate (Fs) is 48 kHz, the distance (d) between the two transducers (the embodiment is illustrated by the example of a sound signal, so the transducers are microphones) is 2.47cm. Since the propagation speed of sound in the air (v) is 340m / s, and thus the maximum delay is 3. Therefore, the whole region can be divided into 7 areas with time delays of -3, -2, -1, 0, 1, 2, and 3, respectively. For example, referring to FIG. 8, if a preset unwanted signal comes from an area with a time delay of -3, the delay is assigned a value of -3.
  • Referring to FIG. 3, in step 3202, a time delay is determined according to an unwanted signal area selected by users in real time, or a preset unwanted signal area.
  • Referring to FIG. 3, at step 3203, the time delays obtained in step 3202 are synchronized according to step 3103.
  • Referring to FIG. 4, the unwanted signals from all the relative delays are employed in this embodiment. At step 3301, all the time delays are analyzed and calculated based on different signals (such as sound signals), the distance between transducers, and the propagation speed of the signal.
  • Referring to FIG. 4, at step3302, all the possible time delays T1, T2, ..., Tn are determined.
  • Referring to FIG. 4, at step3303, each of the different delays are synchronized again according to step 3103.
  • Referring to FIG. 5, at step 3401, the direction of the wanted signal can also be preset, or selected by users in real time.
  • Referring to FIG. 5, at step 3402, the time delays in these directions are calculated.
  • Referring to FIG. 5, based on the method to obtain all of the signal directions in FIG. 4, at step 3403, the time delays of these wanted signals are removed from all of the possible directions, and each of the remaining different time delays is synchronized again according to step 3103. Referring to FIG. 1 again, at step 400, the synchronized input signals are separated into the channels with unwanted signal and channels without unwanted signals. Preferably, step 400 is implemented by a multiplication between matrix of synchronized signals and matrix of coefficients resulted from step 200.
  • For example, referring to the example of step 100, suppose that the mixed signal comprising: M 1 real M 2 real = 1.35 t a + 5 1.37 t b 1.42 t a 1.13 t b + 20 S 1 S 2
    Figure imgb0012
  • The coefficient matrix resulted from step 200 is multiplied with the matrix of the synchronized signals, and the formula is as follows: IC 1 synvoice IC 2 synvoice = 0.396 0.604 0.496 0.504 M 1 real t 5 M 2 real t = 1.39 S 1 a + 0.540 S 2 b 5 + 0.682 S 2 b + 20 0.048 S 1 a + 0.677 S 2 b 5 0.569 S 2 b + 20
    Figure imgb0013
  • The above formula will generate two channels, one of which is represented as: IC 2 synvoice is rms S 2 b 5 + rms S 2 b + 20 rms S 1 a = 25.95 .
    Figure imgb0014
    That is: this channel comprising S1 of 4% and S2 of 96%. If S1 is viewed as an unwanted signal, this channel will be selected and output, illustrating that the effect of the synchronized separation reaches 96%.
  • How to choose between these two channels is explained in detail in step 500.
  • Similarly, if S1 is viewed as an unwanted signal, the matrix of the mixed signals synchronized with S2 is multiplied with the coefficient matrix. Then the result is output through a suitable channel.
  • Referring to FIG. 1, at step 500, for the two channels resulted from step 400, one of which with relatively lower signal energy can be selected as an output channel. The method to calculate signal energy is to obtain the root mean square value of the signals. The selection process will be applied to the channels with unwanted signal and channels without unwanted signals resulted from step 500.
  • Referring to FIGS. 4 and 5, the output channel will be generated at different time delays. In FIG. 4, the output channel is an optimal channel selected based on feature detection (for example, the optimal channel is a channel with the smallest number of unwanted signals in the generated channels). In FIG. 5, the output channel is an optimal channel selected based on signal energy (for example, the optimal channel is a channel with the minimum amount of energy of unwanted signals in the generated channels).
  • Preferably, once the unwanted signals are appropriately removed, subsequent processing can be performed on the separated target unwanted signal and the wanted signal. For example, in the application of a hearing aid, the wanted signal may be selectively amplified and the target unwanted signals may be selectively reduced to improve the perception of the signals.
  • According to an embodiment, the disclosure provides a device comprising a processor and an interface with human-computer interaction. Further, the device also includes, but is not limited to, a memory, a processor, an input/output module, and an information receiving module. The processor is configured to perform the above steps 100, 200, 3201-3203 (or 3401-3403), 400 and 500, and to enhance the frequency domain (optional). Users select an unwanted signal area in real time through the interface with human-computer interaction. The interface with human-computer interaction includes, but is not limited to, a voice receiving module, a transducer, a video receiving module, a touch screen, a keyboard, buttons, knobs, a projection interface, and a virtual 3D interface. The selection methods for users in real time by using the interface with human-computer interaction include the use of voice instructions, different gestures or actions, and areas with different labels selected by users. The interface with human-computer interaction is a touch screen where the user can click on any area. The disclosure provides a device for removing a target unwanted signal. In particular, the device is user-controllable and user-selectable, and can adjust the time delay in real time.
  • The above steps 100-400 may be performed in a different order than that described in the drawings. For example, steps 100 and 300 in the second embodiment (i.e. steps 3201-3203) can be performed in reversed order. For another example, in practical applications, any two steps in steps 100-400 can be performed in parallel or in reverse order according to the functions involved.
  • Preferably, step 200 is performed prior to step 300, that is: the coefficient matrix is calculated and then the input signal is synchronized in the time domain, which provides the advantages that the coefficient matrix will not be recalculated although the time delays are different, and thus a large number of calculation is omitted. In the embodiments described in FIGS. 4 and 5, the result can be obtained by calculating the coefficient matrix just one time. The disclosure draws the following conclusions from a large number of experiments: The coefficient matrix calculated from the synchronized mixed signal is almost the same as that calculated from the original mixed signal, illustrating that the method omits a large amount of calculation without losing the accuracy of the coefficient matrix.
  • Preferably, at step 100, after n input signals is received by using m receiving devices, the input signals received by one or more signal receiving devices can be removed according to a given criterion. According to an embodiment of the disclosure, the input signal is a sound signal, and the receiving device is a device (such as a microphone) for receiving sound signals. When the criterion is Fs*X/V < L/3 (where L is the length of the intercepted discrete signals, X is the distance between any two receiving devices for receiving sound signals, V is the propagation speed of signal, and Fs is the sampling rate), The sound signals received by one of the receiving devices is removed. The embodiment omits a large number of complex calculation while ensuring the accuracy of the pattern recognition, improves the calculation efficiency, and optimizes the power consumption.
  • The signals in the art could be referred to audio signals, image signals, electro-magnetic signals, brain wave signals, electric signals, radio wave signals or other forms of signals that could be picked up by transducers and the disclosure has no limitation in this aspect.
  • According to the disclosure, the perception of the target signals can be improved while reducing the computational cost. In addition, the input signals are synchronized in time domain and thus the method according to the disclosure will introduce minimal frequency distortion.
  • Now referring to FIG. 6, a structural schematic diagram of a computer system 3000 adapted to implement an embodiment of the disclosure is shown.
  • As shown in FIG. 6, the computer system 3000 includes a central processing unit (CPU) 3001, which may execute various appropriate actions and processes in accordance with a program stored in an electronically programmable read-only memory (EPROM) 3002 or a program loaded into a random access memory (RAM). The RAM 3003 also stores various programs and data required by operations of the system 3000. The CPU 3001, the EPROM 3002 and the RAM 3003 are connected to each other through a bus 3004. An input/output (I/O) interface 3005 is also connected to the bus 3004. A Direct Memory Access interface 3006 is connected to the bus 3004 to enable fast data exchange.
  • The following components may be connected to the I/O interface 3005: an removable data storage 3007 comprising USB storage, solid state drive, hard drive etc.; a wireless data link 3008 comprising LAN, blue-tooth, near field communication devices; a signal convertor 3009 which are connected to data input channel(s) 3010 and data output channel(s) 3011. According to an embodiment of the disclosure, the process described above with reference to the flow chart may also be implemented as an embedded computer system similar to the computer system 3000 but without keyboard, mouse, and hard disk. Update of programs will be facilitated via a wireless data link 3008 or removable data storage 3007.
  • The central processing unit can be a cloud processor, and the memory can be a cloud memory.
  • According to an embodiment of the disclosure, the process described above with reference to the flow chart may be implemented as a computer software program. For example, an embodiment of the disclosure comprises a computer program product, which comprises a computer program that is tangibly embodied in a machine-readable medium. The computer program comprises program codes for executing the method as shown in the flow charts. In such an embodiment, the computer program may be downloaded and installed from a network via the wireless data link 3008, and/or may be installed from the removable media 3007.
  • The flow charts and block diagrams in the figures illustrate architectures, functions and operations that may be implemented according to the system, the method and the computer program product of the various embodiments of the present invention. In this regard, each of the blocks in the flow charts and block diagrams may represent a module, a program segment, or a code portion. The module, the program segment, or the code portion comprises one or more executable instructions for implementing the specified logical function. It should be noted that, in some alternative implementations, the functions denoted by the blocks may occur in a different sequence from that the sequence as shown in the figures. For example, in practice, two blocks in succession may be executed substantially in parallel, or in a reverse order, depending on the functionalities involved. It should also be noted that, each block in the block diagrams and/or the flow charts and/or a combination of the blocks may be implemented by a dedicated hardware-based system executing specific functions or operations, or by a combination of a dedicated hardware and computer instructions.
  • The units or modules involved in the embodiments of the disclosure may be implemented by way of software or hardware. The described units or modules may also be provided in a processor. The names of these units or modules are not considered as a limitation to the units or modules.
  • In another aspect, the disclosure also provides a computer readable storage medium. The computer readable storage medium may be the computer readable storage medium included in the apparatus in the above embodiments, and it may also be a separate computer readable storage medium which has not been assembled into the apparatus. The computer readable storage medium stores one or more programs, which are used by one or more processors to execute the method for separating a target signal from noise described in the disclosure.
  • The forgoing is only a description of the preferred embodiments of the disclosure and the applied technical principles. It should be understood by those skilled in the art that the invention scope of the disclosure is not limited to technical solutions formed by the combinations of the above technical features. It should also cover other technical solutions formed by any combinations of the above technical features or equivalent features thereof without departing from the concept of the invention. For example, a technical solution formed by replacing the features disclosed above by technical features with similar functions is also within the scope of the present invention.

Claims (15)

  1. A method for removing a target unwanted signal from multiple signals, comprising:
    providing a set of input signals, each of the input signals comprising the target unwanted signal;
    maximizing and maintaining the independence of the input signals;
    estimating coefficients to maximize the independence;
    synchronizing the set of input signals;
    separating the set of synchronized input signals into channels with the target unwanted signal and channels without target unwanted signals; and
    selecting an optimal channel without unwanted signal as output signal intelligently.
  2. The method of claim 1, wherein the synchronizing the set of input signals comprises:
    detecting a segment of unwanted signal in each of the input signals;
    performing discrete time convolution between two of the detected unwanted signal segments to obtain their relative time delay;
    synchronizing the set of input signals based on the obtained time delay;
    selecting the preferred direction of signals that will be labelled as unwanted signals;
    calculating the relative time delay of unwanted signals from preferred direction;
    synchronizing the set of input signals based on the pre-determined time delay;
    selecting all possible direction of signals that will be labelled as unwanted signals;
    estimate a set of time delays T1, T2, ..., Tn;
    synchronizing the set of input signals based on a series of time delays;
    selecting the incoming direction of signals that will be labelled as wanted signals;
    determine the time delays of unwanted signals from the remaining directions; and
    synchronize the set of input signals based on determined time delays.
  3. The method of claim 1 or 2, where in the synchronizing of input signals can be continuously updated to accommodate the moving of signal sources.
  4. The method of any one of claims 1-3, wherein the set of input signals are obtained at a set of mutually distanced locations.
  5. The method of any one of claims 1-4, the maximizing the independence of the set of input signals comprising: maximizing the Gaussian distribution of the set of input signals by an independent component analysis.
  6. The method of claim 2, the detecting the unwanted signal segment in each of the set of input signals comprising: detecting the unwanted signal segment in each of the set of input signals by performing a pattern recognition.
  7. The method of claim 1, wherein the input signal is a signal picked up by a transducer.
  8. The method of any one of claims 1-7, wherein the input signal is one of:
    an audio signal;
    an electrical signal;
    an image signal; and
    a radio frequency signal.
  9. A system for removing a target noise from signal, comprising:
    a set of input units for inputting a set of input signals;
    a processor; and
    a memory storing computer readable instructions which when executed by the processor, cause the processor to:
    maximize and maintain the independence of the set of input signals;
    estimate the coefficients to maximize the independence as if all signals are synchronized;
    synchronize the set of input signals;
    separate the set of synchronized input signals into channel with the target unwanted signal and the channel without the unwanted signal; and
    select the optimal channel without unwanted signal as output signal intelligently
  10. The method of claim 9, wherein the synchronizing the set of input signals comprises:
    detecting a segment of unwanted signal in each of the input signals;
    performing discrete time convolution between two of the detected unwanted signal segments to obtain their relative time delay;
    synchronizing the set of input signals based on the obtained time delay;
    selecting the preferred direction of signals that will be labelled as unwanted signals;
    calculating the relative time delay of unwanted signals from preferred direction;
    synchronizing the set of input signals based on the pre-determined time delay;
    selecting all possible direction of signals that will be labelled as unwanted signals;
    estimate a set of time delays T1, T2, ..., Tn;
    synchronizing the set of input signals based on a series of time delays;
    selecting the incoming direction of signals that will be labelled as wanted signals;
    determine the time delays of unwanted signals from the remaining directions; and
    synchronize the set of input signals based on determined time delays.
  11. The method of claim 9 or 10, wherein the set of input signals are obtained at a set of mutually distanced locations.
  12. The method of claim 9 or 10, the maximizing the independence of the set of input signals comprising: maximizing the Gaussian distribution of the set of input signals by an independent component analysis.
  13. The method of claim 10, the detecting the unwanted signal segment in each of the set of input signals comprising: detecting the unwanted signal segment in each of the set of input signals by performing a pattern recognition.
  14. The method of claim 9 or 10, wherein the input signal is one of:
    an audio signal;
    an electrical signal;
    an image signal; and
    a radio frequency signal.
  15. A non-transitory computer-readable storage medium, storing instructions which when executed by a processor, perform a method for separating a target unwanted signal from multiple signals, the method comprising:
    providing a set of input signals, each of the input signals comprising the target unwanted signal;
    maximizing and maintaining the independence of the input signals;
    estimating the coefficients of maximization of independence;
    synchronizing the set of input signals;
    separating the synchronized input signals into channels with the unwanted signal and the channels without the unwanted signal; and
    selecting the optimal channel without unwanted signal as output signal intelligently.
EP17921701.3A 2017-08-15 2017-12-21 Method and system for extracting source signal, and storage medium Pending EP3672275A4 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710698651.6A CN109413543B (en) 2017-08-15 2017-08-15 Source signal extraction method, system and storage medium
PCT/CN2017/117813 WO2019033671A1 (en) 2017-08-15 2017-12-21 Method and system for extracting source signal, and storage medium

Publications (2)

Publication Number Publication Date
EP3672275A1 true EP3672275A1 (en) 2020-06-24
EP3672275A4 EP3672275A4 (en) 2023-08-23

Family

ID=65362112

Family Applications (1)

Application Number Title Priority Date Filing Date
EP17921701.3A Pending EP3672275A4 (en) 2017-08-15 2017-12-21 Method and system for extracting source signal, and storage medium

Country Status (3)

Country Link
EP (1) EP3672275A4 (en)
CN (1) CN109413543B (en)
WO (1) WO2019033671A1 (en)

Family Cites Families (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100372277C (en) * 2006-02-20 2008-02-27 东南大学 Space time separation soft inputting and outputting detecting method based on spatial domain prewhitening mergence
CN100495388C (en) * 2006-10-10 2009-06-03 深圳市理邦精密仪器有限公司 Signal processing method using space coordinates convert for realizing signal separation
EP2430975A1 (en) * 2010-09-17 2012-03-21 Stichting IMEC Nederland Principal component analysis or independent component analysis applied to ambulatory electrocardiogram signals
US9100734B2 (en) * 2010-10-22 2015-08-04 Qualcomm Incorporated Systems, methods, apparatus, and computer-readable media for far-field multi-source tracking and separation
CN102571296B (en) * 2010-12-07 2014-09-03 华为技术有限公司 Precoding method and device
JP2012234150A (en) * 2011-04-18 2012-11-29 Sony Corp Sound signal processing device, sound signal processing method and program
US9099096B2 (en) * 2012-05-04 2015-08-04 Sony Computer Entertainment Inc. Source separation by independent component analysis with moving constraint
US8880395B2 (en) * 2012-05-04 2014-11-04 Sony Computer Entertainment Inc. Source separation by independent component analysis in conjunction with source direction information
JP2014045793A (en) * 2012-08-29 2014-03-17 Sony Corp Signal processing system, signal processing apparatus, and program
CN102868433B (en) * 2012-09-10 2015-04-08 西安电子科技大学 Signal transmission method based on antenna selection in multiple-input multiple-output Y channel
CN103083012A (en) * 2012-12-24 2013-05-08 太原理工大学 Atrial fibrillation signal extraction method based on blind source separation
CN103197183B (en) * 2013-01-11 2015-08-19 北京航空航天大学 A kind of method revising Independent component analysis uncertainty in electromagnetic interference (EMI) separation
CN104053107B (en) * 2014-06-06 2018-06-05 重庆大学 One kind is for Sound seperation and localization method under noise circumstance
CN104091356A (en) * 2014-07-04 2014-10-08 南京邮电大学 X-ray medical image objective reconstruction based on independent component analysis
CN108353228B (en) * 2015-11-19 2021-04-16 香港科技大学 Signal separation method, system and storage medium
CN105640500A (en) * 2015-12-21 2016-06-08 安徽大学 Independent component analysis-based saccade signal feature extraction method and recognition method
CN105996993A (en) * 2016-04-29 2016-10-12 南京理工大学 System and method for intelligent video monitoring of vital signs
CN106356075B (en) * 2016-09-29 2019-09-17 合肥美的智能科技有限公司 Blind sound separation method, structure and speech control system and electric appliance assembly
CN107025446A (en) * 2017-04-12 2017-08-08 北京信息科技大学 A kind of vibration signal combines noise-reduction method

Also Published As

Publication number Publication date
CN109413543A (en) 2019-03-01
CN109413543B (en) 2021-01-19
EP3672275A4 (en) 2023-08-23
WO2019033671A1 (en) 2019-02-21

Similar Documents

Publication Publication Date Title
US9668066B1 (en) Blind source separation systems
EP3655949B1 (en) Acoustic source separation systems
Mandel et al. An EM algorithm for localizing multiple sound sources in reverberant environments
Katz et al. A comparative study of interaural time delay estimation methods
US10650841B2 (en) Sound source separation apparatus and method
EP2393463B1 (en) Multiple microphone based directional sound filter
JP6400218B2 (en) Audio source isolation
Douglas et al. Convolutive blind separation of speech mixtures using the natural gradient
GB2548325A (en) Acoustic source seperation systems
US20070100605A1 (en) Method for processing audio-signals
CN110709929B (en) Processing sound data to separate sound sources in a multi-channel signal
CN105580074B (en) Signal processing system and method
EP2437517B1 (en) Sound scene manipulation
US11277210B2 (en) Method, system and storage medium for signal separation
Douglas Blind separation of acoustic signals
Hosseini et al. Time difference of arrival estimation of sound source using cross correlation and modified maximum likelihood weighting function
JP2006227328A (en) Sound processor
Anemüller et al. Adaptive separation of acoustic sources for anechoic conditions: A constrained frequency domain approach
EP3672275A1 (en) Method and system for extracting source signal, and storage medium
US11823698B2 (en) Audio cropping
JP2003078423A (en) Processor for separating blind signal
JP2010217268A (en) Low delay signal processor generating signal for both ears enabling perception of direction of sound source
JP2006072163A (en) Disturbing sound suppressing device
Ishibashi et al. Blind source separation for human speeches based on orthogonalization of joint distribution of observed mixture signals
US20240135948A1 (en) Acoustic echo cancellation

Legal Events

Date Code Title Description
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE INTERNATIONAL PUBLICATION HAS BEEN MADE

PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20200315

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

DAV Request for validation of the european patent (deleted)
DAX Request for extension of the european patent (deleted)
19U Interruption of proceedings before grant

Effective date: 20200316

19W Proceedings resumed before grant after interruption of proceedings

Effective date: 20210406

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

R17P Request for examination filed (corrected)

Effective date: 20200315

REG Reference to a national code

Ref country code: DE

Ref legal event code: R079

Free format text: PREVIOUS MAIN CLASS: H04R0003000000

Ipc: G10L0021027200

A4 Supplementary search report drawn up and despatched

Effective date: 20230726

RIC1 Information provided on ipc code assigned before grant

Ipc: H04R 1/40 20060101ALN20230720BHEP

Ipc: H04R 25/00 20060101ALI20230720BHEP

Ipc: H04R 3/00 20060101ALI20230720BHEP

Ipc: G10L 21/0272 20130101AFI20230720BHEP