CN113488031B - Method and device for determining electronic equipment, storage medium and electronic device - Google Patents

Method and device for determining electronic equipment, storage medium and electronic device Download PDF

Info

Publication number
CN113488031B
CN113488031B CN202110742317.2A CN202110742317A CN113488031B CN 113488031 B CN113488031 B CN 113488031B CN 202110742317 A CN202110742317 A CN 202110742317A CN 113488031 B CN113488031 B CN 113488031B
Authority
CN
China
Prior art keywords
determining
electronic device
reverberation
reverberation energy
energy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110742317.2A
Other languages
Chinese (zh)
Other versions
CN113488031A (en
Inventor
刘建国
栾天祥
赵培
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao Haier Technology Co Ltd
Haier Smart Home Co Ltd
Original Assignee
Qingdao Haier Technology Co Ltd
Haier Smart Home Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao Haier Technology Co Ltd, Haier Smart Home Co Ltd filed Critical Qingdao Haier Technology Co Ltd
Priority to CN202110742317.2A priority Critical patent/CN113488031B/en
Publication of CN113488031A publication Critical patent/CN113488031A/en
Application granted granted Critical
Publication of CN113488031B publication Critical patent/CN113488031B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Data Mining & Analysis (AREA)
  • Human Computer Interaction (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Mathematical Optimization (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Algebra (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a method and a device for determining electronic equipment, a storage medium and an electronic device. Wherein the method comprises the following steps: acquiring voice signals acquired by a plurality of electronic devices, each electronic device comprising at least one microphone array; determining a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy duty ratio represents the relationship between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device; the target device is determined from the plurality of electronic devices based on the reverberant energy duty cycles of the plurality of electronic devices. The invention solves the technical problems of large operation amount, poor performance and low practical application value of the distributed wake-up method caused by the fact that the influence of environmental influence on distance estimation is restrained by the mode of reverberation removal and noise reduction in the distributed wake-up method in the prior art.

Description

Method and device for determining electronic equipment, storage medium and electronic device
Technical Field
The invention relates to the field of internet of things, in particular to a method and a device for determining electronic equipment, a storage medium and an electronic device.
Background
The distributed wake-up is aimed at the problem that a plurality of AI voice devices are simultaneously deployed in a local space at present, so that the same voice command is easy to operate on a plurality of devices at the same time. Particularly, in a home environment, when the voice wakes up the AI voice equipment, if a plurality of voice equipment respond simultaneously, a phenomenon of 'one hundred-call' is caused, so that a user cannot realize the purpose of real operation.
At present, in order to solve the problem of simultaneous wake-up of a plurality of AI voice devices, a distributed wake-up solution is introduced, and a common distributed wake-up solution is to calculate the energy of each device for obtaining a voice signal according to wake-up, and compare the energy according to the energy, wherein the larger the energy is, the closer the device is to a speaker, so that the wake-up should be prioritized. This method cannot work accurately in a space where reverberation is large because the influence of reverberation on energy calculation is not considered, which results in a great error in estimating far and near directly from the energy of the speech. In the known distributed wake-up scheme, robust processing of reverberation influence is still difficult when estimating speaker distance, because the influence of environmental influence on distance estimation is often restrained by a traditional method through a reverberation removal and noise reduction method, but in an actual scene, because hardware equipment computing resources are limited, the reverberation estimation and noise reduction processing with large operation amount is difficult to tolerate, and meanwhile, the processing is required not to cause actual influence on the distance estimation of a sound source, and the requirements greatly limit the practical application value of the distributed wake-up method such as reverberation removal, noise reduction and the like.
In view of the above problems, no effective solution has been proposed at present.
Disclosure of Invention
The embodiment of the invention provides a method and a device for determining electronic equipment, a storage medium and the electronic equipment, which at least solve the technical problems that in the prior art, the distributed wake-up method has large operand, poor performance and low practical application value because the influence of environmental influence on distance estimation is restrained by a mode of reverberation removal and noise reduction.
According to an aspect of an embodiment of the present invention, there is provided a method of determining an electronic device, the method including: acquiring voice signals acquired by a plurality of electronic devices, each electronic device comprising at least one microphone array; determining a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy duty ratio represents the relationship between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device; the target device is determined from the plurality of electronic devices based on the reverberant energy duty cycles of the plurality of electronic devices.
In one exemplary embodiment, determining a target device from a plurality of electronic devices based on reverberant energy duty cycles of the plurality of electronic devices, comprises: and determining the electronic device with the smallest reverberation energy ratio among the plurality of electronic devices as a target device.
In one exemplary embodiment, determining a reverberation energy ratio corresponding to a voice signal collected by each electronic device based on the voice signal collected by each electronic device includes: determining a frequency domain signal corresponding to the voice signal based on the voice signal collected by the microphone of each electronic device; calculating estimated vectors of direct energy components and reverberation energy components of frequency domain signals of each electronic device at a plurality of frequency points, wherein the estimated vectors are used for representing the direct energy components and the reverberation energy components after being spliced and transposed; acquiring a plurality of direct energy components on a plurality of preset frequency points and a plurality of reverberation energy components on a plurality of preset frequency points based on the estimated vector; the ratio of the sum of the plurality of reverberation energy components to the sum of the plurality of direct energy components is determined as a reverberation energy duty cycle of the electronic device.
In one exemplary embodiment, calculating estimated vectors of direct energy components and reverberant energy components of a frequency domain signal of each electronic device at a plurality of frequency points includes: determining cross-correlation parameters between microphone arrays, audio correlation coefficients between microphone arrays, and noise correlation coefficients of each electronic device; an estimated vector is determined based on the cross-correlation parameters, the audio correlation coefficients and the noise correlation coefficients between the microphone arrays.
In one exemplary embodiment, determining the estimated vector from the cross-correlation parameter, the audio correlation coefficient and the noise correlation coefficient between the microphone arrays comprises: determining a correlation coefficient matrix according to the audio correlation coefficient and the noise correlation coefficient; acquiring a preset weight matrix; and determining an estimated vector according to the cross-correlation parameter, the weight matrix and the correlation coefficient matrix.
In one exemplary embodiment, determining cross-correlation parameters between microphone arrays of each electronic device includes: sampling the frequency domain signal of each microphone at a preset frequency point to obtain sampling signals corresponding to the preset frequency point at a plurality of moments; forming a sampling signal sequence based on the sampling signals corresponding to each microphone; the cross-correlation parameter between each two microphones is formed based on the sampled signal sequence corresponding to each microphone and the conjugate of the sampled signal sequence.
In an exemplary embodiment, the above method further comprises: detecting whether voice information corresponding to the voice signal is preset voice information or not, wherein the preset voice information is voice information for triggering alarm; and sending out an alarm signal under the condition that the voice information corresponding to the voice signal is determined to be the preset voice information.
In one exemplary embodiment, after determining the target device from the plurality of electronic devices based on the reverberation energy duty cycles of the plurality of electronic devices, the method further comprises: and sending a response instruction to the target device so that the target device responds to the voice signal according to the response instruction.
According to another aspect of the embodiment of the present invention, there is also provided a method for determining an electronic device, including: acquiring voice signals acquired by a plurality of microphones; determining a reverberation energy ratio corresponding to the voice signal based on the voice signals collected by the microphones, wherein the reverberation energy ratio characterizes the relationship between a reverberation energy component and a direct energy component in the voice signals collected by the electronic equipment; and sending the reverberation energy duty ratio to the server, wherein the server receives the plurality of reverberation energy duty ratios sent by the plurality of electronic devices, and determines the target device from the plurality of electronic devices according to the plurality of reverberation energy duty ratios.
According to another aspect of the embodiment of the present invention, there is also provided an apparatus for determining an electronic device, including: the system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring voice signals acquired by a plurality of electronic devices, and each electronic device comprises at least one microphone array; the first determining module is used for determining a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy duty ratio represents the relation between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device; and the second determining module is used for determining the target device from the plurality of electronic devices according to the reverberation energy duty ratio of the plurality of electronic devices.
According to another aspect of the embodiment of the present invention, there is also provided an apparatus for determining an electronic device, including: the acquisition module is used for acquiring voice signals acquired by the plurality of microphones; the determining module is used for determining a reverberation energy duty ratio corresponding to the voice signals based on the voice signals collected by the microphones, wherein the reverberation energy duty ratio represents the relation between a reverberation energy component and a direct energy component in the voice signals collected by the electronic equipment; and the sending module is used for sending the reverberation energy duty ratio to the server, wherein the server receives the plurality of reverberation energy duty ratios sent by the plurality of electronic devices and determines the target device from the plurality of electronic devices according to the plurality of reverberation energy duty ratios.
According to another aspect of the embodiments of the present invention, there is also provided a computer-readable storage medium having a computer program stored therein, wherein the computer program is configured to perform the above-described method of determining an electronic device when run.
According to another aspect of the embodiment of the present invention, there is also provided an electronic apparatus including a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor executes the method for determining an electronic device described above through the computer program.
In an embodiment of the present application, the method includes acquiring voice signals acquired by a plurality of electronic devices, each electronic device including at least one microphone array; determining a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy duty ratio represents the relationship between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device; the target device is determined from the plurality of electronic devices based on the reverberant energy duty cycles of the plurality of electronic devices. According to the scheme, when the indoor AI voice equipment is awakened, the accuracy of the distributed awakening under the reverberation condition is greatly improved by utilizing the algorithm of the reverberation energy ratio, meanwhile, the method is small in operand, the characteristic of the distance for acquiring the voice signal is not influenced, the method is robust to the environmental influence, and the technical problems that the distributed awakening method is large in operand, poor in performance and low in practical application value due to the fact that the influence of the environmental influence on the distance estimation is restrained by the distributed awakening method in a mode of reverberation removal and noise reduction in the prior art are solved.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a block diagram of the hardware architecture of a computer terminal of a method of determining an electronic device according to an embodiment of the present application;
FIG. 2 is a flow chart of a method of determining an electronic device according to an embodiment of the application;
FIG. 3 is a flow chart of another method of determining an electronic device according to an embodiment of the application;
FIG. 4 is a flow chart of an alternative method of determining an electronic device according to an embodiment of the application;
FIG. 5 is a schematic diagram of the reverberation energy component components of an alternative speech signal according to an embodiment of the present application;
FIG. 6 is a schematic diagram of an apparatus for determining an electronic device according to an embodiment of the application;
fig. 7 is a schematic diagram of another apparatus for determining an electronic device according to an embodiment of the present application.
Detailed Description
In order that those skilled in the art will better understand the present application, a technical solution in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in which it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the present application without making any inventive effort, shall fall within the scope of the present application.
It should be noted that the terms "first," "second," and the like in the description and the claims of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the application described herein may be implemented in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The method embodiments provided by the embodiments of the present application may be performed in a computer terminal, or a similar computing device. Taking a computer terminal as an example, fig. 1 is a block diagram of a hardware structure of a computer terminal according to a data request processing method according to an embodiment of the present application. As shown in fig. 1, the computer terminal may include one or more (only one is shown in fig. 1) processors 102 (the processor 102 may include, but is not limited to, a microprocessor MCU or a processing device such as a programmable logic device FPGA) and a memory 104 for storing data, and in one exemplary embodiment, may also include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the computer terminal described above. For example, a computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than the equivalent functions shown in FIG. 1 or more than the functions shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a method for processing a data request in an embodiment of the present invention, and the processor 102 executes the computer program stored in the memory 104, thereby performing various functional applications and data processing, that is, implementing the above-mentioned method. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the computer terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission means 106 is arranged to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a computer terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.
In this embodiment, a method for determining an electronic device is provided and applied to the computer terminal, and fig. 2 is a flowchart of a method for determining an electronic device according to an embodiment of the present application. The execution main body of the device in the embodiment of the application is a central control device, which can be one of indoor intelligent home devices or intelligent voice terminal devices.
As shown in fig. 2, the method comprises the steps of:
s202, acquiring voice signals acquired by a plurality of electronic devices, wherein each electronic device comprises at least one microphone array.
Each of the above-mentioned electronic devices may be a terminal device having a microphone array, a home device, an intelligent voice terminal device, etc., for example, a mobile phone having a microphone array, a computer, an air cleaner, a refrigerator, a television, an AI sound box, an oven, etc. The microphone array adopted by the electronic equipment can be provided with a plurality of microphones, each microphone can collect the voice signal, and each electronic equipment can perform corresponding behaviors according to the corresponding voice signal. For example, one of the electronic devices can be selected as the central control device, and the control of all the electronic devices in the room can be realized by controlling one electronic device. The terminal equipment can also be used as central control equipment, and all indoor electronic equipment can be controlled through the terminal equipment.
In an alternative embodiment, the central control device is one of all the electronic devices, such as the electronic device a, and the voice signal is a wake-up word. When a user wakes up each electronic device through a wake-up word by using the electronic device a, the electronic device a can obtain multi-channel data corresponding to the wake-up word length (time) of each electronic device, each electronic device adopts a microphone array, for example, 4 microphones are provided for the electronic device B, 4-channel data are provided, 6 microphones are provided for the electronic device C, and 6-channel data are provided. Of course, all the above electronic devices may also be controlled by a terminal device as a central control device, for example, a mobile phone, a PC, etc., and all the electronic devices may acquire the voice signal collected by the microphone of each electronic device.
S204, determining a reverberation energy ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy ratio represents the relation between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device.
The voice signal may have a reverberation energy component and a Direct energy component, where the reverberation energy component is energy carried by the voice signal sent by the sound source and reflected to the device after contacting with other objects, and the Direct energy component is energy carried by Direct sound (Direct sound) sent by the sound source, where the Direct sound may directly reflect a distance information part of the sound source from the electronic device, and the reverberation energy component is entirely caused by an environmental factor, where two parts need to be considered separately when estimating the distance information, so that a reverberation energy duty ratio of the electronic device may be obtained for better judging the distance information.
In an optional implementation manner, the central control device is an intelligent voice terminal device, the sound source is a user, the electronic device is an electronic device A arranged at an indoor corner, the intelligent voice terminal device receives a voice signal acquired by a microphone of the electronic device A, the reverberation energy duty ratio of the device A is acquired through the voice signal, and distance information of the user from the device is judged according to the reverberation energy duty ratio. Because the electronic equipment A is arranged at the corner, energy carried by voice signals sent by a user can be contacted with walls around the electronic equipment A and reflected, so that reverberation energy components are increased due to the influence of wall environment factors, and the reverberation energy in the voice signals collected by the electronic equipment A occupies a relatively large amount.
S206, determining a target device from the plurality of electronic devices according to the reverberation energy duty ratio of the plurality of electronic devices.
The target device is a device which is controlled by a user through the central control device to realize corresponding functions, and the reverberation energy duty ratio of the target device is related to the distance between the sound source and the environment around the target device.
In an optional embodiment, the central control device is an intelligent terminal device, the target device is an electronic device a, and the intelligent terminal device determines the electronic device a as the target device according to the reverberation energy ratio by comparing the reverberation energy ratios of all indoor electronic devices.
As can be seen from the above, in the embodiment of the present application, the method includes acquiring voice signals acquired by a plurality of electronic devices, each of the electronic devices including at least one microphone array; determining a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy duty ratio represents the relationship between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device; the target device is determined from the plurality of electronic devices based on the reverberant energy duty cycles of the plurality of electronic devices. According to the scheme, when the indoor AI voice equipment is awakened, the accuracy of the distributed awakening under the reverberation condition is greatly improved by utilizing the algorithm of the reverberation energy ratio, meanwhile, the method is small in operand, the characteristic of the distance for acquiring the voice signal is not influenced, the method is robust to the environmental influence, and the technical problems that the distributed awakening method is large in operand, poor in performance and low in practical application value due to the fact that the influence of the environmental influence on the distance estimation is restrained by the distributed awakening method in a mode of reverberation removal and noise reduction in the prior art are solved.
In one exemplary embodiment, determining a target device from a plurality of electronic devices based on reverberant energy duty cycles of the plurality of electronic devices, comprises: and determining the electronic device with the smallest reverberation energy ratio among the plurality of electronic devices as a target device.
The energy ratio of the reverberation is determined by the reverberation energy component and the direct energy component, wherein the reverberation energy component refers to the energy which is reflected to the target equipment after the energy carried by the voice signal sent by the user contacts with other objects, and when the reverberation energy component is too large, the reverberation energy component can cause interference to the voice signal and influence the user to determine the target equipment through the central control equipment.
In an alternative embodiment, the central control device calculates a reverberation energy component and a direct energy component carried by the voice signal of each electronic device based on the voice signal collected by the microphone of each electronic device, where the plurality of electronic devices in the room are respectively an electronic device a, an electronic device B, and an electronic device C. Through calculation, the reverberation energy of the electronic device a accounts for twenty-five percent, the reverberation energy of the electronic device B accounts for twenty percent, and the reverberation energy of the electronic device C accounts for ten percent, so that the electronic device C is a target device in a plurality of electronic devices, the electronic device C responds to the voice signal, and a user can control the electronic device C to perform corresponding operation through the central control device.
In one exemplary embodiment, determining a reverberation energy ratio corresponding to a voice signal collected by each electronic device based on the voice signal collected by each electronic device includes: determining a frequency domain signal corresponding to the voice signal based on the voice signal collected by the microphone of each electronic device; calculating estimated vectors of direct energy components and reverberation energy components of frequency domain signals of each electronic device at a plurality of frequency points, wherein the estimated vectors are used for representing the direct energy components and the reverberation energy components after being spliced and transposed; acquiring a plurality of direct energy components on a plurality of preset frequency points and a plurality of reverberation energy components on a plurality of preset frequency points based on the estimated vector; the ratio of the sum of the plurality of reverberation energy components to the sum of the plurality of direct energy components is determined as a reverberation energy duty cycle of the electronic device.
Each of the electronic devices described above may have a plurality of microphones, one for each channel data. The voice signals of the preset frequency points are voice signals collected by the microphones at a certain time, frequency domain signals corresponding to the voice signals are determined according to the voice signals collected by the microphones of each electronic device, estimated vectors of direct energy components and reverberation energy components of the frequency domain signals of each electronic device at a plurality of frequency points are calculated, specifically, the time for the microphones to collect the voice signals is divided into a plurality of time points according to the number of the microphones of each device, and one time point is a preset frequency point. And acquiring a reverberation energy component and a direct energy component of each preset frequency point based on the plurality of preset frequency points, taking the sum of all acquired reverberation energy components as a reverberation energy component of a voice signal detected by each electronic device, taking the sum of all acquired direct energy components as a direct energy component of the voice signal detected by each device, and then determining that the ratio of the sum of the plurality of reverberation energy components to the sum of the plurality of direct energy components is the reverberation energy duty ratio of the device.
In an alternative embodiment, there are a plurality of electronic devices in the room, electronic device a, electronic device B, and electronic device C, respectively. The electronic equipment A is provided with 4 microphones and 4 preset frequency points, wherein the reverberation energy component of the preset frequency point A1 is 10, and the direct energy component is 75; the reverberant energy component of the preset frequency point A2 is 8, and the direct energy component is 81; the reverberant energy component of the preset frequency point A3 is 5, and the direct energy component is 90; the reverberant energy component of the preset frequency point A4 is 17, and the direct energy component is 77. The sum of the reverberant energy components of device a is 10+8+5+17=40, the sum of the direct energy components is 75+81+90+77=323, and the reverberant energy of electronic device a is 40/323≡12.4%. The electronic equipment B is provided with three microphones, 3 frequency points are preset, wherein the reverberation energy component of the preset frequency point B1 is 7, and the direct energy component is 65; the reverberant energy component of the preset frequency point B2 is 3, and the direct energy component is 70; the reverberant energy component of the preset frequency point B3 is 15, and the direct energy component is 44%. The sum of the reverberation energy components of electronic device B is 7+3+15=25, the sum of the direct energy components is 65+70+44=179, and the reverberation energy ratio of electronic device B is 25/179≡14%. The electronic equipment C is provided with 2 microphones, 2 frequency points are preset, wherein the reverberation energy component of the preset frequency point C1 is 25, and the direct energy component is 77; the reverberant energy component of the preset frequency point C2 is 22, and the direct energy component is 80. The sum of the reverberant energy components of the electronic device C is 25+22=47, the sum of the direct energy components is 77+80=157, and the reverberant energy of the electronic device C is 47/157≡29.9%. And the electronic equipment C can be determined to be a target equipment by comparing the reverberation energy duty ratios of the electronic equipment A, the electronic equipment B and the electronic equipment C.
In another alternative embodiment, the estimated vector is used to represent the transpose of the direct energy component and the reverberant energy component after splicing, and accordingly, may be expressed as \hat { \theta } (f), where the vector \theta (f) = [ P D (f),P R (f)] T ,P D (f) And P R (f) Respectively represent a direct energy component and a reverberant energy component, wherein the sum of the direct energy components is \sum_f { P } D (f) Sum of reverberation energy components \sum_f { P } R (f) Thus the reverberation energy ratio is R est =10log 10 (\sum_f{P R (f)}/\sum_f{P D (f) -f) where f represents a certain frequency band.
In one exemplary embodiment, calculating estimated vectors of direct energy components and reverberant energy components of a frequency domain signal of each electronic device at a plurality of frequency points includes: determining cross-correlation parameters between microphone arrays, audio correlation coefficients between microphone arrays, and noise correlation coefficients of each electronic device; an estimated vector is determined based on the cross-correlation parameters, the audio correlation coefficients and the noise correlation coefficients between the microphone arrays.
The cross-correlation parameter between the microphone arrays of each electronic device may be d 11 (f),r 11 (f);d 12 (f),r 12 (f);…,d MM (f),r MM (f) A. The invention relates to a method for producing a fibre-reinforced plastic composite Between every two microphones of the same electronic equipmentThe audio correlation coefficient is d ij (f) The noise correlation coefficient is r ij (f) Where i, j represent the i-th and j-th microphones in the same electronic device, the audio correlation coefficient d ij (f) Can be obtained by calculation of the parameters and the spatial relationship of the microphone, and the noise correlation coefficient is r ij (f) Is also more easily pre-determined, subject to the spatial noise field.
In one exemplary embodiment, determining the estimated vector from the cross-correlation parameter, the audio correlation coefficient and the noise correlation coefficient between the microphone arrays comprises: determining a correlation coefficient matrix according to the audio correlation coefficient and the noise correlation coefficient; acquiring a preset weight matrix; and determining an estimated vector according to the cross-correlation parameter, the weight matrix and the correlation coefficient matrix.
On each preset frequency point, the correlation coefficient matrix between each microphone in the same electronic device is a (f), and the correlation degree between any one microphone and the microphones of other same devices can be determined by calculating the correlation coefficient between each microphone, wherein the correlation degree between any one microphone and the microphones of other same devices can be determined by the audio correlation coefficient d ij (f) And the noise correlation coefficient is r ij (f) Determining a correlation coefficient matrix; the preset weight matrix is W, and global optimization selection can be performed according to historical record data.
In an alternative embodiment, in determining the estimated vector according to the cross-correlation parameter, the weight matrix and the correlation coefficient matrix, the estimated vector of the direct energy component and the reverberant energy component of the speech signal detected by the electronic device may be \hat { \theta } (f) = (a) H WA) -1 A H Wz. Wherein (A) H WA) -1 Expressed is an inverse matrix of the product of the conjugate matrix, the weight matrix and the correlation coefficient matrix of the correlation coefficient matrix, A H Wz represents the product of the conjugate matrix of the correlation coefficient matrix, the weight matrix and the cross-correlation parameter.
From the above, the cross-correlation parameter between the microphone arrays of each electronic device may be d 11 (f),r 11 (f);d 12 (f),r 12 (f);…,d MM (f),r MM (f) Based on the cross-correlation parameters of every two microphones, the correlation coefficient matrix between each microphone is obtained to be A (f) = [ d ] 11 (f),r 11 (f);d 12 (f),r 12 (f);…,d MM (f),r MM (f)]At the same time, the cross-correlation parameter between each microphone is z= [ R ] 11 (f),…,R MM (f)] T
It should be noted that the estimated vector \hat { \theta } (f) = (a) H WA) -1 A H Wz. Can be used to represent the direct energy component P of each electronic device D (f) Is used for the reverberation energy component P R (f) Is used for the estimation of the vector.
In one exemplary embodiment, determining cross-correlation parameters between microphone arrays of each electronic device includes: sampling the frequency domain signal of each microphone at a preset frequency point to obtain sampling signals corresponding to the preset frequency point at a plurality of moments; forming a sampling signal sequence based on the sampling signals corresponding to each microphone; the cross-correlation parameter between each two microphones is formed based on the sampled signal sequence corresponding to each microphone and the conjugate of the sampled signal sequence.
After each electronic device wakes up, the central control device acquires multi-channel data corresponding to the length (time) of the voice information, wherein each electronic device adopts a microphone array, and each electronic device can have a plurality of microphones, for example, 4 microphones, and then has 4-channel data. And performing fast Fourier transform on the voice signal, and recording a sampling signal sequence formed by the acquired frequency domain signals as X (f, t) = [ X ] (1) (f,t),X (2) (f,t),…,X (M) (f,t)] T Wherein M is the number of channels, f represents a certain frequency band, T represents the observation time, and t=0, 1, … and T-1. For frequency point f, statistics is carried out on cross-correlation parameters R (f) =E [ x (f, t) x between each microphone formed by the sampling sequence signals corresponding to each microphone and the conjugate of the sampling sequence signals H (f,t)]Wherein the cross-correlation parameter represents a mathematical expectation of a conjugate matrix product of the sampled signal sequence and the sampled signal sequence, since x (f, t) isA multi-dimensional sample sequence signal, the cross-correlation parameters between each microphone can thus be derived.
In an exemplary embodiment, the above method further comprises: detecting whether voice information corresponding to the voice signal is preset voice information or not, wherein the preset voice information is voice information for triggering alarm; and sending out an alarm signal under the condition that the voice information corresponding to the voice signal is determined to be the preset voice information.
The voice signal is used for triggering an alarm signal of the equipment, and after the equipment receives the voice signal, the equipment starts an alarm task and sends out the alarm signal. The voice signal can learn the characteristics of alarm sounds of people or other animals under dangerous conditions by using a deep learning model, so that the accuracy of the alarm signal is improved, and false alarms are avoided.
In an optional implementation manner, the predetermined voice signal may be emergency keywords such as "fire," "gas open," etc. set by the user, after the user finds that the user catches fire at home, the microphone of the device shouts to "fight fire," and sends an alarm signal after collecting voice information corresponding to the voice signal of "fight fire": the alarm and the loud sound of the fire cheeks are shouted to draw attention of surrounding households, so that the surrounding households can escape from the fire environment in time, and the personal safety is ensured.
In another alternative embodiment, the predetermined voice signal may be an emergency keyword such as "rescue" set by the user, and the user shouts the thieves in the house, and the microphone of the device sends an alarm signal after collecting the voice signal corresponding to the voice signal of "rescue": alarm and a large number of voice call sounds to draw attention of surrounding householders and to surprise gangster. Meanwhile, under the condition, in order not to irritate gangster, the voice signal of 'life saving' can be set as words which are more secret and not easy to be found to give an alarm, so that the time is prolonged, and the safety of a user is ensured.
In one exemplary embodiment, after determining the target device from the plurality of electronic devices based on the reverberation energy duty cycles of the plurality of electronic devices, the method further comprises: and sending a response instruction to the target device so that the target device responds to the voice signal according to the response instruction.
After the target device responds to the voice signal, the user can send out a response instruction through the terminal device and the like to control the target device to perform corresponding operation, so that the target device can respond to the voice signal according to the response instruction.
In an optional embodiment, the central control device is a terminal device, the target device is an electronic device a, the terminal device determines that the electronic device a is the target device according to the reverberation energy ratio of all indoor electronic devices by comparing the reverberation energy ratios, the terminal device sends a response instruction, such as playing music, to the electronic device a, and the electronic device a starts playing music after receiving the response instruction.
In this embodiment, a method for determining an electronic device is provided and applied to the computer terminal, and fig. 3 is a flowchart of a method for determining an electronic device according to an embodiment of the present application. The execution main body of the device provided by the embodiment of the application is a central control device with a server, and the central control device can be one of indoor intelligent home devices or intelligent voice terminal devices.
As shown in fig. 3, the method comprises the steps of:
s302, voice signals acquired by a plurality of microphones are acquired.
The plurality of microphones are part of or all of the microphones on the electronic device, wherein each microphone can collect the voice signals, and the electronic device can perform corresponding actions according to the voice signals collected by the plurality of microphones.
S304, determining a reverberation energy duty ratio corresponding to the voice signals based on the voice signals collected by the microphones, wherein the reverberation energy duty ratio represents the relation between a reverberation energy component and a direct energy component in the voice signals collected by the electronic equipment.
The voice signal may have a reverberation energy component and a Direct energy component, where the reverberation energy component is energy carried by the voice signal sent by the sound source and reflected to the device after contacting with other objects, and the Direct energy component is energy carried by Direct sound (Direct sound) sent by the sound source, where the Direct sound may directly reflect a distance information part of the sound source from the electronic device, and the reverberation energy component is entirely caused by an environmental factor, where two parts need to be considered separately when estimating the distance information, so that a reverberation energy duty ratio of the electronic device may be obtained for better judging the distance information.
And S306, transmitting the reverberation energy duty ratio to a server, wherein the server receives a plurality of reverberation energy duty ratios transmitted by a plurality of electronic devices, and determining a target device from the plurality of electronic devices according to the plurality of reverberation energy duty ratios.
The target device is a device which is controlled by a user through the central control device to realize corresponding functions, and the reverberation energy duty ratio of the target device is related to the distance between the sound source and the environment around the target device.
In an optional embodiment, the target device is an electronic device a, the server receives a plurality of reverberation energy ratios sent by a plurality of electronic devices, and the server determines the electronic device a as the target device according to the reverberation energy ratio by comparing the reverberation energy ratios of all the electronic devices in the room.
FIG. 4 is a flowchart of an alternative method of determining an electronic device, as shown in FIG. 4, according to an embodiment of the invention, comprising the following steps:
s401, acquiring voice signals acquired by a plurality of electronic devices;
s402, determining a frequency domain signal corresponding to the voice signal according to the collected voice signal;
s403, determining cross-correlation parameters among microphone arrays of each electronic device, audio correlation coefficients among microphone arrays and noise correlation coefficients;
S404, determining a correlation coefficient matrix according to the audio correlation coefficient and the noise correlation coefficient;
s405, acquiring a preset weight matrix;
s406, sampling the frequency domain signal of each microphone at a preset frequency point to obtain sampling signals corresponding to the preset frequency point at a plurality of moments;
s407, forming a sampling signal sequence based on the sampling signals corresponding to each microphone;
s408, forming a cross-correlation parameter between every two microphones based on the sampling signal sequence corresponding to each microphone and the conjugate of the sampling signal sequence;
s409, determining an estimated vector according to the cross-correlation parameter, the weight matrix and the correlation coefficient matrix;
s410, acquiring a plurality of direct energy components on a plurality of preset frequency points and a plurality of reverberation energy components on a plurality of preset frequency points based on the estimated vector;
s411, determining the ratio of the sum of the plurality of reverberation energy components to the sum of the plurality of direct energy components as the reverberation energy duty ratio of the electronic device;
s412, determining a reverberation energy ratio corresponding to the voice signal collected by each electronic device;
s413, determining the electronic device with the smallest reverberation energy ratio among the plurality of electronic devices as a target device;
and S414, sending a response instruction to the target device so that the target device responds according to the response instruction multi-voice signals.
As shown in fig. 5, in the reverberation energy component of the above-mentioned voice signal, the Direct sound corresponds to Direct sound in fig. 5, the Early emission sound corresponds to Early reflection in fig. 5, and the reverberant sound corresponds to reverberation in fig. 5, wherein the Direct sound is a distance information part directly reflecting between a speaker (sound source) and a device, and since the reverberant sound part is caused by environmental factors, it is necessary to consider the Direct sound and the reverberant sound separately in the distance estimation, so that the distance information can be estimated more robustly.
In fig. 5, H (ω) represents a frequency domain signal received by the target device, H D (omega) represents the frequency domain signal corresponding to the direct sound, H R And (ω) represents a frequency domain signal corresponding to the reverberant sound. Wherein there is no direct soundIn the case of other reverberant sounds, H (ω=h D (ω); during the reception of the speech signal by the target device, a reverberant sound is generated due to the influence of environmental factors, in which case H (ω) =h D (ω)+H R (ω)。
Fig. 6 is a schematic diagram of an apparatus for determining an electronic device according to an embodiment of the present invention, where the apparatus for determining an electronic device is as shown in fig. 6, and the apparatus includes:
An acquisition module 61, configured to acquire voice signals acquired by a plurality of electronic devices, each electronic device including at least one microphone array;
a first determining module 62, configured to determine, based on the voice signal collected by each electronic device, a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device, where the reverberation energy duty ratio characterizes a relationship between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device;
the second determining module 63 is configured to determine a target device from the plurality of electronic devices according to the reverberation energy ratios of the plurality of electronic devices.
In an exemplary embodiment, the second determining module includes: and the first determining submodule is used for determining the electronic equipment with the smallest reverberation energy ratio among the plurality of electronic equipment as the target equipment.
In an exemplary embodiment, the first determining module includes:
the second determining submodule is used for determining a frequency domain signal corresponding to the voice signal based on the voice signal collected by the microphone of each electronic device;
the computing module is used for computing estimated vectors of direct energy components and reverberation energy components of the frequency domain signals of each electronic device at a plurality of frequency points, wherein the estimated vectors are used for representing transposes of the direct energy components and the reverberation energy components after being spliced.
The first acquisition submodule is used for acquiring a plurality of direct energy components on a plurality of preset frequency points and a plurality of reverberation energy components on a plurality of preset frequency points based on the estimated vector;
and a third determining submodule, configured to determine a ratio of a sum of the plurality of reverberation energy components to a sum of the plurality of direct energy components as a reverberation energy duty cycle of the electronic device.
In one exemplary embodiment, the computing module includes:
a fourth determining submodule for determining a cross-correlation parameter between microphone arrays of each electronic device, an audio correlation coefficient between microphone arrays and a noise correlation coefficient;
and a fifth determining sub-module for determining an estimated vector according to the cross-correlation parameter, the audio correlation coefficient and the noise correlation coefficient between the microphone arrays.
In one exemplary embodiment, the fifth determination submodule includes:
a sixth determining submodule, configured to determine a correlation coefficient matrix according to the audio correlation coefficient and the noise correlation coefficient;
the second acquisition module is used for acquiring a preset weight matrix;
and the seventh determination submodule is used for determining an estimated vector according to the cross-correlation parameter, the weight matrix and the correlation coefficient matrix.
In one exemplary embodiment, determining cross-correlation parameters between microphone arrays of each electronic device includes:
The sampling module is used for sampling the frequency domain signal of each microphone at a preset frequency point to obtain sampling signals corresponding to the preset frequency point at a plurality of moments;
a first constructing module, configured to construct a sampling signal sequence based on the sampling signal corresponding to each microphone;
and a second construction module for constructing a cross-correlation parameter between each two microphones based on the sampled signal sequence corresponding to each microphone and the conjugate of the sampled signal sequence.
In an exemplary embodiment, the above method further comprises:
the detection module is used for detecting whether voice information corresponding to the voice signal is preset voice information or not, wherein the preset voice information is voice information for triggering an alarm;
and the alarm module is used for sending out an alarm signal under the condition that the voice information corresponding to the voice signal is determined to be the preset voice information.
In an exemplary embodiment, after the second determining module, the method further includes:
and the sending module is used for sending the response instruction to the target equipment so that the target equipment responds to the voice signal according to the response instruction.
Fig. 7 is a schematic diagram of another apparatus for determining an electronic device according to an embodiment of the present invention, where the apparatus for determining an electronic device is shown in fig. 7, and the apparatus includes:
An acquisition module 71, configured to acquire voice signals acquired by a plurality of microphones;
a determining module 72, configured to determine a reverberation energy duty ratio corresponding to the voice signal based on the voice signals collected by the plurality of microphones, where the reverberation energy duty ratio characterizes a relationship between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device;
and the sending module 73 is configured to send the reverberation energy duty ratio to the server, where the server receives the plurality of reverberation energy duty ratios sent by the plurality of electronic devices, and determines the target device from the plurality of electronic devices according to the plurality of reverberation energy duty ratios.
An embodiment of the present invention also provides a storage medium including a stored program, wherein the program executes the method of any one of the above.
Alternatively, in the present embodiment, the above-described storage medium may be configured to store program code for performing the steps of:
s1: acquiring voice signals acquired by a plurality of electronic devices, each electronic device comprising at least one microphone array;
s2: determining a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy duty ratio represents the relationship between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device;
S3: the target device is determined from the plurality of electronic devices based on the reverberant energy duty cycles of the plurality of electronic devices.
An embodiment of the invention also provides an electronic device comprising a memory having stored therein a computer program and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
Optionally, the electronic apparatus may further include a transmission device and an input/output device, where the transmission device is connected to the processor, and the input/output device is connected to the processor.
Alternatively, in the present embodiment, the above-described processor may be configured to execute the following steps by a computer program:
s1: acquiring voice signals acquired by a plurality of electronic devices, each electronic device comprising at least one microphone array;
s2: determining a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy duty ratio represents the relationship between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device;
s3: the target device is determined from the plurality of electronic devices based on the reverberant energy duty cycles of the plurality of electronic devices.
Alternatively, in the present embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Alternatively, specific examples in this embodiment may refer to examples described in the foregoing embodiments and optional implementations, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the invention described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may alternatively be implemented in program code executable by computing devices, so that they may be stored in a memory device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than that shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps within them may be fabricated into a single integrated circuit module for implementation. Thus, the present invention is not limited to any specific combination of hardware and software.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principle of the present invention should be included in the protection scope of the present invention.

Claims (11)

1. A method of determining an electronic device, the method comprising:
acquiring voice signals acquired by a plurality of electronic devices, each electronic device comprising at least one microphone array;
determining a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy duty ratio represents the relationship between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device;
determining a target device from the plurality of electronic devices according to the reverberation energy duty cycles of the plurality of electronic devices;
the method for determining the reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device comprises the following steps: determining a frequency domain signal corresponding to the voice signal based on the voice signal collected by the microphone of each electronic device; calculating estimated vectors of direct energy components and reverberation energy components of the frequency domain signals of each electronic device at a plurality of frequency points, wherein the estimated vectors are used for representing transposes of the direct energy components and the reverberation energy components after splicing; acquiring a plurality of direct energy components on a plurality of preset frequency points and a plurality of reverberation energy components on the plurality of preset frequency points based on the estimated vector; determining a ratio of a sum of the plurality of reverberation energy components to a sum of the plurality of direct energy components as a reverberation energy duty cycle of the electronic device;
The method for calculating the estimated vectors of the direct energy component and the reverberant energy component of the frequency domain signal of each electronic device at a plurality of frequency points comprises the following steps: determining cross-correlation parameters between microphone arrays of each electronic device, audio correlation coefficients between the microphone arrays, and noise correlation coefficients; and determining the estimation vector according to the cross-correlation parameter, the audio correlation coefficient and the noise correlation coefficient among the microphone arrays.
2. The method of claim 1, wherein determining a target device from the plurality of electronic devices based on the reverberation energy duty cycles of the plurality of electronic devices comprises:
and determining the electronic equipment with the smallest reverberation energy ratio as the target equipment in the plurality of electronic equipment.
3. The method of claim 1, wherein determining the estimated vector based on the cross-correlation parameter, the audio correlation coefficient and the noise correlation coefficient between the microphone arrays comprises:
determining a correlation coefficient matrix according to the audio correlation coefficient and the noise correlation coefficient;
acquiring a preset weight matrix;
and determining the estimation vector according to the cross-correlation parameter, the weight matrix and the correlation coefficient matrix.
4. The method of claim 1, wherein determining cross-correlation parameters between microphone arrays of each of the electronic devices comprises:
sampling the frequency domain signal of each microphone at the preset frequency point to obtain sampling signals corresponding to the preset frequency point at a plurality of moments;
forming a sampling signal sequence based on the sampling signals corresponding to each microphone;
and forming a cross-correlation parameter between every two microphones based on the sampling signal sequence corresponding to each microphone and the conjugate of the sampling signal sequence.
5. The method according to claim 1, wherein the method further comprises:
detecting whether voice information corresponding to the voice signal is preset voice information or not, wherein the preset voice information is voice information for triggering alarm;
and sending out an alarm signal under the condition that the voice information corresponding to the voice signal is determined to be the preset voice information.
6. The method of claim 1, wherein after determining a target device from the plurality of electronic devices based on the reverberation energy duty cycles of the plurality of electronic devices, the method further comprises:
And sending a response instruction to the target equipment so that the target equipment responds to the voice signal according to the response instruction.
7. A method of determining an electronic device, the method comprising:
acquiring voice signals acquired by a plurality of microphones;
determining a reverberation energy duty ratio corresponding to the voice signal based on the voice signals collected by the microphones, wherein the reverberation energy duty ratio characterizes the relationship between a reverberation energy component and a direct energy component in the voice signals collected by the electronic equipment;
the reverberation energy duty ratio is sent to a server, wherein the server receives a plurality of reverberation energy duty ratios sent by a plurality of electronic devices, and determines a target device from the plurality of electronic devices according to the plurality of reverberation energy duty ratios;
wherein, based on the voice signals collected by the microphones, determining a reverberation energy duty ratio corresponding to the voice signals includes: determining a frequency domain signal corresponding to the voice signal based on the voice signals collected by the microphones; calculating estimated vectors of direct energy components and reverberation energy components of the frequency domain signals at a plurality of frequency points, wherein the estimated vectors are used for representing transposes of the direct energy components and the reverberation energy components after splicing; acquiring a plurality of direct energy components on a plurality of preset frequency points and a plurality of reverberation energy components on the plurality of preset frequency points based on the estimated vector; determining a ratio of a sum of the plurality of reverberation energy components to a sum of the plurality of direct energy components as a reverberation energy duty cycle corresponding to the speech signal;
The calculating the estimated vectors of the direct energy component and the reverberant energy component of the frequency domain signal at a plurality of frequency points comprises the following steps: determining cross-correlation parameters between the plurality of microphone arrays, audio correlation coefficients and noise correlation coefficients between the plurality of microphone arrays; the estimated vector is determined based on the cross-correlation parameter, the audio correlation coefficient and the noise correlation coefficient between the plurality of microphone arrays.
8. An apparatus for determining an electronic device, the apparatus comprising:
the system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring voice signals acquired by a plurality of electronic devices, and each electronic device comprises at least one microphone array;
the first determining module is used for determining a reverberation energy duty ratio corresponding to the voice signal collected by each electronic device based on the voice signal collected by each electronic device, wherein the reverberation energy duty ratio represents the relation between a reverberation energy component and a direct energy component in the voice signal collected by the electronic device;
a second determining module, configured to determine a target device from the plurality of electronic devices according to the reverberation energy duty ratios of the plurality of electronic devices;
Wherein the first determining module includes: the second determining submodule is used for determining a frequency domain signal corresponding to the voice signal based on the voice signal collected by the microphone of each electronic device; the computing module is used for computing estimated vectors of direct energy components and reverberation energy components of the frequency domain signals of each electronic device at a plurality of frequency points, wherein the estimated vectors are used for representing transposes of the direct energy components and the reverberation energy components after being spliced; the first acquisition submodule is used for acquiring a plurality of direct energy components on a plurality of preset frequency points and a plurality of reverberation energy components on a plurality of preset frequency points based on the estimated vector; a third determining submodule, configured to determine a ratio of a sum of the plurality of reverberation energy components to a sum of the plurality of direct energy components as a reverberation energy duty cycle of the electronic device;
wherein the computing module comprises: a fourth determining submodule for determining a cross-correlation parameter between microphone arrays of each electronic device, an audio correlation coefficient between microphone arrays and a noise correlation coefficient; and a fifth determining sub-module for determining an estimated vector according to the cross-correlation parameter, the audio correlation coefficient and the noise correlation coefficient between the microphone arrays.
9. An apparatus for determining an electronic device, the apparatus comprising:
the acquisition module is used for acquiring voice signals acquired by the plurality of microphones;
the determining module is used for determining a reverberation energy duty ratio corresponding to the voice signals based on the voice signals collected by the microphones, wherein the reverberation energy duty ratio represents the relation between a reverberation energy component and a direct energy component in the voice signals collected by the electronic equipment;
the sending module is used for sending the reverberation energy duty ratio to a server, wherein the server receives a plurality of reverberation energy duty ratios sent by a plurality of electronic devices, and determines a target device from the plurality of electronic devices according to the plurality of reverberation energy duty ratios;
the device is further used for determining a frequency domain signal corresponding to the voice signal based on the voice signals collected by the microphones; calculating estimated vectors of direct energy components and reverberation energy components of the frequency domain signals at a plurality of frequency points, wherein the estimated vectors are used for representing transposes of the direct energy components and the reverberation energy components after splicing; acquiring a plurality of direct energy components on a plurality of preset frequency points and a plurality of reverberation energy components on the plurality of preset frequency points based on the estimated vector; determining a ratio of a sum of the plurality of reverberation energy components to a sum of the plurality of direct energy components as a reverberation energy duty cycle corresponding to the speech signal;
Wherein the apparatus is further configured to determine cross-correlation parameters between the plurality of microphone arrays, audio correlation coefficients between the plurality of microphone arrays, and noise correlation coefficients; the estimated vector is determined based on the cross-correlation parameter, the audio correlation coefficient and the noise correlation coefficient between the plurality of microphone arrays.
10. A computer readable storage medium, characterized in that the computer readable storage medium comprises a stored program, wherein the program when run performs the method of any of the preceding claims 1 to 7.
11. An electronic device comprising a memory and a processor, characterized in that the memory has stored therein a computer program, the processor being arranged to execute the method according to any of the claims 1 to 7 by means of the computer program.
CN202110742317.2A 2021-06-30 2021-06-30 Method and device for determining electronic equipment, storage medium and electronic device Active CN113488031B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110742317.2A CN113488031B (en) 2021-06-30 2021-06-30 Method and device for determining electronic equipment, storage medium and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110742317.2A CN113488031B (en) 2021-06-30 2021-06-30 Method and device for determining electronic equipment, storage medium and electronic device

Publications (2)

Publication Number Publication Date
CN113488031A CN113488031A (en) 2021-10-08
CN113488031B true CN113488031B (en) 2023-10-24

Family

ID=77937354

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110742317.2A Active CN113488031B (en) 2021-06-30 2021-06-30 Method and device for determining electronic equipment, storage medium and electronic device

Country Status (1)

Country Link
CN (1) CN113488031B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113674761B (en) * 2021-07-26 2023-07-21 青岛海尔科技有限公司 Device determination method and device determination system

Also Published As

Publication number Publication date
CN113488031A (en) 2021-10-08

Similar Documents

Publication Publication Date Title
CN110556103B (en) Audio signal processing method, device, system, equipment and storage medium
US10063965B2 (en) Sound source estimation using neural networks
CN102246228B (en) Sound identification systems
Dorfan et al. Tree-based recursive expectation-maximization algorithm for localization of acoustic sources
JP2019204074A (en) Speech dialogue method, apparatus and system
CN109859749A (en) A kind of voice signal recognition methods and device
CN112037789A (en) Equipment awakening method and device, storage medium and electronic device
CN105388459A (en) Robustness sound source space positioning method of distributed microphone array network
CN113593548B (en) Method and device for waking up intelligent equipment, storage medium and electronic device
US11924618B2 (en) Auralization for multi-microphone devices
CN113488031B (en) Method and device for determining electronic equipment, storage medium and electronic device
WO2023061258A1 (en) Audio processing method and apparatus, storage medium and computer program
CN112951261A (en) Sound source positioning method and device and voice equipment
CN110169082A (en) Combining audio signals output
JP2020524300A (en) Method and device for obtaining event designations based on audio data
CN114464184B (en) Method, apparatus and storage medium for speech recognition
WO2023051622A1 (en) Method for improving far-field speech interaction performance, and far-field speech interaction system
CN110427801A (en) Intelligent home furnishing control method and device, electronic equipment and non-transient storage media
Feng et al. Soft label coding for end-to-end sound source localization with ad-hoc microphone arrays
CN113035174A (en) Voice recognition processing method, device, equipment and system
CN108360942B (en) Intelligent window, control method thereof and intelligent window management system
CN113744719A (en) Voice extraction method, device and equipment
CN114690113A (en) Method and device for determining position of equipment
CN115910047B (en) Data processing method, model training method, keyword detection method and equipment
CN113611298A (en) Awakening method and device of intelligent equipment, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant