CN113905323A - Perceptual sound source height correction method applicable to service type robot playing audio - Google Patents
Perceptual sound source height correction method applicable to service type robot playing audio Download PDFInfo
- Publication number
- CN113905323A CN113905323A CN202111261650.8A CN202111261650A CN113905323A CN 113905323 A CN113905323 A CN 113905323A CN 202111261650 A CN202111261650 A CN 202111261650A CN 113905323 A CN113905323 A CN 113905323A
- Authority
- CN
- China
- Prior art keywords
- related transfer
- transfer function
- height
- head
- hrtf
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 29
- 238000012937 correction Methods 0.000 title claims description 7
- 230000006870 function Effects 0.000 claims abstract description 69
- 238000012546 transfer Methods 0.000 claims abstract description 69
- 230000003993 interaction Effects 0.000 claims abstract description 44
- 230000000694 effects Effects 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 3
- 230000001934 delay Effects 0.000 claims description 3
- 238000005259 measurement Methods 0.000 claims description 3
- 238000004088 simulation Methods 0.000 claims description 3
- 238000002604 ultrasonography Methods 0.000 claims description 2
- 230000004069 differentiation Effects 0.000 abstract description 2
- 210000003128 head Anatomy 0.000 description 43
- 230000005236 sound signal Effects 0.000 description 4
- 230000008447 perception Effects 0.000 description 3
- 230000003595 spectral effect Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 230000002194 synthesizing effect Effects 0.000 description 2
- AOQBFUJPFAJULO-UHFFFAOYSA-N 2-(4-isothiocyanatophenyl)isoindole-1-carbonitrile Chemical compound C1=CC(N=C=S)=CC=C1N1C(C#N)=C2C=CC=CC2=C1 AOQBFUJPFAJULO-UHFFFAOYSA-N 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 210000000883 ear external Anatomy 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 210000005010 torso Anatomy 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 210000003454 tympanic membrane Anatomy 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J11/00—Manipulators not otherwise provided for
- B25J11/0005—Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
-
- B—PERFORMING OPERATIONS; TRANSPORTING
- B25—HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
- B25J—MANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
- B25J11/00—Manipulators not otherwise provided for
- B25J11/008—Manipulators for service tasks
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
Landscapes
- Engineering & Computer Science (AREA)
- Robotics (AREA)
- Mechanical Engineering (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Human Computer Interaction (AREA)
- Stereophonic System (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The invention discloses a method for correcting the height of a perceived sound source when a service robot plays audio, which comprises the following steps: the service type robot local device stores various Head Related Transfer Functions (HRTFs) to form a Head Related Transfer Function (HRTF) database, the various Head Related Transfer Functions (HRTFs) cover different height auditory altitude information, the service type robot acquires the altitude information of a listener of a man-machine interaction main body according to a multi-mode sensing interaction mode, matches the Head Related Transfer Functions (HRTFs) according to physiological altitude characteristics, then finely adjusts the matched Head Related Transfer Functions (HRTFs), convolves local audio data and outputs the data to the service type robot playback device. The invention can effectively correct the height of the human-computer interaction sound of the service robot in real time, and solves the problem of the differentiation of virtual sound images when different service robots and different listeners use the service robots to carry out human-computer interaction.
Description
Technical Field
The invention relates to a method for correcting the height of a perceived sound source when a service robot plays audio, and belongs to the technical field of sound of robots.
Background
Three-dimensional audio has been primarily applied in the fields of movies, games, music and the like, can reconstruct virtual sound images at any spatial position, creates immersive sound scenes, and is commonly used in movie theaters, home theaters and the like.
With the rapid development of the robot service industry, the requirement of customers on the comfort of playing audio of the service robot is also improved. The quality of the audio played by the robot is the final expression form of the human-computer interaction of the interactive robot, and is most directly concerned by consumers, and the quality of the audio played by the robot directly influences the artificial experience of the robot audio interaction. How to effectively improve the quality of the audio played by the robot becomes an important problem, especially for the perception height of the audio played by the robot, the method is an important evaluation index for evaluating the audio alternating current quality of the robot.
Head-Related Transfer Function (HRTF) reflects the filtering effect of the external ear, Head, torso, etc. on sound signals that enter the human ear when the orientation of the sound signals is different. It describes how the human ear perceives sound from a point in space, some spectral variation being observed as the source varies along the elevation axis: there is a notch at 7kHz, with the frequency shifting upward as the elevation axis increases, with a shallow peak seen at 12kHz in the mid-line plane, and a flattening of this peak in the high elevation region. Thus, the perception of elevation angle may be associated with local maxima or hot spots at some particular frequencies. The sound produced by the human auditory system is almost elevated if the spectral differences between a high elevation sound source and a horizontal loudspeaker are applied to the original loudspeaker. HRTF is researched in universities and laboratories at home and abroad, and the theoretical method is applied to spaceflight, military affairs, games, sound and other aspects by people.
Disclosure of Invention
The purpose of the invention is as follows: in order to solve the problem of differentiation of virtual sound images when different service robots and different listeners use the service robots to carry out human-computer interaction, the invention provides a method for correcting the height of a perceived sound source when the service robots play audio.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:
a perceived sound source height correction method suitable for a service robot to play audio comprises the following steps:
step 1, various head-related transfer functions HRTF are stored in the local device of the service robot to form a head-related transfer function HRTF database, and the various head-related transfer functions HRTF cover different height auditory altitude information.
And 2, the service type robot acquires the height information of the listener of the human-computer interaction subject according to the multi-mode sensing interaction mode to obtain the physiological height characteristic of the listener of the human-computer interaction subject.
And 3, matching head-related transfer function HRTFs in the head-related transfer function HRTF database according to the physiological height characteristics of the listener of the man-machine interaction subject, and selecting the corresponding head-related transfer function HRTFs.
And 4, calling the head related transfer function HRTF selected in the step 3, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment.
And 5, the service type robot acquires the physiological height characteristics of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode.
And 6, carrying out height fine adjustment on the head-related transfer function HRTF obtained by matching in the step 3 by the service type robot in real time according to the physiological height characteristics of the listener of the man-machine interaction subject obtained in the step 5 to obtain the head-related transfer function HRTF after the height fine adjustment.
And 7, performing sound fine adjustment on the service type robot according to the head related transfer function HRTF after the height fine adjustment obtained in the step 6 to obtain the head related transfer function HRTF after the sound fine adjustment.
And 8, calling the head related transfer function HRTF subjected to sound fine adjustment obtained in the step 7, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment.
And 9, repeating the steps 5-8 until the front and rear errors of the head related transfer function HRTF after fine adjustment are within a preset threshold range, and obtaining the optimal head related transfer function HRTF.
And step 10, calling the optimal head related transfer function HRTF obtained in the step 9, convolving the local audio data and outputting the local audio data to the service type robot playback equipment.
Preferably: in step 1, the head related transfer functions HRTFs include a head related transfer function HRTF obtained through actual measurement, a head related transfer function HRTF obtained through model simulation and numerical calculation, a head related transfer function HRTF obtained through feedback information correction according to a user, and a shared head related transfer function HRTF.
Preferably: in step 2, the method for the service robot to acquire the height information of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode comprises the following steps: and acquiring the height information of a listener of the man-machine interaction subject by adopting infrared, ultrasound or images.
Preferably: the matching method in step 3 is to call a Head Related Transfer Function (HRTF) of a corresponding angle according to the physiological height information.
Preferably: the sound fine-tuning method in step 7 comprises the following steps: and time delays of different playback speakers of the service robot are controlled, and the experience of the elevation angle of the virtual sound source is improved. And controlling the service type robot to adjust sound effect equalization and reverberation control.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention designs a correction method based on the listener physiological height characteristics in human-computer interaction acquired by the robot, is used for the self-adaptive matching of the robot with the listener perceived sound source height in human-computer interaction, improves the human-computer interaction friendliness of the service robot, and avoids mechanical playing of sound signals with the same perceived height.
(2) The service robot acquires listener height information of a human-computer interaction subject in real time in a multi-mode interaction mode.
(3) The problem of the perceived height of the service robot playing audio is corrected in real time, so that the service robot is more suitable for human-computer interaction listeners with different heights, and the human-computer interaction friendliness of the service robot is improved.
Drawings
Fig. 1 illustrates a method for synthesizing a virtual sound source in a time domain.
FIG. 2 is a flow chart of the present invention.
Detailed Description
The present invention is further illustrated by the following description in conjunction with the accompanying drawings and the specific embodiments, it is to be understood that these examples are given solely for the purpose of illustration and are not intended as a definition of the limits of the invention, since various equivalent modifications will occur to those skilled in the art upon reading the present invention and fall within the limits of the appended claims.
A method for correcting the height of a perceived sound source when a service robot plays audio, as shown in fig. 2, includes the following steps:
step 1, various head-related transfer functions HRTF are stored in the local device of the service robot to form a head-related transfer function HRTF database, and the various head-related transfer functions HRTF cover different height auditory altitude information.
The various head-related transfer functions HRTFs include a head-related transfer function HRTF obtained through actual measurement by a professional method, a head-related transfer function HRTF obtained through model simulation and numerical calculation, a head-related transfer function HRTF obtained through correction according to user use feedback information, and a head-related transfer function HRTF shared by other mechanisms or users.
The invention firstly adopts the public HRTF data measured by CIPIC laboratory of Davis university of California, USA, to realize the synthesis of the playing perception height of the service robot.
As described above, the human brain discriminates the sound source direction in a three-dimensional space according to the spectral characteristics when sound reaches the eardrum. The response of the structure of the human body to sound, in particular the response of the pinna to sound, is called the "pinna effect". The "pinna effect" indicates that the human auditory system is functionally equivalent to a filter related to the spatial direction of sound, and the frequency spectrum of sound in different spatial directions is modified.
The essence of using HRTF data to achieve virtual sound source direction is to convolve the HRTF data with the sound signal to be processed. The process of synthesizing a virtual sound source in the time domain using a monophonic sound source signal is shown in fig. 1. The sound source signal is convoluted with HRTF data of the left ear and the right ear respectively, and the virtual sound source with the azimuth information can be heard through the retransmission of the loudspeaker.
And 2, the service type robot acquires the height information of the listener of the human-computer interaction subject according to the multi-mode sensing interaction mode to obtain the physiological height characteristic of the listener of the human-computer interaction subject.
The method for the service robot to acquire the height information of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode comprises the following steps: the height information of the listener of the man-machine interaction subject can be acquired by adopting an infrared mode, an ultrasonic mode or an image mode.
And 3, matching head-related transfer function HRTFs in the head-related transfer function HRTF database according to the physiological height characteristics of the listener of the man-machine interaction subject, and selecting the corresponding head-related transfer function HRTFs.
The matching method is to call a Head Related Transfer Function (HRTF) of a corresponding angle according to the physiological height information.
And 4, calling the head related transfer function HRTF selected in the step 3, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment.
The convolution local audio data is loaded with standard or universal HRTF parameters, and the convolution local audio data is intended to be played for standard playback. The played content includes virtual sound source height sound effects based on the current HRTF direction.
And 5, the service type robot acquires the physiological height characteristics of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode.
And 6, carrying out height fine adjustment on the head-related transfer function HRTF obtained by matching in the step 3 by the service type robot in real time according to the physiological height characteristics of the listener of the man-machine interaction subject obtained in the step 5 to obtain the head-related transfer function HRTF after the height fine adjustment.
And 7, performing sound fine adjustment on the service type robot according to the head related transfer function HRTF after the height fine adjustment obtained in the step 6 to obtain the head related transfer function HRTF after the sound fine adjustment.
The fine tuning training process can be realized by playing the same sound source, and the sound fine tuning method comprises the following steps: and time delays of different playback speakers of the service robot are controlled, and the experience of the elevation angle of the virtual sound source is improved. And controlling the service type robot to adjust sound effect equalization, reverberation control and the like.
And 8, calling the head related transfer function HRTF subjected to sound fine adjustment obtained in the step 7, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment.
And 9, repeating the steps 5-8 until the front and rear errors of the head related transfer function HRTF after fine adjustment are within a preset threshold range, and obtaining the optimal head related transfer function HRTF.
And step 10, calling the optimal head related transfer function HRTF obtained in the step 9, convolving the local audio data, outputting the local audio data to the service type robot playback equipment, and outputting appropriate playback data.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.
Claims (5)
1. A method for correcting the height of a perceived sound source when a service robot plays audio is characterized by comprising the following steps:
step 1, various head-related transfer functions HRTF are stored in local equipment of the service robot to form a head-related transfer function HRTF database, and the various head-related transfer functions HRTF cover different height auditory altitude information;
step 2, the service type robot acquires the height information of the listener of the human-computer interaction subject according to the multi-mode sensing interaction mode to obtain the physiological height characteristic of the listener of the human-computer interaction subject;
step 3, matching head-related transfer function HRTFs in a head-related transfer function HRTF database according to the physiological height characteristics of the listener of the man-machine interaction subject, and selecting corresponding head-related transfer function HRTFs;
step 4, calling the head related transfer function HRTF selected in the step 3, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment;
step 5, the service type robot acquires the physiological height characteristics of the listener of the man-machine interaction subject again according to the multi-mode sensing interaction mode;
step 6, the service type robot performs height fine adjustment on the head related transfer function HRTF obtained by matching in the step 3 in real time according to the physiological height characteristics of the listener of the man-machine interaction subject obtained in the step 5 to obtain the head related transfer function HRTF after the height fine adjustment;
step 7, the service robot carries out sound fine adjustment according to the head related transfer function HRTF after the height fine adjustment obtained in the step 6 to obtain the head related transfer function HRTF after the sound fine adjustment;
step 8, calling the head related transfer function HRTF after the sound fine adjustment obtained in the step 7, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment;
step 9, repeating the steps 5-8 until the front and back errors of the head related transfer function HRTF after fine adjustment are within a preset threshold range, and obtaining an optimal head related transfer function HRTF;
and step 10, calling the optimal head related transfer function HRTF obtained in the step 9, convolving the local audio data and outputting the local audio data to the service type robot playback equipment.
2. The method for correcting the perceived sound source height when the service robot plays the audio according to claim 1, wherein: in step 1, the head related transfer functions HRTFs include a head related transfer function HRTF obtained through actual measurement, a head related transfer function HRTF obtained through model simulation and numerical calculation, a head related transfer function HRTF obtained through feedback information correction according to a user, and a shared head related transfer function HRTF.
3. The method for correcting the perceived sound source height when the service robot plays the audio according to claim 2, wherein: in step 2, the method for the service robot to acquire the height information of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode comprises the following steps: and acquiring the height information of a listener of the man-machine interaction subject by adopting infrared, ultrasound or images.
4. The method as claimed in claim 3, wherein the method comprises the following steps: the matching method in step 3 is to call a Head Related Transfer Function (HRTF) of a corresponding angle according to the physiological height information.
5. The method for correcting the perceived sound source height when the service robot plays the audio according to claim 4, wherein: the sound fine-tuning method in step 7 comprises the following steps: controlling time delays of different playback speakers of the service robot and improving experience of the elevation angle of the virtual sound source; and controlling the service type robot to adjust sound effect equalization and reverberation control.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111261650.8A CN113905323B (en) | 2021-10-28 | 2021-10-28 | Perception sound source height correction method suitable for service robot in audio playing |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111261650.8A CN113905323B (en) | 2021-10-28 | 2021-10-28 | Perception sound source height correction method suitable for service robot in audio playing |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113905323A true CN113905323A (en) | 2022-01-07 |
CN113905323B CN113905323B (en) | 2024-01-23 |
Family
ID=79026680
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111261650.8A Active CN113905323B (en) | 2021-10-28 | 2021-10-28 | Perception sound source height correction method suitable for service robot in audio playing |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113905323B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105979441A (en) * | 2016-05-17 | 2016-09-28 | 南京大学 | Customized optimization method for 3D sound effect headphone reproduction |
CN108540925A (en) * | 2018-04-11 | 2018-09-14 | 北京理工大学 | A kind of fast matching method of personalization head related transfer function |
-
2021
- 2021-10-28 CN CN202111261650.8A patent/CN113905323B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105979441A (en) * | 2016-05-17 | 2016-09-28 | 南京大学 | Customized optimization method for 3D sound effect headphone reproduction |
CN108540925A (en) * | 2018-04-11 | 2018-09-14 | 北京理工大学 | A kind of fast matching method of personalization head related transfer function |
Non-Patent Citations (2)
Title |
---|
张宗帅;顾亚平;张俊;杨小平;: "基于HRTF的虚拟声源定位", 网络新媒体技术, no. 02 * |
黄婉秋;曾向阳;王蕾;: "基于多维生理参数的头相关传递函数个人化方法", 西北工业大学学报, no. 02 * |
Also Published As
Publication number | Publication date |
---|---|
CN113905323B (en) | 2024-01-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10440496B2 (en) | Spatial audio processing emphasizing sound sources close to a focal distance | |
JP5894634B2 (en) | Determination of HRTF for each individual | |
JP4938015B2 (en) | Method and apparatus for generating three-dimensional speech | |
JP4584416B2 (en) | Multi-channel audio playback apparatus for speaker playback using virtual sound image capable of position adjustment and method thereof | |
JP4921470B2 (en) | Method and apparatus for generating and processing parameters representing head related transfer functions | |
US20050147261A1 (en) | Head relational transfer function virtualizer | |
CN113170271B (en) | Method and apparatus for processing stereo signals | |
EP3652737A1 (en) | Concept for generating an enhanced sound-field description or a modified sound field description using a depth-extended dirac technique or other techniques | |
JP2006081191A (en) | Sound reproducing apparatus and sound reproducing method | |
CN112005559B (en) | Method for improving positioning of surround sound | |
KR20180102596A (en) | Synthesis of signals for immersive audio playback | |
CN112956210B (en) | Audio signal processing method and device based on equalization filter | |
EP3595337A1 (en) | Audio apparatus and method of audio processing | |
Lee et al. | A real-time audio system for adjusting the sweet spot to the listener's position | |
Sunder | Binaural audio engineering | |
EP2822301B1 (en) | Determination of individual HRTFs | |
CN110225445A (en) | A kind of processing voice signal realizes the method and device of three-dimensional sound field auditory effect | |
CN113905323B (en) | Perception sound source height correction method suitable for service robot in audio playing | |
KR20160136716A (en) | A method and an apparatus for processing an audio signal | |
US10999694B2 (en) | Transfer function dataset generation system and method | |
Frank et al. | Simple reduction of front-back confusion in static binaural rendering | |
JP2020156029A (en) | Method of generating head transfer function, apparatus, and program | |
Villegas | Improving perceived elevation accuracy in sound reproduced via a loudspeaker ring by means of equalizing filters and side loudspeaker grouping | |
André | Audiovisual spatial congruence, and applications to 3D sound and stereoscopic video | |
US20230403528A1 (en) | A method and system for real-time implementation of time-varying head-related transfer functions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |