CN113905323A - Perceptual sound source height correction method applicable to service type robot playing audio - Google Patents

Perceptual sound source height correction method applicable to service type robot playing audio Download PDF

Info

Publication number
CN113905323A
CN113905323A CN202111261650.8A CN202111261650A CN113905323A CN 113905323 A CN113905323 A CN 113905323A CN 202111261650 A CN202111261650 A CN 202111261650A CN 113905323 A CN113905323 A CN 113905323A
Authority
CN
China
Prior art keywords
related transfer
transfer function
height
head
hrtf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202111261650.8A
Other languages
Chinese (zh)
Other versions
CN113905323B (en
Inventor
林志斌
刘晓峻
卢晶
狄敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Province Nanjing University Of Science And Technology Electronic Information Technology Co ltd
Nanjing Nanda Electronic Wisdom Service Robot Research Institute Co ltd
Nanjing University
Original Assignee
Jiangsu Province Nanjing University Of Science And Technology Electronic Information Technology Co ltd
Nanjing Nanda Electronic Wisdom Service Robot Research Institute Co ltd
Nanjing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Province Nanjing University Of Science And Technology Electronic Information Technology Co ltd, Nanjing Nanda Electronic Wisdom Service Robot Research Institute Co ltd, Nanjing University filed Critical Jiangsu Province Nanjing University Of Science And Technology Electronic Information Technology Co ltd
Priority to CN202111261650.8A priority Critical patent/CN113905323B/en
Publication of CN113905323A publication Critical patent/CN113905323A/en
Application granted granted Critical
Publication of CN113905323B publication Critical patent/CN113905323B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J11/00Manipulators not otherwise provided for
    • B25J11/0005Manipulators having means for high-level communication with users, e.g. speech generator, face recognition means
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B25HAND TOOLS; PORTABLE POWER-DRIVEN TOOLS; MANIPULATORS
    • B25JMANIPULATORS; CHAMBERS PROVIDED WITH MANIPULATION DEVICES
    • B25J11/00Manipulators not otherwise provided for
    • B25J11/008Manipulators for service tasks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2420/00Techniques used stereophonic systems covered by H04S but not provided for in its groups
    • H04S2420/01Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]

Landscapes

  • Engineering & Computer Science (AREA)
  • Robotics (AREA)
  • Mechanical Engineering (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Stereophonic System (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a method for correcting the height of a perceived sound source when a service robot plays audio, which comprises the following steps: the service type robot local device stores various Head Related Transfer Functions (HRTFs) to form a Head Related Transfer Function (HRTF) database, the various Head Related Transfer Functions (HRTFs) cover different height auditory altitude information, the service type robot acquires the altitude information of a listener of a man-machine interaction main body according to a multi-mode sensing interaction mode, matches the Head Related Transfer Functions (HRTFs) according to physiological altitude characteristics, then finely adjusts the matched Head Related Transfer Functions (HRTFs), convolves local audio data and outputs the data to the service type robot playback device. The invention can effectively correct the height of the human-computer interaction sound of the service robot in real time, and solves the problem of the differentiation of virtual sound images when different service robots and different listeners use the service robots to carry out human-computer interaction.

Description

Perceptual sound source height correction method applicable to service type robot playing audio
Technical Field
The invention relates to a method for correcting the height of a perceived sound source when a service robot plays audio, and belongs to the technical field of sound of robots.
Background
Three-dimensional audio has been primarily applied in the fields of movies, games, music and the like, can reconstruct virtual sound images at any spatial position, creates immersive sound scenes, and is commonly used in movie theaters, home theaters and the like.
With the rapid development of the robot service industry, the requirement of customers on the comfort of playing audio of the service robot is also improved. The quality of the audio played by the robot is the final expression form of the human-computer interaction of the interactive robot, and is most directly concerned by consumers, and the quality of the audio played by the robot directly influences the artificial experience of the robot audio interaction. How to effectively improve the quality of the audio played by the robot becomes an important problem, especially for the perception height of the audio played by the robot, the method is an important evaluation index for evaluating the audio alternating current quality of the robot.
Head-Related Transfer Function (HRTF) reflects the filtering effect of the external ear, Head, torso, etc. on sound signals that enter the human ear when the orientation of the sound signals is different. It describes how the human ear perceives sound from a point in space, some spectral variation being observed as the source varies along the elevation axis: there is a notch at 7kHz, with the frequency shifting upward as the elevation axis increases, with a shallow peak seen at 12kHz in the mid-line plane, and a flattening of this peak in the high elevation region. Thus, the perception of elevation angle may be associated with local maxima or hot spots at some particular frequencies. The sound produced by the human auditory system is almost elevated if the spectral differences between a high elevation sound source and a horizontal loudspeaker are applied to the original loudspeaker. HRTF is researched in universities and laboratories at home and abroad, and the theoretical method is applied to spaceflight, military affairs, games, sound and other aspects by people.
Disclosure of Invention
The purpose of the invention is as follows: in order to solve the problem of differentiation of virtual sound images when different service robots and different listeners use the service robots to carry out human-computer interaction, the invention provides a method for correcting the height of a perceived sound source when the service robots play audio.
The technical scheme is as follows: in order to achieve the purpose, the invention adopts the technical scheme that:
a perceived sound source height correction method suitable for a service robot to play audio comprises the following steps:
step 1, various head-related transfer functions HRTF are stored in the local device of the service robot to form a head-related transfer function HRTF database, and the various head-related transfer functions HRTF cover different height auditory altitude information.
And 2, the service type robot acquires the height information of the listener of the human-computer interaction subject according to the multi-mode sensing interaction mode to obtain the physiological height characteristic of the listener of the human-computer interaction subject.
And 3, matching head-related transfer function HRTFs in the head-related transfer function HRTF database according to the physiological height characteristics of the listener of the man-machine interaction subject, and selecting the corresponding head-related transfer function HRTFs.
And 4, calling the head related transfer function HRTF selected in the step 3, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment.
And 5, the service type robot acquires the physiological height characteristics of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode.
And 6, carrying out height fine adjustment on the head-related transfer function HRTF obtained by matching in the step 3 by the service type robot in real time according to the physiological height characteristics of the listener of the man-machine interaction subject obtained in the step 5 to obtain the head-related transfer function HRTF after the height fine adjustment.
And 7, performing sound fine adjustment on the service type robot according to the head related transfer function HRTF after the height fine adjustment obtained in the step 6 to obtain the head related transfer function HRTF after the sound fine adjustment.
And 8, calling the head related transfer function HRTF subjected to sound fine adjustment obtained in the step 7, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment.
And 9, repeating the steps 5-8 until the front and rear errors of the head related transfer function HRTF after fine adjustment are within a preset threshold range, and obtaining the optimal head related transfer function HRTF.
And step 10, calling the optimal head related transfer function HRTF obtained in the step 9, convolving the local audio data and outputting the local audio data to the service type robot playback equipment.
Preferably: in step 1, the head related transfer functions HRTFs include a head related transfer function HRTF obtained through actual measurement, a head related transfer function HRTF obtained through model simulation and numerical calculation, a head related transfer function HRTF obtained through feedback information correction according to a user, and a shared head related transfer function HRTF.
Preferably: in step 2, the method for the service robot to acquire the height information of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode comprises the following steps: and acquiring the height information of a listener of the man-machine interaction subject by adopting infrared, ultrasound or images.
Preferably: the matching method in step 3 is to call a Head Related Transfer Function (HRTF) of a corresponding angle according to the physiological height information.
Preferably: the sound fine-tuning method in step 7 comprises the following steps: and time delays of different playback speakers of the service robot are controlled, and the experience of the elevation angle of the virtual sound source is improved. And controlling the service type robot to adjust sound effect equalization and reverberation control.
Compared with the prior art, the invention has the following beneficial effects:
(1) the invention designs a correction method based on the listener physiological height characteristics in human-computer interaction acquired by the robot, is used for the self-adaptive matching of the robot with the listener perceived sound source height in human-computer interaction, improves the human-computer interaction friendliness of the service robot, and avoids mechanical playing of sound signals with the same perceived height.
(2) The service robot acquires listener height information of a human-computer interaction subject in real time in a multi-mode interaction mode.
(3) The problem of the perceived height of the service robot playing audio is corrected in real time, so that the service robot is more suitable for human-computer interaction listeners with different heights, and the human-computer interaction friendliness of the service robot is improved.
Drawings
Fig. 1 illustrates a method for synthesizing a virtual sound source in a time domain.
FIG. 2 is a flow chart of the present invention.
Detailed Description
The present invention is further illustrated by the following description in conjunction with the accompanying drawings and the specific embodiments, it is to be understood that these examples are given solely for the purpose of illustration and are not intended as a definition of the limits of the invention, since various equivalent modifications will occur to those skilled in the art upon reading the present invention and fall within the limits of the appended claims.
A method for correcting the height of a perceived sound source when a service robot plays audio, as shown in fig. 2, includes the following steps:
step 1, various head-related transfer functions HRTF are stored in the local device of the service robot to form a head-related transfer function HRTF database, and the various head-related transfer functions HRTF cover different height auditory altitude information.
The various head-related transfer functions HRTFs include a head-related transfer function HRTF obtained through actual measurement by a professional method, a head-related transfer function HRTF obtained through model simulation and numerical calculation, a head-related transfer function HRTF obtained through correction according to user use feedback information, and a head-related transfer function HRTF shared by other mechanisms or users.
The invention firstly adopts the public HRTF data measured by CIPIC laboratory of Davis university of California, USA, to realize the synthesis of the playing perception height of the service robot.
As described above, the human brain discriminates the sound source direction in a three-dimensional space according to the spectral characteristics when sound reaches the eardrum. The response of the structure of the human body to sound, in particular the response of the pinna to sound, is called the "pinna effect". The "pinna effect" indicates that the human auditory system is functionally equivalent to a filter related to the spatial direction of sound, and the frequency spectrum of sound in different spatial directions is modified.
The essence of using HRTF data to achieve virtual sound source direction is to convolve the HRTF data with the sound signal to be processed. The process of synthesizing a virtual sound source in the time domain using a monophonic sound source signal is shown in fig. 1. The sound source signal is convoluted with HRTF data of the left ear and the right ear respectively, and the virtual sound source with the azimuth information can be heard through the retransmission of the loudspeaker.
And 2, the service type robot acquires the height information of the listener of the human-computer interaction subject according to the multi-mode sensing interaction mode to obtain the physiological height characteristic of the listener of the human-computer interaction subject.
The method for the service robot to acquire the height information of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode comprises the following steps: the height information of the listener of the man-machine interaction subject can be acquired by adopting an infrared mode, an ultrasonic mode or an image mode.
And 3, matching head-related transfer function HRTFs in the head-related transfer function HRTF database according to the physiological height characteristics of the listener of the man-machine interaction subject, and selecting the corresponding head-related transfer function HRTFs.
The matching method is to call a Head Related Transfer Function (HRTF) of a corresponding angle according to the physiological height information.
And 4, calling the head related transfer function HRTF selected in the step 3, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment.
The convolution local audio data is loaded with standard or universal HRTF parameters, and the convolution local audio data is intended to be played for standard playback. The played content includes virtual sound source height sound effects based on the current HRTF direction.
And 5, the service type robot acquires the physiological height characteristics of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode.
And 6, carrying out height fine adjustment on the head-related transfer function HRTF obtained by matching in the step 3 by the service type robot in real time according to the physiological height characteristics of the listener of the man-machine interaction subject obtained in the step 5 to obtain the head-related transfer function HRTF after the height fine adjustment.
And 7, performing sound fine adjustment on the service type robot according to the head related transfer function HRTF after the height fine adjustment obtained in the step 6 to obtain the head related transfer function HRTF after the sound fine adjustment.
The fine tuning training process can be realized by playing the same sound source, and the sound fine tuning method comprises the following steps: and time delays of different playback speakers of the service robot are controlled, and the experience of the elevation angle of the virtual sound source is improved. And controlling the service type robot to adjust sound effect equalization, reverberation control and the like.
And 8, calling the head related transfer function HRTF subjected to sound fine adjustment obtained in the step 7, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment.
And 9, repeating the steps 5-8 until the front and rear errors of the head related transfer function HRTF after fine adjustment are within a preset threshold range, and obtaining the optimal head related transfer function HRTF.
And step 10, calling the optimal head related transfer function HRTF obtained in the step 9, convolving the local audio data, outputting the local audio data to the service type robot playback equipment, and outputting appropriate playback data.
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (5)

1. A method for correcting the height of a perceived sound source when a service robot plays audio is characterized by comprising the following steps:
step 1, various head-related transfer functions HRTF are stored in local equipment of the service robot to form a head-related transfer function HRTF database, and the various head-related transfer functions HRTF cover different height auditory altitude information;
step 2, the service type robot acquires the height information of the listener of the human-computer interaction subject according to the multi-mode sensing interaction mode to obtain the physiological height characteristic of the listener of the human-computer interaction subject;
step 3, matching head-related transfer function HRTFs in a head-related transfer function HRTF database according to the physiological height characteristics of the listener of the man-machine interaction subject, and selecting corresponding head-related transfer function HRTFs;
step 4, calling the head related transfer function HRTF selected in the step 3, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment;
step 5, the service type robot acquires the physiological height characteristics of the listener of the man-machine interaction subject again according to the multi-mode sensing interaction mode;
step 6, the service type robot performs height fine adjustment on the head related transfer function HRTF obtained by matching in the step 3 in real time according to the physiological height characteristics of the listener of the man-machine interaction subject obtained in the step 5 to obtain the head related transfer function HRTF after the height fine adjustment;
step 7, the service robot carries out sound fine adjustment according to the head related transfer function HRTF after the height fine adjustment obtained in the step 6 to obtain the head related transfer function HRTF after the sound fine adjustment;
step 8, calling the head related transfer function HRTF after the sound fine adjustment obtained in the step 7, convolving the local audio data, and outputting the local audio data to the service type robot playback equipment;
step 9, repeating the steps 5-8 until the front and back errors of the head related transfer function HRTF after fine adjustment are within a preset threshold range, and obtaining an optimal head related transfer function HRTF;
and step 10, calling the optimal head related transfer function HRTF obtained in the step 9, convolving the local audio data and outputting the local audio data to the service type robot playback equipment.
2. The method for correcting the perceived sound source height when the service robot plays the audio according to claim 1, wherein: in step 1, the head related transfer functions HRTFs include a head related transfer function HRTF obtained through actual measurement, a head related transfer function HRTF obtained through model simulation and numerical calculation, a head related transfer function HRTF obtained through feedback information correction according to a user, and a shared head related transfer function HRTF.
3. The method for correcting the perceived sound source height when the service robot plays the audio according to claim 2, wherein: in step 2, the method for the service robot to acquire the height information of the listener of the man-machine interaction subject according to the multi-mode sensing interaction mode comprises the following steps: and acquiring the height information of a listener of the man-machine interaction subject by adopting infrared, ultrasound or images.
4. The method as claimed in claim 3, wherein the method comprises the following steps: the matching method in step 3 is to call a Head Related Transfer Function (HRTF) of a corresponding angle according to the physiological height information.
5. The method for correcting the perceived sound source height when the service robot plays the audio according to claim 4, wherein: the sound fine-tuning method in step 7 comprises the following steps: controlling time delays of different playback speakers of the service robot and improving experience of the elevation angle of the virtual sound source; and controlling the service type robot to adjust sound effect equalization and reverberation control.
CN202111261650.8A 2021-10-28 2021-10-28 Perception sound source height correction method suitable for service robot in audio playing Active CN113905323B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111261650.8A CN113905323B (en) 2021-10-28 2021-10-28 Perception sound source height correction method suitable for service robot in audio playing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111261650.8A CN113905323B (en) 2021-10-28 2021-10-28 Perception sound source height correction method suitable for service robot in audio playing

Publications (2)

Publication Number Publication Date
CN113905323A true CN113905323A (en) 2022-01-07
CN113905323B CN113905323B (en) 2024-01-23

Family

ID=79026680

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111261650.8A Active CN113905323B (en) 2021-10-28 2021-10-28 Perception sound source height correction method suitable for service robot in audio playing

Country Status (1)

Country Link
CN (1) CN113905323B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105979441A (en) * 2016-05-17 2016-09-28 南京大学 Customized optimization method for 3D sound effect headphone reproduction
CN108540925A (en) * 2018-04-11 2018-09-14 北京理工大学 A kind of fast matching method of personalization head related transfer function

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105979441A (en) * 2016-05-17 2016-09-28 南京大学 Customized optimization method for 3D sound effect headphone reproduction
CN108540925A (en) * 2018-04-11 2018-09-14 北京理工大学 A kind of fast matching method of personalization head related transfer function

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张宗帅;顾亚平;张俊;杨小平;: "基于HRTF的虚拟声源定位", 网络新媒体技术, no. 02 *
黄婉秋;曾向阳;王蕾;: "基于多维生理参数的头相关传递函数个人化方法", 西北工业大学学报, no. 02 *

Also Published As

Publication number Publication date
CN113905323B (en) 2024-01-23

Similar Documents

Publication Publication Date Title
US10440496B2 (en) Spatial audio processing emphasizing sound sources close to a focal distance
JP5894634B2 (en) Determination of HRTF for each individual
JP4938015B2 (en) Method and apparatus for generating three-dimensional speech
JP4584416B2 (en) Multi-channel audio playback apparatus for speaker playback using virtual sound image capable of position adjustment and method thereof
JP4921470B2 (en) Method and apparatus for generating and processing parameters representing head related transfer functions
US20050147261A1 (en) Head relational transfer function virtualizer
CN113170271B (en) Method and apparatus for processing stereo signals
EP3652737A1 (en) Concept for generating an enhanced sound-field description or a modified sound field description using a depth-extended dirac technique or other techniques
JP2006081191A (en) Sound reproducing apparatus and sound reproducing method
CN112005559B (en) Method for improving positioning of surround sound
KR20180102596A (en) Synthesis of signals for immersive audio playback
CN112956210B (en) Audio signal processing method and device based on equalization filter
EP3595337A1 (en) Audio apparatus and method of audio processing
Lee et al. A real-time audio system for adjusting the sweet spot to the listener's position
Sunder Binaural audio engineering
EP2822301B1 (en) Determination of individual HRTFs
CN110225445A (en) A kind of processing voice signal realizes the method and device of three-dimensional sound field auditory effect
CN113905323B (en) Perception sound source height correction method suitable for service robot in audio playing
KR20160136716A (en) A method and an apparatus for processing an audio signal
US10999694B2 (en) Transfer function dataset generation system and method
Frank et al. Simple reduction of front-back confusion in static binaural rendering
JP2020156029A (en) Method of generating head transfer function, apparatus, and program
Villegas Improving perceived elevation accuracy in sound reproduced via a loudspeaker ring by means of equalizing filters and side loudspeaker grouping
André Audiovisual spatial congruence, and applications to 3D sound and stereoscopic video
US20230403528A1 (en) A method and system for real-time implementation of time-varying head-related transfer functions

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant