WO2017135063A1 - Audio processing device, audio processing method and program - Google Patents

Audio processing device, audio processing method and program Download PDF

Info

Publication number
WO2017135063A1
WO2017135063A1 PCT/JP2017/001853 JP2017001853W WO2017135063A1 WO 2017135063 A1 WO2017135063 A1 WO 2017135063A1 JP 2017001853 W JP2017001853 W JP 2017001853W WO 2017135063 A1 WO2017135063 A1 WO 2017135063A1
Authority
WO
WIPO (PCT)
Prior art keywords
hrtf data
hrtf
data
predetermined direction
processing apparatus
Prior art date
Application number
PCT/JP2017/001853
Other languages
French (fr)
Japanese (ja)
Inventor
哲 曲谷地
誉 今
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニー株式会社 filed Critical ソニー株式会社
Publication of WO2017135063A1 publication Critical patent/WO2017135063A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control

Definitions

  • the present disclosure relates to a voice processing device, a voice processing method, and a program, and in particular, a voice processing device and a voice processing method capable of realizing efficient compression of a head-related transfer function database and high-speed data access. , As well as programs.
  • This binaural reproduction technology is generally called an auditory display (VAD: Virtual Auditory Display) and is realized using a head-related transfer function (HRTF) (see Patent Document 1).
  • VAD Virtual Auditory Display
  • HRTF head-related transfer function
  • the head-related transfer function expresses information about how sound is transmitted from all directions surrounding the human head to the eardrum as a function of frequency and direction of arrival.
  • the auditory display is a system that uses this principle.
  • the present disclosure has been made in view of such circumstances, and in particular, the desired density of HRTFs stored in the database is set as desired while maintaining the density of the required HRTFs as intended by the creator of the HRTF database. It is intended to speed up access to HRTF.
  • the speech processing device is based on relationship information indicating a relationship between an HRTF data holding unit that holds HRTF data corresponding to a plurality of directions and the HRTF data corresponding to a predetermined direction. And a HRTF data reading unit that reads the HRTF data corresponding to the direction from the HRTF data holding unit.
  • the predetermined direction may include a direction and / or distance or position with respect to the listener.
  • the predetermined direction can be a direction specified by an elevation angle and a horizontal angle centered on the listener's head.
  • the predetermined direction may be a direction specified by two angles that specify an arbitrary polar coordinate direction centered on the listener's head.
  • the two angles specifying the direction of the arbitrary polar coordinates can be an elevation angle and a horizontal angle with the listener's head as the center.
  • the predetermined direction may not coincide with any of the plurality of directions.
  • the predetermined direction can be an intersection of arbitrary grids in an arbitrary coordinate system.
  • the HRTF data can be defined so as to be spatially approximately equal density or approximately unequal density in a plurality of directions.
  • the HRTF data may be defined to correspond to a plurality of directions so as to have a substantially spherical or substantially equal density in a substantially spherical shape.
  • the HRTF data can be defined according to a plurality of positional relationships with respect to the listener so as to be spatially substantially non-uniform in a plurality of directions.
  • the relationship information is information indicating a relationship between the predetermined direction and an address at which the HRTF data is stored in the corresponding HRTF data holding unit, and the HRTF data reading unit is based on the relationship information.
  • the HRTF data of the address for the predetermined direction can be read.
  • the HRTF data reading unit reads a plurality of HRTF data corresponding to the predetermined direction from the HRTF data holding unit based on relation information indicating a relationship with the HRTF data corresponding to the predetermined direction.
  • An interpolation unit that interpolates and generates interpolation HRTF data corresponding to the predetermined direction based on the plurality of HRTF data corresponding to the predetermined direction can be further included.
  • the interpolating unit can interpolate and generate HRTF data corresponding to the predetermined direction based on the plurality of HRTF data and an interpolation coefficient corresponding to each of the plurality of HRTF data.
  • the interpolation unit interpolates and generates HRTF data corresponding to the predetermined direction by calculating a linear sum of the plurality of HRTF data and an interpolation coefficient in a spatial direction corresponding to each of the plurality of HRTF data. You can make it.
  • the plurality of HRTF data, information on offset in a time direction indicating a deviation of a pulse position in the plurality of HRTF data, a spatial direction corresponding to each of the plurality of HRTF data, and a time direction Based on the interpolation coefficient, the HRTF data corresponding to the predetermined direction can be generated by interpolation.
  • the interpolation unit aligns the offset in the time direction in the plurality of HRTF data, calculates a linear sum with an interpolation coefficient in a spatial direction corresponding to each of the plurality of HRTF data, and in the plurality of HRTF data
  • the HRTF data offset by a time calculated as a linear sum of the offset in the time direction and the interpolation coefficient in the time direction can be output as HRTF data corresponding to the predetermined direction.
  • the speech processing method is based on relationship information indicating a relationship with the HRTF data corresponding to a predetermined direction, and the HRTF data corresponding to the predetermined direction is converted into HRTFs corresponding to a plurality of directions.
  • This is an audio processing method including a step of reading from an HRTF data holding unit that holds data.
  • a program according to an aspect of the present disclosure is based on relationship information indicating a relationship between an HRTF data holding unit that holds HRTF data corresponding to a plurality of directions and the HRTF data corresponding to a predetermined direction.
  • a program that causes a computer to function as an HRTF data reading unit that reads the HRTF data corresponding to HRTF data from the HRTF data holding unit.
  • the HRTF data holding unit holds HRTF data corresponding to a plurality of directions, and based on the relationship information indicating the relationship with the HRTF data corresponding to a predetermined direction, the predetermined direction The HRTF data corresponding to is read from the HRTF data holding unit.
  • FIG. 11 is a diagram illustrating a configuration example of a general-purpose personal computer.
  • a method using a head-related transfer function is generally used as a method of simulating stereoscopic sound at the ear by presenting headphones.
  • the head-related transfer function HRTF H (x, ⁇ ) is the transfer characteristic H_1 (x, ⁇ ) from the sound source position x to the eardrum position when the head is present in free space. Normalized by the transfer characteristic H_0 (x, ⁇ ) from the sound source to the head center O.
  • H (x, ⁇ ) (H_1 (x, ⁇ )) / (H_0 (x, ⁇ ))
  • the database array is constructed at an angular interval determined by the horizontal angle and the elevation angle, access to the angle becomes easy.
  • the zenith of the listener G is assumed to have an elevation angle of 0 degrees and the nadir is assumed to have an elevation angle of 180 degrees, and as shown in the right part of FIG.
  • the front direction of the listener G is assumed to be 0 degree, and the angle that advances clockwise is assumed.
  • HRTFs for one round of the horizontal angle are arranged for each data, and when the elevation angle is 0 degrees, the elevation angle is 1 degree.
  • data for one round is arranged.
  • the data array is such that this is performed up to an elevation angle of 180 degrees.
  • E indicates the number of elevation angles at which data points exist.
  • E 181 because data exists in increments of 1 degree from elevation angles 0 to 180 degrees.
  • round (x) is a function that rounds off the first decimal place of x.
  • data at the same elevation angle is selected at equal intervals in the horizontal direction, and as many as possible at equal intervals of a certain length or more on the circumference within the equal elevation angle. Place the data points.
  • the closest data point to the direction specified by the elevation angle and horizontal angle to be accessed is found from the header, and the index is accessed by multiplying the data size a of HRTF for one direction. It will be necessary. That is, as the amount of HRTF data increases, the amount of calculation for searching for data becomes enormous.
  • an address of a polar coordinate system arrangement (absolute or relative address in the HRTF database) is introduced as header information in order to facilitate data access while preventing data volume congestion.
  • this can be performed with respect to an orthogonal coordinate system, a cylindrical coordinate system, and the like in the same manner, only a polar coordinate system will be described in the present embodiment.
  • the data point arrangement of the HRTF database main body is arbitrary, and even if the density is equal, the density may be intentionally biased. However, here, as an example for explanation, it is assumed that the data point arrangement of the HRTF database main body is arranged at substantially equal density as shown in FIG.
  • the HRTF database may be in the form of FIR (FiniteFiImpulse Response), or some transformation, approximation, or editing such as IIR (InfiniteInImpulse Response) or Fourier transform may be performed.
  • an address table corresponding to the polar coordinate arrangement is set.
  • Each element (each point in the figure) of this polar coordinate address table stores, for example, the address of the HRTF data on the HRTF database in FIG. 4 that is closest to that point.
  • What is described as an address here is not limited to an absolute or relative address, but may be information that enables access to HRTF data.
  • the address of HRTF data may be assigned.
  • an address table element for designating HRTF data in a desired direction is set, and the address of the data point where the HRTF data corresponding to the address table element is stored is stored.
  • the elevation angle and horizontal angle that specify the direction are stored.
  • the address of the data point storing the HRTF data are recorded for each element of the address table.
  • the data points in the HRTF database may be constructed so that they are spatially approximately the same density, or may be constructed so that there is a partial density deviation spatially, and the elevation angle and horizontal angle. It becomes possible to access the desired HRTF data appropriately by specifying in association with the direction specified by.
  • the desired HRTF data may be expressed as an elevation angle and a horizontal angle here, what is necessary is just two angles of the arbitrary polar coordinates centering on a head.
  • the size of the address is arbitrary, but it is considered to be negligibly small for data such as HRTF. That is, in the example of the element of the HRTF address table as shown in FIG. 5, it has a 4-byte address and is the same arrangement as the HRTF data arrangement of FIG. In such a case, the total address size is 260 kB, and considering that about 200 MB of data has been reduced by suppressing the bias in data density, this increase in address size can be ignored. I can say that.
  • the direction selection unit 31 selects information specifying the direction of the sound source for the listener, for example, an elevation angle and a horizontal angle, and supplies the selected direction information to the address information reading unit 32.
  • the address information reading unit 32 includes pointer points arranged in polar coordinates in FIG. 5 together with information for designating an arbitrary direction of the sound source with respect to the selected listener and header information stored in the header information storage unit 33.
  • the address table which is a pointer point storing the address in the HRTF database 36 in which the desired HRTF is stored, is calculated, and the address table which is the pointer point of the calculated position is calculated.
  • An address (data size a ′ for one direction (here, 4 bytes, for example)) stored as an element is supplied to the HRTF reading unit 35.
  • the address information storage unit 34 stores, for example, a data size a ′ (for example, 4 bytes) for one direction for designating the pointer point shown in FIG. Yes. That is, the size of the address information stored in the address information storage unit 34 is a ′ ⁇ A ⁇ E.
  • the header information includes the data size a ′ for one direction for designating the pointer point shown in FIG. 5, the number A of horizontal angles where the pointer point in FIG. 5 exists, and one round on one elevation angle.
  • Data size b ( A ⁇ a ′), the start address c of the file image, and the number E of elevations at which the data points exist.
  • the HRTF reading unit 35 accesses the HRTF database 36 according to the address information supplied from the address information reading unit 32, reads the corresponding HRTF, and supplies it to the HRTF calculation unit 37.
  • the HRTF calculation unit 37 generates a binaural signal by performing predetermined calculation processing on the audio signal supplied from the sound source output unit 38 using the supplied HRTF, and outputs the audio from the headphones 39.
  • the HRTF may be in the form of FIR (Finite Impulse Response), or some transformation, approximation, or editing such as IIR (Infinite Impulse Response) or Fourier transform may be performed.
  • the conversion, approximation, and editing may be performed by other than the HRTF calculation unit 37, or may be performed by the HRTF calculation unit 37.
  • various methods can be used for the filter processing method, that is, the HRTF calculation method depending on the format after conversion, approximation, and editing.
  • step S11 the direction selection unit 31 selects information on the direction of a predetermined sound source with respect to the listener, that is, information on a predetermined direction including, for example, an elevation angle and a horizontal angle that specify one of HRTF data, and reads address information. To the unit 32.
  • step S 12 the address information reading unit 32 stores the desired HRTF in the HRTF database 36 based on the selected elevation angle and horizontal angle and the header information stored in the header information storage unit 33. Are read from the address information storage unit 34 and supplied to the HRTF reading unit 35.
  • the address information reading unit 32 is based on the direction information for specifying the HRTF composed of the elevation angle and the horizontal angle selected by the direction selection unit 31 and the header information of the HRTF stored in the header information storage unit 33.
  • the address Address of the pointer point in the desired direction is calculated from the addresses of the pointer points set as the address table in the polar coordinate arrangement of FIG.
  • Address in Expression (2) specifies a pointer point corresponding to the selected direction among pointer points storing addresses constituting each element of the address table set in the polar coordinate arrangement of FIG. Address.
  • the information stored at this address is a pointer including the address in the HRTF database 36 where the HRTF data corresponding to the selected direction is stored.
  • Each pointer point in FIG. 5 stores, for example, the address of the closest one of the data points of the HRTF data arranged in the HRTF database 36 at substantially equal intervals in the horizontal direction as shown in FIG.
  • the address information reading unit 32 reads out an address stored as a pointer point in the address table of the polar coordinate arrangement corresponding to FIG. 5 specified by, for example, the elevation angle and the horizontal angle, and supplies the address to the HRTF reading unit 35.
  • step S13 the HRTF reading unit 35 accesses the address for specifying the data point in the HRTF database 36 supplied from the address information reading unit 32, reads the corresponding HRTF data, and supplies it to the HRTF calculation unit 37.
  • step S14 the HRTF calculation unit 37 performs HRTF filter processing on the audio signal supplied from the sound source output unit 38 using the supplied HRTF, generates a binaural signal, and outputs the binaural signal to the headphones 39.
  • step S15 the headphones 39 output sound based on the supplied binaural signal.
  • the address of the data point storing the HRTF data is set for each elevation angle and horizontal angle. It is stored as each element of the address table set in, and this is read according to the direction specified by the elevation angle and horizontal angle, and the data point of the read address is accessed by the conventional method. Since it is possible to specify the desired HRTF data in the direction specified by the elevation angle and the horizontal angle as described above, it is possible to quickly read out the HRTF data.
  • the HRTF data is arranged on the spherical surface so as to have the same density.
  • the HRTF data is constituted by other norms, As shown in FIG. 5, by assigning addresses where each HRTF data is stored to an address table made up of pointer points as shown in FIG.
  • the data points of the HRTF data are specified by specifying the direction of the elevation angle and the horizontal angle. This makes it possible to identify the desired HRTF data from the HRTF database 36 quickly.
  • the spherical surface may be a plurality of spherical surfaces having different radii. Further, the center of the head and the center of the sphere may be shifted from each other, and may not be a complete sphere.
  • Second embodiment An example in which the data point arrangement of the HRTF data in the HRTF database 36 is arranged at substantially equal density as shown in FIG. 4 has been described. However, when address information does not exist (HRTF data is stored) When the address is assigned for each direction consisting of elevation and horizontal angles (when the address table is not generated), when the HRTF data is loaded, or when it is first started after the HRTF data is incorporated Alternatively, an address table storing addresses for specifying HRTF data in the HRTF database 36 corresponding to each pointer point in FIG. 5 may be generated.
  • the composition of the data points of the HRTF data is not limited to the arrangement on the spherical surface or polyhedron as shown in FIG. 4 but also according to various directions and / or distances or positions as viewed from the listener. But that's fine. Therefore, the configuration of the data points of the HRTF data is not limited to the arrangement on the spherical surface described above, but may be arranged in a cylindrical coordinate system or in a three-axis coordinate space. is there.
  • the composition of the data points of the HRTF data may be defined so as to be substantially equal density with respect to the space, or may be non-equal density, for example, near the front of the listener. May be defined with a high density for the rear and a low density for the rear. Therefore, when assigning addresses that are pointer points that are elements of the address table of FIG. 5, it is necessary to associate the HRTF data point closest to the pointer point that is an element of each address table in advance. is there.
  • FIG. 8 shows the case where there is no information on the address where the HRTF data is stored in the HRTF database 36 (the address table where the address information storing the HRTF data is assigned for each elevation angle and horizontal angle is not generated). 5, voice processing for generating an address table storing addresses for specifying HRTF data in the HRTF database 36 corresponding to each element of the address table of FIG. 5, that is, for each direction specified by the elevation angle and the horizontal angle.
  • 3 is a block diagram illustrating a configuration example of a device 11.
  • the speech processing apparatus 11 of FIG. 8 is different from the speech processing apparatus 11 of FIG. 6 in that an address calculation unit 51 is further provided and an address information storage unit 52 is provided instead of the address information storage unit 34. is there.
  • the address calculation unit 51 is loaded when HRTF data is loaded, or when it is first started after HRTF data is incorporated
  • the header information stored in the header information storage unit 33 and the HRTF database 36 are read out, and address information for each element of the address table in FIG. 5 is generated in advance, and the direction specified by the elevation angle and the horizontal angle.
  • the address where the corresponding HRTF data is stored is assigned to the address table and stored in the address information storage unit 52.
  • the address information reading unit 32 associates the address of the HRTF data stored in the address information storage unit 52 with the elevation angle and horizontal angle information specifying the HRTF data, for example, an expression Reading is performed based on the calculation result of (2), and the read address is supplied to the HRTF reading unit 35. That is, the address information reading unit 32 addresses the address at which the HRTF data is stored based on the header information and the calculation result of Expression (2) every time the elevation angle and horizontal angle information specifying the HRTF data is updated. The information is read from the information storage unit 52.
  • step S31 the address calculation unit 51 determines whether or not new unregistered HRTF data has been loaded (or immediately after activation), and unregistered new HRTF data has been loaded (or immediately after activation). The same process is repeated until it is determined that If it is determined in step S31 that new unregistered HRTF data has been loaded (or immediately after startup), the process proceeds to step S32.
  • step S32 the address calculation unit 51 sets the unprocessed direction corresponding to the address point of the HRTF data in the address table in FIG. 5 as the process target direction among the directions formed by the elevation angle and the horizontal angle.
  • step S33 the address calculation unit 51 specifies the HRTF data corresponding to the processing target direction, and specifies the address in the corresponding HRTF database 36.
  • step S34 the address calculation unit 51 stores the address of the specified HRTF data in the address information storage unit 52 in association with the elevation angle and the horizontal angle that specify the processing target direction.
  • step S35 the address calculation unit 51 determines whether or not an elevation angle and a horizontal angle that specify an unprocessed direction exist. If there is an elevation angle and a horizontal angle that specify an unprocessed direction, the process includes: The process returns to step S32. That is, the process of steps S32 to S35 is repeated until there is no unprocessed direction, and when there is no unprocessed direction, the process ends.
  • the selected elevation angle and horizontal angle in which the address storing the HRTF data in the corresponding direction forms the address table based on the selected elevation angle and horizontal angle in step S34 in the flowchart of FIG. Is stored in the address information storage unit 52 as an element in the direction corresponding to.
  • the address information reading unit 32 accesses the address table of the address information storage unit 52 based on the calculation result of Expression (2), for example, and specifies the HRTF data.
  • the address stored corresponding to the direction is read and supplied to the HRTF reading unit 35.
  • the address position of the HRTF data stored in association with the elevation angle and horizontal angle information specifying the HRTF data is read and supplied to the HRTF reading unit 35.
  • the HRTFs of three HRTF data points Pa, Pb, and Pc stored in advance are HRTF_Pa
  • HRTF_Pb and HRTF_Pc using the respective interpolation coefficients ⁇ , ⁇ , and ⁇ , for example, the following equation (3) is calculated to obtain HRTF_P at the point P that is a desired data point. Also good.
  • HTRF_P HRTF_Pa ⁇ ⁇ + HRTF_Pb ⁇ ⁇ + HRTF_Pc ⁇ ⁇ ... (3)
  • the HRTF database can be made smaller.
  • FIG. 11 shows an example of the configuration of a speech processing apparatus that interpolates HRTF data.
  • the same name and the same reference numeral are given to the configuration having the same function as the voice processing apparatus 11 in FIG. It shall be.
  • the speech processing apparatus 11 of FIG. 11 differs from the speech processing apparatus 11 of FIG. 6 in that an address information interpolation coefficient storage unit 71 is provided instead of the address information storage unit 34, and further, an interpolation calculation unit 72 is provided.
  • the address information interpolation coefficient storage unit 71 has the same basic function as that of the address information storage unit 34, but further calculates an interpolation coefficient required for the interpolation calculation for each direction specified by the elevation angle and the horizontal angle. Each is a memorized point. Basically, a plurality of required combinations of HRTF data and respective interpolation coefficients are required for each direction specified by the elevation angle and the horizontal angle. Therefore, the address information interpolation coefficient storage unit 71 stores a combination of addresses of HRTF data required for each direction and a corresponding interpolation coefficient in association with each other.
  • the interpolation calculation unit 72 calculates HRTF data in the corresponding direction by calculating a linear sum, for example, using the three points of HRTF data specified according to the directions of the elevation angle and the horizontal angle and the corresponding interpolation coefficient. Is generated by interpolation and supplied to the HRTF calculator 37.
  • step S52 the address information reading unit 32 determines the corresponding three HRTFs based on the selected elevation angle and horizontal angle, the header information stored in the header information storage unit 33, and the equation (2).
  • the address in the HRTF database 36 in which data is stored is read from the address information interpolation coefficient storage unit 71 and supplied to the HRTF reading unit 35.
  • step S53 the HRTF reading unit 35 accesses the HRTF database 36 according to the addresses for the three points supplied from the address information reading unit 32, reads the corresponding HRTF data for the three points, and performs the interpolation operation unit. 72.
  • step S54 the interpolation calculation unit 72 reads out the interpolation coefficients corresponding to the HRTF data for three points from the address information interpolation coefficient storage unit 71.
  • step S55 the interpolation calculation unit 72 generates the HRTF data in a desired direction by interpolation, for example, by executing the interpolation calculation represented by the above-described equation (3), and supplies the generated HRTF data to the HRTF calculation unit 37.
  • step S ⁇ b> 56 the HRTF calculation unit 37 generates a binaural signal by filtering the audio signal supplied from the sound source output unit 38 using the supplied HRTF data generated by interpolation. Output to.
  • interpolation may be performed by a method other than linear sum.
  • the elevation angle and horizontal angle are determined at the timing when the new HRTF database is loaded or at the first start-up, as in the address information calculation process in the second embodiment.
  • the address information interpolation coefficient storage unit 71 may store the interpolation coefficient in association with the address information in which the HRTF data is stored for each element of the address table specified by the above.
  • the linear sum is simply used. If interpolated, the pulse position in the time direction may be shifted.
  • the shift in the time direction of the pulse position of the HRTF data is held as offset information, and this is used to align the pulse positions on the time axis in advance.
  • the desired interpolation may be realized.
  • the HRTF data may be offset and stored in advance, or the HRTF data may be stored with accompanying offset information, Alternatively, the interpolation may be performed after the offset is made common at the time of interpolation.
  • FIG. 14 shows an example of the configuration of a speech processing apparatus that is offset and interpolated in the time direction of the pulse position of HRTF data.
  • the same name and the same reference numeral are given to the configuration having the same function as that of the audio processing device 11 in FIG. It shall be.
  • the speech processing apparatus 11 of FIG. 14 differs from the speech processing apparatus 11 of FIG. 11 in that the address information interpolation coefficient offset information storage section 101, instead of the address information interpolation coefficient storage section 71 and the interpolation calculation section 72, In addition, an interpolation calculation unit 102 is provided.
  • the address information interpolation coefficient offset information storage unit 101 further stores offset information in the time direction of the pulse position for each direction of the HRTF data. That is, the address information interpolation coefficient offset information storage unit 101 stores times t1 to t3 as offset information for each of the directions A to C, for example, as shown in the right part of FIG.
  • the interpolation calculation unit 102 is basically the same as the interpolation calculation unit 72, but based on the address information supplied from the address information reading unit 32, ⁇ and ⁇ in the above-described equation (3) are used. , ⁇ , etc., and spatial direction interpolation coefficients and time direction offset information are read out, and HRTF data is interpolated using the three HRTF data and supplied to the HRTF computing unit 37.
  • step S92 the address information reading unit 32 reads the corresponding three points of address information from the address information interpolation coefficient offset information storage unit 101 based on the selected elevation angle and horizontal angle, and stores them in the header information storage unit 33. Together with the stored header information, an address in the HRTF database 36 storing the desired HRTF is calculated and supplied to the HRTF reading unit 35 and the interpolation calculation unit 102.
  • step S93 the HRTF reading unit 35 accesses the address of the corresponding HRTF database 36 in accordance with the address for the three points supplied from the address information reading unit 32, and reads the corresponding HRTF data for the three points. To the interpolation calculation unit 102.
  • step S94 the interpolation calculation unit 102 reads out the spatial direction interpolation coefficients corresponding to the three HRTF data from the address information interpolation coefficient offset information storage unit 101. That is, the spatial direction interpolation coefficients here are interpolation coefficients corresponding to ⁇ , ⁇ , and ⁇ in the above-described equation (3).
  • step S95 the interpolation calculation unit 102 reads the offset information corresponding to the HRTF data for three points from the address information interpolation coefficient offset information storage unit 101.
  • step S ⁇ b> 96 the interpolation calculation unit 102 generates an HRTF data in a desired direction by interpolating in consideration of offset information and interpolation coefficients in the time direction, and supplies the generated HRTF data to the HRTF calculation unit 37.
  • the interpolation calculation unit 102 corrects the pulse positions so as to be aligned in the time direction, and then performs the interpolation calculation in the spatial direction by the calculation of the above equation (3) using the interpolation coefficient. Further, as shown in FIG. 16, the time direction interpolation coefficients p, q, r set based on the time direction offset information t1, t2, t3 are calculated, and the time direction interpolation coefficients p, q are calculated. , R is used to obtain a time direction deviation time T by the calculation represented by the following equation (4).
  • the interpolation calculation unit 102 corrects the offset of the shift time T in the time direction obtained by the equation (4), and converts the HRTF data that is the interpolation calculation result obtained by the equation (3) described above into the HRTF calculation unit 37 To supply.
  • step S ⁇ b> 97 the HRTF calculation unit 37 generates a binaural signal by filtering the audio signal supplied from the sound source output unit 38 using the supplied HRTF data generated by interpolation. Output.
  • the interpolation calculation performed in the spatial direction in step S96 is supplied to the HRTF calculation unit 37 together with the information on the time T without correcting the shift time T, and the filter processing in the HRTF calculation unit 37 is performed. The time correction of T may be performed later.
  • HRTF data is obtained with high accuracy in the spatial direction while reducing the amount of data in the HRTF database, and further, the timing is adjusted in consideration of the deviation time T in the time direction to realize convolution of audio data It becomes possible.
  • the interpolation coefficient offset information storage unit 101 stores the address information in advance. It may be read out and used for interpolation calculation. Further, the interpolation coefficients p, q, and r in the time direction may be replaced with the interpolation coefficients ⁇ , ⁇ , and ⁇ in the spatial direction, or may be obtained from ⁇ , ⁇ , and ⁇ by some calculation formula. Conversely, the spatial direction interpolation coefficients ⁇ , ⁇ , ⁇ may be replaced with temporal direction interpolation coefficients p, q, r, or may be obtained from p, q, r by some calculation formula.
  • the predetermined direction may be an intersection of arbitrary grids in a grid set in an arbitrary coordinate system.
  • FIG. 17 shows a configuration example of a general-purpose personal computer.
  • This personal computer incorporates a CPU (Central Processing Unit) 1001.
  • An input / output interface 1005 is connected to the CPU 1001 via a bus 1004.
  • a ROM (Read Only Memory) 1002 and a RAM (Random Access Memory) 1003 are connected to the bus 1004.
  • the input / output interface 1005 includes an input unit 1006 including an input device such as a keyboard and a mouse for a user to input an operation command, an output unit 1007 for outputting a processing operation screen and an image of the processing result to a display device, programs, and various types.
  • a storage unit 1008 including a hard disk drive for storing data, a LAN (Local Area Network) adapter, and the like are connected to a communication unit 1009 that executes communication processing via a network represented by the Internet.
  • magnetic disks including flexible disks
  • optical disks including CD-ROM (Compact Disc-Read Only Memory), DVD (Digital Versatile Disc)), magneto-optical disks (including MD (Mini Disc)), or semiconductors
  • a drive 1010 for reading / writing data from / to a removable medium 1011 such as a memory is connected.
  • the CPU 1001 is read from a program stored in the ROM 1002 or a removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, installed in the storage unit 1008, and loaded from the storage unit 1008 to the RAM 1003. Various processes are executed according to the program.
  • the RAM 1003 also appropriately stores data necessary for the CPU 1001 to execute various processes.
  • the CPU 1001 loads the program stored in the storage unit 1008 to the RAM 1003 via the input / output interface 1005 and the bus 1004 and executes the program, for example. Is performed.
  • the program executed by the computer (CPU 1001) can be provided by being recorded on the removable medium 1011 as a package medium, for example.
  • the program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
  • the program can be installed in the storage unit 1008 via the input / output interface 1005 by attaching the removable medium 1011 to the drive 1010. Further, the program can be received by the communication unit 1009 via a wired or wireless transmission medium and installed in the storage unit 1008. In addition, the program can be installed in advance in the ROM 1002 or the storage unit 1008.
  • the program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.
  • the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Accordingly, a plurality of devices housed in separate housings and connected via a network and a single device housing a plurality of modules in one housing are all systems. .
  • the present disclosure can take a cloud computing configuration in which one function is shared by a plurality of devices via a network and is processed jointly.
  • each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
  • the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
  • this indication can also take the following structures.
  • an HRTF data holding unit for holding HRTF data corresponding to a plurality of directions;
  • a speech processing apparatus comprising: an HRTF data reading unit that reads the HRTF data corresponding to the predetermined direction from the HRTF data holding unit based on relationship information indicating a relationship with the HRTF data corresponding to a predetermined direction.
  • the predetermined direction includes a direction and / or a distance or a position with respect to a listener.
  • ⁇ 3> The audio processing device according to ⁇ 1> or ⁇ 2>, wherein the predetermined direction is a direction specified by two angles that specify an arbitrary polar coordinate direction centered on a listener's head.
  • ⁇ 4> The audio processing device according to ⁇ 3>, wherein the two angles that specify the direction of the arbitrary polar coordinate are an elevation angle and a horizontal angle centered on the listener's head.
  • ⁇ 5> The voice processing device according to any one of ⁇ 1> to ⁇ 4>, wherein the predetermined direction does not coincide with any of the plurality of directions.
  • ⁇ 6> The speech processing apparatus according to any one of ⁇ 1> to ⁇ 5>, wherein the predetermined direction is an intersection of arbitrary grids in an arbitrary coordinate system.
  • ⁇ 7> The speech processing apparatus according to any one of ⁇ 1> to ⁇ 6>, wherein the HRTF data is defined to be spatially substantially equal density or substantially unequal density in a plurality of directions.
  • the HRTF data is defined so as to have a substantially spherical or substantially equal density or substantially non-uniform density corresponding to a plurality of directions. .
  • the HRTF data is defined according to a plurality of positional relationships with the listener so as to be spatially substantially non-uniform in a plurality of directions. Any one of ⁇ 1> to ⁇ 8> A voice processing apparatus according to claim 1.
  • the relationship information is information indicating a relationship between the predetermined direction and an address at which the HRTF data is stored in the corresponding HRTF data holding unit,
  • the speech processing apparatus according to any one of ⁇ 1> to ⁇ 9>, wherein the HRTF data reading unit reads HRTF data of an address with respect to the predetermined direction based on the relationship information.
  • the HRTF data reading unit based on relation information indicating a relationship with the HRTF data corresponding to the predetermined direction, a plurality of the HRTF data corresponding to the predetermined direction, the HRTF data holding unit Read more,
  • the speech processing according to any one of ⁇ 1> to ⁇ 10>, further including an interpolating unit that interpolates and generates interpolated HRTF data corresponding to the predetermined direction based on the plurality of HRTF data corresponding to the predetermined direction.
  • the interpolation unit generates the HRTF data corresponding to the predetermined direction by interpolation based on the plurality of HRTF data and an interpolation coefficient corresponding to each of the plurality of HRTF data.
  • Voice processing device based on relation information indicating a relationship with the HRTF data corresponding to the predetermined direction, a plurality of the HRTF data corresponding to the predetermined direction, the HRTF data holding unit Read more,
  • the speech processing according to any one of ⁇ 1> to ⁇ 10>, further including an interpolating unit
  • the interpolation unit calculates a linear sum of the plurality of HRTF data and an interpolation coefficient in a spatial direction corresponding to each of the plurality of HRTF data, thereby obtaining HRTF data corresponding to the predetermined direction.
  • the voice processing device according to ⁇ 12>, wherein interpolation generation is performed.
  • the interpolation unit includes the plurality of HRTF data, time-direction offset information indicating a shift in pulse position in the plurality of HRTF data, a spatial direction corresponding to each of the plurality of HRTF data, and a time
  • the speech processing apparatus according to ⁇ 13>, wherein HRTF data corresponding to the predetermined direction is generated by interpolation based on a direction interpolation coefficient.
  • the interpolation unit aligns the offset in the time direction in the plurality of HRTF data, calculates a linear sum with a spatial direction interpolation coefficient corresponding to each of the plurality of HRTF data, and the plurality of HRTF data
  • the HRTF data offset by a time calculated as a linear sum of the offset in the time direction and the interpolation coefficient in the time direction in HRTF is output as HRTF data corresponding to the predetermined direction ⁇ 14>
  • Processing equipment ⁇ 16>
  • the audio processing apparatus according to any one of ⁇ 1> to ⁇ 15>, further including a binaural signal generation unit configured to convolve the HRTF data read by the HRTF data reading unit with a signal of a sound source to generate a binaural signal. .
  • An HRTF data holding unit that holds the HRTF data corresponding to the predetermined direction and the HRTF data corresponding to a plurality of directions based on the relationship information indicating the relationship with the HRTF data corresponding to the predetermined direction.
  • a voice processing method including a step of reading more. ⁇ 18> an HRTF data holding unit that holds HRTF data corresponding to a plurality of directions; Based on the relationship information indicating the relationship with the HRTF data corresponding to a predetermined direction, the computer functions as an HRTF data reading unit that reads the HRTF data corresponding to the predetermined direction from the HRTF data holding unit. program.
  • 11 voice processing device 31 direction selection unit, 32 pointer calculation unit, 33 header information storage unit, 34 address information storage unit, 35 HRTF reading unit, 36 HRTF database, 37 HRTF calculation unit, 38 sound source output unit, 39 headphones, 51 Address calculation unit, 52 Address information storage unit, 71 Address information interpolation coefficient storage unit, 72 Interpolation calculation unit, 101 Address information interpolation coefficient offset information storage unit, 102 Interpolation calculation unit

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

The present disclosure relates to an audio processing device, an audio processing method and a program which align the necessary HRTF density with the intentions of an HRTF database creator, while also enabling increased speed of access to a desired HRTF, from among the HRTFs stored in the database. On the basis of address information, which indicates the relationship to HRTF data corresponding to a desired direction, in an HRTF database that holds HRTF data corresponding to a plurality of directions, HRTF data corresponding to the desired direction is read from an HRTF data holding unit that holds HRTF data corresponding to a plurality of directions. The present disclosure is applicable to image processing devices.

Description

音声処理装置、および音声処理方法、並びにプログラムAudio processing apparatus, audio processing method, and program
 本開示は、音声処理装置、および音声処理方法、並びにプログラムに関し、特に、頭部伝達関数のデータベースの効率的な圧縮とデータアクセスの高速化を実現できるようにした音声処理装置、および音声処理方法、並びにプログラムに関する。 The present disclosure relates to a voice processing device, a voice processing method, and a program, and in particular, a voice processing device and a voice processing method capable of realizing efficient compression of a head-related transfer function database and high-speed data access. , As well as programs.
 近年、音声の分野において、全周囲からの空間情報を収録、伝送、再生する系の開発および普及が進んでいる。スーパーハイビジョンにおいては、22.2チャネルの3次元マルチチャネル音響での放送が計画されている。また、バーチャルリアリティの分野においても、全周囲を取り囲む映像に加え、音声においても全周囲を取り囲む信号を再生するものが普及しつつある。 In recent years, development and popularization of a system for recording, transmitting, and reproducing spatial information from all around has been progressing in the audio field. In Super Hi-Vision, 22.2 channel 3D multi-channel sound broadcasting is planned. Also, in the field of virtual reality, in addition to video surrounding the entire periphery, what reproduces a signal surrounding the entire periphery is also becoming popular in audio.
 このようなコンテンツが普及するなか、多チャネルのスピーカを設置する必要がなく、ヘッドフォンにより容易に再生が可能となるバイノーラル再生技術は、今後ますます重要になると考えられる。 As such content becomes widespread, binaural playback technology that can easily be played back with headphones without the need to install multi-channel speakers is expected to become increasingly important.
 このバイノーラル再生技術は、一般に聴覚ディスプレイ(VAD:Virtual Auditory Display)と呼ばれており、頭部伝達関数(HRTF:Head-related transfer function)を用いて実現される(特許文献1参照)。 This binaural reproduction technology is generally called an auditory display (VAD: Virtual Auditory Display) and is realized using a head-related transfer function (HRTF) (see Patent Document 1).
 頭部伝達関数とは、人間の頭部を取り囲むあらゆる方向から両耳鼓膜までの音の伝わり方に関する情報を周波数と到来方向の関数として表現したものである。目的となる音声に対して、所定の方向からのHRTFを合成したものをヘッドフォンで提示した場合に、聴取者にとっては、ヘッドフォンからではなく、その用いたHRTFの方向から音が到来しているかのように知覚される。聴覚ディスプレイ(VAD)は、この原理を利用したシステムである。 The head-related transfer function expresses information about how sound is transmitted from all directions surrounding the human head to the eardrum as a function of frequency and direction of arrival. When the target speech is synthesized with HRTFs from a given direction and presented with headphones, the listener will hear whether the sound is coming from the direction of the HRTF used, not from the headphones. Perceived as. The auditory display (VAD) is a system that uses this principle.
特開2013-110682号公報JP 2013-110682 A
 しかしながら、このときHRTFは人間を取り囲む全周囲に渡って必要とされるため、その分、大容量のデータベースが必要となる。 However, at this time, since HRTF is required over the entire circumference that surrounds humans, a correspondingly large database is required.
 結果として、大容量のデータベースに格納されたHRTFのうち、必要とされるHRTFを高速で読み出せるようにする必要がある。 As a result, among the HRTFs stored in a large-capacity database, it is necessary to be able to read the required HRTFs at high speed.
 本開示は、このような状況に鑑みてなされたものであり、特に、必要とするHRTFの密度をHRTFデータベース作成者の意図通りのものにしつつ、データベースに格納されたHRTFのうち、所望とするHRTFへのアクセスを高速化できるようにするものである。 The present disclosure has been made in view of such circumstances, and in particular, the desired density of HRTFs stored in the database is set as desired while maintaining the density of the required HRTFs as intended by the creator of the HRTF database. It is intended to speed up access to HRTF.
 本開示の一側面の音声処理装置は、複数の方向に対応するHRTFデータを保持するHRTFデータ保持部と、所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータを、前記HRTFデータ保持部より読み込むHRTFデータ読込部とを含む音声処理装置である。 The speech processing device according to an aspect of the present disclosure is based on relationship information indicating a relationship between an HRTF data holding unit that holds HRTF data corresponding to a plurality of directions and the HRTF data corresponding to a predetermined direction. And a HRTF data reading unit that reads the HRTF data corresponding to the direction from the HRTF data holding unit.
 前記所定の方向には、聴取者に対する方向、および/もしくは距離、または位置を含ませるようにすることができる。 The predetermined direction may include a direction and / or distance or position with respect to the listener.
 前記所定の方向は、聴取者の頭部を中心とした仰角および水平角により特定される方向とすることができる。 The predetermined direction can be a direction specified by an elevation angle and a horizontal angle centered on the listener's head.
 前記所定の方向は、前記聴取者の頭部を中心とした任意の極座標の方向を特定する2つの角度により特定される方向であるとすることができる。 The predetermined direction may be a direction specified by two angles that specify an arbitrary polar coordinate direction centered on the listener's head.
 前記任意の極座標の方向を特定する2つの角度は、前記聴取者の頭部を中心とした仰角および水平角とすることができる。 The two angles specifying the direction of the arbitrary polar coordinates can be an elevation angle and a horizontal angle with the listener's head as the center.
 前記所定の方向は、前記複数の方向のいずれの方向にも一致しないようにすることができる。 The predetermined direction may not coincide with any of the plurality of directions.
 前記所定の方向は任意の座標系の任意のグリットの交点とすることができる。 The predetermined direction can be an intersection of arbitrary grids in an arbitrary coordinate system.
 前記HRTFデータは、複数の方向に対して空間的に略等密度または略非等密度となるように定義されるようにすることができる。 The HRTF data can be defined so as to be spatially approximately equal density or approximately unequal density in a plurality of directions.
 前記HRTFデータは、複数の方向に対応して、前記空間的に、略球面状に略等密度または略非等密度となるように定義されるようにすることができる。 The HRTF data may be defined to correspond to a plurality of directions so as to have a substantially spherical or substantially equal density in a substantially spherical shape.
 前記HRTFデータは、複数の方向に対して空間的に略非等密度となるように、前記聴取者に対する複数の位置関係に応じて定義されるようにすることができる。 The HRTF data can be defined according to a plurality of positional relationships with respect to the listener so as to be spatially substantially non-uniform in a plurality of directions.
 前記関係情報には、前記所定の方向と、対応する前記HRTFデータ保持部における前記HRTFデータが格納されたアドレスとの関係を示す情報であり、前記HRTFデータ読込部は、前記関係情報に基づいて、前記所定の方向に対するアドレスのHRTFデータを読み込ませるようにすることができる。 The relationship information is information indicating a relationship between the predetermined direction and an address at which the HRTF data is stored in the corresponding HRTF data holding unit, and the HRTF data reading unit is based on the relationship information. The HRTF data of the address for the predetermined direction can be read.
 前記HRTFデータ読込部には、前記所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する複数の前記HRTFデータを、前記HRTFデータ保持部より読み込ませ、前記所定の方向に対応する前記複数のHRTFデータに基づいて、前記所定の方向に対応する補間HRTFデータを補間生成する補間部をさらに含ませるようにすることができる。 The HRTF data reading unit reads a plurality of HRTF data corresponding to the predetermined direction from the HRTF data holding unit based on relation information indicating a relationship with the HRTF data corresponding to the predetermined direction. An interpolation unit that interpolates and generates interpolation HRTF data corresponding to the predetermined direction based on the plurality of HRTF data corresponding to the predetermined direction can be further included.
 前記補間部には、前記複数のHRTFデータと、前記複数のHRTFデータのそれぞれに対応する補間係数とに基づいて、前記所定の方向に対応するHRTFデータを補間生成させるようにすることができる。 The interpolating unit can interpolate and generate HRTF data corresponding to the predetermined direction based on the plurality of HRTF data and an interpolation coefficient corresponding to each of the plurality of HRTF data.
 前記補間部には、前記複数のHRTFデータと、前記複数のHRTFデータのそれぞれに対応する空間方向の補間係数との線形和を算出することで、前記所定の方向に対応するHRTFデータを補間生成させるようにすることができる。 The interpolation unit interpolates and generates HRTF data corresponding to the predetermined direction by calculating a linear sum of the plurality of HRTF data and an interpolation coefficient in a spatial direction corresponding to each of the plurality of HRTF data. You can make it.
 前記補間部には、前記複数のHRTFデータ、前記複数のHRTFデータにおけるパルス位置のずれを示す時間方向のオフセットの情報、並びに、前記複数のHRTFデータのそれぞれに対応する空間方向、および時間方向の補間係数に基づいて、前記所定の方向に対応するHRTFデータを補間生成させるようにすることができる。 In the interpolating unit, the plurality of HRTF data, information on offset in a time direction indicating a deviation of a pulse position in the plurality of HRTF data, a spatial direction corresponding to each of the plurality of HRTF data, and a time direction Based on the interpolation coefficient, the HRTF data corresponding to the predetermined direction can be generated by interpolation.
 前記補間部には、前記複数のHRTFデータにおける前記時間方向のオフセットを揃え、前記複数のHRTFデータのそれぞれに対応する空間方向の補間係数との線形和を算出し、前記複数のHRTFデータにおける前記時間方向のオフセットと、前記時間方向の補間係数との線形和として算出された時間だけオフセットした前記HRTFデータを、前記所定の方向に対応するHRTFデータとして出力させるようにすることができる。 The interpolation unit aligns the offset in the time direction in the plurality of HRTF data, calculates a linear sum with an interpolation coefficient in a spatial direction corresponding to each of the plurality of HRTF data, and in the plurality of HRTF data The HRTF data offset by a time calculated as a linear sum of the offset in the time direction and the interpolation coefficient in the time direction can be output as HRTF data corresponding to the predetermined direction.
 HRTFデータ読込部により読み込まれる前記HRTFデータを、音源の信号に畳み込んでバイノーラル信号を生成するバイノーラル信号生成部をさらに含ませるようにすることができる。 It is possible to further include a binaural signal generation unit that generates a binaural signal by convolving the HRTF data read by the HRTF data reading unit with a signal of a sound source.
 本開示の一側面の音声処理方法は、所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータを、複数の方向に対応するHRTFデータを保持するHRTFデータ保持部より読み込むステップを含む音声処理方法である。 The speech processing method according to an aspect of the present disclosure is based on relationship information indicating a relationship with the HRTF data corresponding to a predetermined direction, and the HRTF data corresponding to the predetermined direction is converted into HRTFs corresponding to a plurality of directions. This is an audio processing method including a step of reading from an HRTF data holding unit that holds data.
 本開示の一側面のプログラムは、複数の方向に対応するHRTFデータを保持するHRTFデータ保持部と、所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータを、前記HRTFデータ保持部より読み込むHRTFデータ読込部としてコンピュータを機能させるプログラムである。 A program according to an aspect of the present disclosure is based on relationship information indicating a relationship between an HRTF data holding unit that holds HRTF data corresponding to a plurality of directions and the HRTF data corresponding to a predetermined direction. Is a program that causes a computer to function as an HRTF data reading unit that reads the HRTF data corresponding to HRTF data from the HRTF data holding unit.
 本開示の一側面においては、HRTFデータ保持部により、複数の方向に対応するHRTFデータが保持され、所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータが、前記HRTFデータ保持部より読み込まれる。 In one aspect of the present disclosure, the HRTF data holding unit holds HRTF data corresponding to a plurality of directions, and based on the relationship information indicating the relationship with the HRTF data corresponding to a predetermined direction, the predetermined direction The HRTF data corresponding to is read from the HRTF data holding unit.
 本開示の一側面によれば、頭部伝達関数のデータベースの効率的な圧縮とデータアクセスの高速化を実現することが可能となる。 According to one aspect of the present disclosure, it is possible to realize efficient compression of the head-related transfer function database and high-speed data access.
本開示を適用した音声処理装置において使用するHRTF(頭部伝達関数)データの聴取者に対する方向の定義を説明する図である。It is a figure explaining the definition of the direction with respect to the listener of the HRTF (head-related transfer function) data used in the speech processing device to which this indication is applied. 従来の音声処理装置において使用するHRTF(頭部伝達関数)データの空間方向のデータ点の配置例を説明する図である。It is a figure explaining the example of arrangement | positioning of the data point of the spatial direction of the HRTF (head-related transfer function) data used in the conventional speech processing unit. 図2のHRTFデータを読み出す際にマトリクス状に展開した場合の例を説明する図である。It is a figure explaining the example at the time of expand | deploying in matrix form when reading the HRTF data of FIG. 本開示を適用した音声処理装置において使用するHRTF(頭部伝達関数)データの空間方向のデータ点の定義を説明する図である。It is a figure explaining the definition of the data point of the spatial direction of the HRTF (head related transfer function) data used in the speech processing unit to which this indication is applied. 本開示を適用した音声処理装置において使用するHRTF(頭部伝達関数)データを読み出すアドレステーブルの配置例を説明する図である。It is a figure explaining the example of arrangement | positioning of the address table which reads the HRTF (head-related transfer function) data used in the speech processing device to which this indication is applied. 本開示を適用した音声処理装置の第1の実施の形態の構成例を説明する図である。It is a figure explaining the structural example of 1st Embodiment of the audio processing apparatus to which this indication is applied. 図6の音声処理装置による音声処理を説明するフローチャートである。It is a flowchart explaining the audio | voice process by the audio | voice processing apparatus of FIG. 本開示を適用した音声処理装置の第2の実施の形態の構成例を説明する図である。It is a figure explaining the structural example of 2nd Embodiment of the audio processing apparatus to which this indication is applied. 図9の音声処理装置によるアドレス情報演算処理を説明するフローチャートである。It is a flowchart explaining the address information calculation process by the speech processing unit of FIG. HRTFデータの空間方向の補間生成を説明する図である。It is a figure explaining the interpolation production | generation of the spatial direction of HRTF data. 本開示を適用した音声処理装置の第3の実施の形態の構成例を説明する図である。It is a figure explaining the structural example of 3rd Embodiment of the audio processing apparatus to which this indication is applied. 図11の音声処理装置による音声処理を説明するフローチャートである。It is a flowchart explaining the audio | voice process by the audio | voice processing apparatus of FIG. HRTFデータにおけるパルス信号の時間方向のずれを説明する図である。It is a figure explaining the shift | offset | difference of the time direction of the pulse signal in HRTF data. 本開示を適用した音声処理装置の第4の実施の形態の構成例を説明する図である。It is a figure explaining the structural example of 4th Embodiment of the audio processing apparatus to which this indication is applied. 図14の音声処理装置による音声処理を説明するフローチャートである。It is a flowchart explaining the audio | voice process by the audio | voice processing apparatus of FIG. 図14の音声処理装置によるHRTFデータの時間方向の補間生成を説明する図である。It is a figure explaining the interpolation production | generation of the time direction of the HRTF data by the audio | voice processing apparatus of FIG. 汎用のパーソナルコンピュータの構成例を説明する図である。And FIG. 11 is a diagram illustrating a configuration example of a general-purpose personal computer.
 以下に添付図面を参照しながら、本開示の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the accompanying drawings. In addition, in this specification and drawing, about the component which has the substantially same function structure, duplication description is abbreviate | omitted by attaching | subjecting the same code | symbol.
 また、以下の順序で説明を行う。
  1.第1の実施の形態
  2.第2の実施の形態
  3.第3の実施の形態
  4.第4の実施の形態
  5.応用例
The description will be given in the following order.
1. 1. First embodiment 2. Second embodiment 3. Third embodiment 4. Fourth embodiment Application examples
 <<1.第1の実施の形態>>
 <バイノーラル再生技術>
 ヘッドフォン提示により耳元で立体音響をシミュレートする手法として、頭部伝達関数(HRTF:Head-related transfer function)を用いた方法が一般的である。頭部伝達関数HRTF H(x,ω)は、自由空間内において頭部が存在する状態での音源位置xから鼓膜位置までの伝達特性H_1(x,ω)を、頭部が存在しない状態での音源から頭部中心Oまでの伝達特性H_0(x,ω)で正規化したものである。
<< 1. First embodiment >>
<Binaural reproduction technology>
A method using a head-related transfer function (HRTF) is generally used as a method of simulating stereoscopic sound at the ear by presenting headphones. The head-related transfer function HRTF H (x, ω) is the transfer characteristic H_1 (x, ω) from the sound source position x to the eardrum position when the head is present in free space. Normalized by the transfer characteristic H_0 (x, ω) from the sound source to the head center O.
 H(x,ω)=(H_1(x,ω))/(H_0(x,ω)) H (x, ω) = (H_1 (x, ω)) / (H_0 (x, ω))
 ここで、この伝達関数を任意の音声信号に畳み込み、ヘッドフォンなどにより提示することで、聴取者に対し、あたかも畳み込んだHRTFの方向xから音が聞こえてくるかのように知覚させることができる。このとき、任意の方向からの音を聴取者に提示する必要がある場合、聴取者を取り囲む全周囲の頭部伝達関数を測定およびシミュレーションなどの方法で得ることとなる。 Here, by convolving this transfer function into an arbitrary audio signal and presenting it with headphones, the listener can perceive as if sound is heard from the direction x of the convoluted HRTF. . At this time, when it is necessary to present a sound from an arbitrary direction to the listener, a head related transfer function around the listener is obtained by a method such as measurement and simulation.
 <極座標系配置のHRTFデータベース>
 聴覚ディスプレイ(VAD:Virtual Auditory Display)(以降においては、単に、VADとも称するものとする)においてヘッドトラッキングなどにより動的に音像を制御する場合、得られた全周囲のHRTFデータベースから、必要とするHRTFを逐次読み込む必要がある。
<HRTF database with polar coordinate system>
When the sound image is dynamically controlled by head tracking or the like in a virtual auditory display (VAD) (hereinafter, simply referred to as VAD), it is necessary from the obtained HRTF database of the entire circumference. HRTF needs to be read sequentially.
 この際、データベースの配列が水平角および仰角でそれぞれ決まった角度間隔で構築されている場合、その角度へのアクセスは容易となる。この例では、図1の左部で示されるように聴取者Gの天頂を仰角0度として、天底を仰角180度とするものとし、図1の右部で示されるように、水平角を聴取者Gの正面方向を0度として、右回りに進む角度とするものとする。 At this time, if the database array is constructed at an angular interval determined by the horizontal angle and the elevation angle, access to the angle becomes easy. In this example, as shown in the left part of FIG. 1, the zenith of the listener G is assumed to have an elevation angle of 0 degrees and the nadir is assumed to have an elevation angle of 180 degrees, and as shown in the right part of FIG. The front direction of the listener G is assumed to be 0 degree, and the angle that advances clockwise is assumed.
 例えば、1サンプルあたり4バイトとし、1024サンプルのデータをもつHRTFを左右で1組とし、図2の球面状の各点で示されるように、各仰角において水平角1度刻みで360組あるデータが仰角1度刻みで保持される場合を考える。 For example, 4 bytes per sample, HRTF with data of 1024 samples, one set on the left and right, and 360 sets of data at each elevation angle in increments of 1 degree as shown by the spherical points in FIG. Suppose that is held in increments of 1 degree of elevation.
 この場合、各データが、図3のマトリクスで示されるように、仰角0度から水平角一周分(水平角0度乃至359度)のHRTFが配列され、仰角0度が終わったら仰角1度について同じように一周分のデータが配列される。これを仰角180度まで行うようなデータ配列となる。このとき、1方向分のHRTFのデータサイズは、4(byte/sample)×1024(sample/ch)×2ch=8192byte(=aとする)となる。 In this case, as shown in the matrix of FIG. 3, HRTFs for one round of the horizontal angle (horizontal angle 0 to 359 degrees) are arranged for each data, and when the elevation angle is 0 degrees, the elevation angle is 1 degree. Similarly, data for one round is arranged. The data array is such that this is performed up to an elevation angle of 180 degrees. At this time, the data size of the HRTF for one direction is 4 (byte / sample) × 1024 (sample / ch) × 2ch = 8192 bytes (= a).
 また、一つの仰角上で一周分のデータサイズは、A×a(=bとする)byteとなる。ここで、Aは、一つの仰角上のデータ点の存在する水平角の数を示しており、上の例では、水平角1度刻みでデータ点が存在するためA=360となる。たとえば、図3のマトリクスで黒地のマス目で示されるように、仰角178度、水平角357度(elev=178,azim=357)のHRTFにアクセスする場合、ファイルイメージの先頭アドレス(cとする)から数えて、178×b+357×aを計算すれば容易に所望のHRTFにアクセスすることが可能である。 Also, the data size for one round on one elevation angle is A × a (= b) bytes. Here, A indicates the number of horizontal angles at which data points on one elevation angle exist, and in the above example, A = 360 because data points exist at intervals of 1 degree in the horizontal angle. For example, when accessing an HRTF with an elevation angle of 178 degrees and a horizontal angle of 357 degrees (elev = 178, azim = 357) as indicated by the black squares in the matrix of FIG. ), It is possible to easily access a desired HRTF by calculating 178 × b + 357 × a.
 これを一般化すると、以下の式(1)で表現されるアドレスAddにアクセスすればよいことになる。 If this is generalized, the address Add represented by the following formula (1) may be accessed.
 Add=round(elev×(E-1)/180)×b+round(azim×A/360)×a+c
                            ・・・(1)
Add = round (elev × (E-1) / 180) × b + round (azim × A / 360) × a + c
... (1)
 ここで、Eはデータ点の存在する仰角の数を示しており、上の例では、仰角0度から180度まで1度刻みでデータが存在するため、E=181となる。また、ここでround(x)はxの小数第1位を四捨五入する関数である。 Here, E indicates the number of elevation angles at which data points exist. In the above example, E = 181 because data exists in increments of 1 degree from elevation angles 0 to 180 degrees. Here, round (x) is a function that rounds off the first decimal place of x.
 ところで、この例においては、保持するデータの密度は不均等である。すなわち、図2を参照すると、水平面(仰角=90度の面)に設定されるデータの密度に比べて、天頂や天底の密度が高くなっている。つまり、水平面のデータ量に対して、天頂、天底のデータ量が過剰になっているといえる。 By the way, in this example, the density of data to be held is uneven. That is, referring to FIG. 2, the density of the zenith and nadir is higher than the density of the data set on the horizontal plane (surface with elevation angle = 90 degrees). That is, it can be said that the amount of data at the zenith and nadir is excessive with respect to the amount of data on the horizontal plane.
 <等密度配置のHRTFデータベース>
 図2で示されるようなデータ量の密度が過剰になっている領域を無くすように、HRTFをできるだけ等密度に配置するような方法が存在する。正多面体をベースにして拡張するものもあるが、本明細書においては簡易的なものを一例にして述べるものとする。
<HRTF database with uniform density>
There is a method in which the HRTFs are arranged as equally as possible so as to eliminate the region where the density of the data amount is excessive as shown in FIG. Some expansion is based on a regular polyhedron, but in this specification, a simple one will be described as an example.
 この方法においては、例えば、図4で示されるように、同一仰角におけるデータを水平方向に等間隔に選び、等仰角内の円周上に、ある一定の長さ以上の等間隔で、できるだけ多くのデータ点を配置する。 In this method, for example, as shown in FIG. 4, data at the same elevation angle is selected at equal intervals in the horizontal direction, and as many as possible at equal intervals of a certain length or more on the circumference within the equal elevation angle. Place the data points.
 このようにすることで、図2のデータの配置に比べて、水平面におけるデータ密度に対する天頂や天底でのデータ量の過密を防ぐことができる。 This makes it possible to prevent overcrowding of the amount of data at the zenith and nadir relative to the data density on the horizontal plane, compared to the data arrangement of FIG.
 また、図2の場合、全データサイズは、181×360×8192byte=533MBであったが、図4の場合、水平面(仰角=90度の面)上に配置するHRTFデータ点を図2と同じにする条件のとき、全データ点数は、例えば、水平方向の所定間隔とすることで、41343とすることができる。このため、全データサイズは、4×41343×1024×2=338MBとなり、533MBに対して約37%(=(533-338)/533×100)のデータ量を削減することができる。 In addition, in the case of FIG. 2, the total data size was 181 × 360 × 8192 bytes = 533 MB, but in the case of FIG. 4, the HRTF data points arranged on the horizontal plane (elevation angle = 90 degrees surface) are the same as FIG. In this condition, the total number of data points can be set to 41343 by setting a predetermined interval in the horizontal direction, for example. For this reason, the total data size is 4 × 41343 × 1024 × 2 = 338 MB, and the data amount can be reduced by about 37% (= (533-338) / 533 × 100) with respect to 533 MB.
 しかしながら、このような配置の場合、上述したデータアクセスの容易性が失われてしまう。すなわち、図4の場合、各仰角の面上でデータ点数が違うため、例えば、ヘッダ情報として、仰角および水平角をデータ配列順に並べるなどの方法が考えられる。 However, in the case of such an arrangement, the above-described ease of data access is lost. That is, in the case of FIG. 4, since the number of data points is different on the surface of each elevation angle, for example, as the header information, a method of arranging the elevation angle and the horizontal angle in the data arrangement order can be considered.
 ただし、この場合、アクセスしたい仰角および水平角により特定される方向に対して最も近いデータ点をヘッダ内から見つけだし、そのインデックスに対して、1方向分のHRTFのデータサイズaを掛けることでアクセスすることが必要となる。つまり、HRTFのデータが多くなるほどデータの探索をするための演算量が膨大になってしまうことになる。 However, in this case, the closest data point to the direction specified by the elevation angle and horizontal angle to be accessed is found from the header, and the index is accessed by multiplying the data size a of HRTF for one direction. It will be necessary. That is, as the amount of HRTF data increases, the amount of calculation for searching for data becomes enormous.
 <音声処理装置の第1の実施の形態>
 そこで、本開示を適用した音声処理装置においては、データ量の過密を防ぎつつ、データアクセスを容易にするためにヘッダ情報として、極座標系配置のアドレス(HRTFデータベースの絶対または相対アドレス)を導入する。これは、直交座標系や円筒座標系などに対しても同様な考え方で行うことが可能であるが、本実施の形態においては極座標系のみについて説明する。
<First Embodiment of Audio Processing Device>
Therefore, in a speech processing apparatus to which the present disclosure is applied, an address of a polar coordinate system arrangement (absolute or relative address in the HRTF database) is introduced as header information in order to facilitate data access while preventing data volume congestion. . Although this can be performed with respect to an orthogonal coordinate system, a cylindrical coordinate system, and the like in the same manner, only a polar coordinate system will be described in the present embodiment.
 HRTFデータベース本体のデータ点配置は任意であり、等密度であっても、意図的に粗密を偏らせても構わない。ただし、ここでは説明のための例として、HRTFデータベース本体のデータ点配置を図4で示されるように略等密度に配置するものとする。また、HRTFデータベースはFIR(Finite Impulse Response)の形であってもよいし、IIR(Infinite Impulse Response)やフーリエ変換など何かしらの変換や近似、編集が行われていてもよい。 The data point arrangement of the HRTF database main body is arbitrary, and even if the density is equal, the density may be intentionally biased. However, here, as an example for explanation, it is assumed that the data point arrangement of the HRTF database main body is arranged at substantially equal density as shown in FIG. The HRTF database may be in the form of FIR (FiniteFiImpulse Response), or some transformation, approximation, or editing such as IIR (InfiniteInImpulse Response) or Fourier transform may be performed.
 ここで、図5の各点で示されるように、極座標配置に対応するアドレステーブルを設定するものとする。この極座標配置のアドレステーブルの各要素(図中の各点)には、たとえばその点から一番近い、図4のHRTFデータベース上のHRTFデータのアドレスが格納されている。ここでアドレスと記述したものは,絶対または相対アドレスに限らずHRTFデータへのアクセスを可能とする情報であればよい。 Here, as indicated by each point in FIG. 5, an address table corresponding to the polar coordinate arrangement is set. Each element (each point in the figure) of this polar coordinate address table stores, for example, the address of the HRTF data on the HRTF database in FIG. 4 that is closest to that point. What is described as an address here is not limited to an absolute or relative address, but may be information that enables access to HRTF data.
 図5の極座標配置に対応するアドレステーブルの要素数は、図4のHRTFデータベース36上のHRTFデータのアドレス数よりも多く設定されることになるため、複数の異なるアドレステーブルの要素から、同一のHRTFデータのアドレスが割り付けられることもある。 Since the number of elements of the address table corresponding to the polar coordinate arrangement of FIG. 5 is set to be larger than the number of addresses of the HRTF data on the HRTF database 36 of FIG. The address of HRTF data may be assigned.
 このように所望とする方向のHRTFデータを指定するためのアドレステーブルの要素を設定し、このアドレステーブルの要素に対応するHRTFデータが格納されたデータ点のアドレスを格納しておくようにする。換言すれば、アドレステーブルの要素毎に、対応するHRTFデータを割り付けると共に、割り付けられたHRTFデータのデータ点を指定するアドレスを格納しておくようにすることで、方向を特定する仰角および水平角の情報と、HRTFデータが格納されたデータ点のアドレスとの関係がアドレステーブルの要素毎に記録されることになる。この結果、上述したように所望とする方向のHRTFデータへのアクセスについては、従来通りに仰角および水平角で実現することが可能となる上、HRTFデータベースの密度の偏りに影響されることなく適切にアクセスすることが可能となる。すなわち、HRTFデータベースにおけるデータ点については、空間的に略等密度になるように構築してもよいし、空間的に一部に密度の偏りがあるように構築してもよく、仰角および水平角とで特定される方向に対応付けて指定することで適切に所望とするHRTFデータにアクセスすることが可能となる。なお、ここでは仰角および水平角と表現しているが、頭部を中心とする任意の極座標の2つの角度であればよい。 * In this way, an address table element for designating HRTF data in a desired direction is set, and the address of the data point where the HRTF data corresponding to the address table element is stored is stored. In other words, by assigning the corresponding HRTF data for each element of the address table and storing the address that specifies the data point of the assigned HRTF data, the elevation angle and horizontal angle that specify the direction are stored. And the address of the data point storing the HRTF data are recorded for each element of the address table. As a result, as described above, access to the HRTF data in the desired direction can be realized at the elevation angle and horizontal angle as before, and it is appropriate without being affected by the density deviation of the HRTF database. Can be accessed. That is, the data points in the HRTF database may be constructed so that they are spatially approximately the same density, or may be constructed so that there is a partial density deviation spatially, and the elevation angle and horizontal angle. It becomes possible to access the desired HRTF data appropriately by specifying in association with the direction specified by. In addition, although expressed as an elevation angle and a horizontal angle here, what is necessary is just two angles of the arbitrary polar coordinates centering on a head.
 尚、アドレスのサイズは任意であるが、HRTFのようなデータに対しては無視できるほど小さいサイズになると考えられる。すなわち、図5に示すようなHRTFのアドレステーブルの要素の例では、4byteのアドレスを持ち、図2のHRTFデータ配置と同じ配置になっている。このような場合、アドレスのサイズの合計は260kBとなり、データの密度の偏りを抑制することで約200MBのデータが削減されていることに鑑みると、この程度のアドレスのサイズの増加は無視できるといえる。 The size of the address is arbitrary, but it is considered to be negligibly small for data such as HRTF. That is, in the example of the element of the HRTF address table as shown in FIG. 5, it has a 4-byte address and is the same arrangement as the HRTF data arrangement of FIG. In such a case, the total address size is 260 kB, and considering that about 200 MB of data has been reduced by suppressing the bias in data density, this increase in address size can be ignored. I can say that.
 <第1の実施の形態の構成例>
 次に、図6のブロック図を参照して、上述した機能を備えた音声処理装置の構成例について説明する。
<Configuration example of the first embodiment>
Next, with reference to the block diagram of FIG. 6, a configuration example of the voice processing device having the above-described function will be described.
 図6の音声処理装置11は、方向選択部31、アドレス情報読込部32、ヘッダ情報記憶部33、アドレス情報記憶部34、HRTF読込部35、HRTFデータベース36、HRTF演算部37、音源出力部38、およびヘッドフォン39を備えている。 6 includes a direction selection unit 31, an address information reading unit 32, a header information storage unit 33, an address information storage unit 34, an HRTF reading unit 35, an HRTF database 36, an HRTF calculation unit 37, and a sound source output unit 38. , And headphones 39.
 方向選択部31は、聴取者に対する音源の、例えば仰角および水平角からなる方向を指定する情報を選択し、選択した方向の情報をアドレス情報読込部32に供給する。 The direction selection unit 31 selects information specifying the direction of the sound source for the listener, for example, an elevation angle and a horizontal angle, and supplies the selected direction information to the address information reading unit 32.
 アドレス情報読込部32は、選択された聴取者に対する音源の任意の方向を指定する情報、並びに、ヘッダ情報記憶部33に記憶されているヘッダ情報と併せて、図5の極座標配置されたポインタ点からなるアドレステーブルのうち、所望とするHRTFが格納されているHRTFデータベース36内のアドレスが格納されたポインタ点であるアドレステーブルの要素を算出し、算出された位置のポインタ点であるアドレステーブルの要素として格納されているアドレス(1方向分のデータサイズa’(ここでは、例えば、4byte))をHRTF読込部35に供給する。 The address information reading unit 32 includes pointer points arranged in polar coordinates in FIG. 5 together with information for designating an arbitrary direction of the sound source with respect to the selected listener and header information stored in the header information storage unit 33. Of the address table, which is a pointer point storing the address in the HRTF database 36 in which the desired HRTF is stored, is calculated, and the address table which is the pointer point of the calculated position is calculated. An address (data size a ′ for one direction (here, 4 bytes, for example)) stored as an element is supplied to the HRTF reading unit 35.
 アドレス情報記憶部34は、例えば、図5で示されるポインタ点を指定するための1方向分のデータサイズa’(ここでは、例えば、4byte)を水平角数A×仰角数Eだけ記憶している。すなわち、アドレス情報記憶部34に記憶されるアドレス情報のサイズは、a'×A×Eとなる。 The address information storage unit 34 stores, for example, a data size a ′ (for example, 4 bytes) for one direction for designating the pointer point shown in FIG. Yes. That is, the size of the address information stored in the address information storage unit 34 is a ′ × A × E.
 ここで、ヘッダ情報は、図5で示されるポインタ点を指定するための1方向分のデータサイズa’、図5のポインタ点の存在する水平角の数A、一つの仰角上での一周分のデータサイズb(=A×a’)、ファイルイメージの先頭アドレスc、およびデータ点の存在する仰角の数Eを含むものである。ヘッダ情報のうち一つの仰角上での一周分のデータサイズb(=A×a’)は、ポインタ点を指定するためのアドレスにおける1方向分のデータサイズa’に基づいて設定される。 Here, the header information includes the data size a ′ for one direction for designating the pointer point shown in FIG. 5, the number A of horizontal angles where the pointer point in FIG. 5 exists, and one round on one elevation angle. Data size b (= A × a ′), the start address c of the file image, and the number E of elevations at which the data points exist. The data size b (= A × a ′) for one round on one elevation angle in the header information is set based on the data size a ′ for one direction at the address for designating the pointer point.
 HRTF読込部35は、アドレス情報読込部32より供給されてくるアドレス情報に応じて、HRTFデータベース36にアクセスし、対応するHRTFを読み出してHRTF演算部37に供給する。 The HRTF reading unit 35 accesses the HRTF database 36 according to the address information supplied from the address information reading unit 32, reads the corresponding HRTF, and supplies it to the HRTF calculation unit 37.
 HRTF演算部37は、供給されてきたHRTFを用いて、音源出力部38より供給されてくる音声信号を所定の演算処理することにより、バイノーラル信号を生成し、ヘッドフォン39より音声を出力する。ここで、HRTFはFIR(Finite Impulse Response)の形であってもよいし、IIR(Infinite Impulse Response)やフーリエ変換など何かしらの変換や近似、編集が行われていてもよい。また、それらの変換や近似、編集はHRTF演算部37以外で行われていても、HRTF演算部37で変換が行われてもかまわない。また、フィルタ処理の方法、すなわち、HRTFの演算方法については変換や近似、編集後の形式に応じてさまざまな方法を用いることが可能である。 The HRTF calculation unit 37 generates a binaural signal by performing predetermined calculation processing on the audio signal supplied from the sound source output unit 38 using the supplied HRTF, and outputs the audio from the headphones 39. Here, the HRTF may be in the form of FIR (Finite Impulse Response), or some transformation, approximation, or editing such as IIR (Infinite Impulse Response) or Fourier transform may be performed. Further, the conversion, approximation, and editing may be performed by other than the HRTF calculation unit 37, or may be performed by the HRTF calculation unit 37. Also, various methods can be used for the filter processing method, that is, the HRTF calculation method depending on the format after conversion, approximation, and editing.
 <図6の音声処理装置による音声処理>
 次に、図7のフローチャートを参照して、図6の音声処理装置11による音声処理について説明する。
<Audio processing by the audio processing apparatus of FIG. 6>
Next, the audio processing by the audio processing device 11 of FIG. 6 will be described with reference to the flowchart of FIG.
 ステップS11において、方向選択部31は、聴取者に対する所定の音源の方向の情報、すなわち、HRTFデータのいずれかを特定する例えば仰角および水平角からなる所定の方向の情報を選択し、アドレス情報読込部32に供給する。 In step S11, the direction selection unit 31 selects information on the direction of a predetermined sound source with respect to the listener, that is, information on a predetermined direction including, for example, an elevation angle and a horizontal angle that specify one of HRTF data, and reads address information. To the unit 32.
 ステップS12において、アドレス情報読込部32は、選択された仰角および水平角、および、ヘッダ情報記憶部33に記憶されているヘッダ情報に基づいて、所望とするHRTFが格納されているHRTFデータベース36内のアドレスをアドレス情報記憶部34より読み出して、HRTF読込部35に供給する。 In step S 12, the address information reading unit 32 stores the desired HRTF in the HRTF database 36 based on the selected elevation angle and horizontal angle and the header information stored in the header information storage unit 33. Are read from the address information storage unit 34 and supplied to the HRTF reading unit 35.
 より詳細には、アドレス情報読込部32は、方向選択部31により選択された仰角および水平角からなるHRTFを特定する方向の情報、ヘッダ情報記憶部33に記憶されているHRTFのヘッダ情報に基づいて、例えば、以下の式(2)を用いて、図5の極座標配置でアドレステーブルとして設定されているポインタ点のアドレスのうちの所望とする方向のポインタ点のアドレスAddressを演算する。 More specifically, the address information reading unit 32 is based on the direction information for specifying the HRTF composed of the elevation angle and the horizontal angle selected by the direction selection unit 31 and the header information of the HRTF stored in the header information storage unit 33. For example, by using the following equation (2), the address Address of the pointer point in the desired direction is calculated from the addresses of the pointer points set as the address table in the polar coordinate arrangement of FIG.
 Address=round(elev×E/180)×b+round(azim×A/360)×a’+c
                            ・・・(2)
Address = round (elev × E / 180) × b + round (azim × A / 360) × a '+ c
... (2)
 ここで、式(2)のAddressは、図5の極座標配置で設定されたアドレステーブルの各要素を構成するアドレスが格納されたポインタ点のうち、選択された方向に対応するポインタ点を特定するアドレスである。そして、このアドレスに格納された情報が、HRTFデータベース36内における、選択された方向に対応するHRTFデータが格納されたアドレスを含むポインタである。図5の各ポインタ点においては、HRTFデータベース36内において、図4のように水平方向に略等間隔で配置されたHRTFデータのデータ点のうち、例えば最も近いもののアドレスが格納されている。 Here, Address in Expression (2) specifies a pointer point corresponding to the selected direction among pointer points storing addresses constituting each element of the address table set in the polar coordinate arrangement of FIG. Address. The information stored at this address is a pointer including the address in the HRTF database 36 where the HRTF data corresponding to the selected direction is stored. Each pointer point in FIG. 5 stores, for example, the address of the closest one of the data points of the HRTF data arranged in the HRTF database 36 at substantially equal intervals in the horizontal direction as shown in FIG.
 そこで、アドレス情報読込部32は、例えば仰角および水平角で特定される図5に対応する極座標配置のアドレステーブルにおいてポインタ点として格納されているアドレスを読み出し、HRTF読込部35に供給する。 Therefore, the address information reading unit 32 reads out an address stored as a pointer point in the address table of the polar coordinate arrangement corresponding to FIG. 5 specified by, for example, the elevation angle and the horizontal angle, and supplies the address to the HRTF reading unit 35.
 ステップS13において、HRTF読込部35は、アドレス情報読込部32より供給されてくる、HRTFデータベース36におけるデータ点を特定するアドレスにアクセスし、対応するHRTFデータを読み出してHRTF演算部37に供給する。 In step S13, the HRTF reading unit 35 accesses the address for specifying the data point in the HRTF database 36 supplied from the address information reading unit 32, reads the corresponding HRTF data, and supplies it to the HRTF calculation unit 37.
 ステップS14において、HRTF演算部37は、供給されてきたHRTFを用いて、音源出力部38より供給されてくる音声信号にHRTFフィルタ処理を施し、バイノーラル信号を生成して、ヘッドフォン39に出力する。 In step S14, the HRTF calculation unit 37 performs HRTF filter processing on the audio signal supplied from the sound source output unit 38 using the supplied HRTF, generates a binaural signal, and outputs the binaural signal to the headphones 39.
 ステップS15において、ヘッドフォン39は、供給されてくるバイノーラル信号に基づいて音声を出力する。 In step S15, the headphones 39 output sound based on the supplied binaural signal.
 以上の処理により、HRTFデータへのアクセスについては、仰角および水平角に応じてデータ点が配置されていない構成であっても、HRTFデータが格納されたデータ点のアドレスが、仰角および水平角毎に設定されるアドレステーブルの各要素として格納され、これが仰角および水平角により指定される方向に応じて読み出され、読み出されたアドレスのHRTFのデータ点にアクセスするようにしたことで、従来通りに仰角および水平角により指定される方向で所望とするHRTFデータを特定することが可能となるので、迅速なHRTFデータの読み出しを実現することが可能となる。 With the above processing, for access to HRTF data, even if the data points are not arranged according to the elevation angle and the horizontal angle, the address of the data point storing the HRTF data is set for each elevation angle and horizontal angle. It is stored as each element of the address table set in, and this is read according to the direction specified by the elevation angle and horizontal angle, and the data point of the read address is accessed by the conventional method. Since it is possible to specify the desired HRTF data in the direction specified by the elevation angle and the horizontal angle as described above, it is possible to quickly read out the HRTF data.
 また、以上の例においては、等密度となるように球面上にHRTFデータを配置する例について説明してきたが、HRTFデータが、それ以外の規範により構成されるものであっても、上述した場合と同様に各HRTFデータが格納されているアドレスを図5で示されるようなポインタ点からなるアドレステーブルに割り付けるようにすることで、例えば、聴視者の前方と後方を高密度にHRTFデータのデータ点を分布して、聴視者の左右上下方向には低密度にHRTFデータのデータ点を分布するようにしても、仰角および水平角からなる方向を特定することでHRTFデータのデータ点を特定することが可能となり、HRTFデータベース36より迅速に所望とするHRTFデータを読み出すことが可能となる。 In the above example, the example in which the HRTF data is arranged on the spherical surface so as to have the same density has been described. However, even if the HRTF data is constituted by other norms, As shown in FIG. 5, by assigning addresses where each HRTF data is stored to an address table made up of pointer points as shown in FIG. Even if the data points are distributed and the data points of the HRTF data are distributed at a low density in the left and right and up and down directions of the viewer, the data points of the HRTF data are specified by specifying the direction of the elevation angle and the horizontal angle. This makes it possible to identify the desired HRTF data from the HRTF database 36 quickly.
 HRTFデータが配置される構成については、上述した以外にも、前方のみの密度を高くして、それ以外の方向の密度を低くするようにしてもよい。また、球面は半径の異なる複数の球面でもよい。さらに、頭部の中心と球の中心がずれていてもよく、完全な球でなくてもよい。 Regarding the configuration in which HRTF data is arranged, in addition to the above, only the forward density may be increased and the density in other directions may be decreased. The spherical surface may be a plurality of spherical surfaces having different radii. Further, the center of the head and the center of the sphere may be shifted from each other, and may not be a complete sphere.
 <<2.第2の実施の形態>>
 以上においては、HRTFデータベース36内のHRTFデータのデータ点配置が図4のように略等密度に配置されるときの例について説明してきたが、アドレス情報が存在しない場合(HRTFデータが格納されたアドレスが仰角および水平角からなる方向毎に割り付けられた,アドレステーブルが生成されていない場合)、HRTFデータがロードされたタイミング、または、HRTFデータが組み込まれた後の最初に起動したタイミングなどに、図5のポインタ点毎に対応するHRTFデータベース36内におけるHRTFデータを特定するアドレスを格納したアドレステーブルを生成しておくようにしてもよい。
<< 2. Second embodiment >>
In the above, an example in which the data point arrangement of the HRTF data in the HRTF database 36 is arranged at substantially equal density as shown in FIG. 4 has been described. However, when address information does not exist (HRTF data is stored) When the address is assigned for each direction consisting of elevation and horizontal angles (when the address table is not generated), when the HRTF data is loaded, or when it is first started after the HRTF data is incorporated Alternatively, an address table storing addresses for specifying HRTF data in the HRTF database 36 corresponding to each pointer point in FIG. 5 may be generated.
 すなわち、HRTFデータのデータ点の構成については、図4で示されるような球面上や多面体上の配置のみならず、聴取者からみて、様々な方向、および/もしくは距離、または位置に応じたものでもよいものである。このため、HRTFデータのデータ点の構成は、上述した球面上の配置のみならず、円筒座標系に配置されたり、3軸の座標空間に配置されるように構成するようにしてもよいものである。また、HRTFデータのデータ点の構成は、空間に対して略等密度となるように定義されるようにしてもよいし、非等密度となるようにしてもよく、例えば、聴取者の前方付近に対して高密度で定義し、後方に対しては低密度に定義するようにしてもよい。このため、図5のアドレステーブルの各要素となるポインタ点であるアドレスを割り付けるにあたっては、それぞれのアドレステーブルの各要素となるポインタ点から最も近い位置のHRTFデータ点を予め対応付けておく必要がある。 In other words, the composition of the data points of the HRTF data is not limited to the arrangement on the spherical surface or polyhedron as shown in FIG. 4 but also according to various directions and / or distances or positions as viewed from the listener. But that's fine. Therefore, the configuration of the data points of the HRTF data is not limited to the arrangement on the spherical surface described above, but may be arranged in a cylindrical coordinate system or in a three-axis coordinate space. is there. In addition, the composition of the data points of the HRTF data may be defined so as to be substantially equal density with respect to the space, or may be non-equal density, for example, near the front of the listener. May be defined with a high density for the rear and a low density for the rear. Therefore, when assigning addresses that are pointer points that are elements of the address table of FIG. 5, it is necessary to associate the HRTF data point closest to the pointer point that is an element of each address table in advance. is there.
 図8は、HRTFデータベース36におけるHRTFデータが格納されたアドレスの情報が存在しない(HRTFデータが格納されたアドレスの情報が、仰角および水平角毎に割り付けられたアドレステーブルが生成されていない)場合、図5のアドレステーブルの要素毎に、すなわち、仰角および水平角で特定される方向毎に対応するHRTFデータベース36内におけるHRTFデータを特定するアドレスを格納したアドレステーブルを生成するようにした音声処理装置11の構成例を説明するブロック図である。尚、図8の音声処理装置11において、図6の音声処理装置11における構成と同一の機能を備えた構成については、同一の名称、および同一の符号を付すものとし、その説明は適宜省略するものとする。 FIG. 8 shows the case where there is no information on the address where the HRTF data is stored in the HRTF database 36 (the address table where the address information storing the HRTF data is assigned for each elevation angle and horizontal angle is not generated). 5, voice processing for generating an address table storing addresses for specifying HRTF data in the HRTF database 36 corresponding to each element of the address table of FIG. 5, that is, for each direction specified by the elevation angle and the horizontal angle. 3 is a block diagram illustrating a configuration example of a device 11. FIG. In the audio processing device 11 of FIG. 8, components having the same functions as those of the audio processing device 11 of FIG. 6 are given the same names and the same reference numerals, and the description thereof is omitted as appropriate. Shall.
 すなわち、図8の音声処理装置11において、図6の音声処理装置11と異なる点は、アドレス演算部51をさらに設け、アドレス情報記憶部34に代えて、アドレス情報記憶部52を設けた点である。 That is, the speech processing apparatus 11 of FIG. 8 is different from the speech processing apparatus 11 of FIG. 6 in that an address calculation unit 51 is further provided and an address information storage unit 52 is provided instead of the address information storage unit 34. is there.
 アドレス演算部51は、方向毎にHRTFデータが格納されたアドレスが割り付けられたアドレステーブルが存在しない場合、HRTFデータがロードされたタイミング、または、HRTFデータが組み込まれた後の最初に起動したタイミングなどに、ヘッダ情報記憶部33に記憶されているヘッダ情報およびHRTFデータベース36を読み出して、図5のアドレステーブルの要素毎のアドレスの情報を予め生成して、仰角および水平角により特定される方向毎に対応するHRTFデータが格納されているアドレスをアドレステーブルに割り付けてアドレス情報記憶部52に記憶させる。 When there is no address table to which an address storing HRTF data is assigned for each direction, the address calculation unit 51 is loaded when HRTF data is loaded, or when it is first started after HRTF data is incorporated For example, the header information stored in the header information storage unit 33 and the HRTF database 36 are read out, and address information for each element of the address table in FIG. 5 is generated in advance, and the direction specified by the elevation angle and the horizontal angle. The address where the corresponding HRTF data is stored is assigned to the address table and stored in the address information storage unit 52.
 すなわち、この場合、アドレス情報読込部32は、HRTFデータを指定する仰角および水平角の情報に対応付けて、アドレス情報記憶部52に記憶されているHRTFデータが格納されたアドレスを、例えば、式(2)の計算結果に基づいて読み出して、読み出したアドレスをHRTF読込部35に供給する。つまり、アドレス情報読込部32は、HRTFデータを指定する仰角および水平角の情報が更新される度に、ヘッダ情報および式(2)の計算結果に基づいて、HRTFデータが格納されたアドレスをアドレス情報記憶部52より読み出すこととなる。 That is, in this case, the address information reading unit 32 associates the address of the HRTF data stored in the address information storage unit 52 with the elevation angle and horizontal angle information specifying the HRTF data, for example, an expression Reading is performed based on the calculation result of (2), and the read address is supplied to the HRTF reading unit 35. That is, the address information reading unit 32 addresses the address at which the HRTF data is stored based on the header information and the calculation result of Expression (2) every time the elevation angle and horizontal angle information specifying the HRTF data is updated. The information is read from the information storage unit 52.
 <図8の音声処理装置によるアドレス情報演算処理>
 次に、図9のフローチャートを参照して、図8の音声処理装置11によるアドレス情報演算処理について説明する。
<Address information calculation processing by the voice processing device of FIG. 8>
Next, the address information calculation process by the voice processing device 11 of FIG. 8 will be described with reference to the flowchart of FIG.
 ステップS31において、アドレス演算部51は、未登録の新たなHRTFデータがロードされたか(または、起動直後か)否かを判定し、未登録の新たなHRTFデータがロードされた(または、起動直後である)と判定されるまで、同様の処理が繰り返される。そして、ステップS31において、未登録の新たなHRTFデータがロードされた(または、起動直後である)とみなされた場合、処理は、ステップS32に進む。 In step S31, the address calculation unit 51 determines whether or not new unregistered HRTF data has been loaded (or immediately after activation), and unregistered new HRTF data has been loaded (or immediately after activation). The same process is repeated until it is determined that If it is determined in step S31 that new unregistered HRTF data has been loaded (or immediately after startup), the process proceeds to step S32.
 ステップS32において、アドレス演算部51は、仰角および水平角からなる方向のうち、図5のアドレステーブルのHRTFデータのアドレス点に対応する未処理の方向を処理対象方向に設定する。 In step S32, the address calculation unit 51 sets the unprocessed direction corresponding to the address point of the HRTF data in the address table in FIG. 5 as the process target direction among the directions formed by the elevation angle and the horizontal angle.
 ステップS33において、アドレス演算部51は、処理対象方向に対応するHRTFデータを特定し、対応するHRTFデータベース36におけるアドレスを特定する。 In step S33, the address calculation unit 51 specifies the HRTF data corresponding to the processing target direction, and specifies the address in the corresponding HRTF database 36.
 ステップS34において、アドレス演算部51は、特定されたHRTFデータのアドレスを、処理対象方向を特定する仰角および水平角に対応付けて、アドレス情報記憶部52に記憶させる。 In step S34, the address calculation unit 51 stores the address of the specified HRTF data in the address information storage unit 52 in association with the elevation angle and the horizontal angle that specify the processing target direction.
 ステップS35において、アドレス演算部51は、未処理の方向を特定する仰角および水平角が存在するか否かを判定し、未処理の方向を特定する仰角および水平角が存在する場合、処理は、ステップS32に戻る。すなわち、未処理の方向がなくなるまで、ステップS32乃至S35の処理が繰り返されて、未処理の方向がなくなった場合、処理は、終了する。 In step S35, the address calculation unit 51 determines whether or not an elevation angle and a horizontal angle that specify an unprocessed direction exist. If there is an elevation angle and a horizontal angle that specify an unprocessed direction, the process includes: The process returns to step S32. That is, the process of steps S32 to S35 is repeated until there is no unprocessed direction, and when there is no unprocessed direction, the process ends.
 以上の処理により、図9のフローチャートにおけるステップS34において、選択された仰角および水平角に基づいて、対応する方向のHRTFデータが格納されたアドレスがアドレステーブルを構成する、選択された仰角および水平角に対応する方向の要素としてアドレス情報記憶部52に記憶されることになる。 Through the above processing, the selected elevation angle and horizontal angle in which the address storing the HRTF data in the corresponding direction forms the address table based on the selected elevation angle and horizontal angle in step S34 in the flowchart of FIG. Is stored in the address information storage unit 52 as an element in the direction corresponding to.
 このため、図7のフローチャートにおける音声処理においては、アドレス情報読込部32は、例えば、式(2)の計算結果に基づいて、アドレス情報記憶部52のアドレステーブルにアクセスし、HRTFデータを特定する方向に対応して格納されているアドレスを読み出して、HRTF読込部35に供給することとなる。 For this reason, in the audio processing in the flowchart of FIG. 7, the address information reading unit 32 accesses the address table of the address information storage unit 52 based on the calculation result of Expression (2), for example, and specifies the HRTF data. The address stored corresponding to the direction is read and supplied to the HRTF reading unit 35.
 結果として、HRTFデータを指定する仰角および水平角の情報に対応付けて記憶されているHRTFデータのアドレスの位置を読み出して、HRTF読込部35に供給することとなる。 As a result, the address position of the HRTF data stored in association with the elevation angle and horizontal angle information specifying the HRTF data is read and supplied to the HRTF reading unit 35.
 <<3.第3の実施の形態>>
 以上においては、HRTFデータを定義する空間的な密度を効率よく低減させる例について説明してきたが、これ以上低減させるためには、空間的に疎となった空間におけるHRTFデータを周辺のHRTFデータにより補間して生成するようにしてもよい。
<< 3. Third Embodiment >>
In the above, an example of efficiently reducing the spatial density that defines the HRTF data has been described, but in order to reduce the density further, the HRTF data in a spatially sparse space is replaced by the surrounding HRTF data. You may make it produce | generate by interpolation.
 すなわち、例えば、図10で示されるように、所望とするHRTFのデータ点である点PにおけるHRTF_Pの場合、予め格納されている3点のHRTFのデータ点Pa,Pb,PcのHRTFがHRTF_Pa,HRTF_Pb,HRTF_Pcであるとき、それぞれの補間係数α,β,γを用いて、例えば、以下の式(3)を演算することで、所望とするデータ点である点PにおけるHRTF_Pを求めるようにしてもよい。 That is, for example, as shown in FIG. 10, in the case of HRTF_P at a point P that is a desired HRTF data point, the HRTFs of three HRTF data points Pa, Pb, and Pc stored in advance are HRTF_Pa, When HRTF_Pb and HRTF_Pc, using the respective interpolation coefficients α, β, and γ, for example, the following equation (3) is calculated to obtain HRTF_P at the point P that is a desired data point. Also good.
 HTRF_P=HRTF_Pa×α+HRTF_Pb×β+HRTF_Pc×γ
                            ・・・(3)
HTRF_P = HRTF_Pa × α + HRTF_Pb × β + HRTF_Pc × γ
... (3)
 このような構成にすることで、HRTFデータベースをより小さなものとすることができる。 に す る With this configuration, the HRTF database can be made smaller.
 <第3の実施の形態の構成例>
 図11のブロック図は、HRTFデータを補間するようにした音声処理装置の構成例を示している。尚、図11の音声処理装置11の構成において、図6の音声処理装置11と同一の機能を備えた構成については、同一の名称、および同一の符号を付しており、その説明は適宜省略するものとする。
<Configuration Example of Third Embodiment>
The block diagram of FIG. 11 shows an example of the configuration of a speech processing apparatus that interpolates HRTF data. In the configuration of the voice processing apparatus 11 in FIG. 11, the same name and the same reference numeral are given to the configuration having the same function as the voice processing apparatus 11 in FIG. It shall be.
 すなわち、図11の音声処理装置11において、図6の音声処理装置11と異なるのは、アドレス情報記憶部34に代えて、アドレス情報補間係数記憶部71が設けられており、さらに、補間演算部72が設けられている点である。 That is, the speech processing apparatus 11 of FIG. 11 differs from the speech processing apparatus 11 of FIG. 6 in that an address information interpolation coefficient storage unit 71 is provided instead of the address information storage unit 34, and further, an interpolation calculation unit 72 is provided.
 アドレス情報補間係数記憶部71は、基本的な機能は、アドレス情報記憶部34と同様であるが、さらに、補間演算に必要とされる補間係数を、仰角および水平角により特定される方向毎にそれぞれ記憶している点である。基本的に、仰角および水平角により特定される方向毎に、必要とされる複数のHRTFデータの組み合わせと、それぞれの補間係数が必要とされる。このため、アドレス情報補間係数記憶部71は、方向毎に必要とされるHRTFデータのアドレスの組み合わせと、対応する補間係数とを対応付けて記憶している。 The address information interpolation coefficient storage unit 71 has the same basic function as that of the address information storage unit 34, but further calculates an interpolation coefficient required for the interpolation calculation for each direction specified by the elevation angle and the horizontal angle. Each is a memorized point. Basically, a plurality of required combinations of HRTF data and respective interpolation coefficients are required for each direction specified by the elevation angle and the horizontal angle. Therefore, the address information interpolation coefficient storage unit 71 stores a combination of addresses of HRTF data required for each direction and a corresponding interpolation coefficient in association with each other.
 尚、この例においては、3点のHRTFデータを用いて補間する例について説明するものとするが、それ以外の数の複数のHRTFデータを補間に用いるようにしてもよい。また、このため、アドレス情報読込部32、およびHRTF読込部35は、常に、1方向について、3点のHRTFデータを読み出して補間演算部72に供給する。 In this example, an example of interpolation using three HRTF data will be described, but a plurality of other HRTF data may be used for interpolation. For this reason, the address information reading unit 32 and the HRTF reading unit 35 always read out three points of HRTF data in one direction and supply them to the interpolation calculation unit 72.
 補間演算部72は、仰角および水平角の方向に応じて特定される3点のHRTFデータと、対応する補間係数とを用いて、例えば、線形和を演算することで、対応する方向のHRTFデータを補間生成してHRTF演算部37に供給する。 The interpolation calculation unit 72 calculates HRTF data in the corresponding direction by calculating a linear sum, for example, using the three points of HRTF data specified according to the directions of the elevation angle and the horizontal angle and the corresponding interpolation coefficient. Is generated by interpolation and supplied to the HRTF calculator 37.
 <図11の音声処理装置による音声処理>
 次に、図12のフローチャートを参照して、図11の音声処理装置11による音声処理について説明する。尚、図12のフローチャートにおけるステップS51,S56,S57の処理については、図7のフローチャートを参照して説明したステップS11,S14,S15の処理と同様であるので、その説明は省略する。
<Audio processing by the audio processing apparatus of FIG. 11>
Next, with reference to the flowchart of FIG. 12, the audio processing by the audio processing device 11 of FIG. 11 will be described. Note that the processing of steps S51, S56, and S57 in the flowchart of FIG. 12 is the same as the processing of steps S11, S14, and S15 described with reference to the flowchart of FIG.
 すなわち、ステップS52において、アドレス情報読込部32は、選択された仰角および水平角、ヘッダ情報記憶部33に記憶されているヘッダ情報、および、式(2)に基づいて、対応する3点のHRTFデータが格納されたHRTFデータベース36におけるアドレスをアドレス情報補間係数記憶部71より読み出して、HRTF読込部35に供給する。 That is, in step S52, the address information reading unit 32 determines the corresponding three HRTFs based on the selected elevation angle and horizontal angle, the header information stored in the header information storage unit 33, and the equation (2). The address in the HRTF database 36 in which data is stored is read from the address information interpolation coefficient storage unit 71 and supplied to the HRTF reading unit 35.
 ステップS53において、HRTF読込部35は、アドレス情報読込部32より供給されてくる3点分のアドレスに応じて、HRTFデータベース36にアクセスし、対応する3点分のHRTFデータを読み出して補間演算部72に供給する。 In step S53, the HRTF reading unit 35 accesses the HRTF database 36 according to the addresses for the three points supplied from the address information reading unit 32, reads the corresponding HRTF data for the three points, and performs the interpolation operation unit. 72.
 ステップS54において、補間演算部72は、3点分のHRTFデータと対応する補間係数をアドレス情報補間係数記憶部71より読み出す。 In step S54, the interpolation calculation unit 72 reads out the interpolation coefficients corresponding to the HRTF data for three points from the address information interpolation coefficient storage unit 71.
 ステップS55において、補間演算部72は、例えば、上述した式(3)で示される補間演算を実行することにより、所望とする方向のHRTFデータを補間生成してHRTF演算部37に供給する。 In step S55, the interpolation calculation unit 72 generates the HRTF data in a desired direction by interpolation, for example, by executing the interpolation calculation represented by the above-described equation (3), and supplies the generated HRTF data to the HRTF calculation unit 37.
 ステップS56において、HRTF演算部37は、供給されてきた、補間生成されたHRTFデータを用いて、音源出力部38より供給されてくる音声信号をフィルタ処理することによりバイノーラル信号を生成し、ヘッドフォン39に出力する。 In step S <b> 56, the HRTF calculation unit 37 generates a binaural signal by filtering the audio signal supplied from the sound source output unit 38 using the supplied HRTF data generated by interpolation. Output to.
 以上の処理により、HRTFデータベースのデータ量を小さくしつつ、仰角および水平角により特定される方向毎に高精度のHRTFデータを読み出すことが可能となり、音声データへの畳み込みを実現することが可能となる。 With the above processing, it is possible to read out high-precision HRTF data for each direction specified by the elevation angle and horizontal angle while reducing the amount of data in the HRTF database, and it is possible to realize convolution into audio data Become.
 尚、以上においては、3点のHRTFに、それぞれの補間係数を乗じて、線形和を求めることで補間する例について説明してきたが、線形和以外の方法で補間するようにしてもよい。また、新たにHRTFデータベースが構築される際、第2の実施の形態におけるアドレス情報演算処理と同様に、新たなHRTFデータベースがロードされたタイミングや、その後の最初の起動などにおいて、仰角および水平角により特定されるアドレステーブルの要素毎に、HRTFデータが格納されるアドレスの情報と共に、補間係数も対応付けて、アドレス情報補間係数記憶部71に格納するようにしてもよい。 In the above description, an example of performing interpolation by multiplying three HRTFs by respective interpolation coefficients to obtain a linear sum has been described. However, interpolation may be performed by a method other than linear sum. In addition, when a new HRTF database is constructed, the elevation angle and horizontal angle are determined at the timing when the new HRTF database is loaded or at the first start-up, as in the address information calculation process in the second embodiment. The address information interpolation coefficient storage unit 71 may store the interpolation coefficient in association with the address information in which the HRTF data is stored for each element of the address table specified by the above.
 <<4.第4の実施の形態>>
 以上においては、HRTFデータを3点利用し、それぞれに補間係数を乗じて、線形和を求めることでHRTFデータを補間する例について説明してきたが、補間するHRTFデータが時間ドメインで保持されている場合、補間係数のみを用いて、そのまま補間(例えば、線形和などで補間)してしまうと、パルス位置が合わず所望の補間結果が得られないことがある。
<< 4. Fourth embodiment >>
In the above, an example of interpolating HRTF data by using three points of HRTF data and multiplying each by an interpolation coefficient to obtain a linear sum has been described. However, HRTF data to be interpolated is held in the time domain. In this case, if interpolation is performed using only the interpolation coefficient as it is (for example, interpolation by linear sum), the pulse position may not match and a desired interpolation result may not be obtained.
 すなわち、図13の左部で示されるように、HRTFデータが、方向A乃至Cのそれぞれについて設定され、方向A乃至Cのそれぞれについてずれが時間t1乃至t3であるとき、単純に線形和を用いて補間すると時間方向のパルス位置がずれてしまう恐れがある。 That is, as shown in the left part of FIG. 13, when the HRTF data is set for each of the directions A to C, and the shift is for each of the directions A to C, the linear sum is simply used. If interpolated, the pulse position in the time direction may be shifted.
 そこで、図13の右部で示されるように、HRTFデータのパルス位置の時間方向のずれをオフセット情報として保持し、これを利用して予め時間軸上で、パルス位置を揃えるようにすることで、所望の補間を実現できるようにしてもよい。パルス位置を揃えるには、例えば、HRTFデータを予めオフセットさせたうえで、記憶させるようにして利用するようにしてもよいし、または、HRTFデータにオフセット情報を付随させて保持しておき、適宜、補間時にオフセットさせて共通化した上で補間するようにしてもよい。 Therefore, as shown in the right part of FIG. 13, the shift in the time direction of the pulse position of the HRTF data is held as offset information, and this is used to align the pulse positions on the time axis in advance. The desired interpolation may be realized. In order to align the pulse position, for example, the HRTF data may be offset and stored in advance, or the HRTF data may be stored with accompanying offset information, Alternatively, the interpolation may be performed after the offset is made common at the time of interpolation.
 <第4の実施の形態の構成例>
 図14のブロック図は、HRTFデータのパルス位置の時間方向にオフセットして補間するようにした音声処理装置の構成例を示している。尚、図14の音声処理装置11の構成において、図11の音声処理装置11と同一の機能を備えた構成については、同一の名称、および同一の符号を付しており、その説明は適宜省略するものとする。
<Example of Configuration of Fourth Embodiment>
The block diagram of FIG. 14 shows an example of the configuration of a speech processing apparatus that is offset and interpolated in the time direction of the pulse position of HRTF data. In the configuration of the audio processing device 11 in FIG. 14, the same name and the same reference numeral are given to the configuration having the same function as that of the audio processing device 11 in FIG. It shall be.
 すなわち、図14の音声処理装置11において、図11の音声処理装置11と異なるのは、アドレス情報補間係数記憶部71、および補間演算部72に代えて、アドレス情報補間係数オフセット情報記憶部101、および補間演算部102が設けられている点である。 That is, the speech processing apparatus 11 of FIG. 14 differs from the speech processing apparatus 11 of FIG. 11 in that the address information interpolation coefficient offset information storage section 101, instead of the address information interpolation coefficient storage section 71 and the interpolation calculation section 72, In addition, an interpolation calculation unit 102 is provided.
 アドレス情報補間係数オフセット情報記憶部101は、アドレス情報補間係数記憶部71の機能に加えて、さらに、HRTFデータの方向毎のパルス位置の時間方向のオフセット情報を記憶している。すなわち、アドレス情報補間係数オフセット情報記憶部101は、例えば、図13の右部で示されるように、方向A乃至Cのそれぞれについて、オフセット情報として時間t1乃至t3を格納している。 In addition to the function of the address information interpolation coefficient storage unit 71, the address information interpolation coefficient offset information storage unit 101 further stores offset information in the time direction of the pulse position for each direction of the HRTF data. That is, the address information interpolation coefficient offset information storage unit 101 stores times t1 to t3 as offset information for each of the directions A to C, for example, as shown in the right part of FIG.
 補間演算部102は、基本的には、補間演算部72と同様であるが、さらに、アドレス情報読込部32より供給されてくるアドレスの情報に基づいて、上述した式(3)におけるα、β、γなどのような空間方向の補間係数、および、時間方向のオフセット情報を読み出し、3個のHRTFデータを用いて、HRTFデータを補間生成し、HRTF演算部37に供給する。 The interpolation calculation unit 102 is basically the same as the interpolation calculation unit 72, but based on the address information supplied from the address information reading unit 32, α and β in the above-described equation (3) are used. , Γ, etc., and spatial direction interpolation coefficients and time direction offset information are read out, and HRTF data is interpolated using the three HRTF data and supplied to the HRTF computing unit 37.
 <図14の音声処理装置による音声処理>
 次に、図15のフローチャートを参照して、図14の音声処理装置による音声処理について説明する。尚、図14のフローチャートにおけるステップS91,S97,S98の処理については、図7のフローチャートを参照して説明したステップS11,S14,S15の処理と同様であるので、その説明は省略する。
<Audio processing by the audio processing apparatus of FIG. 14>
Next, audio processing by the audio processing apparatus of FIG. 14 will be described with reference to the flowchart of FIG. The processes in steps S91, S97, and S98 in the flowchart in FIG. 14 are the same as the processes in steps S11, S14, and S15 described with reference to the flowchart in FIG.
 すなわち、ステップS92において、アドレス情報読込部32は、選択された仰角および水平角に基づいて、対応する3点のアドレス情報をアドレス情報補間係数オフセット情報記憶部101より読み出し、ヘッダ情報記憶部33に記憶されているヘッダ情報と併せて、所望とするHRTFが格納されているHRTFデータベース36内のアドレスを算出して、HRTF読込部35、および補間演算部102に供給する。 That is, in step S92, the address information reading unit 32 reads the corresponding three points of address information from the address information interpolation coefficient offset information storage unit 101 based on the selected elevation angle and horizontal angle, and stores them in the header information storage unit 33. Together with the stored header information, an address in the HRTF database 36 storing the desired HRTF is calculated and supplied to the HRTF reading unit 35 and the interpolation calculation unit 102.
 ステップS93において、HRTF読込部35は、アドレス情報読込部32より供給されてくる3点分のアドレスに応じて、対応するHRTFデータベース36のアドレスにアクセスし、対応する3点分のHRTFデータを読み出して補間演算部102に供給する。 In step S93, the HRTF reading unit 35 accesses the address of the corresponding HRTF database 36 in accordance with the address for the three points supplied from the address information reading unit 32, and reads the corresponding HRTF data for the three points. To the interpolation calculation unit 102.
 ステップS94において、補間演算部102は、3点分のHRTFデータと対応する空間方向の補間係数をアドレス情報補間係数オフセット情報記憶部101より読み出す。すなわち、ここでいう空間方向の補間係数は、上述した式(3)におけるα、β、γに対応する補間係数である。 In step S94, the interpolation calculation unit 102 reads out the spatial direction interpolation coefficients corresponding to the three HRTF data from the address information interpolation coefficient offset information storage unit 101. That is, the spatial direction interpolation coefficients here are interpolation coefficients corresponding to α, β, and γ in the above-described equation (3).
 ステップS95において、補間演算部102は、3点分のHRTFデータと対応するオフセット情報をアドレス情報補間係数オフセット情報記憶部101より読み出す。 In step S95, the interpolation calculation unit 102 reads the offset information corresponding to the HRTF data for three points from the address information interpolation coefficient offset information storage unit 101.
 ステップS96において、補間演算部102は、時間方向のオフセット情報および補間係数を考慮した補間演算を実行することにより、所望とする方向のHRTFデータを補間生成してHRTF演算部37に供給する。 In step S <b> 96, the interpolation calculation unit 102 generates an HRTF data in a desired direction by interpolating in consideration of offset information and interpolation coefficients in the time direction, and supplies the generated HRTF data to the HRTF calculation unit 37.
 より詳細には、補間演算部102は、時間方向にパルス位置が揃うように補正した上で、補間係数を用いて上述した式(3)の演算により空間方向に対して補間演算を行った後、さらに、図16で示されるように、時間方向のオフセット情報t1,t2,t3に基づいて設定される時間方向の補間係数p,q,rを算出し、この時間方向の補間係数p,q,rを用いて、時間方向のずれ時間Tを以下の式(4)で示される演算により求める。 More specifically, the interpolation calculation unit 102 corrects the pulse positions so as to be aligned in the time direction, and then performs the interpolation calculation in the spatial direction by the calculation of the above equation (3) using the interpolation coefficient. Further, as shown in FIG. 16, the time direction interpolation coefficients p, q, r set based on the time direction offset information t1, t2, t3 are calculated, and the time direction interpolation coefficients p, q are calculated. , R is used to obtain a time direction deviation time T by the calculation represented by the following equation (4).
 T=t1×p+t2×q+t3×r
                            ・・・(4)
T = t1 × p + t2 × q + t3 × r
... (4)
 すなわち、補間演算部102は、式(4)により求められた時間方向のずれ時間Tのオフセットを補正した、上述した式(3)により求められた補間演算結果となるHRTFデータをHRTF演算部37に供給する。 That is, the interpolation calculation unit 102 corrects the offset of the shift time T in the time direction obtained by the equation (4), and converts the HRTF data that is the interpolation calculation result obtained by the equation (3) described above into the HRTF calculation unit 37 To supply.
 ステップS97において、HRTF演算部37は、供給されてきた、補間生成されたHRTFデータを用いて、音源出力部38より供給されてくる音声信号をフィルタ処理してバイノーラル信号を生成し、ヘッドフォン39に出力する。
 もしくは,ステップS96において空間方向に対して補間演算を行ったものを,ずれ時間Tの補正を行わずにその時間Tの情報とともにHRTF演算部37へ供給し,HRTF演算部37でのフィルタ処理の後にTの時間補正を行ってもよい。
In step S <b> 97, the HRTF calculation unit 37 generates a binaural signal by filtering the audio signal supplied from the sound source output unit 38 using the supplied HRTF data generated by interpolation. Output.
Alternatively, the interpolation calculation performed in the spatial direction in step S96 is supplied to the HRTF calculation unit 37 together with the information on the time T without correcting the shift time T, and the filter processing in the HRTF calculation unit 37 is performed. The time correction of T may be performed later.
 以上の処理により、HRTFデータベースのデータ量を小さくしつつ、空間方向に高精度にHRTFデータを求め、さらに、時間方向のずれ時間Tを考慮してタイミングを調整して音声データの畳み込みを実現することが可能となる。 Through the above processing, HRTF data is obtained with high accuracy in the spatial direction while reducing the amount of data in the HRTF database, and further, the timing is adjusted in consideration of the deviation time T in the time direction to realize convolution of audio data It becomes possible.
 尚、以上においては、時間方向の補間係数については、補間演算部102が演算する例について説明してきたが、空間方向の補間係数と同様に、アドレス情報補間係数オフセット情報記憶部101において予め格納しておくようにし、読み出して補間演算に利用するようにしてもよい。また、時間方向の補間係数p,q,rは、空間方向の補間係数α、β、γで置き換えたり,α、β、γから何らかの計算式により求めたりするようにしてもよい。その逆に、空間方向の補間係数α、β、γは、時間方向の補間係数p,q,rで置き換えたり,p,q,rから何らかの計算式により求めたりするようにしてもよい。 In the above description, the example in which the interpolation calculation unit 102 calculates the interpolation coefficient in the time direction has been described. However, like the interpolation coefficient in the spatial direction, the interpolation coefficient offset information storage unit 101 stores the address information in advance. It may be read out and used for interpolation calculation. Further, the interpolation coefficients p, q, and r in the time direction may be replaced with the interpolation coefficients α, β, and γ in the spatial direction, or may be obtained from α, β, and γ by some calculation formula. Conversely, the spatial direction interpolation coefficients α, β, γ may be replaced with temporal direction interpolation coefficients p, q, r, or may be obtained from p, q, r by some calculation formula.
 以上においては、極座標における仰角と水平角とにより方向を特定する例について説明してきたが、方向が特定できれば他のパラメータでもよく、例えば、任意の極座標の方向を特定する2の角度であってもよい。また、所定の方向は任意の座標系で設定されるグリッドのうちの任意のグリットの交点とするようにしてもよい。 In the above, the example in which the direction is specified by the elevation angle and the horizontal angle in the polar coordinates has been described. However, other parameters may be used as long as the direction can be specified. Good. The predetermined direction may be an intersection of arbitrary grids in a grid set in an arbitrary coordinate system.
 <<5.応用例>>
 <ソフトウェアにより実行させる例>
 ところで、上述した一連の処理は、ハードウェアにより実行させることもできるが、ソフトウェアにより実行させることもできる。一連の処理をソフトウェアにより実行させる場合には、そのソフトウェアを構成するプログラムが、専用のハードウェアに組み込まれているコンピュータ、または、各種のプログラムをインストールすることで、各種の機能を実行することが可能な、例えば汎用のパーソナルコンピュータなどに、記録媒体からインストールされる。
<< 5. Application example >>
<Example executed by software>
By the way, the series of processes described above can be executed by hardware, but can also be executed by software. When a series of processing is executed by software, a program constituting the software may execute various functions by installing a computer incorporated in dedicated hardware or various programs. For example, it is installed from a recording medium in a general-purpose personal computer or the like.
 図17は、汎用のパーソナルコンピュータの構成例を示している。このパーソナルコンピュータは、CPU(Central Processing Unit)1001を内蔵している。CPU1001にはバス1004を介して、入出力インタ-フェイス1005が接続されている。バス1004には、ROM(Read Only Memory)1002およびRAM(Random Access Memory)1003が接続されている。 FIG. 17 shows a configuration example of a general-purpose personal computer. This personal computer incorporates a CPU (Central Processing Unit) 1001. An input / output interface 1005 is connected to the CPU 1001 via a bus 1004. A ROM (Read Only Memory) 1002 and a RAM (Random Access Memory) 1003 are connected to the bus 1004.
 入出力インタ-フェイス1005には、ユーザが操作コマンドを入力するキーボード、マウスなどの入力デバイスよりなる入力部1006、処理操作画面や処理結果の画像を表示デバイスに出力する出力部1007、プログラムや各種データを格納するハードディスクドライブなどよりなる記憶部1008、LAN(Local Area Network)アダプタなどよりなり、インターネットに代表されるネットワークを介した通信処理を実行する通信部1009が接続されている。また、磁気ディスク(フレキシブルディスクを含む)、光ディスク(CD-ROM(Compact Disc-Read Only Memory)、DVD(Digital Versatile Disc)を含む)、光磁気ディスク(MD(Mini Disc)を含む)、もしくは半導体メモリなどのリムーバブルメディア1011に対してデータを読み書きするドライブ1010が接続されている。 The input / output interface 1005 includes an input unit 1006 including an input device such as a keyboard and a mouse for a user to input an operation command, an output unit 1007 for outputting a processing operation screen and an image of the processing result to a display device, programs, and various types. A storage unit 1008 including a hard disk drive for storing data, a LAN (Local Area Network) adapter, and the like are connected to a communication unit 1009 that executes communication processing via a network represented by the Internet. Also, magnetic disks (including flexible disks), optical disks (including CD-ROM (Compact Disc-Read Only Memory), DVD (Digital Versatile Disc)), magneto-optical disks (including MD (Mini Disc)), or semiconductors A drive 1010 for reading / writing data from / to a removable medium 1011 such as a memory is connected.
 CPU1001は、ROM1002に記憶されているプログラム、または磁気ディスク、光ディスク、光磁気ディスク、もしくは半導体メモリ等のリムーバブルメディア1011ら読み出されて記憶部1008にインストールされ、記憶部1008からRAM1003にロードされたプログラムに従って各種の処理を実行する。RAM1003にはまた、CPU1001が各種の処理を実行する上において必要なデータなども適宜記憶される。 The CPU 1001 is read from a program stored in the ROM 1002 or a removable medium 1011 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, installed in the storage unit 1008, and loaded from the storage unit 1008 to the RAM 1003. Various processes are executed according to the program. The RAM 1003 also appropriately stores data necessary for the CPU 1001 to execute various processes.
 以上のように構成されるコンピュータでは、CPU1001が、例えば、記憶部1008に記憶されているプログラムを、入出力インタフェース1005及びバス1004を介して、RAM1003にロードして実行することにより、上述した一連の処理が行われる。 In the computer configured as described above, the CPU 1001 loads the program stored in the storage unit 1008 to the RAM 1003 via the input / output interface 1005 and the bus 1004 and executes the program, for example. Is performed.
 コンピュータ(CPU1001)が実行するプログラムは、例えば、パッケージメディア等としてのリムーバブルメディア1011に記録して提供することができる。また、プログラムは、ローカルエリアネットワーク、インターネット、デジタル衛星放送といった、有線または無線の伝送媒体を介して提供することができる。 The program executed by the computer (CPU 1001) can be provided by being recorded on the removable medium 1011 as a package medium, for example. The program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
 コンピュータでは、プログラムは、リムーバブルメディア1011をドライブ1010に装着することにより、入出力インタフェース1005を介して、記憶部1008にインストールすることができる。また、プログラムは、有線または無線の伝送媒体を介して、通信部1009で受信し、記憶部1008にインストールすることができる。その他、プログラムは、ROM1002や記憶部1008に、あらかじめインストールしておくことができる。 In the computer, the program can be installed in the storage unit 1008 via the input / output interface 1005 by attaching the removable medium 1011 to the drive 1010. Further, the program can be received by the communication unit 1009 via a wired or wireless transmission medium and installed in the storage unit 1008. In addition, the program can be installed in advance in the ROM 1002 or the storage unit 1008.
 なお、コンピュータが実行するプログラムは、本明細書で説明する順序に沿って時系列に処理が行われるプログラムであっても良いし、並列に、あるいは呼び出しが行われたとき等の必要なタイミングで処理が行われるプログラムであっても良い。 The program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.
 また、本明細書において、システムとは、複数の構成要素(装置、モジュール(部品)等)の集合を意味し、すべての構成要素が同一筐体中にあるか否かは問わない。したがって、別個の筐体に収納され、ネットワークを介して接続されている複数の装置、及び、1つの筐体の中に複数のモジュールが収納されている1つの装置は、いずれも、システムである。 In this specification, the system means a set of a plurality of components (devices, modules (parts), etc.), and it does not matter whether all the components are in the same housing. Accordingly, a plurality of devices housed in separate housings and connected via a network and a single device housing a plurality of modules in one housing are all systems. .
 なお、本開示の実施の形態は、上述した実施の形態に限定されるものではなく、本開示の要旨を逸脱しない範囲において種々の変更が可能である。 Note that the embodiments of the present disclosure are not limited to the above-described embodiments, and various modifications can be made without departing from the scope of the present disclosure.
 例えば、本開示は、1つの機能をネットワークを介して複数の装置で分担、共同して処理するクラウドコンピューティングの構成をとることができる。 For example, the present disclosure can take a cloud computing configuration in which one function is shared by a plurality of devices via a network and is processed jointly.
 また、上述のフローチャートで説明した各ステップは、1つの装置で実行する他、複数の装置で分担して実行することができる。 Further, each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
 さらに、1つのステップに複数の処理が含まれる場合には、その1つのステップに含まれる複数の処理は、1つの装置で実行する他、複数の装置で分担して実行することができる。 Further, when a plurality of processes are included in one step, the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
 尚、本開示は、以下のような構成も取ることができる。
<1> 複数の方向に対応するHRTFデータを保持するHRTFデータ保持部と、
 所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータを、前記HRTFデータ保持部より読み込むHRTFデータ読込部と
 を含む音声処理装置。
<2> 前記所定の方向は、聴取者に対する方向、および/もしくは距離、または位置を含む
 <1>に記載の音声処理装置。
<3> 前記所定の方向は、聴取者の頭部を中心とした任意の極座標の方向を特定する2つの角度により特定される方向である
 <1>または<2>に記載の音声処理装置。
<4> 前記任意の極座標の方向を特定する2つの角度は、前記聴取者の頭部を中心とした仰角および水平角である
 <3>に記載の音声処理装置。
<5> 前記所定の方向は、前記複数の方向のいずれの方向にも一致しない
 <1>乃至<4>のいずれかに記載の音声処理装置。
<6> 前記所定の方向は任意の座標系の任意のグリットの交点である
 <1>乃至<5>のいずれかに記載の音声処理装置。
<7> 前記HRTFデータは、複数の方向に対して空間的に略等密度または略非等密度となるように定義されている
 <1>乃至<6>のいずれかに記載の音声処理装置。
<8> 前記HRTFデータは、複数の方向に対応して、前記空間的に、略球面状に略等密度または略非等密度となるように定義されている
 <7>に記載の音声処理装置。
<9> 前記HRTFデータは、複数の方向に対して空間的に略非等密度となるように、前記聴取者に対する複数の位置関係に応じて定義されている
 <1>乃至<8>のいずれかに記載の音声処理装置。
<10> 前記関係情報は、前記所定の方向と、対応する前記HRTFデータ保持部における前記HRTFデータが格納されたアドレスとの関係を示す情報であり、
 前記HRTFデータ読込部は、前記関係情報に基づいて、前記所定の方向に対するアドレスのHRTFデータを読み込む
 <1>乃至<9>のいずれかに記載の音声処理装置。
<11> 前記HRTFデータ読込部は、前記所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する複数の前記HRTFデータを、前記HRTFデータ保持部より読み込み、
 前記所定の方向に対応する前記複数のHRTFデータに基づいて、前記所定の方向に対応する補間HRTFデータを補間生成する補間部をさらに含む
 <1>乃至<10>のいずれかに記載の音声処理装置。
<12> 前記補間部は、前記複数のHRTFデータと、前記複数のHRTFデータのそれぞれに対応する補間係数とに基づいて、前記所定の方向に対応するHRTFデータを補間生成する
 <11>に記載の音声処理装置。
<13> 前記補間部は、前記複数のHRTFデータと、前記複数のHRTFデータのそれぞれに対応する空間方向の補間係数との線形和を算出することで、前記所定の方向に対応するHRTFデータを補間生成する
 <12>に記載の音声処理装置。
<14> 前記補間部は、前記複数のHRTFデータ、前記複数のHRTFデータにおけるパルス位置のずれを示す時間方向のオフセットの情報、並びに、前記複数のHRTFデータのそれぞれに対応する空間方向、および時間方向の補間係数に基づいて、前記所定の方向に対応するHRTFデータを補間生成する
 <13>に記載の音声処理装置。
<15> 前記補間部は、前記複数のHRTFデータにおける前記時間方向のオフセットを揃え、前記複数のHRTFデータのそれぞれに対応する空間方向の補間係数との線形和を算出し、前記複数のHRTFデータにおける前記時間方向のオフセットと、前記時間方向の補間係数との線形和として算出された時間だけオフセットした前記HRTFデータを、前記所定の方向に対応するHRTFデータとして出力する
 <14>に記載の音声処理装置。
<16> HRTFデータ読込部により読み込まれる前記HRTFデータを、音源の信号に畳み込んでバイノーラル信号を生成するバイノーラル信号生成部をさらに含む
 <1>乃至<15>のいずれかに記載の音声処理装置。
<17> 所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータを、複数の方向に対応するHRTFデータを保持するHRTFデータ保持部より読み込むステップ
 を含む音声処理方法。
<18> 複数の方向に対応するHRTFデータを保持するHRTFデータ保持部と、
 所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータを、前記HRTFデータ保持部より読み込むHRTFデータ読込部と
 してコンピュータを機能させるプログラム。
In addition, this indication can also take the following structures.
<1> an HRTF data holding unit for holding HRTF data corresponding to a plurality of directions;
A speech processing apparatus comprising: an HRTF data reading unit that reads the HRTF data corresponding to the predetermined direction from the HRTF data holding unit based on relationship information indicating a relationship with the HRTF data corresponding to a predetermined direction.
<2> The audio processing device according to <1>, wherein the predetermined direction includes a direction and / or a distance or a position with respect to a listener.
<3> The audio processing device according to <1> or <2>, wherein the predetermined direction is a direction specified by two angles that specify an arbitrary polar coordinate direction centered on a listener's head.
<4> The audio processing device according to <3>, wherein the two angles that specify the direction of the arbitrary polar coordinate are an elevation angle and a horizontal angle centered on the listener's head.
<5> The voice processing device according to any one of <1> to <4>, wherein the predetermined direction does not coincide with any of the plurality of directions.
<6> The speech processing apparatus according to any one of <1> to <5>, wherein the predetermined direction is an intersection of arbitrary grids in an arbitrary coordinate system.
<7> The speech processing apparatus according to any one of <1> to <6>, wherein the HRTF data is defined to be spatially substantially equal density or substantially unequal density in a plurality of directions.
<8> The audio processing apparatus according to <7>, wherein the HRTF data is defined so as to have a substantially spherical or substantially equal density or substantially non-uniform density corresponding to a plurality of directions. .
<9> The HRTF data is defined according to a plurality of positional relationships with the listener so as to be spatially substantially non-uniform in a plurality of directions. Any one of <1> to <8> A voice processing apparatus according to claim 1.
<10> The relationship information is information indicating a relationship between the predetermined direction and an address at which the HRTF data is stored in the corresponding HRTF data holding unit,
The speech processing apparatus according to any one of <1> to <9>, wherein the HRTF data reading unit reads HRTF data of an address with respect to the predetermined direction based on the relationship information.
<11> The HRTF data reading unit, based on relation information indicating a relationship with the HRTF data corresponding to the predetermined direction, a plurality of the HRTF data corresponding to the predetermined direction, the HRTF data holding unit Read more,
The speech processing according to any one of <1> to <10>, further including an interpolating unit that interpolates and generates interpolated HRTF data corresponding to the predetermined direction based on the plurality of HRTF data corresponding to the predetermined direction. apparatus.
<12> The interpolation unit generates the HRTF data corresponding to the predetermined direction by interpolation based on the plurality of HRTF data and an interpolation coefficient corresponding to each of the plurality of HRTF data. Voice processing device.
<13> The interpolation unit calculates a linear sum of the plurality of HRTF data and an interpolation coefficient in a spatial direction corresponding to each of the plurality of HRTF data, thereby obtaining HRTF data corresponding to the predetermined direction. The voice processing device according to <12>, wherein interpolation generation is performed.
<14> The interpolation unit includes the plurality of HRTF data, time-direction offset information indicating a shift in pulse position in the plurality of HRTF data, a spatial direction corresponding to each of the plurality of HRTF data, and a time The speech processing apparatus according to <13>, wherein HRTF data corresponding to the predetermined direction is generated by interpolation based on a direction interpolation coefficient.
<15> The interpolation unit aligns the offset in the time direction in the plurality of HRTF data, calculates a linear sum with a spatial direction interpolation coefficient corresponding to each of the plurality of HRTF data, and the plurality of HRTF data The HRTF data offset by a time calculated as a linear sum of the offset in the time direction and the interpolation coefficient in the time direction in HRTF is output as HRTF data corresponding to the predetermined direction <14> Processing equipment.
<16> The audio processing apparatus according to any one of <1> to <15>, further including a binaural signal generation unit configured to convolve the HRTF data read by the HRTF data reading unit with a signal of a sound source to generate a binaural signal. .
<17> An HRTF data holding unit that holds the HRTF data corresponding to the predetermined direction and the HRTF data corresponding to a plurality of directions based on the relationship information indicating the relationship with the HRTF data corresponding to the predetermined direction. A voice processing method including a step of reading more.
<18> an HRTF data holding unit that holds HRTF data corresponding to a plurality of directions;
Based on the relationship information indicating the relationship with the HRTF data corresponding to a predetermined direction, the computer functions as an HRTF data reading unit that reads the HRTF data corresponding to the predetermined direction from the HRTF data holding unit. program.
 11 音声処理装置, 31 方向選択部, 32 ポインタ演算部, 33 ヘッダ情報記憶部, 34 アドレス情報記憶部, 35 HRTF読込部, 36 HRTFデータベース, 37 HRTF演算部, 38 音源出力部, 39 ヘッドフォン, 51 アドレス演算部, 52 アドレス情報記憶部, 71 アドレス情報補間係数記憶部, 72 補間演算部, 101 アドレス情報補間係数オフセット情報記憶部, 102 補間演算部 11 voice processing device, 31 direction selection unit, 32 pointer calculation unit, 33 header information storage unit, 34 address information storage unit, 35 HRTF reading unit, 36 HRTF database, 37 HRTF calculation unit, 38 sound source output unit, 39 headphones, 51 Address calculation unit, 52 Address information storage unit, 71 Address information interpolation coefficient storage unit, 72 Interpolation calculation unit, 101 Address information interpolation coefficient offset information storage unit, 102 Interpolation calculation unit

Claims (18)

  1.  複数の方向に対応するHRTFデータを保持するHRTFデータ保持部と、
     所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータを、前記HRTFデータ保持部より読み込むHRTFデータ読込部と
     を含む音声処理装置。
    An HRTF data holding unit for holding HRTF data corresponding to a plurality of directions;
    A speech processing apparatus comprising: an HRTF data reading unit that reads the HRTF data corresponding to the predetermined direction from the HRTF data holding unit based on relationship information indicating a relationship with the HRTF data corresponding to a predetermined direction.
  2.  前記所定の方向は、聴取者に対する方向、および/もしくは距離、または位置を含む
     請求項1に記載の音声処理装置。
    The sound processing apparatus according to claim 1, wherein the predetermined direction includes a direction and / or a distance or a position with respect to a listener.
  3.  前記所定の方向は、聴取者の頭部を中心とした任意の極座標の方向を特定する2つの角度により特定される方向である
     請求項1に記載の音声処理装置。
    The audio processing apparatus according to claim 1, wherein the predetermined direction is a direction specified by two angles that specify an arbitrary polar coordinate direction centered on a listener's head.
  4.  前記任意の極座標の方向を特定する2つの角度は、前記聴取者の頭部を中心とした仰角および水平角である
     請求項3に記載の音声処理装置。
    The sound processing apparatus according to claim 3, wherein the two angles that specify the direction of the arbitrary polar coordinate are an elevation angle and a horizontal angle centered on the listener's head.
  5.  前記所定の方向は、前記複数の方向のいずれの方向にも一致しない
     請求項1に記載の音声処理装置。
    The voice processing apparatus according to claim 1, wherein the predetermined direction does not coincide with any of the plurality of directions.
  6.  前記所定の方向は任意の座標系の任意のグリットの交点である
     請求項1に記載の音声処理装置。
    The speech processing apparatus according to claim 1, wherein the predetermined direction is an intersection of arbitrary grids in an arbitrary coordinate system.
  7.  前記HRTFデータは、複数の方向に対して空間的に略等密度または略非等密度となるように定義されている
     請求項1に記載の音声処理装置。
    The speech processing apparatus according to claim 1, wherein the HRTF data is defined so as to have a spatially substantially equal density or a substantially unequal density in a plurality of directions.
  8.  前記HRTFデータは、複数の方向に対応して、前記空間的に、略球面状に略等密度または略非等密度となるように定義されている
     請求項7に記載の音声処理装置。
    The speech processing apparatus according to claim 7, wherein the HRTF data is defined so as to have a substantially spherical shape with substantially equal density or substantially non-uniform density corresponding to a plurality of directions.
  9.  前記HRTFデータは、複数の方向に対して空間的に略非等密度となるように、前記聴取者に対する複数の位置関係に応じて定義されている
     請求項1に記載の音声処理装置。
    The audio processing apparatus according to claim 1, wherein the HRTF data is defined according to a plurality of positional relationships with respect to the listener such that the HRTF data is spatially substantially non-uniform in a plurality of directions.
  10.  前記関係情報は、前記所定の方向と、対応する前記HRTFデータ保持部における前記HRTFデータが格納されたアドレスとの関係を示す情報であり、
     前記HRTFデータ読込部は、前記関係情報に基づいて、前記所定の方向に対するアドレスのHRTFデータを読み込む
     請求項1に記載の音声処理装置。
    The relationship information is information indicating a relationship between the predetermined direction and an address at which the HRTF data is stored in the corresponding HRTF data holding unit,
    The speech processing apparatus according to claim 1, wherein the HRTF data reading unit reads HRTF data of an address with respect to the predetermined direction based on the relationship information.
  11.  前記HRTFデータ読込部は、前記所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する複数の前記HRTFデータを、前記HRTFデータ保持部より読み込み、
     前記所定の方向に対応する前記複数のHRTFデータに基づいて、前記所定の方向に対応する補間HRTFデータを補間生成する補間部をさらに含む
     請求項1に記載の音声処理装置。
    The HRTF data reading unit reads a plurality of HRTF data corresponding to the predetermined direction from the HRTF data holding unit based on relation information indicating a relationship with the HRTF data corresponding to the predetermined direction,
    The speech processing apparatus according to claim 1, further comprising an interpolation unit that interpolates and generates interpolated HRTF data corresponding to the predetermined direction based on the plurality of HRTF data corresponding to the predetermined direction.
  12.  前記補間部は、前記複数のHRTFデータと、前記複数のHRTFデータのそれぞれに対応する補間係数とに基づいて、前記所定の方向に対応するHRTFデータを補間生成する
     請求項11に記載の音声処理装置。
    The audio processing according to claim 11, wherein the interpolation unit interpolates and generates HRTF data corresponding to the predetermined direction based on the plurality of HRTF data and an interpolation coefficient corresponding to each of the plurality of HRTF data. apparatus.
  13.  前記補間部は、前記複数のHRTFデータと、前記複数のHRTFデータのそれぞれに対応する空間方向の補間係数との線形和を算出することで、前記所定の方向に対応するHRTFデータを補間生成する
     請求項12に記載の音声処理装置。
    The interpolation unit interpolates and generates HRTF data corresponding to the predetermined direction by calculating a linear sum of the plurality of HRTF data and an interpolation coefficient in a spatial direction corresponding to each of the plurality of HRTF data. The speech processing apparatus according to claim 12.
  14.  前記補間部は、前記複数のHRTFデータ、前記複数のHRTFデータにおけるパルス位置のずれを示す時間方向のオフセットの情報、並びに、前記複数のHRTFデータのそれぞれに対応する空間方向、および時間方向の補間係数に基づいて、前記所定の方向に対応するHRTFデータを補間生成する
     請求項11に記載の音声処理装置。
    The interpolation unit includes the plurality of HRTF data, time direction offset information indicating a pulse position shift in the plurality of HRTF data, and spatial direction and time direction interpolation corresponding to each of the plurality of HRTF data. The speech processing apparatus according to claim 11, wherein HRTF data corresponding to the predetermined direction is generated by interpolation based on a coefficient.
  15.  前記補間部は、前記複数のHRTFデータにおける前記時間方向のオフセットを揃え、前記複数のHRTFデータのそれぞれに対応する空間方向の補間係数との線形和を算出し、前記複数のHRTFデータにおける前記時間方向のオフセットと、前記時間方向の補間係数との線形和として算出された時間だけオフセットした前記HRTFデータを、前記所定の方向に対応するHRTFデータとして出力する
     請求項14に記載の音声処理装置。
    The interpolation unit aligns the offset in the time direction in the plurality of HRTF data, calculates a linear sum with a spatial direction interpolation coefficient corresponding to each of the plurality of HRTF data, and calculates the time in the plurality of HRTF data. The speech processing apparatus according to claim 14, wherein the HRTF data offset by a time calculated as a linear sum of a direction offset and the interpolation coefficient in the time direction is output as HRTF data corresponding to the predetermined direction.
  16.  HRTFデータ読込部により読み込まれる前記HRTFデータを、音源の信号に畳み込んでバイノーラル信号を生成するバイノーラル信号生成部をさらに含む
     請求項1に記載の音声処理装置。
    The audio processing apparatus according to claim 1, further comprising a binaural signal generation unit configured to convolve the HRTF data read by the HRTF data reading unit with a sound source signal to generate a binaural signal.
  17.  所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータを、複数の方向に対応するHRTFデータを保持するHRTFデータ保持部より読み込むステップ
     を含む音声処理方法。
    A step of reading the HRTF data corresponding to the predetermined direction from an HRTF data holding unit holding HRTF data corresponding to a plurality of directions based on relation information indicating a relationship with the HRTF data corresponding to a predetermined direction. An audio processing method including:
  18.  複数の方向に対応するHRTFデータを保持するHRTFデータ保持部と、
     所定の方向に対応する前記HRTFデータとの関係を示す関係情報に基づいて、前記所定の方向に対応する前記HRTFデータを、前記HRTFデータ保持部より読み込むHRTFデータ読込部と
     してコンピュータを機能させるプログラム。
    An HRTF data holding unit for holding HRTF data corresponding to a plurality of directions;
    Based on the relationship information indicating the relationship with the HRTF data corresponding to a predetermined direction, the computer functions as an HRTF data reading unit that reads the HRTF data corresponding to the predetermined direction from the HRTF data holding unit. program.
PCT/JP2017/001853 2016-02-04 2017-01-20 Audio processing device, audio processing method and program WO2017135063A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2016-019659 2016-02-04
JP2016019659 2016-02-04

Publications (1)

Publication Number Publication Date
WO2017135063A1 true WO2017135063A1 (en) 2017-08-10

Family

ID=59500684

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2017/001853 WO2017135063A1 (en) 2016-02-04 2017-01-20 Audio processing device, audio processing method and program

Country Status (1)

Country Link
WO (1) WO2017135063A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020026548A1 (en) * 2018-07-31 2020-02-06 ソニー株式会社 Information processing device, information processing method, and acoustic system
CN114598985A (en) * 2022-03-07 2022-06-07 安克创新科技股份有限公司 Audio processing method and device
US20230081104A1 (en) * 2021-09-14 2023-03-16 Sound Particles S.A. System and method for interpolating a head-related transfer function
US11785385B2 (en) 2021-02-09 2023-10-10 Yamaha Corporation Shoulder-mounted speaker, sound image localization method, and non-transitory computer-readable medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1042399A (en) * 1996-02-13 1998-02-13 Sextant Avionique Voice space system and individualizing method for executing it
JP2003284196A (en) * 2002-03-20 2003-10-03 Sony Corp Sound image localizing signal processing apparatus and sound image localizing signal processing method
JP2004279241A (en) * 2003-03-17 2004-10-07 Internatl Business Mach Corp <Ibm> System and method for capturing sound source position, sound reflective factor to be used for the system, and its forming method
JP2013211906A (en) * 2007-03-01 2013-10-10 Mahabub Jerry Sound spatialization and environment simulation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1042399A (en) * 1996-02-13 1998-02-13 Sextant Avionique Voice space system and individualizing method for executing it
JP2003284196A (en) * 2002-03-20 2003-10-03 Sony Corp Sound image localizing signal processing apparatus and sound image localizing signal processing method
JP2004279241A (en) * 2003-03-17 2004-10-07 Internatl Business Mach Corp <Ibm> System and method for capturing sound source position, sound reflective factor to be used for the system, and its forming method
JP2013211906A (en) * 2007-03-01 2013-10-10 Mahabub Jerry Sound spatialization and environment simulation

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020026548A1 (en) * 2018-07-31 2020-02-06 ソニー株式会社 Information processing device, information processing method, and acoustic system
CN112368768A (en) * 2018-07-31 2021-02-12 索尼公司 Information processing apparatus, information processing method, and acoustic system
US11659347B2 (en) 2018-07-31 2023-05-23 Sony Corporation Information processing apparatus, information processing method, and acoustic system
US11785385B2 (en) 2021-02-09 2023-10-10 Yamaha Corporation Shoulder-mounted speaker, sound image localization method, and non-transitory computer-readable medium
US20230081104A1 (en) * 2021-09-14 2023-03-16 Sound Particles S.A. System and method for interpolating a head-related transfer function
US12035126B2 (en) * 2021-09-14 2024-07-09 Sound Particles S.A. System and method for interpolating a head-related transfer function
CN114598985A (en) * 2022-03-07 2022-06-07 安克创新科技股份有限公司 Audio processing method and device
CN114598985B (en) * 2022-03-07 2024-05-03 安克创新科技股份有限公司 Audio processing method and device

Similar Documents

Publication Publication Date Title
JP7367785B2 (en) Audio processing device and method, and program
US12022277B2 (en) Method for generating customized spatial audio with head tracking
WO2017135063A1 (en) Audio processing device, audio processing method and program
US20110286601A1 (en) Audio signal processing device and audio signal processing method
WO2018008396A1 (en) Acoustic field formation device, method, and program
US10595148B2 (en) Sound processing apparatus and method, and program
CN108476365B (en) Audio processing apparatus and method, and storage medium
JP6834985B2 (en) Speech processing equipment and methods, and programs
US11004457B2 (en) Sound reproducing method, apparatus and non-transitory computer readable storage medium thereof
CN110832884B (en) Signal processing apparatus and method, and computer-readable storage medium
WO2020196004A1 (en) Signal processing device and method, and program
JP7260821B2 (en) Signal processing device, signal processing method and signal processing program
WO2021131385A1 (en) Sound equipment, sound processing method and recording medium
CN116600242B (en) Audio sound image optimization method and device, electronic equipment and storage medium
WO2023085186A1 (en) Information processing device, information processing method, and information processing program
Wang et al. Extension of the real-time Simulated Open Field Environment for fast binaural rendering
JP2023159690A (en) Signal processing apparatus, method for controlling signal processing apparatus, and program
CN116965062A (en) Clustering audio objects
CN116193196A (en) Virtual surround sound rendering method, device, equipment and storage medium
JP2022034267A (en) Binaural reproduction device and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17747222

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17747222

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP