WO2022220181A1 - 情報処理方法、情報処理装置、及び、プログラム - Google Patents
情報処理方法、情報処理装置、及び、プログラム Download PDFInfo
- Publication number
- WO2022220181A1 WO2022220181A1 PCT/JP2022/017167 JP2022017167W WO2022220181A1 WO 2022220181 A1 WO2022220181 A1 WO 2022220181A1 JP 2022017167 W JP2022017167 W JP 2022017167W WO 2022220181 A1 WO2022220181 A1 WO 2022220181A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- structures
- sound
- information processing
- simplified
- shape
- Prior art date
Links
- 230000010365 information processing Effects 0.000 title claims abstract description 73
- 238000003672 processing method Methods 0.000 title claims abstract description 18
- 230000005236 sound signal Effects 0.000 claims description 24
- 238000004364 calculation method Methods 0.000 claims description 18
- 230000005484 gravity Effects 0.000 claims description 14
- 230000006870 function Effects 0.000 claims description 10
- 238000012546 transfer Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 description 22
- 238000004891 communication Methods 0.000 description 20
- 238000004590 computer program Methods 0.000 description 15
- 238000000034 method Methods 0.000 description 15
- 230000008859 change Effects 0.000 description 7
- 238000001514 detection method Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 238000009877 rendering Methods 0.000 description 6
- 238000010586 diagram Methods 0.000 description 3
- 230000004886 head movement Effects 0.000 description 3
- 230000010354 integration Effects 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 239000000470 constituent Substances 0.000 description 2
- 238000005401 electroluminescence Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 230000003190 augmentative effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000644 propagated effect Effects 0.000 description 1
- 230000011514 reflex Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
- H04S7/303—Tracking of listener position or orientation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/012—Head tracking input arrangements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/11—Positioning of individual sound objects, e.g. moving airplane, within a sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/01—Enhancing the perception of the sound image or of the spatial distribution using head related transfer functions [HRTF's] or equivalents thereof, e.g. interaural time difference [ITD] or interaural level difference [ILD]
Definitions
- the present disclosure relates to an information processing method, an information processing device, and a program for reproducing stereophonic sound.
- the present disclosure provides an information processing method and the like that can reduce the processing load required to reproduce stereophonic sound.
- An information processing method acquires space information for reproducing a virtual space, wherein the virtual space comprises a first structure arranged in the virtual space; and a sound source, each having a simplified shape for simplifying the shape of the first structure, each of the plurality of second structures having a simplified shape of , having a shape obtained by combining one or more types of three-dimensional shapes among a plurality of types of predetermined simple three-dimensional shapes, and for each of the plurality of second structures, the sound reflection efficiency of the second structure calculating a plurality of reflection index values respectively corresponding to the plurality of second structures by calculating reflection index values related to the plurality of second structures, based on the plurality of reflection index values, By selecting one of the second structures and replacing the first structure with the selected one second structure, a simplified space in which the three-dimensional shape of the first structure is simplified Generate.
- an information processing apparatus includes a processor and a memory, and the processor acquires space information for reproducing a virtual space using the memory,
- the virtual space includes a first structure arranged in the virtual space and a sound source, each of which has a simplified shape for simplifying the shape of the first structure.
- a plurality of reflection index values corresponding to each of the plurality of second structures are calculated by calculating a reflection index value related to the sound reflection efficiency of the second structure. and selecting a second structure from the plurality of second structures based on the plurality of reflection index values, and attaching the first structure to the selected one second structure.
- a simplified space in which the three-dimensional shape of the first structure is simplified is generated.
- the information processing method and the like according to the present disclosure can reduce the processing load required for reproducing stereophonic sound.
- FIG. Fig. 10 is a table showing pre-associated weights for each simple solid shape
- 4 is a flow chart showing an example of the operation of the information processing device
- 4 is a flow chart showing an example of processing for simplifying a virtual space
- It is a figure which shows the example of a virtual space.
- Conventional sound reproduction technologies include, for example, a method based on wave acoustic theory that faithfully reproduces physical characteristics such as the boundary element method described in Patent Document 1, or a method based on geometrical acoustics such as the sound ray method. methods are known.
- the method based on the wave acoustic theory has a problem that the amount of calculation increases when calculating the impulse response, especially in high frequencies, for a complicated spatial shape.
- even when using a method based on geometrical acoustics such as the sound ray method there is a problem that the amount of calculation in real time is large in a 6DoF (six degrees of freedom) environment where the sound object moves and the user moves. be.
- An information processing method acquires space information for reproducing a virtual space, wherein the virtual space comprises a first structure arranged in the virtual space; and a sound source, each having a simplified shape for simplifying the shape of the first structure, each of the plurality of second structures having a simplified shape of , having a shape obtained by combining one or more types of three-dimensional shapes among a plurality of types of predetermined simple three-dimensional shapes, and for each of the plurality of second structures, the sound reflection efficiency of the second structure calculating a plurality of reflection index values respectively corresponding to the plurality of second structures by calculating reflection index values related to the plurality of second structures, based on the plurality of reflection index values, By selecting one of the second structures and replacing the first structure with the selected one second structure, a simplified space in which the three-dimensional shape of the first structure is simplified Generate.
- one of the plurality of second structures generated by simplifying the shape of the first structure arranged in the virtual space is replaced with one second structure selected based on the reflection index value. Therefore, the first structure can be replaced with a second structure that has similar characteristics that affect sound and has a simplified shape, and the amount of calculation is reduced so as not to change the characteristics that affect sound. We can obtain a simplified space where Therefore, the processing load required for stereophonic reproduction can be reduced.
- the listening position of the listener in the virtual space is specified, and in generating the plurality of second structures, the first structure when the first structure is planarly viewed from the listening position
- the plurality of second structures may be generated by combining one or more types of three-dimensional shapes among the plurality of types of simple three-dimensional shapes so as to have the same projected area as the object.
- each of the plurality of second structures has a sound propagation path between the sound source and the listening position, and the second structure when viewed from the listening position to the plurality of second structures in a plan view.
- the angle of reflection of sound at each of three positions that is, the position of the center of gravity of the projected shape of the structure and the position of two points sandwiching the position of the center of gravity, and the angle of reflection of the sound at each position of the position of the center of gravity of the projected shape of the structure, and the first It may be generated such that the sound reflection angles at each of the three positions of the center of gravity of the projected shape of one structure and the positions of two points sandwiching the center of gravity are equal to each other.
- the plurality of second structures may have shapes different from each other.
- the second structure having the smallest corresponding reflection index value may be selected as the one second structure from among the plurality of second structures.
- the structure with the smallest amount of calculation can be selected as one second structure from among the plurality of second structures.
- the position and orientation of the listener's head in the virtual space are specified, and the sound source is determined based on the simplified space, the head position and orientation, and the position of the sound source.
- At least one of the sound that arrives at the head from and the sound that arrives at the head after being reflected by the one second structure in the simplified space, and the direction of arrival and the propagation until the arrival A sound signal may be generated by calculating a propagation distance and a propagation distance, and convolving the arrival direction and the propagation distance of the at least one sound with a predetermined head-related transfer function, and outputting the generated sound signal.
- the stereophonic sound processing is performed using the simplified space that can reduce the amount of calculation so as not to change the characteristics that affect the sound. can.
- the position and posture of the head and the position of the sound source are specified at a plurality of timings different from each other, and the propagation distance is calculated, the audio signal is generated, and The audio signal may be output.
- an information processing apparatus includes a processor and a memory, and the processor acquires space information for reproducing a virtual space using the memory,
- the virtual space includes a first structure arranged in the virtual space and a sound source, each of which has a simplified shape for simplifying the shape of the first structure.
- a plurality of reflection index values corresponding to each of the plurality of second structures are calculated by calculating a reflection index value related to the sound reflection efficiency of the second structure. and selecting a second structure from the plurality of second structures based on the plurality of reflection index values, and attaching the first structure to the selected one second structure.
- a simplified space in which the three-dimensional shape of the first structure is simplified is generated.
- one of the plurality of second structures generated by simplifying the shape of the first structure arranged in the virtual space is replaced with one second structure selected based on the reflection index value. Therefore, the first structure can be replaced with a second structure that has similar characteristics that affect sound and has a simplified shape, and the amount of calculation is reduced so as not to change the characteristics that affect sound. We can obtain a simplified space where Therefore, the processing load required for stereophonic reproduction can be reduced.
- FIG. 1 is a diagram showing an example of a sound reproduction system according to an embodiment.
- a sound reproduction system 1 includes, for example, an information processing device 100, a terminal 200, and a controller 300, as shown in FIG.
- these may be communicably connected to each other by dedicated wired communication, or may be communicably connected by wireless communication. These may be connected so as to be able to communicate directly, or may be connected so as to be able to communicate via a predetermined device therebetween.
- the information processing device 100 reproduces sound in the virtual space and outputs it to the terminal 200 .
- the information processing apparatus 100 reproduces a virtual space and reproduces sounds that the user can hear in the virtual space.
- the virtual space includes structures, sound sources, listeners, and the like. A listener is a user. These structures, sound sources and listeners are virtual.
- the information processing apparatus 100 reproduces sounds that can be heard by the listener in the virtual space based on the size and position of the structure, the position of the sound source, and the position of the listener in the virtual space.
- the terminal 200 outputs the generated sound to the user, and the controller 300 acquires the input received by the controller 300 from the user.
- the position and posture of the listener in the virtual space are changed according to the input obtained by the terminal 200 . Therefore, the information processing apparatus 100 changes the sound to be reproduced according to the listener's position and posture in the virtual space, which is changed according to the input acquired by the terminal 200 .
- the information processing device 100 includes an acquisition unit 101, a candidate generation unit 102, a calculation unit 103, a selection unit 104, a decoding unit 105, a space generation unit 106, a rendering unit 107, and a communication unit 108.
- the information processing apparatus 100 can be realized by a processor executing a predetermined program using a memory. That is, the information processing device 100 is a computer.
- the acquisition unit 101 acquires acoustic information for reproducing acoustics in a virtual space.
- Acquisition unit 101 may acquire acoustic information from an external storage device via a network, or may acquire acoustic information from an internal storage device.
- the storage device may be a device that reads information recorded on a recording medium such as an optical disc or a memory card, or has a built-in HDD (Hard Disk Drive) or SSD (Solid State Drive) recording medium. It may be a device for reading information recorded on a recording medium.
- the external storage device may be, for example, a server connected via the Internet.
- the acoustic information includes, for example, an audio stream indicating sound from a sound source and spatial information indicating a virtual space.
- Spatial information includes mesh information for reproducing the first structure placed in the virtual space, sound source positions, and the like.
- the mesh information includes information such as structure size, shape, color, reflectance, and reverberation characteristics.
- Structures include man-made structures and natural structures. That is, structures include all virtual objects for defining space.
- the sound source position indicates the position where the sound is reproduced (output) in the structure. The sound source position may change over time.
- the sound source is, for example, an object-based, HOA-based, or channel-based sound source.
- the candidate generation unit 102 generates a plurality of second structures each having a simplified shape for simplifying the shape of the first structure.
- Each simplified shape of the plurality of second structures has a shape obtained by combining one or more types of three-dimensional shapes among a plurality of types of predetermined simple three-dimensional shapes.
- simple solid shapes include, for example, cuboids, cylinders, spheres, and cones.
- the second structure may be composed of, for example, a combination of one or more of cuboids, cylinders, spheres, and cones, or two or more types of cuboids, cylinders, spheres, and cones 1 It may be configured by a combination of one or more.
- the plurality of second structures have shapes different from each other.
- the candidate generation unit 102 generates a plurality of second structures by determining a plurality of combinations of simplified shapes so as to have shapes that approximate the shape of the first structure. Specifically, the candidate generation unit 102 generates multiple types of simple three-dimensional structures so as to be equal to the projected area of the first structure to be processed when the first structure to be processed is planarly viewed from the listening position. A plurality of second structures are generated by combining one or more three-dimensional shapes among the shapes.
- each of the plurality of second structures is a projection of the plurality of second structures when each of the plurality of second structures is viewed from the listening position in a sound propagation path from the sound source to the listening position.
- the calculation unit 103 calculates, for each of the plurality of second structures generated by the candidate generation unit 102, a reflection index value related to the sound reflection efficiency of the second structure. Thereby, the calculation unit 103 calculates a plurality of reflection index values respectively corresponding to the plurality of second structures. Specifically, the calculation unit 103 calculates, as a reflection index value, the sum of weights pre-associated with one or more simple three-dimensional shapes that constitute each second structure. Specifically, the reflection index value is an index value that is set to a larger value as the predetermined sound from the facing sound source is reflected more efficiently. Efficiently reflecting means, for example, a smaller difference in frequency characteristics when a predetermined sound and a reflected sound are compared. For example, the reflection index value is set to a rectangular parallelepiped, cylinder, sphere, and cone in descending order.
- Fig. 2 is a table showing the pre-associated weights of each simple three-dimensional shape.
- a cuboid is associated with a weight w1, a cylinder is associated with a weight w2 smaller than the weight w1, a sphere is associated with a weight w3 smaller than the weight w2, and a cone is associated with a weight less than w3.
- a small weight w4 is associated.
- the calculator 103 calculates the weight w1 as the reflection index value of the second structure.
- the calculation unit 103 calculates the sum of the weight w2 and the weight w3 ⁇ 2 as the reflection of the second structure. Calculated as an index value.
- the calculator 103 calculates the weight w4 ⁇ 4 as the reflection index value of the second structure.
- the selection unit 104 selects one second structure from among the plurality of second structures based on the plurality of reflection index values respectively corresponding to the plurality of second structures. Specifically, the selection unit 104 selects the second structure having the smallest corresponding reflection index value among the plurality of second structures as one second structure.
- the decoding unit 105 decodes the first audio signal by performing decoding processing on the encoded audio stream.
- the space generation unit 106 generates a simplified space in which the three-dimensional shape of the first structure is simplified by replacing the first structure with one second structure selected by the selection unit 104.
- the rendering unit 107 Based on the simplified space generated by the space generation unit 106, the position and posture of the listener's head, and the position of the sound source, the rendering unit 107 generates sound arriving from the sound source to the head and the simplified space. An arrival direction and a propagation distance propagated until arrival of at least one of the sounds reflected by one second structure in the space and arriving at the head are calculated. Then, the rendering unit 107 convolves the direction of arrival and the propagation distance of at least one of the sounds in the first audio signal decoded by the decoding unit 105 with a predetermined head-related transfer function (HRTF). , to generate a second audio signal and output the generated second audio signal.
- HRTF head-related transfer function
- the rendering unit 107 Based on the spatial information and the position and posture of the listener's head, the rendering unit 107 also generates a video signal representing the field of view seen by the listener from the listener's listening position to the listener's posture.
- the video signal is an image of a structure in a virtual space in which the structure is not simplified and included in the field of view. Note that the rendering unit 107 identifies the listening position of the listener in the virtual space based on the listener information received by the communication unit 108 .
- the communication unit 108 exchanges information with the terminal 200 by communicating with the terminal 200 .
- the communication unit 108 transmits the second audio signal and the video signal for output to the terminal 200, for example.
- the communication unit 108 also receives, from the terminal 200, listener information indicating the position and posture of the listener's head, for example.
- the terminal 200 includes a communication section 201 , a control section 202 , a detection section 203 , an input reception section 204 , a display section 205 and an audio output section 206 .
- the terminal 200 may be, for example, a VR headset worn on the user's head, or may be a mobile terminal such as a smart phone attached to a wearable device for wearing on the user's head.
- the communication unit 201 exchanges information with the information processing apparatus 100 by communicating with the information processing apparatus 100 .
- the communication unit 201 transmits, for example, listener information indicating the position and posture of the listener's head to the information processing apparatus 100 .
- the communication unit 108 receives, for example, the second audio signal and the video signal for output from the information processing apparatus 100 .
- the control unit 202 outputs the second audio signal to the audio output unit 206 and outputs the video signal to the display unit 205 . Also, the control unit 202 acquires the movement of the user's head (that is, changes in the position and posture of the head) detected by the detection unit 203 . Also, the control unit 202 acquires the input received by the input receiving unit 204 . The input indicates moving the position of the listener in the virtual space or changing the posture of the listener. The control unit 202 adjusts the listening position of the listener and the attitude of the listener's head based on the acquired user's head movement and the input indicating that the listener's head position and attitude should be changed.
- the listener information shown is generated, and the listener information is transmitted to the information processing apparatus 100 via the communication unit 201 .
- the control unit 202 acquires head movements and inputs, and sequentially (that is, at regular time intervals) performs processing for generating listener information based on the acquired head movements and inputs.
- the constant time interval is, for example, less than 1 second.
- the detection unit 203 sequentially detects the motion of the user's head.
- the detection unit 203 detects changes in the position and posture of the user's head.
- the detector 203 includes, for example, an acceleration sensor and an angular velocity sensor.
- the detection unit 203 is, for example, an IMU (Inertial Measurement Unit).
- the input reception unit 204 receives an input from the controller 300 operated by the user, indicating that the position of the listener should be moved in the virtual space or the posture of the listener's head should be changed.
- the input reception unit 204 may receive input from the controller 300 through wireless communication with the controller 300 or may receive input from the controller 300 through wired communication.
- the communication unit 201 may have the function of the input reception unit 204 that receives input from the controller 300 .
- the input reception unit 204 may have a button, a touch sensor, or the like that directly receives an input from the user.
- the display unit 205 displays the video (moving image) indicated by the video signal output by the control unit 202 .
- a moving image is a video made up of a plurality of frames.
- the video may be a still image.
- the display unit 205 is, for example, a liquid crystal display, an organic EL (Electro Luminescence) display, or the like.
- the audio output unit 206 outputs audio (including music) indicated by the audio signal output by the control unit 202 .
- the audio output unit 206 is, for example, a speaker.
- the controller 300 is a device that receives input from the user and transmits the received input to the terminal 200 .
- FIG. 3 is a flowchart showing an example of the operation of the information processing device.
- the information processing device 100 simplifies the virtual space included in the spatial information (S11). The details of the process of simplifying the virtual space will be described later.
- the information processing device 100 acquires listener information including the position and posture of the listener's head in the virtual space (S12).
- the information processing apparatus 100 Based on the simplified space generated by the space generation unit 106, the position and posture of the listener's head, and the position of the sound source, the information processing apparatus 100 generates sound arriving from the sound source to the head, Then, the direction of arrival and the propagation distance of at least one of the sounds arriving at the head after being reflected by one of the second structures in the simplified space are calculated. Then, the information processing apparatus 100 performs an operation to convolve the direction of arrival and the propagation distance of at least one sound with a predetermined head-related transfer function (HRTF) with respect to the decoded first audio signal, thereby obtaining a second audio signal. is generated (S13).
- HRTF head-related transfer function
- the information processing device 100 outputs the generated second audio signal (S14).
- FIG. 4 is a flowchart showing an example of processing for simplifying a virtual space.
- the information processing device 100 acquires spatial information (S21).
- Spatial information is information for reproducing a virtual space.
- a virtual space includes a structure arranged in the virtual space and a sound source.
- the information processing device 100 acquires the listening position of the listener in the virtual space (S22).
- the information processing device 100 executes a loop including steps S23 to S26 for each of the one or more first structures in the virtual space.
- the information processing device 100 generates a plurality of second structures each having a simplified shape for simplifying the shape of the first structure to be processed (S23).
- the information processing apparatus 100 calculates a reflection index value related to the sound reflection efficiency of the second structure, thereby corresponding to each of the plurality of second structures.
- a plurality of reflection index values are calculated (S24).
- the information processing device 100 selects one second structure from among the plurality of second structures based on the plurality of reflection index values (S25).
- the information processing device 100 replaces the first structure with one selected second structure (S26).
- the information processing device 100 ends the loop after executing steps S23 to S26 for all of the one or more first structures. This creates a simplified space in which all the first structures are replaced with the second structures.
- the information processing apparatus 100 does not have to execute the above loop, and only needs to be able to execute the processing of steps S23 to S26 for each of the one or more first structures in the virtual space.
- FIG. 5 is a diagram showing a specific example of a virtual space.
- the virtual space VS100 includes a plurality of first structures 301, sound sources 302, and listeners 310.
- the information processing apparatus 100 replaces the arc-curved first structure 301 with a simpler-shaped second structure.
- one of a plurality of types of simple three-dimensional shapes may be selected so as to be equal to the projected area of the projected shape 311 of the first structure 301 when the first structure 301 is viewed from the listening position of the listener 310 .
- a plurality of second structures are generated by combining the three-dimensional shapes described above.
- the plurality of second structures may be, for example, one cuboid, a combination of five cylinders, a combination of twenty spheres, a combination of ten cones, or the like.
- Information processing apparatus 100 performs the following information processing method.
- the information processing apparatus 100 acquires space information for reproducing a virtual space.
- the virtual space includes a first structure arranged in the virtual space and a sound source.
- the information processing apparatus 100 generates a plurality of second structures each having a simplified shape for simplifying the shape of the first structure.
- Each simplified shape of the plurality of second structures has a shape obtained by combining one or more types of three-dimensional shapes among a plurality of types of predetermined simple three-dimensional shapes.
- the information processing apparatus 100 calculates, for each of the plurality of second structures, a reflection index value related to the sound reflection efficiency of the second structure, thereby obtaining a plurality of sound reflections corresponding to the plurality of second structures.
- the information processing device 100 selects one second structure from among the plurality of second structures based on the plurality of reflection index values.
- the information processing apparatus 100 generates a simplified space in which the three-dimensional shape of the first structure is simplified by replacing the first structure with one selected second structure.
- one of the plurality of second structures generated by simplifying the shape of the first structure arranged in the virtual space is replaced with one second structure selected based on the reflection index value. Therefore, the first structure can be replaced with a second structure that has similar characteristics that affect sound and has a simplified shape, and the amount of calculation is reduced so as not to change the characteristics that affect sound. We can obtain a simplified space where Therefore, the processing load required for stereophonic reproduction can be reduced.
- the information processing device 100 further identifies the listening position of the listener in the virtual space.
- the information processing apparatus 100 When generating a plurality of second structures, the information processing apparatus 100 generates a plurality of types of simple three-dimensional shapes so as to be equal to the projected area of the first structure when the first structure is viewed from the listening position. A plurality of second structures are generated by combining one or more of the three-dimensional shapes.
- each of the plurality of second structures has a projection shape of the second structure when each of the plurality of second structures is planarly viewed from the listening position in the sound propagation path from the sound source to the listening position.
- the center of gravity of the projection shape of the first structure when the first structure is viewed from the listening position, and the reflection angle of the sound at each of the three positions of the center of gravity of It is generated such that the sound reflection angles at each of the three positions of the position and the positions of the two points sandwiching the center of gravity position are equal to each other.
- the information processing apparatus 100 selects the second structure having the smallest corresponding reflection index value among the plurality of second structures as one second structure.
- the structure with the smallest amount of calculation can be selected as one second structure from among the plurality of second structures.
- the information processing device 100 further performs the following processing.
- the information processing apparatus 100 identifies the position and posture of the listener's head in the virtual space. Based on the simplified space, the position and orientation of the head, and the position of the sound source, the information processing apparatus 100 detects the sound coming from the sound source to the head and the sound reflected by one second structure in the simplified space. Then, the arrival direction and the propagation distance of at least one of the sounds arriving at the head are calculated.
- the information processing apparatus 100 generates an audio signal by convolving the arrival direction and propagation distance of at least one sound with a predetermined head-related transfer function. The information processing device 100 outputs the generated audio signal.
- the stereophonic sound processing is performed using the simplified space that can reduce the amount of calculation so as not to change the characteristics that affect the sound. can.
- the position and posture of the head and the position of the sound source are specified at a plurality of mutually different timings. Calculation of the propagation distance, generation of the audio signal, and output of the audio signal are performed for each of a plurality of timings.
- the information processing apparatus 100 includes a processor and memory, and the processor uses the memory to perform the above processes.
- Each device in the above embodiment is specifically a computer system composed of a microprocessor, ROM, RAM, hard disk unit, display unit, keyboard, mouse, and the like.
- a computer program is recorded in the RAM or hard disk unit.
- Each device achieves its function by the microprocessor operating according to the computer program.
- the computer program is constructed by combining a plurality of instruction codes indicating instructions to the computer in order to achieve a predetermined function.
- a system LSI is an ultra-multifunctional LSI manufactured by integrating multiple components on a single chip. Specifically, it is a computer system that includes a microprocessor, ROM, RAM, etc. . A computer program is recorded in the RAM. The system LSI achieves its functions by the microprocessor operating according to the computer program.
- each part of the constituent elements constituting each of the devices described above may be individually integrated into one chip, or may be integrated into one chip so as to include part or all of them.
- system LSI may also be called IC, LSI, super LSI, or ultra LSI depending on the degree of integration.
- the method of circuit integration is not limited to LSI, and may be realized by a dedicated circuit or a general-purpose processor.
- An FPGA Field Programmable Gate Array
- a reconfigurable processor that can reconfigure the connections and settings of the circuit cells inside the LSI may be used.
- the IC card or module is a computer system composed of a microprocessor, ROM, RAM and the like.
- the IC card or the module may include the super multifunctional LSI.
- the IC card or the module achieves its function by the microprocessor operating according to the computer program. This IC card or this module may be tamper resistant.
- the present disclosure may be the method shown above. Moreover, it may be a computer program for realizing these methods by a computer, or it may be a digital signal composed of the computer program.
- the present disclosure includes a computer-readable recording medium for the computer program or the digital signal, such as a flexible disk, hard disk, CD-ROM, MO, DVD, DVD-ROM, DVD-RAM, BD (Blu-ray (Registered Trademark) Disc), semiconductor memory, or the like. Moreover, it may be the digital signal recorded on these recording media.
- a computer-readable recording medium for the computer program or the digital signal such as a flexible disk, hard disk, CD-ROM, MO, DVD, DVD-ROM, DVD-RAM, BD (Blu-ray (Registered Trademark) Disc), semiconductor memory, or the like.
- BD Blu-ray (Registered Trademark) Disc
- semiconductor memory or the like.
- it may be the digital signal recorded on these recording media.
- the computer program or the digital signal may be transmitted via an electric communication line, a wireless or wired communication line, a network represented by the Internet, data broadcasting, or the like.
- the present disclosure may also be a computer system comprising a microprocessor and memory, the memory storing the computer program, and the microprocessor operating according to the computer program.
- the present disclosure can be used for an information processing method, an information processing apparatus, a program, and the like that can reduce the processing load required for reproducing stereophonic sound.
- Sound reproduction system 100 Information processing device 101 Acquisition unit 102 Candidate generation unit 103 Calculation unit 104 Selection unit 105 Decoding unit 106 Space generation unit 107 Rendering unit 108 Communication unit 200 Terminal 201 Communication unit 202 Control unit 203 Detection unit 204 Input reception unit 205 Display unit 206 Audio output unit 300 Controller 301 First structure 302 Sound source 310 Listener 311 Projection shape
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- General Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Stereophonic System (AREA)
Abstract
Description
従来の音響再生技術には、例えば、特許文献1に記載されているような境界要素法などの物理特性を忠実に再現する波動音響理論に基づく方式、または、音線法等の幾何音響に基づく方法などが知られている。波動音響理論に基づく方式では、複雑な空間形状に対して、特に高域においてインパルス応答を算出する際に演算量が増大するという課題がある。また、音線法等の幾何音響に基づく方法を使用する場合においても、音オブジェクトが移動したり、利用者が移動する6DoF(6自由度)環境では実時間での計算量が多いという課題がある。
[1.構成]
まず、本開示に係るシステム構成について説明する。
次に、情報処理装置100の動作、つまり、情報処理装置100により実行される情報処理方法について説明する。
本実施の形態に係る情報処理装置100は、下記の情報処理方法を行う。情報処理装置100は、仮想的な空間を再現するための空間情報を取得する。仮想的な空間は、仮想的な空間内に配置される第1構造物と、音源とを含む。情報処理装置100は、それぞれが、第1構造物の形状を単純化するための単純化形状を有する複数の第2構造物を生成する。複数の第2構造物のそれぞれの単純化形状は、予め定められた複数種類の単純な立体形状のうちの1種類以上の立体形状を組み合わせた形状を有する。情報処理装置100は、複数の第2構造物のそれぞれについて、当該第2構造物の音の反射効率に関連する反射指標値を算出することで、複数の第2構造物にそれぞれ対応する複数の反射指標値を算出する。情報処理装置100は、複数の反射指標値に基づいて、複数の第2構造物のうちの1つの第2構造物を選択する。情報処理装置100は、第1構造物を、選択した1つの第2構造物に置き換えることで、第1構造物の立体形状が単純化された単純化空間を生成する。
以上のように、本開示について上記の実施の形態に基づいて説明してきたが、本開示は、上記の実施の形態に限定されないのはもちろんである。以下のような場合も本開示に含まれる。
100 情報処理装置
101 取得部
102 候補生成部
103 算出部
104 選択部
105 デコード部
106 空間生成部
107 レンダリング部
108 通信部
200 端末
201 通信部
202 制御部
203 検出部
204 入力受付部
205 表示部
206 音声出力部
300 コントローラ
301 第1構造物
302 音源
310 受聴者
311 投影形状
Claims (9)
- 仮想的な空間を再現するための空間情報を取得し、前記仮想的な空間は、前記仮想的な空間内に配置される第1構造物と、音源とを含み、
それぞれが、前記第1構造物の形状を単純化するための単純化形状を有する複数の第2構造物を生成し、前記複数の第2構造物のそれぞれの単純化形状は、予め定められた複数種類の単純な立体形状のうちの1種類以上の立体形状を組み合わせた形状を有し、
前記複数の第2構造物のそれぞれについて、当該第2構造物の音の反射効率に関連する反射指標値を算出することで、前記複数の第2構造物にそれぞれ対応する複数の反射指標値を算出し、
前記複数の反射指標値に基づいて、前記複数の第2構造物のうちの1つの第2構造物を選択し、
前記第1構造物を、選択した前記1つの第2構造物に置き換えることで、前記第1構造物の立体形状が単純化された単純化空間を生成する
情報処理方法。 - さらに、
前記仮想的な空間内の受聴者の受聴位置を特定し、
前記複数の第2構造物の生成では、前記受聴位置から前記第1構造物を平面視した場合の前記第1構造物の投影面積と等しくなるように、前記複数種類の単純な立体形状のうちの前記1種類以上の立体形状を組み合わせることで、前記複数の第2構造物を生成する
請求項1に記載の情報処理方法。 - 前記複数の第2構造物のそれぞれは、前記音源から前記受聴位置までの間に音の伝搬経路において、前記受聴位置から前記複数の第2構造物をそれぞれ平面視した場合の当該第2構造物の投影形状の重心位置と重心位置を挟む2点の位置との3つの位置のそれぞれの位置における音の反射角と、前記受聴位置から前記第1構造物を平面視した場合の前記第1構造物の投影形状の重心位置と重心位置を挟む2点の位置との3つの位置のそれぞれの位置における音の反射角とが互いに等しくなるように生成される
請求項2に記載の情報処理方法。 - 前記複数の第2構造物は、互いに異なる形状を有する
請求項1から3のいずれか1項に記載の情報処理方法。 - 前記1つの第2構造物の選択では、前記複数の第2構造物のうち対応する反射指標値が最小となる第2構造物を前記1つの第2構造物として選択する
請求項1から4のいずれか1項に記載の情報処理方法。 - さらに、
前記仮想的な空間内の受聴者の頭部の位置及び姿勢を特定し、
前記単純化空間と、前記頭部の位置及び姿勢と、前記音源の位置とに基づいて、前記音源から前記頭部へ到来する音、及び、前記単純化空間の前記1つの第2構造物に反射して前記頭部へ到来する音の少なくとも一方の音の、到来方向と、到来するまでに伝搬する伝搬距離とを算出し、
前記少なくとも一方の音の、前記到来方向及び前記伝搬距離を所定の頭部伝達関数に畳み込むことで、音声信号を生成し、
生成した音声信号を出力する
請求項1から5のいずれか1項に記載の情報処理方法。 - 前記頭部の位置及び姿勢と、前記音源の位置とは、互いに異なる複数のタイミングで特定され、
前記複数のタイミングのそれぞれ毎に、前記伝搬距離の算出、前記音声信号の生成、及び、前記音声信号の出力が行われる
請求項6に記載の情報処理方法。 - 請求項1から7のいずれか1項に記載の情報処理方法をコンピュータに実行させるためのプログラム。
- プロセッサと、
メモリと、を備え、
前記プロセッサは、前記メモリを用いて、
仮想的な空間を再現するための空間情報を取得し、前記仮想的な空間は、前記仮想的な空間内に配置される第1構造物と、音源とを含み、
それぞれが、前記第1構造物の形状を単純化するための単純化形状を有する複数の第2構造物を生成し、前記複数の第2構造物のそれぞれの単純化形状は、予め定められた複数種類の単純な立体形状のうちの1種類以上の立体形状を組み合わせた形状を有し、
前記複数の第2構造物のそれぞれについて、当該第2構造物の音の反射効率に関連する反射指標値を算出することで、前記複数の第2構造物にそれぞれ対応する複数の反射指標値を算出し、
前記複数の反射指標値に基づいて、前記複数の第2構造物のうちの1つの第2構造物を選択し、
前記第1構造物を、選択した前記1つの第2構造物に置き換えることで、前記第1構造物の立体形状が単純化された単純化空間を生成する
情報処理装置。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22788106.7A EP4325898A1 (en) | 2021-04-12 | 2022-04-06 | Information processing method, information processing device, and program |
CN202280027328.5A CN117178570A (zh) | 2021-04-12 | 2022-04-06 | 信息处理方法、信息处理装置和程序 |
JP2023514620A JPWO2022220181A1 (ja) | 2021-04-12 | 2022-04-06 | |
US18/372,793 US20240015461A1 (en) | 2021-04-12 | 2023-09-26 | Information processing method, information processing device, and recording medium |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202163173609P | 2021-04-12 | 2021-04-12 | |
US63/173,609 | 2021-04-12 | ||
JP2022054455 | 2022-03-29 | ||
JP2022-054455 | 2022-03-29 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/372,793 Continuation US20240015461A1 (en) | 2021-04-12 | 2023-09-26 | Information processing method, information processing device, and recording medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022220181A1 true WO2022220181A1 (ja) | 2022-10-20 |
Family
ID=83639670
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2022/017167 WO2022220181A1 (ja) | 2021-04-12 | 2022-04-06 | 情報処理方法、情報処理装置、及び、プログラム |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240015461A1 (ja) |
EP (1) | EP4325898A1 (ja) |
JP (1) | JPWO2022220181A1 (ja) |
WO (1) | WO2022220181A1 (ja) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000267675A (ja) * | 1999-03-16 | 2000-09-29 | Sega Enterp Ltd | 音響信号処理装置 |
JP2006128818A (ja) | 2004-10-26 | 2006-05-18 | Victor Co Of Japan Ltd | 立体映像・立体音響対応記録プログラム、再生プログラム、記録装置、再生装置及び記録メディア |
-
2022
- 2022-04-06 WO PCT/JP2022/017167 patent/WO2022220181A1/ja active Application Filing
- 2022-04-06 EP EP22788106.7A patent/EP4325898A1/en active Pending
- 2022-04-06 JP JP2023514620A patent/JPWO2022220181A1/ja active Pending
-
2023
- 2023-09-26 US US18/372,793 patent/US20240015461A1/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000267675A (ja) * | 1999-03-16 | 2000-09-29 | Sega Enterp Ltd | 音響信号処理装置 |
JP2006128818A (ja) | 2004-10-26 | 2006-05-18 | Victor Co Of Japan Ltd | 立体映像・立体音響対応記録プログラム、再生プログラム、記録装置、再生装置及び記録メディア |
Also Published As
Publication number | Publication date |
---|---|
EP4325898A1 (en) | 2024-02-21 |
JPWO2022220181A1 (ja) | 2022-10-20 |
US20240015461A1 (en) | 2024-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10397728B2 (en) | Differential headtracking apparatus | |
KR100606734B1 (ko) | 삼차원 입체음향 구현 방법 및 그 장치 | |
KR102506593B1 (ko) | 바이노럴 오디오를 생성하는 헤드-웨어러블 장치 | |
WO2020148120A2 (en) | Processing audio signals | |
Schröder et al. | Virtual reality system at RWTH Aachen University | |
US11429340B2 (en) | Audio capture and rendering for extended reality experiences | |
US10757528B1 (en) | Methods and systems for simulating spatially-varying acoustics of an extended reality world | |
CN112470102A (zh) | 高效渲染虚拟声场 | |
US9838790B2 (en) | Acquisition of spatialized sound data | |
TW202117500A (zh) | 用於音訊呈現之隱私分區及授權 | |
EP3994864A1 (en) | Password-based authorization for audio rendering | |
US11937065B2 (en) | Adjustment of parameter settings for extended reality experiences | |
KR102656969B1 (ko) | 불일치 오디오 비주얼 캡쳐 시스템 | |
WO2022220181A1 (ja) | 情報処理方法、情報処理装置、及び、プログラム | |
US11696085B2 (en) | Apparatus, method and computer program for providing notifications | |
CN117178570A (zh) | 信息处理方法、信息处理装置和程序 | |
WO2022220182A1 (ja) | 情報処理方法、プログラム、及び情報処理システム | |
US20230345193A1 (en) | Signal processing apparatus for generating virtual viewpoint video image, signal processing method, and storage medium | |
WO2023199815A1 (ja) | 音響処理方法、プログラム、及び音響処理システム | |
CN110428802B (zh) | 声音混响方法、装置、计算机设备及计算机存储介质 | |
US20240022854A1 (en) | Sound reproduction method, sound reproduction device, and recording medium | |
CN117063489A (zh) | 信息处理方法、程序和信息处理*** |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22788106 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2023514620 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2022788106 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 2022788106 Country of ref document: EP Effective date: 20231113 |