WO2024094214A1 - 基于自由视角的空间音效实现方法、设备及存储介质 - Google Patents

基于自由视角的空间音效实现方法、设备及存储介质 Download PDF

Info

Publication number
WO2024094214A1
WO2024094214A1 PCT/CN2023/129967 CN2023129967W WO2024094214A1 WO 2024094214 A1 WO2024094214 A1 WO 2024094214A1 CN 2023129967 W CN2023129967 W CN 2023129967W WO 2024094214 A1 WO2024094214 A1 WO 2024094214A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
audio
spatial
camera position
current camera
Prior art date
Application number
PCT/CN2023/129967
Other languages
English (en)
French (fr)
Inventor
赵俊哲
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2024094214A1 publication Critical patent/WO2024094214A1/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers

Definitions

  • the present application relates to the field of audio and video technology, and in particular to a method, device and storage medium for realizing spatial sound effects based on free viewing angle.
  • the current free-viewing angle devices do not provide spatial sound effects.
  • users watch live broadcasts or play games no matter which angle they switch to, they cannot distinguish the direction of the sound, resulting in a poor user experience. Therefore, how to solve the poor sound effect playback experience of the current free-viewing angle has become a technical problem that needs to be solved urgently.
  • the embodiments of the present application provide a method, device and storage medium for realizing spatial sound effects based on free perspective, aiming to solve the current technical problem of poor sound effect playback experience in free perspective.
  • an embodiment of the present application provides a method for realizing spatial sound effects based on a free perspective, comprising: matching target position information corresponding to the current camera position in a storage unit based on a camera position number corresponding to the current camera position; acquiring audio stream data, and decoding the audio stream data into a target audio frame; and performing sound effect conversion on the target audio frame based on the target position information to obtain the target spatial audio corresponding to the current camera position.
  • an embodiment of the present application further provides a spatial sound effect implementation device based on a free perspective, the spatial sound effect implementation device based on a free perspective comprising a processor, a memory, and a memory stored in the memory.
  • a computer program on a memory and executable by the processor and a data bus for realizing connection and communication between the processor and the memory, wherein when the computer program is executed by the processor, any one of the spatial sound effect realization methods based on free perspective provided in the specification of this application is realized.
  • an embodiment of the present application further provides a storage medium for computer-readable storage, wherein the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement any of the spatial sound effect implementation methods based on free perspective provided in the specification of this application.
  • FIG1 is a schematic flow chart of a first embodiment of a method for realizing spatial sound effects based on free viewing angle provided by the present invention
  • FIG2 is a schematic flow chart of a second embodiment of a method for realizing spatial sound effects based on free viewing angle provided by the present invention
  • FIG3 is a schematic diagram of a surround playback mode implemented by a spatial sound effect based on a free viewing angle provided by an embodiment of the present invention
  • FIG4 is a schematic diagram of a linear playback mode implemented by a spatial sound effect based on a free viewing angle provided by an embodiment of the present invention
  • FIG5 is a schematic flow chart of a third embodiment of a method for realizing spatial sound effects based on free viewing angle provided by the present invention.
  • FIG6 is a schematic block diagram of the structure of a device for realizing spatial sound effects based on free viewing angle provided by an embodiment of the present invention.
  • the embodiment of the present invention provides a method, device and storage medium for realizing spatial sound effects based on free viewing angle.
  • the method for realizing spatial sound effects based on free viewing angle can be applied to a mobile terminal, which can be an electronic device such as a mobile phone, a tablet computer, a laptop computer, a desktop computer, a personal digital assistant and a wearable device.
  • FIG. 1 is a flow chart of a first embodiment of a method for realizing spatial sound effects based on free viewing angle provided by the present invention.
  • the method for realizing spatial sound effects based on free viewing angle includes steps S101 to S103 .
  • Step S101 Based on the camera position number corresponding to the current camera position, match the target position information corresponding to the current camera position in a storage unit.
  • the client can obtain the camera position number, sound position information, and media information such as the default playback camera position number of each camera position by parsing the index file or the video stream data, and store it in the client's storage unit to obtain the camera position number of each camera position and the corresponding sound position information.
  • the user selects a camera position for audio rendering on the client, and each camera position has a unique corresponding camera position number. According to the camera position number corresponding to the camera position selected by the user, the sound position information corresponding to the camera position number is queried in the storage unit, that is, the target position information corresponding to the current camera position.
  • Step S102 Acquire audio stream data, and decode the audio stream data into target audio frames
  • the client can download the audio stream data that needs to be converted into sound effects, and decode the audio stream data into audio frames.
  • sampling rate refers to the number of samples extracted from a continuous signal per second to form a discrete signal, which is expressed in Hertz (Hz).
  • the reciprocal of the sampling frequency is the sampling period or sampling time, which is the time interval between samples.
  • the sampling frequency refers to how many signal samples the computer collects per second.
  • the number of samples refers to the size of a frame of audio.
  • the sampling format refers to the storage format of the audio, such as 8-bit unsigned integers, 16-bit signed integers, 32-bit signed integers, and single-precision floating point numbers.
  • an audio frame usually refers to a sampling point size, such as 8 channels/bit depth (B/s); the other is how long an audio frame takes, such as taking audio data within a time range of 1s as 1 frame.
  • a sampling point size such as 8 channels/bit depth (B/s)
  • B/s channels/bit depth
  • audio frames have fixed sizes, but non-fixed For the non-fixed size and fixed duration audio frame types, real-time analysis is required to know the actual size of the audio frame.
  • Step S103 Based on the target position information, perform sound effect conversion on the target audio frame to obtain the target spatial audio corresponding to the current camera position.
  • the head related transfer function (HRTF) operation can be performed on the target audio frame according to the target position information matched by the current camera position, so as to generate audio data with spatial sound effects corresponding to the current camera position.
  • the client can render and output the converted spatial audio data to obtain the target spatial audio.
  • HRTF Head Related Transfer Function
  • HRTF is an acoustic model, which is a function of spatial parameters (spherical coordinates relative to the center of the listener's head), sound frequency (generally only includes 20-20khz, because the human ear can generally only sense sounds in this frequency range), and anthropometric parameters (the size of the head, torso, auricle, etc. that will reflect and diffract sound waves).
  • HRTF uses the frequency vibration prediction of the human ear and brain to synthesize 3D sound effects.
  • DSP digital signal processing
  • HRTF can process the sound source of the virtual world in real time.
  • the human brain can perceive the real positioning feeling, such as the sound coming from the front/back, above/below, or any direction in the three-dimensional space.
  • the sound of any spatial position that the listener wants to hear can be perfectly rendered (convolving the time-domain HRIR of a certain spatial position with a monophonic sound or multiplying the HRTF with the Fourier transform of a monophonic sound).
  • This embodiment provides a method for realizing spatial sound effects based on free perspective.
  • This method searches for corresponding target orientation information in a storage unit according to the position number corresponding to the current position selected by the user as the sound orientation information of the current position; decodes the audio stream data into audio frames by parsing the audio stream data as the audio data of the current position; performs sound effect conversion of the audio frames to the corresponding spatial orientation according to the target orientation information of the current position to obtain the target spatial audio corresponding to the current position; performs corresponding sound effect conversion of the audio frames according to the orientation information corresponding to the positions of different positions to obtain the audio stream corresponding to the orientation information of each position, so that the video stream and the audio stream can achieve spatial synchronization effect and improve the user experience.
  • the technical problem of poor sound effect playback experience of the current free perspective is solved.
  • FIG. 2 is a flow chart of a second embodiment of a method for realizing spatial sound effects based on free viewing angle provided by the present invention.
  • step S103 the following steps are further included:
  • Step S201 determine an audio playback mode based on the current device distribution mode, where the audio playback mode includes a surround playback mode and a linear playback mode.
  • the corresponding audio playback mode can be determined according to the device distribution at the application site to match the playback requirements of the current device distribution.
  • the audio playback mode can be a surround playback mode or a linear playback mode. release mode.
  • the surround playback mode means that the distances between each camera position and the center point are consistent, and the effect of the distance on the sound generation is not considered.
  • the target audio frame is converted into a sound effect based on the position information corresponding to the current camera position to obtain the target spatial audio corresponding to the current camera position.
  • the client can match the sound position information corresponding to the camera number from the storage unit according to the camera number corresponding to the camera number currently selected by the user, and then perform HRTF calculation on the audio frame of the camera according to the sound position information corresponding to the camera number to achieve spatial sound effect conversion.
  • the user in surround playback mode, after completing the spatial sound effect conversion of the current camera position, the user can perform a perspective switching operation, change the camera position, and re-convert the spatial sound effect for the next camera position.
  • the target audio frame is converted into a sound effect based on the target position information corresponding to the current camera position number to obtain a first spatial audio corresponding to the current camera position; based on the current device distribution method, the device distribution center is determined; the relative distance between the current camera position and the device distribution center is obtained, and based on the relative distance, the first spatial audio is corrected to obtain a second spatial audio as the target spatial audio corresponding to the current camera position.
  • the distances between each camera position and the center point are inconsistent, and distance attenuation is added to achieve the effect that the sound attenuates as the distance increases, thereby providing users with a better audio-visual experience.
  • the corresponding sound orientation information is matched in the storage unit, and then the HRTF operation is performed on the audio frame according to the sound orientation information. Then, according to the distance between the camera position and the center point, the distance operation is added when converting the audio frame, and it is converted into audio data with spatial sound effects. Then the client re-corrects the distance parameter to generate spatial audio data at the corrected position, and then renders and outputs the converted spatial audio data.
  • the distance calculation method can be an audio processing method such as Open Audio Library (Open AL).
  • OpenAL Open Audio Library
  • OpenAL Open Audio Library
  • API Application Programming Interface
  • the main function of OpenAL is to encode in source objects, audio buffers, and listeners.
  • the source object contains a pointer to the buffer, the speed, position and direction of the sound, and the sound intensity.
  • the listener object contains the speed, position and direction of the listener, as well as the overall gain of all sounds.
  • the buffer contains 8 or 16-bit, mono or stereo PCM format audio data, and the presentation engine performs all necessary calculations, such as distance attenuation, Doppler effect, etc.
  • the user in a linear playback mode, after completing the spatial sound conversion of the current camera position, the user can perform a perspective switching operation, change the camera position, and re-convert the spatial sound for the next camera position based on the distance between the next camera position and the center point.
  • the audio playback mode can be but is not limited to the above two modes. This application only uses the above two methods as examples. Other audio playback modes applicable to the spatial sound effect implementation method based on free perspective provided in this application should also be within the scope of protection of this application.
  • the method further includes: recalculating the target orientation information based on the fine-tuning parameters input by the user to generate corrected orientation information as the target orientation information.
  • the orientation can be fine-tuned, including orientation, pitch and other data, to generate corrected orientation information, and then based on the corrected orientation information, HRTF calculations are performed on the audio frame of the current camera position to achieve spatial sound effect conversion.
  • FIG. 5 is a flowchart of a third embodiment of a method for realizing spatial sound effects based on free viewing angle provided by the present invention.
  • step S101 the following steps are further included:
  • Step S301 Obtain index file.
  • Step S302 based on the analysis of the index file, the media information is obtained, and the media information is stored in the storage unit; wherein the media information group includes the camera number of at least one camera, the sound position information corresponding to each camera number, and a default camera number.
  • the server may write the media information into an index file, and the index file is stored in the server.
  • the client requests the server to download the index file to obtain the index file.
  • the client can parse the index file to obtain media information such as the machine position number, sound direction information, and the default playback machine position number of each machine position, and transfer the media information to a preset storage unit in the client.
  • the default camera number means that when the user does not specify a camera number, the default camera number in the index file is used as the camera number of the current camera, and the spatial sound effect conversion is performed on the audio stream of the current camera according to the sound orientation information corresponding to the default camera number.
  • step S101 it also includes: obtaining at least one channel of video stream data; based on parsing the data header of the video stream data, obtaining the sound orientation information corresponding to each channel of the video stream data; based on the storage unit, storing the machine position number of each video stream data and the sound orientation information corresponding to each machine position number.
  • the server may write the sound position information of all cameras into the data header of each video stream, and the client may parse the position information in the data header of all video streams and transfer it to the storage unit of the client.
  • the video stream and audio stream data may be synchronized using the timestamp as a synchronization standard, and after performing spatial sound effect conversion on each camera position, audio and video synchronization with a spatial conversion effect may be achieved.
  • the method further includes: switching to the next camera perspective as the current camera perspective based on the perspective switching instruction executed by the user.
  • the user can click
  • the perspective switching command can be used to switch sequentially among the unprocessed camera positions according to the camera position number sequence, or you can directly click on the camera position number corresponding to the unprocessed camera position you want to process. After switching, it will be used as the current camera position for spatial sound conversion operations.
  • FIG. 6 is a schematic block diagram of the structure of a device for realizing spatial sound effects based on free viewing angle provided by an embodiment of the present invention.
  • a spatial sound effect implementation device 300 based on free perspective includes a processor 301 and a memory 302.
  • the processor 301 and the memory 302 are connected via a bus 303, which is, for example, an I2C (Inter-integrated Circuit) bus.
  • I2C Inter-integrated Circuit
  • the processor 301 is used to provide computing and control capabilities to support the operation of the entire free-viewing-based spatial sound effect implementation device.
  • the processor 301 can be a central processing unit (CPU), and the processor 301 can also be other general-purpose processors, digital signal processors (DSP), application-specific integrated circuits (ASIC), field-programmable gate arrays (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor can be a microprocessor or the processor can also be any conventional processor, etc.
  • the memory 302 can be a flash memory (Flash) chip, a read-only memory (ROM) disk, a CD, a USB flash drive or a mobile hard disk, etc.
  • Flash flash memory
  • ROM read-only memory
  • CD compact disc-read only memory
  • USB flash drive a mobile hard disk
  • FIG6 is merely a block diagram of a partial structure related to the embodiment of the present invention, and does not constitute a limitation on the spatial sound effect implementation device based on free perspective to which the embodiment of the present invention is applied.
  • the specific spatial sound effect implementation device based on free perspective may include more or fewer components than those shown in the figure, or combine certain components, or have a different arrangement of components.
  • the processor 301 is used to run a computer program stored in the memory 302, and implement any one of the spatial sound effect implementation methods based on free perspective provided in the embodiments of the present invention when executing the computer program.
  • the processor 301 is used to run a computer program stored in a memory, and implement the following steps when executing the computer program: based on the camera position number corresponding to the current camera position, matching the target position information corresponding to the current camera position in a storage unit; obtaining audio stream data, and decoding the audio stream data into a target audio frame; based on the target position information, performing sound effect conversion on the target audio frame to obtain the target spatial audio corresponding to the current camera position.
  • the processor 301 before the processor 301 implements the sound effect conversion of the target audio frame based on the target position information to obtain the target spatial audio corresponding to the current camera position, it is also used to implement: based on the current device distribution, determine the audio playback mode, and the audio playback mode includes a surround playback mode and a linear playback mode.
  • the processor 301 when the processor 301 performs sound effect conversion on the target audio frame based on the target position information to obtain the target spatial audio corresponding to the current camera position, it is used to implement: when the audio playback mode is the surround playback mode, based on the current camera position, The position information corresponding to the camera position number is used to perform sound effect conversion on the target audio frame to obtain the target spatial audio corresponding to the current camera position.
  • the processor 301 when the processor 301 implements the sound effect conversion of the target audio frame based on the target position information to obtain the target spatial audio corresponding to the current camera position, it is used to implement: when the audio playback mode is a linear playback mode, based on the target position information corresponding to the current camera position number, the sound effect conversion of the target audio frame is performed to obtain the first spatial audio corresponding to the current camera position; based on the current device distribution method, the device distribution center is determined; the relative distance between the current camera position and the device distribution center is obtained, and based on the relative distance, the first spatial audio is corrected to obtain the second spatial audio as the target spatial audio corresponding to the current camera position.
  • the processor 301 before implementing the sound effect conversion of the target audio frame based on the target position information to obtain the target spatial audio corresponding to the current camera position, is also used to implement: based on the fine-tuning parameters input by the user, recalculate the target position information to generate corrected position information as the target position information.
  • the processor 301 before implementing the matching of the target position information corresponding to the current camera position in the storage unit based on the camera position number corresponding to the current camera position, the processor 301 is also used to implement: obtaining an index file; obtaining the media information based on the analysis of the index file, and storing the media information in the storage unit; wherein the media information group includes the camera position number of at least one camera position, the sound position information corresponding to each camera position number, and a default camera position number.
  • the processor 301 before implementing the camera position number corresponding to the current camera position and matching the target position information corresponding to the current camera position in the storage unit, is also used to implement: obtaining at least one channel of video stream data; obtaining the sound position information corresponding to each channel of video stream data based on the analysis of the data header of the video stream data; and storing the camera position number of each video stream data and the sound position information corresponding to each camera position number based on the storage unit.
  • the processor 301 after the processor 301 implements the sound effect conversion of the target audio frame based on the target position information to obtain the target spatial audio corresponding to the current camera position, it is also used to implement: switching to the next camera perspective as the current camera position based on the perspective switching instruction executed by the user.
  • An embodiment of the present invention also provides a storage medium for computer-readable storage, wherein the storage medium stores one or more programs, and the one or more programs can be executed by one or more processors to implement any of the spatial sound effect implementation methods based on free perspective provided in the description of the embodiment of the present invention.
  • the storage medium may be an internal storage unit of the device for realizing spatial sound effects based on free viewing angles as described in the above-mentioned embodiment, for example, a hard disk of the device for realizing spatial sound effects based on free viewing angles.
  • the storage medium may also be an external storage device of the spatial sound effect implementation device based on free perspective, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, a flash card (Flash Card), etc., equipped on the spatial sound effect implementation device based on free perspective.
  • Such software may be distributed on a computer-readable medium, which may include a computer storage medium (or non-transitory medium) and a communication medium (or temporary medium).
  • a computer storage medium includes volatile and non-volatile, removable and non-removable media implemented in any method or technology for storing information (such as computer-readable instructions, data structures, program modules, or other data).
  • Computer storage media include, but are not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tapes, magnetic disk storage or other magnetic storage devices, or any other medium that can be used to store the desired information and can be accessed by a computer.
  • communication media typically contain computer-readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Stereophonic System (AREA)

Abstract

一种基于自由视角的空间音效实现方法、设备及存储介质,属于音视频技术领域。方法包括:根据用户选择的当前机位对应的机位号,在存储单元中查找对应的目标方位信息,作为当前机位的声音方位信息;通过将音频流数据解码为音频帧,获得当前机位的音频数据;根据当前机位的目标方位信息,对音频帧进行对应空间方位的音效转换,获得当前机位对应的目标空间音频;通过不同方位的机位对应的方位信息,对音频帧进行相应的音效转换,以获得对应各机位方位信息的音频流。

Description

基于自由视角的空间音效实现方法、设备及存储介质
交叉引用
本发明要求在2022年11月04日提交中国专利局、申请号为202211378901.5、发明名称为“基于自由视角的空间音效实现方法、设备及存储介质”的中国专利申请的优先权,该申请的全部内容通过引用结合在本发明中。
技术领域
本申请涉及音视频技术领域,尤其涉及一种基于自由视角的空间音效实现方法、设备及存储介质。
背景技术
随着第五代移动通信技术(5th Generation Mobile Communication Technology,5G)时代的到来,更大的带宽能够为用户提供更好的观影体验,超清4k/8k为用户带来更清晰更细致的画质享受。目前,自由视角被广泛应用于运动赛事、教育培训、文娱演出等场景,配合虚拟现实(Virtual Reality,VR)/增强现实(Augmented Reality,AR)头盔、耳机等设备,能够为用户提供更佳的视听体验。
但目前的自由视角设备,并没有提供空间音效功能,当用户观看直播、体验游戏时,不管切换到哪个视角,都不能分辨声音的方位,造成用户体验感较差。因此,如何解决目前自由视角的音效播放体验感差成为了亟待解决的技术问题。
发明内容
本申请实施例提供了一种基于自由视角的空间音效实现方法、设备及存储介质,旨在解决目前自由视角的音效播放体验感差的技术问题。
第一方面,本申请实施例提供一种基于自由视角的空间音效实现方法,包括:基于当前机位对应的机位号,在存储单元中匹配所述当前机位对应的目标方位信息;获取音频流数据,并将所述音频流数据解码为目标音频帧;基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频。
第二方面,本申请实施例还提供一种基于自由视角的空间音效实现设备,所述基于自由视角的空间音效实现设备包括处理器、存储器、存储在所述存 储器上并可被所述处理器执行的计算机程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,其中所述计算机程序被所述处理器执行时,实现如本申请说明书提供的任一项基于自由视角的空间音效实现方法。
第三方面,本申请实施例还提供一种存储介质,用于计算机可读存储,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现如本申请说明书提供的任一项基于自由视角的空间音效实现方法。
附图说明
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明提供的一种基于自由视角的空间音效实现方法的第一实施例流程示意图;
图2为本发明提供的一种基于自由视角的空间音效实现方法的第二实施例流程示意图;
图3为本发明实施例提供的基于自由视角的空间音效实现的环绕式播放模式示意图;
图4为本发明实施例提供的基于自由视角的空间音效实现的线性式播放模式示意图;
图5为本发明提供的一种基于自由视角的空间音效实现方法的第三实施例流程示意图;
图6为本发明实施例提供的一种基于自由视角的空间音效实现设备的结构示意性框图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
附图中所示的流程图仅是示例说明,不是必须包括所有的内容和操作/步骤,也不是必须按所描述的顺序执行。例如,有的操作/步骤还可以分解、组合或部分合并,因此实际执行的顺序有可能根据实际情况改变。
应当理解,在此本发明说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本发明。如在本发明说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个” 及“该”意在包括复数形式。
本发明实施例提供一种基于自由视角的空间音效实现方法、设备及存储介质。其中,该基于自由视角的空间音效实现方法可应用于移动终端中,该移动终端可以手机、平板电脑、笔记本电脑、台式电脑、个人数字助理和穿戴式设备等电子设备。
下面结合附图,对本发明的一些实施例作详细说明。在不冲突的情况下,下述的实施例及实施例中的特征可以相互组合。
以下,将结合图1中的场景对本发明的实施例提供的基于自由视角的空间音效实现方法进行详细介绍。需知,图1中的场景仅用于解释本发明实施例提供的基于自由视角的空间音效实现方法,但并不构成对本发明实施例提供的基于自由视角的空间音效实现方法应用场景的限定。
请参照图1,图1为本发明提供的一种基于自由视角的空间音效实现方法的第一实施例流程示意图。
如图1所示,该基于自由视角的空间音效实现方法包括步骤S101至步骤S103。
步骤S101、基于当前机位对应的机位号,在存储单元中匹配所述当前机位对应的目标方位信息。
本实施例中,客户端可以通过对索引文件或者是视频流的数据解析,获得每个机位的机位号、声音方位信息以及默认播放的机位号等媒体信息,并存储到客户端的存储单元中,获得各机位的机位号以及对应的声音方位信息。
在一实施例中,用户在客户端选择需要进行音频渲染的机位,每个机位都具备唯一对应的机位号,根据用户选择的机位对应的机位号,在存储单元中查询该机位号对应的声音方位信息,即为当前机位对应的目标方位信息。
步骤S102、获取音频流数据,并将所述音频流数据解码为目标音频帧;
本实施例中,客户端可以下载需要进行音效转换的音频流数据,并且将音频流数据解码为音频帧。
具体地,在进行音频信息分析时,通常包括对采样率、采样数和采样格式等方面的分析。其中,采样率是指每秒从连续信号中提取并组成离散信号的采样个数,它用赫兹(Hz)来表示。采样频率的倒数是采样周期或者叫作采样时间,它是采样之间的时间间隔。通俗的讲采样频率是指计算机每秒钟采集多少个信号样本。采样数是指一帧音频的大小。采样格式则是指音频的存储格式,比如8位无符号整数、16位有符号整数、32位有符号整数以及单精度浮点数等。
在一实施例中,在脉冲编码调制(ulse Code Modulation,PCM)音频数据中,音频帧有两种说法:一是一个音频帧通常指一个采样点大小,比如8通道/位深(B/s);二是一个音频帧用多长时间,比如以1s时间范围内的音频数据作为1帧。
可以理解地是,在其他非PCM数据中,音频帧有固定大小的,非固定 大小的,还有一种固定时长的。针对非固定的大小和固定时长的音频帧类型,需要实时解析才可知道音频帧的实际大小。
步骤S103、基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频。
本实施例中,可以根据当前机位匹配到的目标方位信息,对目标音频帧进行头相关变换函数(Head Related Transfer Function,HRTF)运算,从而生成当前机位对应的具有空间音效的音频数据,客户端可以将转换完毕的空间音频数据渲染输出,得到目标空间音频。
具体地,HRTF(Head Related Transfer Function)是指头相关变换函数,是一种音效定位算法。HRTF是一个声学模型,是空间参数(相对于听者人头中心的球形坐标系)、声音频率(一般只包括20-20khz,因为人耳一般只能感受这个频率范围的声音)、人体学参数(会反射、衍射声波的头、躯干、耳廓等的尺寸)的函数。
在一实施例中,HRTF使用人耳和人脑的频率振动预知来合成3D音效,通过高速数字信号处理(Digital Signal Processor,DSP)计算,HRTF可实时处理虚拟世界的音源。当声音晶片计算包含3D声音的波形时,通过耳机,人脑可感知到真实的定位感受,比如从前方/后方,上方/下方或者是三维空间内任意方位传来的声音。
在一实施例中,若是得到某个人的全空间HRTF数据库,则可以完美的渲染(将某一空间位置时域HRIR与单声道声音卷积或者HRTF与单声道声音的傅里叶变换乘积)出任何位置听者想要听见的空间位置的声音。
本实施例提供了一种基于自由视角的空间音效实现方法,本方法根据用户选择的当前机位对应的机位号,在存储单元中查找对应的目标方位信息,作为当前机位的声音方位信息;通过对音频流数据的解析,将音频流数据解码为音频帧,作为当前机位的音频数据;根据当前机位的目标方位信息,对音频帧进行对应空间方位的音效转换,获得当前机位对应的目标空间音频;通过不同方位的机位对应的方位信息,对音频帧进行相应的音效转换,以获得对应各机位方位信息的音频流,使得视频流和音频流可以实现空间同步效果,提高用户体验感。由此,解决了目前自由视角的音效播放体验感差的技术问题。
请参照图2,图2为本发明提供的一种基于自由视角的空间音效实现方法的第二实施例流程示意图。
本实施例中,基于上述图1所示实施例,所述步骤S103之前,还包括:
步骤S201、基于当前设备分布方式,确定音频播放模式,所述音频播放模式包括环绕式播放模式和线性式播放模式。
本实施例中,可以根据应用现场的设备分布情况,确定对应的音频播放模式,以匹配当前设备分布的播放需求。
在一实施例中,音频播放模式可以是环绕式播放模式,或者是线性式播 放模式。
在一实施例中,如图3所示,环绕式播放模式是指各机位与中心点的距离是一致的,不考虑距离对于声音产生的影响。
在一实施例中,在所述音频播放模式为环绕式播放模式的情况下,基于所述当前机位号对应的所述方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的所述目标空间音频。
在一实施例中,客户端可以根据用户当前选择的机位对应的机位号,从存储单元中匹配到该机位号对应的声音方位信息。然后根据该机位号对应的声音方位信息对该机位的音频帧进行HRTF运算,实现空间音效转换。
在一实施例中,在环绕式播放模式下,在完成当前机位的空间音效转换之后,用户可以进行视角切换操作,更换机位,重新对下一个机位进行空间音效转换。
在一实施例中,在所述音频播放模式为线性式播放模式的情况下,基于所述当前机位号对应的所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的第一空间音频;基于所述当前设备分布方式,确定设备分布中心;获取所述当前机位与所述设备分布中心的相对距离,并基于所述相对距离,对所述第一空间音频进行修正,获得第二空间音频,作为所述当前机位对应的目标空间音频。
在一实施例中,如图4所示,在线性式播放模式中,各机位与中心点的距离不一致,加入了距离衰减,可以达到声音随着距离的增加而衰减的效果,使得用户具有更好的视听体验。
在一实施例中,在用户选定待处理的机位之后,在存储单元中匹配对应的声音方位信息,然后根据该声音方位信息对音频帧进行HRTF运算,然后根据该机位与中心点的距离,在对音频帧转换时加入距离运算,转换为具有空间音效的音频数据,然后客户端重新对距离参数进行修正,生成修正位置的空间音频数据,在将转换后的空间音频数据进行渲染输出。
在一实施例中,距离运算方法可以是如开放音效库(Open Audio Library,Open AL)等音效处理方法。OpenAL(Open Audio Library)是自由软件界的跨平台音效应用程序编程接口(Application Programming Interface,API),由Loki Software,使用在Windows、Linux***上,用在音效缓冲和收听中编码。OpenAL主要的功能是在来源物体、音效缓冲和收听者中编码。来源物体包含一个指向缓冲区的指标、声音的速度、位置和方向,以及声音强度。收听者物体包含收听者的速度、位置和方向,以及全部声音的整体增益。缓冲里包含8或16位元、单声道或立体声PCM格式的音效资料,表现引擎进行所有必要的计算,如距离衰减、多普勒效应等。
在一实施例中,在线性式播放模式下,在完成当前机位的空间音效转换之后,用户可以进行视角切换操作,更换机位,根据下一个机位与中心点的距离,重新对下一个机位进行空间音效转换。
可以理解地是,本领域人员应知,音频播放模式可以是但不限于上述两种模式,本申请仅以上述两种方式进行举例说明,本申请提供的基于自由视角的空间音效实现方法适用的其他音频播放模式也应在本申请保护范围内。
进一步地,基于上述图1所示实施例,所述步骤S103之前,具体还包括:基于用户输入的微调参数,对所述目标方位信息进行重计算,生成修正方位信息,作为所述目标方位信息。
本实施例中,在匹配到当前机位对应的声音方位信息后,可以对该方位进行微调,包括方位、俯仰等数据,生成修正方位信息,然后根据修正方位信息,对当前机位的音频帧进行HRTF运算,实现空间音效转换。
请参照图5,图5为本发明提供的一种基于自由视角的空间音效实现方法的第三实施例流程示意图。
本实施例中,基于上述图1所示实施例,所述步骤S101之前,还包括:
步骤S301、获取索引文件。
步骤S302、基于对所述索引文件的解析,获取所述媒体信息,并将所述媒体信息存储于所述存储单元中;其中,所述媒体信息组包括至少一个机位的机位号、各机位号对应的声音方位信息以及默认机位号。
本实施例中,服务端可以将媒体信息写入索引文件中,索引文件存储在服务器中,在需要使用时,客户端向服务端请求下载,以获得该索引文件。
在一实施例中,客户端可以在下载索引文件之后,通过对索引文件的解析,获取每个机位的机位号、声音方位信息以及默认播放的机位号等媒体信息,并且将该媒体信息转存到客户端中的预设存储单元中。
在一实施例中,默认机位号是指用户未指定某一个机位号时,采用索引文件中默认机位号作为当前机位的机位号,并根据默认机位号对应的声音方位信息对当前机位的音频流进行空间音效转换。
进一步地,基于上述图1所示实施例,所述步骤S101之前,还包括:获取至少一路视频流数据;基于对所述视频流数据的数据头的解析,获得各路视频流数据对应的声音方位信息;基于所述存储单元,存储各视频流数据的机位号以及各机位号对应的所述声音方位信息。
在一实施例中,服务端可以将所有机位的声音方位信息写入每一路视频流的数据头中,客户端解析所有视频流中数据头中的方位信息,并转存到客户端的存储单元中。
在一实施例中,可以以时间戳为同步标准,将视频流和音频流数据进行同步,在进行各机位的空间音效转换之后,可以实现具有空间转换效果的音画同步。
进一步地,基于上述图1所示实施例,所述步骤S103之后,具体还包括:基于所述用户执行的视角切换指令,切换到下一个机位视角,作为所述当前机位。
在一实施例中,客户端在完成当前机位的音效转换之后,用户可以点击 视角切换指令,可以根据机位号顺序在未处理机位中进行顺延切换,也可以直接点击想要进行处理的未处理机位对应的机位号,切换之后作为当前机位进行空间音效转换运算。
请参阅图6,图6为本发明实施例提供的一种基于自由视角的空间音效实现设备的结构示意性框图。
如图6所示,基于自由视角的空间音效实现设备300包括处理器301和存储器302,处理器301和存储器302通过总线303连接,该总线比如为I2C(Inter-integrated Circuit)总线。
具体地,处理器301用于提供计算和控制能力,支撑整个基于自由视角的空间音效实现设备的运行。处理器301可以是中央处理单元(Central Processing Unit,CPU),该处理器301还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
具体地,存储器302可以是闪存(Flash)芯片、只读存储器(ROM,Read-Only Memory)磁盘、光盘、U盘或移动硬盘等。
本领域技术人员可以理解,图6中示出的结构,仅仅是与本发明实施例方案相关的部分结构的框图,并不构成对本发明实施例方案所应用于其上的基于自由视角的空间音效实现设备的限定,具体的基于自由视角的空间音效实现设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
其中,所述处理器301用于运行存储在存储器302中的计算机程序,并在执行所述计算机程序时实现本发明实施例提供的任意一种所述的基于自由视角的空间音效实现方法。
在一实施例中,所述处理器301用于运行存储在存储器中的计算机程序,并在执行所述计算机程序时实现如下步骤:基于当前机位对应的机位号,在存储单元中匹配所述当前机位对应的目标方位信息;获取音频流数据,并将所述音频流数据解码为目标音频帧;基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频。
在一实施例中,所述处理器301在实现所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频之前,还用于实现:基于当前设备分布方式,确定音频播放模式,所述音频播放模式包括环绕式播放模式和线性式播放模式。
在一实施例中,所述处理器301在实现所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频时,用于实现:在所述音频播放模式为环绕式播放模式的情况下,基于所述当前 机位号对应的所述方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的所述目标空间音频。
在一实施例中,所述处理器301在实现所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频时,用于实现:在所述音频播放模式为线性式播放模式的情况下,基于所述当前机位号对应的所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的第一空间音频;基于所述当前设备分布方式,确定设备分布中心;获取所述当前机位与所述设备分布中心的相对距离,并基于所述相对距离,对所述第一空间音频进行修正,获得第二空间音频,作为所述当前机位对应的目标空间音频。
在一实施例中,所述处理器301在实现所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频之前,还用于实现:基于用户输入的微调参数,对所述目标方位信息进行重计算,生成修正方位信息,作为所述目标方位信息。
在一实施例中,所述处理器301在实现所述基于当前机位对应的机位号,在存储单元中匹配所述当前机位对应的目标方位信息之前,还用于实现:获取索引文件;基于对所述索引文件的解析,获取所述媒体信息,并将所述媒体信息存储于所述存储单元中;其中,所述媒体信息组包括至少一个机位的机位号、各机位号对应的声音方位信息以及默认机位号。
在一实施例中,所述处理器301在实现所述基于当前机位对应的机位号,在存储单元中匹配所述当前机位对应的目标方位信息之前,还用于实现:获取至少一路视频流数据;基于对所述视频流数据的数据头的解析,获得各路视频流数据对应的声音方位信息;基于所述存储单元,存储各视频流数据的机位号以及各机位号对应的所述声音方位信息。
在一实施例中,所述处理器301在实现所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频之后,还用于实现:基于所述用户执行的视角切换指令,切换到下一个机位视角,作为所述当前机位。
需要说明的是,所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的基于自由视角的空间音效实现设备的具体工作过程,可以参考前述基于自由视角的空间音效实现方法实施例中的对应过程,在此不再赘述。
本发明实施例还提供一种存储介质,用于计算机可读存储,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现如本发明实施例说明书提供的任一项基于自由视角的空间音效实现方法。
其中,所述存储介质可以是前述实施例所述的基于自由视角的空间音效实现设备的内部存储单元,例如所述基于自由视角的空间音效实现设备的硬 盘或内存。所述存储介质也可以是所述基于自由视角的空间音效实现设备的外部存储设备,例如所述基于自由视角的空间音效实现设备上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。
本领域普通技术人员可以理解,上文中所公开方法中的全部或某些步骤、***、装置中的功能模块/单元可以被实施为软件、固件、硬件及其适当的组合。在硬件实施例中,在以上描述中提及的功能模块/单元之间的划分不一定对应于物理组件的划分;例如,一个物理组件可以具有多个功能,或者一个功能或步骤可以由若干物理组件合作执行。某些物理组件或所有物理组件可以被实施为由处理器,如中央处理器、数字信号处理器或微处理器执行的软件,或者被实施为硬件,或者被实施为集成电路,如专用集成电路。这样的软件可以分布在计算机可读介质上,计算机可读介质可以包括计算机存储介质(或非暂时性介质)和通信介质(或暂时性介质)。如本领域普通技术人员公知的,术语计算机存储介质包括在用于存储信息(诸如计算机可读指令、数据结构、程序模块或其他数据)的任何方法或技术中实施的易失性和非易失性、可移除和不可移除介质。计算机存储介质包括但不限于RAM、ROM、EEPROM、闪存或其他存储器技术、CD-ROM、数字多功能盘(DVD)或其他光盘存储、磁盒、磁带、磁盘存储或其他磁存储装置、或者可以用于存储期望的信息并且可以被计算机访问的任何其他的介质。此外,本领域普通技术人员公知的是,通信介质通常包含计算机可读指令、数据结构、程序模块或者诸如载波或其他传输机制之类的调制数据信号中的其他数据,并且可包括任何信息递送介质。
应当理解,在本发明说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者***不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者***所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者***中还存在另外的相同要素。
上述本发明实施例序号仅仅为了描述,不代表实施例的优劣。以上所述,仅为本发明的具体实施例,但本发明的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本发明揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本发明的保护范围之内。因此,本发明的保护范围应以权利要求的保护范围为准。

Claims (10)

  1. 一种基于自由视角的空间音效实现方法,其中,所述方法包括以下步骤:
    基于当前机位对应的机位号,在存储单元中匹配所述当前机位对应的目标方位信息;
    获取音频流数据,并将所述音频流数据解码为目标音频帧;
    基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频。
  2. 根据权利要求1所述的基于自由视角的空间音效实现方法,其中,所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频之前,还包括:
    基于当前设备分布方式,确定音频播放模式,所述音频播放模式包括环绕式播放模式和线性式播放模式。
  3. 根据权利要求2所述的基于自由视角的空间音效实现方法,其中,所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频,包括:
    在所述音频播放模式为环绕式播放模式的情况下,基于所述当前机位号对应的所述方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的所述目标空间音频。
  4. 根据权利要求2所述的基于自由视角的空间音效实现方法,其中,所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频,包括:
    在所述音频播放模式为线性式播放模式的情况下,基于所述当前机位号对应的所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的第一空间音频;
    基于所述当前设备分布方式,确定设备分布中心;
    获取所述当前机位与所述设备分布中心的相对距离,并基于所述相对距离,对所述第一空间音频进行修正,获得第二空间音频,作为所述当前机位对应的目标空间音频。
  5. 根据权利要求1所述的基于自由视角的空间音效实现方法,其中,所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频之前,还包括:
    基于用户输入的微调参数,对所述目标方位信息进行重计算,生成修正方位信息,作为所述目标方位信息。
  6. 根据权利要求1所述的基于自由视角的空间音效实现方法,其中,所述基于当前机位对应的机位号,在存储单元中匹配所述当前机位对应的目标 方位信息之前,还包括:
    获取索引文件;
    基于对所述索引文件的解析,获取所述媒体信息,并将所述媒体信息存储于所述存储单元中;其中,所述媒体信息组包括至少一个机位的机位号、各机位号对应的声音方位信息以及默认机位号。
  7. 根据权利要求1所述的基于自由视角的空间音效实现方法,其中,所述基于当前机位对应的机位号,在存储单元中匹配所述当前机位对应的目标方位信息之前,还包括:
    获取至少一路视频流数据;
    基于对所述视频流数据的数据头的解析,获得各路视频流数据对应的声音方位信息;
    基于所述存储单元,存储各视频流数据的机位号以及各机位号对应的所述声音方位信息。
  8. 根据权利要求1-7中任一项所述的基于自由视角的空间音效实现方法,其中,所述基于所述目标方位信息,对所述目标音频帧进行音效转换,获得所述当前机位对应的目标空间音频之后,还包括:
    基于所述用户执行的视角切换指令,切换到下一个机位视角,作为所述当前机位。
  9. 一种基于自由视角的空间音效实现设备,其中,所述基于自由视角的空间音效实现设备包括处理器、存储器、存储在所述存储器上并可被所述处理器执行的计算机程序以及用于实现所述处理器和所述存储器之间的连接通信的数据总线,其中所述计算机程序被所述处理器执行时,实现如权利要求1至8中任一项所述的基于自由视角的空间音效实现方法的步骤。
  10. 一种存储介质,用于计算机可读存储,其中,所述存储介质存储有一个或者多个程序,所述一个或者多个程序可被一个或者多个处理器执行,以实现权利要求1至8中任一项所述的基于自由视角的空间音效实现的方法的步骤。
PCT/CN2023/129967 2022-11-04 2023-11-06 基于自由视角的空间音效实现方法、设备及存储介质 WO2024094214A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211378901.5A CN118042345A (zh) 2022-11-04 2022-11-04 基于自由视角的空间音效实现方法、设备及存储介质
CN202211378901.5 2022-11-04

Publications (1)

Publication Number Publication Date
WO2024094214A1 true WO2024094214A1 (zh) 2024-05-10

Family

ID=90929784

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/129967 WO2024094214A1 (zh) 2022-11-04 2023-11-06 基于自由视角的空间音效实现方法、设备及存储介质

Country Status (2)

Country Link
CN (1) CN118042345A (zh)
WO (1) WO2024094214A1 (zh)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104869524A (zh) * 2014-02-26 2015-08-26 腾讯科技(深圳)有限公司 三维虚拟场景中的声音处理方法及装置
US20180332422A1 (en) * 2017-05-12 2018-11-15 Microsoft Technology Licensing, Llc Multiple listener cloud render with enhanced instant replay
CN111148013A (zh) * 2019-12-26 2020-05-12 上海大学 一个动态跟随听觉视角的虚拟现实音频双耳再现***与方法
CN111885414A (zh) * 2020-07-24 2020-11-03 腾讯科技(深圳)有限公司 一种数据处理方法、装置、设备及可读存储介质
CN112492380A (zh) * 2020-11-18 2021-03-12 腾讯科技(深圳)有限公司 音效调整方法、装置、设备及存储介质
CN112771884A (zh) * 2018-04-13 2021-05-07 华为技术有限公司 具有多个机位的虚拟现实内容的沉浸式媒体度量
CN114040318A (zh) * 2021-11-02 2022-02-11 海信视像科技股份有限公司 一种空间音频的播放方法及设备
CN114630145A (zh) * 2022-03-17 2022-06-14 腾讯音乐娱乐科技(深圳)有限公司 一种多媒体数据合成方法、设备及存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104869524A (zh) * 2014-02-26 2015-08-26 腾讯科技(深圳)有限公司 三维虚拟场景中的声音处理方法及装置
US20180332422A1 (en) * 2017-05-12 2018-11-15 Microsoft Technology Licensing, Llc Multiple listener cloud render with enhanced instant replay
CN112771884A (zh) * 2018-04-13 2021-05-07 华为技术有限公司 具有多个机位的虚拟现实内容的沉浸式媒体度量
CN111148013A (zh) * 2019-12-26 2020-05-12 上海大学 一个动态跟随听觉视角的虚拟现实音频双耳再现***与方法
CN111885414A (zh) * 2020-07-24 2020-11-03 腾讯科技(深圳)有限公司 一种数据处理方法、装置、设备及可读存储介质
CN112492380A (zh) * 2020-11-18 2021-03-12 腾讯科技(深圳)有限公司 音效调整方法、装置、设备及存储介质
CN114040318A (zh) * 2021-11-02 2022-02-11 海信视像科技股份有限公司 一种空间音频的播放方法及设备
CN114630145A (zh) * 2022-03-17 2022-06-14 腾讯音乐娱乐科技(深圳)有限公司 一种多媒体数据合成方法、设备及存储介质

Also Published As

Publication number Publication date
CN118042345A (zh) 2024-05-14

Similar Documents

Publication Publication Date Title
US10674262B2 (en) Merging audio signals with spatial metadata
US11838742B2 (en) Signal processing device and method, and program
KR20170106063A (ko) 오디오 신호 처리 방법 및 장치
EP3574662B1 (en) Ambisonic audio with non-head tracked stereo based on head position and time
US10075797B2 (en) Matrix decoder with constant-power pairwise panning
BR112020000759A2 (pt) aparelho para gerar uma descrição modificada de campo sonoro de uma descrição de campo sonoro e metadados em relação a informações espaciais da descrição de campo sonoro, método para gerar uma descrição aprimorada de campo sonoro, método para gerar uma descrição modificada de campo sonoro de uma descrição de campo sonoro e metadados em relação a informações espaciais da descrição de campo sonoro, programa de computador, descrição aprimorada de campo sonoro
JP2018110366A (ja) 3dサウンド映像音響機器
US10595148B2 (en) Sound processing apparatus and method, and program
TW202105164A (zh) 用於低頻率效應之音訊呈現
TWI736542B (zh) 資訊處理裝置、資料配訊伺服器及資訊處理方法、以及非暫時性電腦可讀取之記錄媒體
WO2024094214A1 (zh) 基于自由视角的空间音效实现方法、设备及存储介质
US11503226B2 (en) Multi-camera device
US11483669B2 (en) Spatial audio parameters
CN113691927B (zh) 音频信号处理方法及装置
CN113194400B (zh) 音频信号的处理方法、装置、设备及存储介质
KR20190079993A (ko) 입체 음향 컨텐츠 저작 방법 및 이를 위한 어플리케이션
US20230089225A1 (en) Audio rendering method and apparatus
You et al. Using digital compass function in smartphone for head-tracking to reproduce virtual sound field with headphones
CN114128312A (zh) 用于低频效果的音频渲染
CN115033201A (zh) 一种音频录制方法、装置、***、设备和存储介质
Ludovico et al. Head in space: A head-tracking based binaural spatialization system

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23885138

Country of ref document: EP

Kind code of ref document: A1