WO2024119946A1 - 音频控制方法、音频控制装置、介质与电子设备 - Google Patents

音频控制方法、音频控制装置、介质与电子设备 Download PDF

Info

Publication number
WO2024119946A1
WO2024119946A1 PCT/CN2023/118788 CN2023118788W WO2024119946A1 WO 2024119946 A1 WO2024119946 A1 WO 2024119946A1 CN 2023118788 W CN2023118788 W CN 2023118788W WO 2024119946 A1 WO2024119946 A1 WO 2024119946A1
Authority
WO
WIPO (PCT)
Prior art keywords
audio
sound
played
target
type
Prior art date
Application number
PCT/CN2023/118788
Other languages
English (en)
French (fr)
Inventor
白金
严锋贵
林松
李鸿
姚津
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2024119946A1 publication Critical patent/WO2024119946A1/zh

Links

Definitions

  • the present disclosure relates to the technical field of audio processing, and in particular to an audio control method, an audio control device, a computer-readable storage medium, and an electronic device.
  • Sound image refers to the listener's sense of the sound position.
  • the sound image of audio is usually fixed. For example, most audio does not have spatial sound effects, and its sound image is on the left and right sides of the user or in a relatively close range by default.
  • an audio control method an audio control device, a computer-readable storage medium, and an electronic device are provided.
  • an audio control method comprising: providing an audio and video setting control in an audio setting interface; in response to an audio and video setting operation performed through the audio and video setting control, determining audio and video information of a target audio type according to the audio and video setting operation; and when playing an audio to be played under the target audio type, rendering the audio to be played based on the audio and video information of the target audio type.
  • an audio control device comprising: a control providing module, for providing an audio and video setting control in an audio setting interface; an information determining module, for responding to an audio and video setting operation performed through the audio and video setting control, and determining the audio and video information of a target audio type according to the audio and video setting operation; and an audio rendering module, for rendering the audio to be played based on the audio and video information of the target audio type when playing the audio to be played under the target audio type.
  • a computer-readable storage medium on which a computer program is stored.
  • the audio control method of the first aspect and possible implementation methods thereof are implemented.
  • an electronic device comprising: a processor; and a memory for storing executable instructions of the processor, wherein the processor is configured to execute the audio control method of the first aspect and possible implementation thereof by executing the executable instructions.
  • FIG. 1 shows a flow chart of an audio control method in this exemplary embodiment.
  • FIG. 2 is a schematic diagram showing an audio setting interface in this exemplary implementation.
  • FIG. 3 is a schematic diagram showing a virtual sound space in this exemplary embodiment.
  • FIG. 4 is a schematic diagram showing another audio setting interface in this exemplary embodiment.
  • FIG. 5 is a schematic diagram showing another virtual sound space in this exemplary embodiment.
  • FIG. 6 is a schematic diagram showing an interface of the front mode in this exemplary embodiment.
  • FIG. 7 is a schematic diagram showing an interface of a post-placement mode in this exemplary embodiment.
  • FIG. 8 is a schematic diagram showing an interface of the spatial tiling mode in this exemplary embodiment.
  • FIG. 9 shows a flowchart of the underlying architecture of an audio control method in this exemplary embodiment.
  • FIG. 10 is a schematic diagram showing an audio playback effect in this exemplary implementation.
  • FIG. 11 is a block diagram showing a structure of an audio control device in this exemplary embodiment.
  • FIG. 12 is a block diagram showing an electronic device in this exemplary embodiment.
  • FIG. 1 shows an exemplary process of an audio control method, including the following operations S110 to S130:
  • Operation S110 providing an audio/video setting control in the audio setting interface.
  • the audio setting interface refers to a visual interface for interacting with the user to set the audio, which may include the audio type to be controlled, audio or audio-visual setting controls, etc.
  • the audio-visual setting control refers to a module control for audio-visual setting.
  • the image distribution of the audio during playback is controlled and adjusted.
  • the image setting control can be a selection control, and the user can determine the image information by selecting it.
  • the image setting control can also be an interactive control, and the user can determine the image information by inputting information.
  • the user can call out the audio settings interface through a preset operation to trigger the display of the audio and video settings controls.
  • the user can call out the audio and video settings controls through voice operation; or click on the audio settings shortcut option in the application to trigger the jump to the audio settings interface and display the audio and video settings controls, etc.
  • pan/tilt information of the target audio type is determined according to the pan/tilt setting operation.
  • the terminal device can play many different types of audio.
  • the game sound in the game program the voice when the user is playing online video or voice, the music of the music player program, and the notification sound with a prompt function provided by the terminal to the user.
  • This exemplary embodiment can perform audio control on the audio under one or more of the above audio types.
  • the user can perform audio and video setting operations on the audio and video setting controls in the audio setting interface, and first determine the audio and video information of the target audio type, wherein the target audio type refers to the audio type that currently needs to be audio controlled.
  • the audio setting interface can provide multiple audio types for the user to determine the target audio type.
  • the audio and video information refers to the audio and video layout information of the audio when it is played.
  • the audio and video layout information can be the specific position information of the sound source corresponding to the audio in the sound virtual space, or it can be the type name of the audio and video layout, such as the front mode, corresponding to the sound source position in the front mode in the sound virtual space, etc.
  • the user performs the audio and video setting operation through the audio and video setting controls, and can determine which audio type is controlled, and what kind of audio and video layout control is performed on the audio under the audio type.
  • the audio and video setting operation can be a single operation, double-clicking, long pressing, sliding, or other combined operations performed by the user in the audio and video setting control.
  • the audio and video information of the target audio type can be determined by performing a sliding audio and video setting operation in the audio setting interface; or the user can first select the target audio type from a variety of audio types through a single-click operation, and then select the audio and video mode through a single-click operation to determine what kind of audio and video adjustments to make to the target audio type, that is, to determine the audio and video information of the target audio type.
  • Figure 2 shows a schematic diagram of an audio setting interface, which can display the simulated position of a simulated user in a virtual sound space.
  • the audio setting interface can also include a plurality of audio type options 210 to be selected, and an audio-visual setting control 220.
  • the user can perform audio-visual setting operations in the audio setting interface, determine the target audio type from the plurality of audio type options 210 to be selected, and determine the audio-visual information corresponding to the target audio type in the audio-visual setting control 220.
  • Operation S230 When playing the audio to be played of the target audio type, rendering the audio to be played based on the audio-visual information of the target audio type.
  • the target audio type may include multiple audios, for example, the music audio type may include different music audios.
  • this exemplary embodiment may apply the audio-visual information to all audios under the target audio type.
  • the audio to be played is the audio that needs to be played currently.
  • the audio to be played may be rendered based on the audio-visual information of the target audio type. For example, after determining the audio-visual information of the game audio type, when it is necessary to play the audio in game program A, the audio in game program A may be rendered based on the audio-visual information of the game audio type. Render the audio.
  • Rendering the audio to be played refers to an audio processing process performed according to the position information of the audio in the virtual sound space in the sound-image information, so that when the audio to be played is played, it can present the sound effect of the sound source of the audio to be played being located at a specific sound source position in the virtual sound space.
  • the virtual sound space refers to a virtual space simulated based on physical laws such as the structure of the human ear and the propagation characteristics of sound in the air, and the virtual space has spatial characteristics that match the physical space.
  • the sound effects of different real sound sources can be simulated, so that the user feels that the sound seems to be emitted from a virtual position in the three-dimensional space.
  • the sound source is a virtual sound point position in the virtual space.
  • a plurality of sound source positions can be included in the virtual space, and one or more sound source positions can constitute a sound-image layout, and different audio types can be set with different sound-image layouts to achieve a listening effect that does not interfere with each other when audio of different audio types is played simultaneously.
  • an audio and video setting control is provided in the audio setting interface; in response to the audio and video setting operation performed through the audio and video setting control, the audio and video information of the target audio type is determined according to the audio and video setting operation; when playing the audio to be played under the target audio type, the audio to be played is rendered based on the audio and video information of the target audio type.
  • this exemplary embodiment can set the audio and video information of the target audio type according to the audio and video setting operation by performing the audio and video setting operation on the audio and video setting control, that is, this exemplary embodiment provides a method for setting corresponding audio and video information for different audio types, and audio rendering is performed based on the audio and video information, so that the audio can be played according to the set audio and video information, avoiding the problem of mutual interference during playback due to the single and fixed audio and video information in the prior art;
  • the user can perform the audio and video setting operation on the audio and video setting control, that is, can achieve the setting of the audio and video information of the target audio type, the operation process is simple and convenient, and can also meet the user's personalized needs for audio and video, and has a wide range of applications.
  • the sound image information of the target audio type includes the spatial sound source orientation of the target audio type; the operation S230 may include:
  • the spatial sound source orientation of the target audio type refers to the orientation information of the sound source corresponding to the audio of the target audio type in the sound virtual space, which may include information such as the position, direction and distance of the sound source of the target audio type in the virtual sound space, wherein the position may be a fixed coordinate of the sound source, such as a three-dimensional coordinate, or a coordinate set, etc.; the direction refers to the main propagation direction of the audio during playback.
  • the sound effects of the direction relative to the terminal device and the direction opposite to the terminal device in the sound virtual space may also be quite different.
  • the specific direction may be represented by the relative angle between the sound source and the terminal device in the virtual space; the distance may be the distance between the sound source and the simulated user in the sound virtual space, or the distance from other The distance of the sound source, etc.
  • a sound object refers to a virtual sound object predefined in the sound virtual space, and different sound objects can be located at different positions in the virtual sound space.
  • FIG. 3 shows sound objects located at different spatial sound source positions in the virtual sound space 300, which may include 6 sound objects at fixed positions S1 to S6 , and 3 sound objects at non-fixed positions, corresponding to music M, game sound G, and voice V, respectively.
  • the terminal may also pre-configure other numbers of sound objects with fixed positions or non-fixed positions when it is factory set.
  • 9 sound objects at fixed positions S1 to S9 may be set, such as music ML , music MR , game sound GL , game sound GR , voice VL , and voice VR , wherein music ML and music MR may correspond to the sound objects of the left channel of music and the right channel of music, respectively.
  • This exemplary embodiment can map the audio data of the audio to be played to a sound object located at the spatial sound source position in the virtual space to associate the unrendered audio data with the sound object, and perform sound and image rendering on the audio data of the audio to be played based on the audio data of the sound object, so that the rendered audio data can present the sound and image effects corresponding to the sound object.
  • the above-mentioned audio and video setting control includes an audio and video mode selection control; in response to an audio and video setting operation performed through the audio and video setting control, determining audio and video information of the target audio type according to the audio and video setting operation includes:
  • a spatial sound source orientation of the target audio type is determined according to the pan mode selected by the user.
  • the audio-visual mode refers to an option mode for determining the audio-visual layout style, which can be a pre-configured mode with a specific audio-visual layout, or a user-defined mode.
  • the audio-visual mode can include multiple types, and different audio-visual modes can have different audio playback effects.
  • the audio-visual setting control may include an audio-visual mode selection control, and the user can select from multiple audio-visual modes by simply clicking, double-clicking, or long pressing operations to determine the spatial sound source orientation corresponding to the current target audio type.
  • the audio-visual setting control 220 shown in Figure 2 is an audio-visual mode selection control, which includes multiple audio-visual modes such as front mode, rear mode, spatial tiling, and customization.
  • the above-mentioned determining the spatial sound source orientation of the target audio type according to the sound and image mode selected by the user includes:
  • the spatial sound source orientation of the target audio type is determined according to the position of the sound and image arrangement control of the target audio type; the sound and image arrangement control may be moved.
  • the audio-visual layout control refers to a control in the audio setting interface that is used to provide users with a visual audio-visual layout effect.
  • the position of the audio-visual layout control in the audio setting interface can, to a certain extent, reflect the spatial sound source orientation of the sound object represented by the audio-visual layout control in the virtual sound space. Users can determine the spatial sound source orientation of the target audio type by dragging the audio-visual layout control to achieve personalized rendering of audio data under the target audio type.
  • FIG4 is a schematic diagram of an audio setting interface, showing the simulated position of a simulated user 410 in a virtual sound space and the directional markers of front, back, left, and right in the space.
  • the user selects a custom mode in the audio-visual mode selection control 420, the user can click or slide to drag the audio-visual arrangement control 430 for game sound, the audio-visual arrangement control 440 for voice, or the audio-visual arrangement control 450 for music.
  • the position to which the object moves in the display interface corresponds to the spatial sound source position in the virtual sound space.
  • FIG4 is only a schematic illustration. In addition to music, game sound and voice, other audio types may be present according to actual needs, and the present disclosure does not specifically limit this.
  • the above-mentioned determining the spatial sound source orientation of the target audio type according to the sound and image mode selected by the user includes:
  • the spatial sound source orientation of the target audio type includes left front, right front, and front.
  • the front mode is a sound image distribution mode in which the sound source orientation is in the front area of the user, similar to the effect of the front speaker.
  • the spatial sound source orientation of the target audio type can include one or more of the left front 510, right front 530 and front 520.
  • the audio setting interface can place the audio and video layout controls 620, 630 and 640 corresponding to the custom mode in the lower right corner, or perform other weakened display processing such as dimming, reducing or not displaying, and highlight the logo 610 corresponding to the current front mode to visually present the audio and video control of the front mode corresponding to the current target audio type.
  • mapping audio data of the audio to be played to a sound object located at a spatial sound source position in a virtual sound space includes:
  • the left channel audio data of the audio to be played is mapped to the sound object in the left front
  • the right channel audio data of the audio to be played is mapped to the sound object in the right front
  • the mono audio data of the audio to be played is mapped to the sound object directly in front; the mono audio data is obtained by merging the left channel audio data and the right channel audio data.
  • the audio data can be mapped according to the orientation of the sound object in the front mode.
  • the left channel audio data of the audio to be played can be mapped to the sound object S 1 in the left front
  • the right channel audio data of the audio to be played can be mapped to the sound object S 3 in the right front.
  • this exemplary embodiment can also perform mixing processing based on the audio data of the left and right channels to obtain a mono audio data, which is mapped to the sound object S 2 in the front.
  • determining the spatial sound source orientation of the target audio type according to the audio-visual mode selected by the user includes:
  • the spatial sound source orientation of the target audio type includes left rear, right rear, and directly rear.
  • the rear mode is a sound and image distribution mode in which the sound source is located in the rear area of the user, which is similar to the effect of rear surround.
  • the spatial sound source location of the target audio type can include one or more of the left rear 560, the right rear 540 and the rear 550.
  • the logo 710 corresponding to the current rear mode can be highlighted in the audio setting interface to visually present the audio and video control of the rear mode corresponding to the current target audio type.
  • the audio data of the audio to be played is mapped into the virtual sound space.
  • Sound objects located at the spatial sound source location include:
  • the left channel audio data of the audio to be played is mapped to the sound object at the left rear
  • the right channel audio data of the audio to be played is mapped to the sound object at the right rear
  • the mono audio data of the audio to be played is mapped to the sound object directly behind; the mono audio data is obtained by merging the left channel audio data and the right channel audio data.
  • the audio data can be mapped according to the orientation of the sound object in the rear mode.
  • the left channel audio data of the audio to be played can be mapped to the sound object S6 at the left rear
  • the right channel audio data of the audio to be played can be mapped to the sound object S4 at the right rear.
  • this exemplary embodiment can also perform mixing processing based on the audio data of the left and right channels to obtain a mono audio data, which is mapped to the sound object S5 directly behind.
  • determining the spatial sound source orientation of the target audio type according to the audio-visual mode selected by the user includes:
  • determining that the spatial sound source orientation of the target audio type includes left front, right front, front, left rear, right rear, and rear.
  • the spatial tiling mode is a sound and image distribution mode of the sound source orientation in the area around the user.
  • the spatial sound source orientation of the target audio type can be determined to include one or more of the left front 510, right front 530, front 520, left rear 560, right rear 540, and rear 550.
  • the logo 810 corresponding to the current spatial tiling mode can be highlighted in the audio setting interface to visually present the audio and video control of the spatial tiling mode corresponding to the current target audio type.
  • mapping audio data of the audio to be played to a sound object located at a spatial sound source position in a virtual sound space includes:
  • the mono audio data of the audio to be played is delayed and then mapped to the sound object in the left front, the sound object in the right front, the sound object in the front, the sound object in the left rear, the sound object in the right rear, and the sound object in the rear.
  • the audio data can be mapped according to the orientation of the sound objects in the spatial tiling mode.
  • the audio to be played can be mixed first to obtain mono audio data, and then a delay is applied thereto, and then the audio data is mapped to the left front sound object S1 , the right front sound object S3 , the front sound object S2 , the left rear sound object S6 , the right rear sound object S4 , and the rear sound object S5 as shown in FIG5 .
  • FIG9 shows a flowchart of the underlying architecture of an audio control method in this exemplary embodiment, which may specifically include:
  • Operation S910 obtaining audio to be played, and receiving a user's selection operation for an audio-visual mode
  • Operation S9130 determining mono audio data Mono 1 of the audio to be played according to the left channel audio data L 1 of the audio to be played and the right channel audio data R 1 of the audio to be played;
  • Operation S9210 determining a mono audio Mono 2 according to left channel audio data L 2 and right channel audio data R 2 of the audio to be played;
  • Operation S9220 mapping the mono audio data Mono 2 to the left front sound object S 1 , the right front sound object S 3 , the front sound object S 2 , the left rear sound object, the right rear sound object, and the rear sound object after adding a delay;
  • Operation S9330 determining mono audio data Mono 3 of the audio to be played according to the left channel audio data L 3 of the audio to be played and the right channel audio data R 3 of the audio to be played;
  • Operation S9410 determining a mono audio Mono 4 according to left channel audio data L 4 and right channel audio data R 4 of the audio to be played;
  • Operation S9420 determining the audio type of the audio to be played
  • Operation S920 performing sound and image rendering based on the audio data of the sound object
  • Operation S940 outputting the signal through the earphone
  • Operation S950 outputting the signal through a speaker.
  • the above-mentioned audio and video setting control is used to set the audio and video information of multiple audio types to support setting different audio and video information for different audio types.
  • the audio and video setting control can adjust the audio and video information of various audio types.
  • the control line that is, the audio setting interface can include multiple audio types.
  • the audio and video setting operation can be performed on these multiple audio types respectively, so as to support the setting of different audio and video information for different audio types, so as to achieve different audio and video effects when playing audio of different audio types at the same time.
  • FIG. 10 shows a schematic diagram of different audio types corresponding to different audio and video information.
  • the user can set the pre-mode for the game sound so that the game sound can be rendered based on the sound object located in the virtual sound space area 1010; set the post-mode for the music so that the music can be rendered based on the sound object located in the virtual sound space area 1020; set the custom mode for the voice so that the voice can be rendered based on the sound object located in the custom position 1030 of the virtual sound space, so that when the game sound, music and voice are played at the same time, there are specific and non-interfering sound effects to improve the user's listening experience.
  • the audio control method may further include:
  • mixer mode When mixer mode is on, any audio type is prevented from requesting audio focus.
  • audio focus refers to the mechanism by which the terminal device focuses on playing one type of audio. For example, when a user listens to music, the music focus is to play the currently playing music; when a user opens a game program while listening to music, the game sound will be played and the music will be stopped. At this time, the audio focus changes from playing music to playing game sound.
  • Mixing mode refers to the mode in which the terminal device can play multiple types of audio at the same time. When the terminal device is in non-mixing mode, there is audio focus, and one type of audio will be played according to the actual situation; when the terminal device is in mixing mode, any audio type's request for audio focus will be blocked, so that all types of audio to be played can be played.
  • the audio control method may further include:
  • a volume setting control is provided in the audio setting interface for setting the volume of the target audio type.
  • the volume setting control refers to an option module for adjusting the volume of the audio, and through the volume setting control, the volume of the audio to be played can be customized.
  • the volume setting control can include a variety of ways to control the volume of the target audio type, such as the volume setting control can include a plurality of volume gears, and the volume of the target audio type can be adjusted to the volume gear by selecting the target volume gear, and the volume setting control can also be in the form of a slider, and when the user drags the slider, the volume can be smoothly increased or decreased.
  • the audio setting interface shown in Figure 2 it is shown that it includes an audio type option 210 to be selected, an audio-visual mode selection control 220, and a schematic diagram of a volume setting control 230, and the user can determine the target audio type by selecting the audio type option 210, and the audio-visual mode selection control 220 determines the audio-visual mode of the target audio type, and the operation of the volume setting control 230 determines the volume of the audio under the target audio type.
  • the audio control device 1100 may include: a control providing module, which is used to provide an audio-visual setting control in an audio setting interface; an information determining module, which is used to respond to an audio-visual setting operation performed through the audio-visual setting control and determine the audio-visual information of the target audio type according to the audio-visual setting operation; and an audio rendering module, which is used to render the audio to be played based on the audio-visual information of the target audio type when playing the audio to be played under the target audio type.
  • a control providing module which is used to provide an audio-visual setting control in an audio setting interface
  • an information determining module which is used to respond to an audio-visual setting operation performed through the audio-visual setting control and determine the audio-visual information of the target audio type according to the audio-visual setting operation
  • an audio rendering module which is used to render the audio to be played based on the audio-visual information of the target audio type when playing the audio to be played under the target audio type.
  • the sound image information of the target audio type includes the spatial sound source orientation of the target audio type;
  • the audio rendering module includes: a mapping unit for mapping the audio of the audio to be played The data is mapped to a sound object located at the spatial sound source position in the virtual sound space; a rendering unit is used to perform sound and image rendering based on the audio data of the sound object.
  • the information determination module includes: a selection unit for determining the spatial sound source orientation of the target audio type according to the audio and video mode selected by the user in response to an operation of selecting from multiple audio and video modes of the target audio type through an audio and video mode selection control.
  • the selection unit includes: a first selection subunit, for determining the spatial sound source orientation of the target audio type according to the position of the sound and image layout control of the target audio type in response to the user selecting the custom mode; the sound and image layout control can be moved.
  • the selection unit includes: a second selection subunit, which is used to determine the spatial sound source orientation of the target audio type including left front, right front, and front in response to the user selecting the front mode from multiple audio and video modes.
  • the mapping unit includes: a first mapping sub-unit, used to map the left channel audio data of the audio to be played to the sound object in the left front, map the right channel audio data of the audio to be played to the sound object in the right front, and map the mono audio data of the audio to be played to the sound object directly in front; the mono audio data is obtained by merging the left channel audio data and the right channel audio data.
  • the selection unit includes: a third selection subunit, which is used to determine that the spatial sound source orientation of the target audio type includes the left rear, the right rear, and the rear in response to the user selecting the rear mode from multiple audio and video modes.
  • the mapping unit includes: a second mapping sub-unit, used to map the left channel audio data of the audio to be played to the sound object at the left rear, map the right channel audio data of the audio to be played to the sound object at the right rear, and map the mono audio data of the audio to be played to the sound object directly behind; the mono audio data is obtained by merging the left channel audio data and the right channel audio data.
  • the selection unit includes: a fourth selection subunit, for determining, in response to a user selecting a spatial tiling mode among multiple audio-visual modes, that the spatial sound source orientation of the target audio type includes left front, right front, front, left rear, right rear, and rear.
  • the mapping unit includes: a third mapping sub-unit, which is used to map the mono audio data of the audio to be played to the sound object in the left front, the sound object in the right front, the sound object in the front, the sound object in the left rear, the sound object in the right rear, and the sound object in the rear after adding a delay.
  • the audio and video setting control is used to set audio and video information of multiple audio types to support setting different audio and video information for different audio types.
  • the audio control device further includes: a request blocking module, configured to block requests for audio focus from any audio type when the audio mixing mode is turned on.
  • the audio control device further includes: a volume control module, configured to provide a volume setting control in the audio setting interface for setting the volume of the target audio type.
  • the exemplary embodiments of the present disclosure also provide a computer-readable storage medium that can implement It is in the form of a program product, including program code.
  • the program code is used to enable the terminal device to perform the operations according to various exemplary embodiments of the present disclosure described in the above "Exemplary Method" section of this specification, for example, any one or more operations in Figure 1 or Figure 10 can be performed.
  • the program product can be a portable compact disk read-only memory (CD-ROM) and include program code, and can be run on a terminal device, such as a personal computer.
  • the program product of the present disclosure is not limited to this.
  • the readable storage medium can be any tangible medium containing or storing a program, which can be used by or in combination with an instruction execution system, an apparatus or a device.
  • the program product may use any combination of one or more readable media.
  • the readable medium may be a readable signal medium or a readable storage medium.
  • the readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination of the above. More specific examples of readable storage media (a non-exhaustive list) include: an electrical connection with one or more wires, a portable disk, a hard disk, a random access memory, a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the above.
  • Computer readable signal media may include data signals propagated in baseband or as part of a carrier wave, in which readable program code is carried. Such propagated data signals may take a variety of forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above. Readable signal media may also be any readable medium other than a readable storage medium, which may send, propagate, or transmit a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • the program code embodied on the readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wired, optical cable, RF, etc., or any suitable combination of the foregoing.
  • Program code for performing the operations of the present disclosure may be written in any combination of one or more programming languages, including object-oriented programming languages such as Java, C++, etc., and conventional procedural programming languages such as "C" or similar programming languages.
  • the program code may be executed entirely on the user computing device, partially on the user device, as a separate software package, partially on the user computing device and partially on a remote computing device, or entirely on a remote computing device or server.
  • the remote computing device may be connected to the user computing device through any type of network, including a local area network (LAN) or a wide area network (WAN), or may be connected to an external computing device (e.g., through the Internet using an Internet service provider).
  • LAN local area network
  • WAN wide area network
  • Internet service provider e.g., AT&T, MCI, Sprint, EarthLink, etc.
  • the exemplary embodiments of the present disclosure further provide an electronic device.
  • the electronic device may include a processor and a memory, wherein the memory is used to store executable instructions of the processor, and the processor is configured to execute the above-mentioned image processing method for distortion correction by executing the executable instructions.
  • the mobile terminal 1200 may specifically include: a processor 1201, a memory 1202 , bus 1203 , mobile communication module 1204 , antenna 1 , wireless communication module 1205 , antenna 2 , display screen 1206 , camera module 1207 , audio module 1208 , power module 1209 and sensor module 1210 .
  • Processor 1201 may include one or more processing units.
  • processor 1201 may include AP (Application Processor), modem processor, GPU (Graphics Processing Unit), ISP (Image Signal Processor), controller, encoder, decoder, DSP (Digital Signal Processor), baseband processor and/or NPU (Neural-Network Processing Unit), etc.
  • AP Application Processor
  • modem processor GPU
  • ISP Image Signal Processor
  • controller encoder
  • decoder Digital Signal Processor
  • DSP Digital Signal Processor
  • baseband processor and/or NPU Neuro-Network Processing Unit
  • the encoder can encode (i.e. compress) an image or video to reduce the data size for storage or transmission.
  • the decoder can decode (i.e. decompress) the encoded data of an image or video to restore the image or video data.
  • the processor 1201 may be connected to the memory 1202 or other components via a bus 1203 .
  • the memory 1202 may be used to store computer executable program codes, which may include instructions.
  • the processor 1201 executes various functional applications and data processing of the mobile terminal 1200 by running the instructions stored in the memory 1202.
  • the memory 1202 may also store application data, such as images, videos, and other files.
  • the communication function of the mobile terminal 1200 can be implemented by the mobile communication module 1204, antenna 1, wireless communication module 1205, antenna 2, modulation and demodulation processor and baseband processor. Antenna 1 and antenna 2 are used to transmit and receive electromagnetic wave signals.
  • the mobile communication module 1204 can provide 3G, 4G, 5G and other mobile communication solutions applied to the mobile terminal 1200.
  • the wireless communication module 1205 can provide wireless communication solutions such as wireless LAN, Bluetooth, and near field communication applied to the mobile terminal 1200.
  • the display screen 1206 is used to implement display functions, such as displaying user interfaces, images, videos, etc.
  • the camera module 1207 is used to implement shooting functions, such as shooting images, videos, etc.
  • the audio module 1208 is used to implement audio functions, such as playing audio, collecting voice, etc.
  • the power module 1209 is used to implement power management functions, such as charging the battery, powering the device, monitoring the battery status, etc.
  • the sensor module 1210 may include one or more sensors to implement corresponding sensing detection functions.

Landscapes

  • Stereophonic System (AREA)

Abstract

一种音频控制方法,包括:在音频设置界面中提供声像设置控件(S110);响应于通过所述声像设置控件进行的声像设置操作,根据所述声像设置操作确定目标音频类型的声像信息(S120);当播放所述目标音频类型下的待播放音频时,基于所述目标音频类型的声像信息对所述待播放音频进行渲染(S130)。

Description

音频控制方法、音频控制装置、介质与电子设备
相关申请的交叉引用
本申请要求于2022年12月08日提交中国专利局、申请号为202211574970.3、发明名称为“音频控制方法、音频控制装置、介质与电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及音频处理技术领域,尤其涉及一种音频控制方法、音频控制装置、计算机可读存储介质与电子设备。
背景技术
声像,是指听音者对声音位置的感觉印象。现有技术中,音频的声像通常为固定的,例如大部分音频都不具有空间声音特效,其声像默认是在用户的左右两侧或较近范围内。
发明内容
根据本公开的各种实施例提供一种音频控制方法、音频控制装置、计算机可读存储介质与电子设备。
本公开的其他特性和优点将通过下面的详细描述变得显然,或部分地通过本公开的实践而习得。
根据本公开的第一方面,提供一种音频控制方法,包括:在音频设置界面中提供声像设置控件;响应于通过所述声像设置控件进行的声像设置操作,根据所述声像设置操作确定目标音频类型的声像信息;当播放所述目标音频类型下的待播放音频时,基于所述目标音频类型的声像信息对所述待播放音频进行渲染。
根据本公开的第二方面,提供一种音频控制装置,包括:控件提供模块,用于在音频设置界面中提供声像设置控件;信息确定模块,用于响应于通过所述声像设置控件进行的声像设置操作,根据所述声像设置操作确定目标音频类型的声像信息;音频渲染模块,用于当播放所述目标音频类型下的待播放音频时,基于所述目标音频类型的声像信息对所述待播放音频进行渲染。
根据本公开的第三方面,提供一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述第一方面的音频控制方法及其可能的实现方式。
根据本公开的第四方面,提供一种电子设备,包括:处理器;存储器,用于存储所述处理器的可执行指令。其中,所述处理器配置为经由执行所述可执行指令,来执行上述第一方面的音频控制方法及其可能的实现方式。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出本示例性实施方式中一种音频控制方法的流程图。
图2示出本示例性实施方式中一种音频设置界面的示意图。
图3示出本示例性实施方式中一种虚拟声音空间的示意图。
图4示出本示例性实施方式中另一种音频设置界面的示意图。
图5示出本示例性实施方式中另一种虚拟声音空间的示意图。
图6示出本示例性实施方式中前置模式的界面示意图。
图7示出本示例性实施方式中后置模式的界面示意图。
图8示出本示例性实施方式中空间平铺模式的界面示意图。
图9示出本示例性实施方式中一种音频控制方法的底层架构流程图。
图10示出本示例性实施方式中一种音频播放效果的示意图。
图11示出本示例性实施方式中一种音频控制装置的结构框图。
图12示出本示例性实施方式中一种电子设备的结构图。
具体实施方式
现在将参考附图更全面地描述示例实施方式。然而,示例实施方式能够以多种形式实施,且不应被理解为限于在此阐述的范例;相反,提供这些实施方式使得本公开将更加全面和完整,并将示例实施方式的构思全面地传达给本领域的技术人员。所描述的特征、结构或特性可以以任何合适的方式结合在一个或更多实施方式中。在下面的描述中,提供许多具体细节从而给出对本公开的实施方式的充分理解。然而,本领域技术人员将意识到,可以实践本公开的技术方案而省略特定细节中的一个或更多,或者可以采用其它的方法、组元、装置、操作等。在其它情况下,不详细示出或描述公知技术方案以避免喧宾夺主而使得本公开的各方面变得模糊。
此外,附图仅为本公开的示意性图解,并非一定是按比例绘制。图中相同的附图标记表示相同或类似的部分,因而将省略对它们的重复描述。附图中所示的一些方框图是功能实体,不一定必须与物理或逻辑上独立的实体相对应。可以采用软件形式来实现这些功能实体,或在一个或多个硬件模块或集成电路中实现这些功能实体,或在不同网络和/或处理器装置和/或微控制器装置中实现这些功能实体。
图1示出了音频控制方法的示例性流程,包括以下操作S110至S130:
操作S110,在音频设置界面中提供声像设置控件。
音频设置界面是指用于与用户进行交互,以对音频进行设置的可视化界面,其中,可以包括待控制的音频类型、音频或声像设置控件等。声像设置控件是指用于对音频进行声像设置的模块控件,通过声像设置控件,可以对 音频在播放时的声像分布进行控制和调整。声像设置控件可以是选择控件,用户通过选择操作即可以确定声像信息,声像设置控件也可以是交互控件,用户通过信息输入来确定声像信息等。
在本示例性实施例中,用户可以通过预设操作唤出音频设置界面以触发声像设置控件的显示,例如用户可以通过语音操作,唤出声像设置控件;或者在应用程序中对音频设置的快捷选项进行点击操作,以触发跳转音频设置界面,显示声像设置控件等。
操作S220,响应于通过声像设置控件进行的声像设置操作,根据声像设置操作确定目标音频类型的声像信息。
通常,终端设备可以播放多种不同类型的音频。例如游戏程序中的游戏音、用户进行网络视频或语音时的语音、音乐播放程序的音乐以及终端为用户提供的具有提示功能的通知音等。本示例性实施例可以对上述一种或多种音频类型下的音频进行音频控制,具体的,用户可以在音频设置界面中对声像设置控件进行声像设置操作,先确定目标音频类型的声像信息,其中,目标音频类型是指当前需要进行音频控制的音频类型,音频设置界面中可以提供多种音频类型,以供用户从中确定出目标音频类型。声像信息是指音频在播放时的声像布局信息,该声像布局信息可以是音频对应的声源在声音虚拟空间中的具***置信息,也可以是声像布局的类型名称,例如前置模式,对应声音虚拟空间中前置模式下的声源位置等。用户通过声像设置控件进行声像设置操作,可以确定对哪一种音频类型进行控制,以及对该音频类型下的音频进行怎样的声像布局的控制。
声像设置操作可以是用户在声像设置控件中进行单击、双击、长按、滑动等一种操作或多种组合操作等,例如在确定目标音频类型后,可以通过在音频设置界面中进行滑动的声像设置操作,确定目标音频类型的声像信息;或者用户也可以先通过单击操作从多种音频类型中选择目标音频类型,再通过单击操作选择声像模式,以确定对目标音频类型进行怎样的声像调整,即确定目标音频类型的声像信息等。
图2示出了一种音频设置界面的示意图,音频设置界面中可以显示模拟用户在虚拟声音空间中的模拟位置,音频设置界面中还可以包括多种待选的音频类型选项210,以及声像设置控件220,用户可以通过在音频设置界面中进行声像设置操作,从多种待选的音频类型选项210中确定目标音频类型,并在声像设置控件220中确定目标音频类型对应的声像信息。
操作S230,当播放目标音频类型下的待播放音频时,基于目标音频类型的声像信息对待播放音频进行渲染。
在本示例性实施例中,目标音频类型下可以包括多个音频,例如音乐音频类型下可以包括不同的音乐音频,在确定目标音频类型的声像信息后,本示例性实施例可以将该声像信息应用于该目标音频类型下的所有音频。待播放音频即为当前需要播放的音频,基于目标音频类型的声像信息可以对待播放音频进行渲染,例如在确定游戏音频类型的声像信息后,当需要播放A游戏程序中的音频时,可以基于游戏音频类型的声像信息对A游戏程序中 的音频进行渲染。
对待播放音频进行渲染是指根据声像信息中音频在虚拟声音空间中的方位信息进行的音频处理过程,以使待播放音频在播放时,可以呈现出待播放音频的声源位于虚拟声音空间中的特定声源方位的声音效果。其中,虚拟声音空间是指基于人耳的构造、声音在空气中的传播特性等物理规律,模拟的虚拟空间,该虚拟空间具有与物理空间匹配的空间特征。当声源位于虚拟声音空间中的不同位置时,可以模拟出不同的真实声源的声音效果,使用户感受到声音似乎是从三维空间中的虚拟位置发出的。声源是虚拟空间中虚拟出来的一个发声点位置。在本示例性实施例中,虚拟空间中可以包括多个声源位置,一个或多个声源位置可以构成一种声像布局,不同音频类型可以设置不同的声像布局,以实现不同音频类型的音频在同时播放时,互相不干扰的听音效果。
现有技术中,音频的声像非常单调,且不便于用户辨识,当同时播放多种类型的音频时,往往由于其声像相同,导致互相干扰。因此,如何根据用户的实际需求,设置音频类型的声像信息,以进行声像渲染,提高用户的听音体验,是现有技术亟待解决的问题。
综上,本示例性实施方式中,在音频设置界面中提供声像设置控件;响应于通过声像设置控件进行的声像设置操作,根据声像设置操作确定目标音频类型的声像信息;当播放目标音频类型下的待播放音频时,基于目标音频类型的声像信息对待播放音频进行渲染。一方面,本示例性实施例通过对声像设置控件进行声像设置操作,可以根据声像设置操作对目标音频类型进行声像信息的设置,即本示例性实施例提供了一种能够对不同音频类型设置对应声像信息的方式,基于该声像信息进行音频渲染,可以使音频能够根据设置的声像信息进行播放,避免了现有技术中由于音频声像信息单一且固定,导致播放时互相干扰的问题;另一方面,在本示例性实施例中,用户可以通过对声像设置控件进行声像设置操作,即能够实现对目标音频类型声像信息的设置,操作过程简单、便捷,还能够满足用户对音频声像的个性化需求,具有较广的适用范围。
在一示例性实施例中,上述目标音频类型的声像信息包括目标音频类型的空间声源方位;上述操作S230,可以包括:
将待播放音频的音频数据映射至虚拟声音空间中位于空间声源方位的声音对象;
基于声音对象的音频数据进行声像渲染。
目标音频类型的空间声源方位是指,目标音频类型下的音频在声音虚拟空间中对应声源的方位信息,可以包括目标音频类型的声源在虚拟声音空间中的位置、朝向以及距离等信息,其中,位置可以是声源的固定坐标,如三维坐标,或者坐标集合等;朝向是指音频在播放时的主要的传播方向,在声音虚拟空间中相对终端设备的朝向与相背终端设备的朝向,其声音效果也会具有较大的差别,具体的朝向可以通过虚拟空间中声源与终端设备的相对角度表示;距离可以是声源距离声音虚拟空间中模拟用户的距离,或者与其他 声源的距离等等。声音对象是指在声音虚拟空间中预先定义的虚拟声音对象,不同的声音对象可以位于虚拟声音空间中的不同位置。例如图3示出了虚拟声音空间300中位于不同空间声源方位的声音对象,可以包括S1~S6 6个固定位置的声音对象,以及3个非固定位置的声音对象,分别对应音乐M、游戏音G和语音V,根据实际需要,终端在出厂设置时,还可以预先配置其他数量的具有固定方位或非固定方位的声音对象,具体可以根据实际场景需求进行设置,本公开对此不做具体限定,例如可以设置S1~S9 9个固定位置的声音对象,以及6个非固定位置的声音对象,如音乐ML、音乐MR、游戏音GL、游戏音GR、语音VL、语音VR,其中,音乐ML、音乐MR可以分别对应音乐左声道与音乐右声道的声音对象等。
本示例性实施例可以将待播放音频的音频数据映射至虚拟空间中位于空间声源方位的声音对象,以将未渲染的音频数据与声音对象相关联,基于声音对象的音频数据对待播放音频的音频数据进行声像渲染,可以使得渲染后的音频数据呈现声音对象对应的声像效果。
在一示例性实施例中,上述声像设置控件包括声像模式选择控件;响应于通过声像设置控件进行的声像设置操作,根据声像设置操作确定目标音频类型的声像信息,包括:
响应于通过声像模式选择控件在目标音频类型的多个声像模式中进行选择的操作,根据用户选择的声像模式确定目标音频类型的空间声源方位。
其中,声像模式是指用于确定声像布局样式的选项模式,其可以是预先配置好的具有特定声像布局的模式,也可以是用户自定义的模式。在本示例性实施例中,声像模式可以包括多种,不同的声像模式可以具有不同的音频播放效果。声像设置控件可以包括声像模式选择控件,用户可以通过简单的单击、双击或长按操作等在多个声像模式中进行选择,以确定当前目标音频类型对应的空间声源方位,例如图2所示的声像设置控件220即为一种声像模式选择控件,其中包括前置模式、后置模式、空间平铺以及自定义等多种声像模式。
在一示例性实施例中,上述根据用户选择的声像模式确定目标音频类型的空间声源方位,包括:
响应于用户选择自定义模式,根据目标音频类型的声像布置控件的位置确定目标音频类型的空间声源方位;声像布置控件可被移动。
其中,声像布置控件是指音频设置界面中,用于为用户提供可视化声像布局效果的控件,声像布置控件在音频设置界面中的位置,可以在一定程度上反映声像布置控件表征的声音对象在虚拟声音空间中的空间声源方位。用户可以通过拖动声像布置控件的方式,来确定目标音频类型的空间声源方位,以实现目标音频类型下音频数据的个性化渲染。
图4示出了一种音频设置界面的示意图,显示了模拟用户410在虚拟声音空间中的模拟位置以及空间中前、后、左、右的方向标识,当用户在声像模式选择控件420中选择自定义模式后,可以通过点击或滑动操作拖动游戏音的声像布置控件430、语音的声像布置控件440或音乐的声像布置控件450 在显示界面中进行移动,其移动到的位置对应其在虚拟声音空间中的空间声源方位。图4仅为示意性说明,根据实际需要,除了音乐、游戏音及语音外,还可以有其他音频类型,本公开对此不做具体限定。
在一示例性实施例中,上述根据用户选择的声像模式确定目标音频类型的空间声源方位,包括:
响应于用户在多个声像模式中选择前置模式,确定目标音频类型的空间声源方位包括左前方、右前方、正前方。
其中,前置模式为一种声源方位在用户前方区域的声像分布模式,类似于前置喇叭的效果,当用户在多个声像模式中选择前置模式时,如图5所示,则确定目标音频类型的空间声源方位可以包括左前方510、右前方530以及正前方520中的一个或多个。
在本示例性实施例中,当用户选择前置模式时,如图6所示,音频设置界面中可以将自定义模式对应的声像布置控件620、630和640置在右下角,或进行变暗、变小或不显示等其他的弱化显示处理,并突出显示当前前置模式对应的标识610,以可视化的呈现当前目标音频类型对应进行前置模式的声像控制。
在一示例性实施例中,将待播放音频的音频数据映射至虚拟声音空间中位于空间声源方位的声音对象,包括:
将待播放音频的左声道音频数据映射至左前方的声音对象,将待播放音频的右声道音频数据映射至右前方的声音对象,将待播放音频的单声道音频数据映射至正前方的声音对象;单声道音频数据由左声道音频数据和右声道音频数据合并得到。
进一步的,可以按照前置模式下的声音对象的方位进行音频数据的映射,如图5所示,可以将待播放音频的左声道音频数据映射至左前方的声音对象S1,将待播放音频的右声道音频数据映射至右前方的声音对象S3,为了使音频能够有更好的播放效果,本示例性实施例还可以根据左右声道的音频数据进行混音处理,得到一单声道音频数据,映射至正前方的声音对象S2
在一示例性实施例中,根据用户选择的声像模式确定目标音频类型的空间声源方位,包括:
响应于用户在多个声像模式中选择后置模式,确定目标音频类型的空间声源方位包括左后方、右后方、正后方。
其中,后置模式为一种声源方位在用户后方区域的声像分布模式,类似于后置环绕的效果,当用户在多个声像模式中选择后置模式时,如图5所示,则确定目标音频类型的空间声源方位可以包括左后方560、右后方540以及正后方550中的一个或多个。
在本示例性实施例中,当用户选择后置模式时,如图7所示,音频设置界面中可以突出显示当前后置模式对应的标识710,以可视化的呈现当前目标音频类型对应进行后置模式的声像控制。
在一示例性实施例中,将待播放音频的音频数据映射至虚拟声音空间中 位于空间声源方位的声音对象,包括:
将待播放音频的左声道音频数据映射至左后方的声音对象,将待播放音频的右声道音频数据映射至右后方的声音对象,将待播放音频的单声道音频数据映射至正后方的声音对象;单声道音频数据由左声道音频数据和右声道音频数据合并得到。
进一步的,可以按照后置模式下的声音对象的方位进行音频数据的映射,如图5所示,可以将待播放音频的左声道音频数据映射至左后方的声音对象S6,将待播放音频的右声道音频数据映射至右后方的声音对象S4,为了使音频能够有更好的播放效果,本示例性实施例还可以根据左右声道的音频数据进行混音处理,得到一单声道音频数据,映射至正后方的声音对象S5
在一示例性实施例中,根据用户选择的声像模式确定目标音频类型的空间声源方位,包括:
响应于用户在多个声像模式中选择空间平铺模式,确定目标音频类型的空间声源方位包括左前方、右前方、正前方、左后方、右后方、正后方。
其中,空间平铺模式为一种声源方位在用户周围区域的声像分布模式,当用户在多个声像模式中选择空间平铺模式时,如图5所示,则可以确定目标音频类型的空间声源方位包括左前方510、右前方530、正前方520、左后方560、右后方540、正后方550中的一个或多个。
在本示例性实施例中,当用户选择空间平铺时,如图8所示,音频设置界面中可以突出显示当前空间平铺模式对应的标识810,以可视化的呈现当前目标音频类型对应进行空间平铺模式的声像控制。
在一示例性实施例中,将待播放音频的音频数据映射至虚拟声音空间中位于空间声源方位的声音对象,包括:
将待播放音频的单声道音频数据施加延时后映射至左前方的声音对象、右前方的声音对象、正前方的声音对象、左后方的声音对象、右后方的声音对象、正后方的声音对象。
进一步的,可以按照空间平铺模式下的声音对象的方位进行音频数据的映射,在本示例性实施例中,为了能够得到更好的立体环绕空间平铺音效,可以先将待播放音频进行混音处理得到一单声道音频数据,然后对其施加延时,再将其映射至如图5所示的左前方的声音对象S1、右前方的声音对象S3、正前方的声音对象S2、左后方的声音对象S6、右后方的声音对象S4、正后方的声音对象S5
图9示出了本示例性实施例中一种音频控制方法的底层架构流程图,具体可以包括:
操作S910,获取待播放音频,并接收用户对声像模式的选择操作;
当用户选择前置模式时,可以执行
操作S9110,将待播放音频的左声道音频数据L1映射至左前方的声音对象S1
操作S9120,将待播放音频的右声道音频数据R1映射至右前方的声音 对象S3
操作S9130,根据待播放音频的左声道音频数据L1和待播放音频的右声道音频数据R1,确定待播放音频的单声道音频数据Mono1
操作S9140,将待播放音频的单声道音频数据Mono1映射至正前方的声音对象S2
当用户选择空间平铺模式时,可以执行
操作S9210,根据待播放音频的左声道音频数据L2和右声道音频数据R2,确定单声道音频Mono2
操作S9220,将单声道音频数据Mono2施加延时后映射至左前方的声音对象S1、右前方的声音对象S3、正前方的声音对象S2、左后方的声音对象、右后方的声音对象、正后方的声音对象;
当用户选择后置模式时,可以执行
操作S9310,将待播放音频的左声道音频数据L3映射至左后方的声音对象S6
操作S9320,将待播放音频的右声道音频数据R3映射至右后方的声音对象S4
操作S9330,根据待播放音频的左声道音频数据L3和待播放音频的右声道音频数据R3,确定待播放音频的单声道音频数据Mono3
操作S9340,将待播放音频的单声道音频数据Mono3映射至正后方的声音对象S5
当也用户选择自定义模式时,可以执行
操作S9410,根据待播放音频的左声道音频数据L4和右声道音频数据R4,确定单声道音频Mono4
操作S9420,判断待播放音频的音频类型;
当待播放音频为游戏音时,执行
操作S9430,将待播放音频的单声道音频数据Mono4映射至游戏音的声音对象G;
当待播放音频为语音时,执行
操作S9440,将待播放音频的单声道音频数据Mono4映射至语音的声音对象V;
当待播放音频为音乐时,执行
操作S9450,将待播放音频的单声道音频数据Mono4映射至音乐的声音对象M;
操作S920,基于声音对象的音频数据进行声像渲染;
操作S930,确定信号输出方式;
操作S940,通过耳机信号输出;
操作S950,通过扬声器信号输出。
在一示例性实施例中,上述声像设置控件用于设置多种音频类型的声像信息,以支持对不同音频类型设置不同的声像信息。
在本示例性实施例中,声像设置控件可以对多种音频类型的声像信息进 行控制,即音频设置界面中可以包括多种音频类型,根据用户对声像设置控件的操作,可以对这多种音频类型分别进行声像设置操作,以支持对不同音频类型设置不同的声像信息,从而实现在同时播放不同音频类型的音频时,可以呈现出不同的声像效果。例如图10示出了一种不同音频类型对应不同的声像信息的示意图,用户可以对游戏音设置前置模式,以使游戏音可以基于位于虚拟声音空间区域1010的声音对象进行声像渲染;对音乐设置后置模式,以使音乐可以基于位于虚拟声音空间区域1020的声音对象进行声像渲染;对语音设置自定义模式,以使语音可以基于位于虚拟声音空间自定义位置1030的声音对象进行声像渲染,从而使得在同时播放游戏音、音乐和语音时,具有特定且互不干扰的声音效果,以提高用户的听音感受。
在一示例性实施例中,上述音频控制方法还可以包括:
在混音模式开启的状态下,阻止任一种音频类型对音频焦点的请求。
其中,音频焦点是指终端设备专注播放一种类型音频的机制,例如当用户收听音乐时,音乐焦点为播放当前播放的音乐;当用户在收听音乐时,同时打开游戏程序,则会播放游戏音,而停止音乐,此时音频焦点由播放音乐转变为播放游戏音。混音模式是指终端设备可以同时播放多种类型音频的模式,当终端设备处于非混音模式下,存在音频焦点,将根据实际情况播放一种类型的音频;当终端设备处于混音模式下,将会阻止任意一种音频类型对音频焦点的请求,使得所有类型的待播放音频都可以进行播放。
在一示例性实施例中,上述音频控制方法还可以包括:
在音频设置界面中提供音量设置控件,以用于设置目标音频类型的音量。
其中,音量设置控件是指用于对音频的音量进行调整的选项模块,通过音量设置控件,可以自定义调整待播放音频的音量大小。音量设置控件对目标音频类型的音量的控制可以包括多种方式,例如音量设置控件可以包括多个音量档位,通过对目标音量档位的选择,可以将目标音频类型的音量调整至该音量档位,音量设置控件也可以是以滑块的形式,当用户对滑块进行拖动时,音量可以平滑升高或降低。在图2所示的音频设置界面中,示出了其包括待选的音频类型选项210、声像模式选择控件220,以及音量设置控件230的示意图,用户可以通过对音频类型选项210的选择确定目标音频类型,对声像模式选择控件220的选择,确定目标音频类型的声像模式,对音量设置控件230的操作,确定目标音频类型下音频的音量大小。
本公开的示例性实施方式还提供一种音频控制装置。如图11所示,该音频控制装置1100可以包括:控件提供模块,用于在音频设置界面中提供声像设置控件;信息确定模块,用于响应于通过声像设置控件进行的声像设置操作,根据声像设置操作确定目标音频类型的声像信息;音频渲染模块,用于当播放目标音频类型下的待播放音频时,基于目标音频类型的声像信息对待播放音频进行渲染。
在一示例性实施例中,目标音频类型的声像信息包括目标音频类型的空间声源方位;音频渲染模块,包括:映射单元,用于将待播放音频的音频 数据映射至虚拟声音空间中位于空间声源方位的声音对象;渲染单元,用于基于声音对象的音频数据进行声像渲染。
在一示例性实施例中,信息确定模块,包括:选择单元,用于响应于通过声像模式选择控件在目标音频类型的多个声像模式中进行选择的操作,根据用户选择的声像模式确定目标音频类型的空间声源方位。
在一示例性实施例中,选择单元,包括:第一选择子单元,用于响应于用户选择自定义模式,根据目标音频类型的声像布置控件的位置确定目标音频类型的空间声源方位;声像布置控件可被移动。
在一示例性实施例中,选择单元,包括:第二选择子单元,用于响应于用户在多个声像模式中选择前置模式,确定目标音频类型的空间声源方位包括左前方、右前方、正前方。
在一示例性实施例中,映射单元,包括:第一映射子单元,用于将待播放音频的左声道音频数据映射至左前方的声音对象,将待播放音频的右声道音频数据映射至右前方的声音对象,将待播放音频的单声道音频数据映射至正前方的声音对象;单声道音频数据由左声道音频数据和右声道音频数据合并得到。
在一示例性实施例中,选择单元,包括:第三选择子单元,用于响应于用户在多个声像模式中选择后置模式,确定目标音频类型的空间声源方位包括左后方、右后方、正后方。
在一示例性实施例中,映射单元,包括:第二映射子单元,用于将待播放音频的左声道音频数据映射至左后方的声音对象,将待播放音频的右声道音频数据映射至右后方的声音对象,将待播放音频的单声道音频数据映射至正后方的声音对象;单声道音频数据由左声道音频数据和右声道音频数据合并得到。
在一示例性实施例中,选择单元,包括:第四选择子单元,用于响应于用户在多个声像模式中选择空间平铺模式,则确定目标音频类型的空间声源方位包括左前方、右前方、正前方、左后方、右后方、正后方。
在一示例性实施例中,映射单元,包括:第三映射子单元,用于将待播放音频的单声道音频数据施加延时后映射至左前方的声音对象、右前方的声音对象、正前方的声音对象、左后方的声音对象、右后方的声音对象、正后方的声音对象。
在一示例性实施例中,声像设置控件用于设置多种音频类型的声像信息,以支持对不同音频类型设置不同的声像信息。
在一示例性实施例中,上述音频控制装置还包括:请求阻止模块,用于在混音模式开启的状态下,阻止任一种音频类型对音频焦点的请求。
在一示例性实施例中,上述音频控制装置还包括:音量控制模块,用于在音频设置界面中提供音量设置控件,以用于设置目标音频类型的音量。
上述装置中各部分的具体细节在方法部分实施方式中已经详细说明,因而不再赘述。
本公开的示例性实施方式还提供了一种计算机可读存储介质,可以实 现为程序产品的形式,包括程序代码,当程序产品在终端设备上运行时,程序代码用于使终端设备执行本说明书上述“示例性方法”部分中描述的根据本公开各种示例性实施方式的操作,例如可以执行图1或图10中任意一个或多个操作。该程序产品可以采用便携式紧凑盘只读存储器(CD-ROM)并包括程序代码,并可以在终端设备,例如个人电脑上运行。然而,本公开的程序产品不限于此,在本文件中,可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行***、装置或者器件使用或者与其结合使用。
程序产品可以采用一个或多个可读介质的任意组合。可读介质可以是可读信号介质或者可读存储介质。可读存储介质例如可以为但不限于电、磁、光、电磁、红外线、或半导体的***、装置或器件,或者任意以上的组合。可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式盘、硬盘、随机存取存储器、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便携式紧凑盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。
计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了可读程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。可读信号介质还可以是可读存储介质以外的任何可读介质,该可读介质可以发送、传播或者传输用于由指令执行***、装置或者器件使用或者与其结合使用的程序。
可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、有线、光缆、RF等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言的任意组合来编写用于执行本公开操作的程序代码,程序设计语言包括面向对象的程序设计语言—诸如Java、C++等,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算设备上执行、部分地在用户设备上执行、作为一个独立的软件包执行、部分在用户计算设备上部分在远程计算设备上执行、或者完全在远程计算设备或服务器上执行。在涉及远程计算设备的情形中,远程计算设备可以通过任意种类的网络,包括局域网(LAN)或广域网(WAN),连接到用户计算设备,或者,可以连接到外部计算设备(例如利用因特网服务提供商来通过因特网连接)。
本公开的示例性实施方式还提供一种电子设备。该电子设备可以包括处理器与存储器,存储器用于存储处理器的可执行指令,处理器配置为经由执行可执行指令来执行上述用于畸变校正的图像处理方法。
下面以图12中的移动终端1200为例,对该电子设备的构造进行示例性说明。本领域技术人员应当理解,除了特别用于移动目的的部件之外,图12中的构造也能够应用于固定类型的设备。
如图12所示,移动终端1200具体可以包括:处理器1201、存储器 1202、总线1203、移动通信模块1204、天线1、无线通信模块1205、天线2、显示屏1206、摄像模块1207、音频模块1208、电源模块1209与传感器模块1210。
处理器1201可以包括一个或多个处理单元,例如:处理器1201可以包括AP(Application Processor,应用处理器)、调制解调处理器、GPU(Graphics Processing Unit,图形处理器)、ISP(Image Signal Processor,图像信号处理器)、控制器、编码器、解码器、DSP(Digital Signal Processor,数字信号处理器)、基带处理器和/或NPU(Neural-Network Processing Unit,神经网络处理器)等。
编码器可以对图像或视频进行编码(即压缩),以减小数据大小,便于存储或发送。解码器可以对图像或视频的编码数据进行解码(即解压缩),以还原出图像或视频数据。
处理器1201可以通过总线1203与存储器1202或其他部件形成连接。
存储器1202可以用于存储计算机可执行程序代码,可执行程序代码包括指令。处理器1201通过运行存储在存储器1202的指令,执行移动终端1200的各种功能应用以及数据处理。存储器1202还可以存储应用数据,例如存储图像,视频等文件。
移动终端1200的通信功能可以通过移动通信模块1204、天线1、无线通信模块1205、天线2、调制解调处理器以及基带处理器等实现。天线1和天线2用于发射和接收电磁波信号。移动通信模块1204可以提供应用在移动终端1200上3G、4G、5G等移动通信解决方案。无线通信模块1205可以提供应用在移动终端1200上的无线局域网、蓝牙、近场通信等无线通信解决方案。
显示屏1206用于实现显示功能,如显示用户界面、图像、视频等。摄像模块1207用于实现拍摄功能,如拍摄图像、视频等。音频模块1208用于实现音频功能,如播放音频,采集语音等。电源模块1209用于实现电源管理功能,如为电池充电、为设备供电、监测电池状态等。传感器模块1210可以包括一种或多种传感器,用于实现相应的感应检测功能。
所属技术领域的技术人员能够理解,本公开的各个方面可以实现为***、方法或程序产品。因此,本公开的各个方面可以具体实现为以下形式,即:完全的硬件实施方式、完全的软件实施方式(包括固件、微代码等),或硬件和软件方面结合的实施方式,这里可以统称为“电路”、“模块”或“***”。本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其他实施方式。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施方式仅被视为示例性的,本公开的真正范围和精神由权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅 由所附的权利要求来限定。

Claims (20)

  1. 一种音频控制方法,其特征在于,包括:
    在音频设置界面中提供声像设置控件;
    响应于通过所述声像设置控件进行的声像设置操作,根据所述声像设置操作确定目标音频类型的声像信息;及
    当播放所述目标音频类型下的待播放音频时,基于所述目标音频类型的声像信息对所述待播放音频进行渲染。
  2. 根据权利要求1所述的方法,其特征在于,所述目标音频类型的声像信息包括所述目标音频类型的空间声源方位;所述基于所述目标音频类型的声像信息对所述待播放音频进行渲染,包括:
    将所述待播放音频的音频数据映射至虚拟声音空间中位于所述空间声源方位的声音对象;及
    基于所述声音对象的音频数据进行声像渲染。
  3. 根据权利要求2所述的方法,其特征在于,所述声像设置控件包括声像模式选择控件;所述响应于通过所述声像设置控件进行的声像设置操作,根据所述声像设置操作确定目标音频类型的声像信息,包括:
    响应于通过所述声像模式选择控件在所述目标音频类型的多个声像模式中进行选择的操作,根据用户选择的声像模式确定所述目标音频类型的空间声源方位。
  4. 根据权利要求3所述的方法,其特征在于,所述根据用户选择的声像模式确定所述目标音频类型的空间声源方位,包括:
    响应于用户选择自定义模式,根据所述目标音频类型的声像布置控件的位置确定所述目标音频类型的空间声源方位;所述声像布置控件可被移动。
  5. 根据权利要求3所述的方法,其特征在于,所述根据用户选择的声像模式确定所述目标音频类型的空间声源方位,包括:
    响应于用户在所述多个声像模式中选择前置模式,确定所述目标音频类型的空间声源方位包括左前方、右前方、正前方。
  6. 根据权利要求5所述的方法,其特征在于,所述将所述待播放音频的音频数据映射至虚拟声音空间中位于所述空间声源方位的声音对象,包括:
    将所述待播放音频的左声道音频数据映射至左前方的声音对象,将所述待播放音频的右声道音频数据映射至右前方的声音对象,将所述待播放音频的单声道音频数据映射至正前方的声音对象;所述单声道音频数据由所述左声道音频数据和所述右声道音频数据合并得到。
  7. 根据权利要求3所述的方法,其特征在于,所述根据用户选择的声像模式确定所述目标音频类型的空间声源方位,包括:
    响应于用户在所述多个声像模式中选择后置模式,确定所述目标音频类型的空间声源方位包括左后方、右后方、正后方。
  8. 根据权利要求7所述的方法,其特征在于,所述将所述待播放音频的音频数据映射至虚拟声音空间中位于所述空间声源方位的声音对象,包 括:
    将所述待播放音频的左声道音频数据映射至左后方的声音对象,将所述待播放音频的右声道音频数据映射至右后方的声音对象,将所述待播放音频的单声道音频数据映射至正后方的声音对象;所述单声道音频数据由所述左声道音频数据和所述右声道音频数据合并得到。
  9. 根据权利要求3所述的方法,其特征在于,所述根据用户选择的声像模式确定所述目标音频类型的空间声源方位,包括:
    响应于用户在所述多个声像模式中选择空间平铺模式,确定所述目标音频类型的空间声源方位包括左前方、右前方、正前方、左后方、右后方、正后方。
  10. 根据权利要求9所述的方法,其特征在于,所述将所述待播放音频的音频数据映射至虚拟声音空间中位于所述空间声源方位的声音对象,包括:
    将所述待播放音频的单声道音频数据施加延时后映射至左前方的声音对象、右前方的声音对象、正前方的声音对象、左后方的声音对象、右后方的声音对象、正后方的声音对象。
  11. 根据权利要求1所述的方法,其特征在于,所述声像设置控件用于设置多种音频类型的声像信息,以支持对不同音频类型设置不同的声像信息。
  12. 根据权利要求11所述的方法,其特征在于,所述方法还包括:
    在混音模式开启的状态下,阻止任一种所述音频类型对音频焦点的请求。
  13. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    在所述音频设置界面中提供音量设置控件,以用于设置所述目标音频类型的音量。
  14. 一种音频控制装置,其特征在于,包括:
    控件提供模块,用于在音频设置界面中提供声像设置控件;
    信息确定模块,用于响应于通过所述声像设置控件进行的声像设置操作,根据所述声像设置操作确定目标音频类型的声像信息;及
    音频渲染模块,用于当播放所述目标音频类型下的待播放音频时,基于所述目标音频类型的声像信息对所述待播放音频进行渲染。
  15. 根据权利要求14所述的装置,其特征在于,所述目标音频类型的声像信息包括所述目标音频类型的空间声源方位;所述音频渲染模块包括映射单元,所述映射单元用于将所述待播放音频的音频数据映射至虚拟声音空间中位于所述空间声源方位的声音对象;及基于所述声音对象的音频数据进行声像渲染。
  16. 根据权利要求15所述的装置,其特征在于,所述声像设置控件包括声像模式选择控件;所述信息确定模块包括选择单元,所述选择单元用于响应于通过所述声像模式选择控件在所述目标音频类型的多个声像模式中进行选择的操作,根据用户选择的声像模式确定所述目标音频类型的空间声 源方位。
  17. 根据权利要求16所述的装置,其特征在于,所述选择单元包括第一选择子单元,所述第一选择子单元用于响应于用户选择自定义模式,根据所述目标音频类型的声像布置控件的位置确定所述目标音频类型的空间声源方位;所述声像布置控件可被移动。
  18. 根据权利要求16所述的装置,其特征在于,所述选择单元包括第二选择子单元,所述第二选择子单元用于响应于用户在所述多个声像模式中选择前置模式,确定所述目标音频类型的空间声源方位包括左前方、右前方、正前方。
  19. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至13任一项所述的方法。
  20. 一种电子设备,其特征在于,包括:
    处理器;
    存储器,用于存储所述处理器的可执行指令;
    其中,所述处理器配置为经由执行所述可执行指令来执行权利要求1至13任一项所述的方法。
PCT/CN2023/118788 2022-12-08 2023-09-14 音频控制方法、音频控制装置、介质与电子设备 WO2024119946A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211574970.3 2022-12-08
CN202211574970.3A CN118170339A (zh) 2022-12-08 2022-12-08 音频控制方法、音频控制装置、介质与电子设备

Publications (1)

Publication Number Publication Date
WO2024119946A1 true WO2024119946A1 (zh) 2024-06-13

Family

ID=91353255

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/118788 WO2024119946A1 (zh) 2022-12-08 2023-09-14 音频控制方法、音频控制装置、介质与电子设备

Country Status (2)

Country Link
CN (1) CN118170339A (zh)
WO (1) WO2024119946A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107493542A (zh) * 2012-08-31 2017-12-19 杜比实验室特许公司 用于在听音环境中播放音频内容的扬声器***
CN110972053A (zh) * 2019-11-25 2020-04-07 腾讯音乐娱乐科技(深圳)有限公司 构造听音场景的方法和相关装置
CN111638779A (zh) * 2020-04-27 2020-09-08 维沃移动通信有限公司 音频播放控制方法、装置、电子设备及可读存储介质
CN114023301A (zh) * 2021-11-26 2022-02-08 掌阅科技股份有限公司 音频编辑方法、电子设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107493542A (zh) * 2012-08-31 2017-12-19 杜比实验室特许公司 用于在听音环境中播放音频内容的扬声器***
CN110972053A (zh) * 2019-11-25 2020-04-07 腾讯音乐娱乐科技(深圳)有限公司 构造听音场景的方法和相关装置
CN111638779A (zh) * 2020-04-27 2020-09-08 维沃移动通信有限公司 音频播放控制方法、装置、电子设备及可读存储介质
CN114023301A (zh) * 2021-11-26 2022-02-08 掌阅科技股份有限公司 音频编辑方法、电子设备及存储介质

Also Published As

Publication number Publication date
CN118170339A (zh) 2024-06-11

Similar Documents

Publication Publication Date Title
US10514885B2 (en) Apparatus and method for controlling audio mixing in virtual reality environments
JP7053869B2 (ja) ビデオ生成の方法、装置、電子機器及びコンピュータ読み取り可能記憶媒体
JP6086188B2 (ja) 音響効果調整装置および方法、並びにプログラム
KR102548756B1 (ko) 향상된 3d 오디오 오서링과 렌더링을 위한 시스템 및 툴들
US11037600B2 (en) Video processing method and apparatus, terminal and medium
CN109068260B (zh) 配置经由家庭音频回放***的音频的回放的***和方法
JP2023529868A (ja) 共有方法、装置及び電子機器
US20150264502A1 (en) Audio Signal Processing Device, Position Information Acquisition Device, and Audio Signal Processing System
JP7348927B2 (ja) オーディオ再生方法及び装置、電子機器並びに記憶媒体
US20210225406A1 (en) Video acquisition method and device, terminal and medium
US20220253492A1 (en) Method, an apparatus, an electronic device and a storage medium for multimedia information processing
US20120317594A1 (en) Method and system for providing an improved audio experience for viewers of video
WO2021083145A1 (zh) 视频处理的方法、装置、终端及存储介质
US20170262073A1 (en) Display device and operating method thereof
US20240129427A1 (en) Video processing method and apparatus, and terminal and storage medium
JP2014103456A (ja) オーディオアンプ
CN112673651B (zh) 多视点多用户音频用户体验
WO2024119946A1 (zh) 音频控制方法、音频控制装置、介质与电子设备
WO2023071466A1 (zh) 音乐的音效播放方法及设备
WO2023087031A2 (en) Systems and methods for rendering spatial audio using spatialization shaders
WO2021036782A1 (zh) 声音处理方法、装置、电子设备及计算机可读存储介质
CN111176605B (zh) 一种音频输出方法及电子设备
WO2023240467A1 (zh) 音频播放方法、装置及存储介质
WO2024011937A1 (zh) 音频处理方法、***及电子设备
JP6281606B2 (ja) 音響効果調整装置および方法、並びにプログラム