WO2023284406A1 - 一种通话方法及电子设备 - Google Patents

一种通话方法及电子设备 Download PDF

Info

Publication number
WO2023284406A1
WO2023284406A1 PCT/CN2022/093888 CN2022093888W WO2023284406A1 WO 2023284406 A1 WO2023284406 A1 WO 2023284406A1 CN 2022093888 W CN2022093888 W CN 2022093888W WO 2023284406 A1 WO2023284406 A1 WO 2023284406A1
Authority
WO
WIPO (PCT)
Prior art keywords
electronic device
channel audio
audio signal
call
mode
Prior art date
Application number
PCT/CN2022/093888
Other languages
English (en)
French (fr)
Inventor
玄建永
杨枭
刘镇亿
夏日升
吴元友
Original Assignee
荣耀终端有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 荣耀终端有限公司 filed Critical 荣耀终端有限公司
Publication of WO2023284406A1 publication Critical patent/WO2023284406A1/zh

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/02Constructional features of telephone sets
    • H04M1/03Constructional features of telephone transmitters or receivers, e.g. telephone hand-sets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72433User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72448User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions
    • H04M1/72454User interfaces specially adapted for cordless or mobile telephones with means for adapting the functionality of the device according to specific conditions according to context-related or environment-related conditions

Definitions

  • the present application relates to the field of terminal and communication technologies, and in particular to a calling method and electronic equipment.
  • the screen-to-body ratio is the ratio of the screen of an electronic device to the front of the electronic device.
  • users have higher and higher requirements for the screen-to-body ratio of electronic devices.
  • most electronic devices are full-screen, that is, the front of the electronic device is all screens, and the four border positions are all frameless designs, which are close to 100% of the screen ratio.
  • full-screen electronic devices have significantly improved the visual experience.
  • the appearance of the full screen results in that the earpiece of the electronic device cannot be arranged on the front of the mobile phone, but can only be arranged on the side of the electronic device.
  • part of the audio signal is output from the earpiece to the outside of the human ear, and does not enter the human ear but enters the surrounding environment of the human ear, resulting in sound leakage.
  • another part of the audio signal can be output from the earpiece to the human ear, the energy of this part of the audio signal is lost compared to the complete audio signal due to sound leakage.
  • the noise signal After entering the human ear, it will interfere with the user's recognition of this part of the audio signal, resulting in unclear hearing.
  • the application provides a call method and an electronic device.
  • the electronic device can process the audio signal sent by other electronic devices to the machine using different parameters in different call modes to generate different left channel audio. signal and the right channel audio signal to adapt to the call environment.
  • the present application provides a calling method, which is applied to an electronic device including a first sounder and a second sounder, the second sounder is different from the first sounder, and the first sounder corresponds to the left sound channel, the second sound generator corresponds to the right channel, and the method includes: displaying a call application interface; the electronic device is determined as a first call mode, and the first call mode corresponds to the audio characteristics of the first left channel and the first right channel Audio features, the first left channel audio feature is the audio feature of the audio signal output from the left channel, the first right channel audio feature is the audio feature of the audio signal output from the right channel, and the first call mode corresponds to The first call environment; determine that the electronic device is in the second call environment; the electronic device switches to the second call mode, the second call mode corresponds to the second left channel audio feature and the second right channel audio feature, the second The left channel audio feature is the audio feature of the audio signal output by the left channel, the second right channel audio feature is the audio feature of the audio signal output by the right channel
  • the audio played by the electronic device can be adjusted according to the change of the calling environment, so as to obtain audio adapted to the calling environment.
  • the volume of the played audio can be increased, and the energy of the sound in the frequency band (such as the frequency of 1khz-3khz) that is sensitive to the human ear and has good directivity can be increased, so that the user It can pick up sound clearly in a noisy call environment.
  • the outside world is relatively quiet, which can reduce the volume of the audio played, and at the same time highlight the sound of the frequency band (for example, the frequency is 1khz-3khz) that is sensitive to the human ear and has good directivity, so that while reducing sound leakage, Ensure that the user can pick up the sound clearly.
  • the frequency band for example, the frequency is 1khz-3khz
  • the method further includes: the electronic device receives downlink audio; the downlink audio is sent to the electronic device by other electronic devices during the call audio; the electronic device processes the downlink audio in the first call mode to obtain the first left channel audio and the first right channel audio, wherein, in the first left channel audio, the low-frequency sound The energy is greater than the energy of the high-frequency sound. In the first right channel audio, the energy of the high-frequency sound is greater than the energy of the low-frequency sound; The sound generator plays the first right channel audio.
  • the materials of the two sound generators are different, one of which is suitable for playing high-frequency audio, and the other is better for playing low-frequency audio, then the two-way audio generated by the electronic device
  • One of the audio channels has low frequency energy greater than high frequency, and the other channel has high frequency energy greater than low frequency, so it can be adapted to different sound generators to improve sound quality.
  • the first sounder is placed on the side of the electronic device, and the second sounder is placed inside the screen of the electronic device; wherein, the target left side played by the first sounder
  • the channel audio is transmitted to the human ear through the air, and the target right channel audio played by the second sound generator is transmitted to the human ear through the bone.
  • the second sound generator is placed inside the screen of the electronic device, and the sound is transmitted through bone conduction, so that the user can pick up the sound clearly in any call mode.
  • the energy of the audio sound played by the first sounder can be appropriately reduced, and it can also make the user pick up the sound clearly and reduce sound leakage at the same time.
  • the downlink audio is processed to obtain the first left channel audio and the first right channel audio, which specifically includes: the electronic device obtains the first left channel audio before processing according to the downlink audio channel audio and the processed first right channel audio; the pre-processed first left channel audio and the pre-processed first right channel audio are respectively adjusted for timbre and volume to obtain the first left channel audio
  • the timbre adjustment refers to adjusting the energy distribution of sounds in different frequency bands in the audio
  • the volume adjustment refers to adjusting the energy of the audio.
  • the electronic device can adjust the tone color and volume of the audio, so that the processed audio can adapt to the environment during the call. Realize that the audio can be adjusted with the change of the call environment, and the audio adapting to the call environment can be obtained.
  • the electronic device uses the unprocessed first left channel audio Before performing timbre adjustment and audio adjustment on the audio and the first right-channel audio before processing, the method further includes: the electronic device determines that the first left-channel audio before processing and the first right-channel audio before processing Channel audio processing parameters, the parameters include left channel tone parameters, right channel tone parameters, left channel volume parameters and right channel volume parameters; the first left channel audio before the processing and the The timbre adjustment and audio adjustment are respectively performed on the first right channel audio to obtain the first left channel audio and the first right channel audio, which specifically includes: using the left channel timbre parameter and the left channel volume parameter to process the pre-processing Adjust the timbre and volume of the left channel audio respectively to obtain the first left channel audio; use the timbre parameter of the right channel and the volume parameter of the right channel to adjust the timbre and volume of the right channel audio before processing
  • the parameters for the electronic device to adjust the tone color and volume of the audio are different, so that the processed audio can be adapted to the environment during the call. Realize that the audio can be adjusted with the change of the call environment, and the audio adapting to the call environment can be obtained.
  • determining parameters for processing the pre-processed left channel audio and the pre-processed right channel audio specifically includes: the electronic device determining a call environment type, the call environment Types include quiet, common, and noisy; wherein, when the call environment type is quiet and when the call environment type is common/noisy, the long-term energy of the noise in the first uplink audio corresponding to the former is smaller than the latter; the call environment type is Compared with when the call environment type is quiet/common when it is noisy, the long-term energy of the noise in the first uplink audio corresponding to the former is greater than that of the latter; the electronic device determines the state between the user and the screen, and the state between the user and the screen The state includes the state of sticking to the screen and the state of not sticking to the screen; the state of sticking to the screen is that the distance between the user and the screen of the electronic device is less than a preset value and the duration of the time greater than the preset value is longer than a preset The state of
  • the electronic device determines the call mode by using the call environment type and the state between the user and the screen. In this way, the determined call mode can be made more accurate. For example, when the user is close to the screen and the environment is noisy, it can be determined as a noisy mode. At this time, the volume of the played audio can be set to increase so that the user can pick up the sound clearly.
  • the first mode is one of quiet mode, normal mode and noisy mode
  • the second mode is the other of quiet mode, normal mode and noisy mode
  • the call mode specifically including: when the call environment type is normal and the state between the user and the screen is close to the screen or the state between the user and the screen is In the case of not being close to the screen, the electronic device determines that the call mode is a normal mode; the electronic device determines that the parameters corresponding to the normal mode are the first left channel audio before processing and the first right channel audio before processing.
  • Parameters for processing channel audio when the call environment type is quiet and the state between the user and the screen is close to the screen, the electronic device determines that the call mode is a quiet mode; the electronic device determines that the quiet mode
  • the corresponding parameters are parameters for processing the first left-channel audio before processing and the first right-channel audio before processing; the call environment type is noisy, and the state between the user and the screen is close In the case of the screen state, it is determined that the call mode is a noisy mode; the electronic device determines that the parameter corresponding to the noisy mode is a parameter for processing the first left channel audio before processing and the first right channel audio before processing .
  • the call mode can be divided into quiet mode, normal mode and noisy mode, and the characteristics of the processed audio obtained in these three modes are different.
  • the normal mode the overall energy of the audio signal played by the first sound generator and the second sound generator is set to be larger than that of the quiet mode but smaller than that of the noisy mode, and the energy of the sound signal of the first frequency band can be highlighted. It makes the pickup sound clear while reducing sound leakage.
  • the noisy mode the overall energy of the audio signals played by the first sounder and the second sounder is set to the maximum, and the energy of the sound signal in the first frequency band is highlighted, so that it can be picked up even in a noisy environment. The sound is clear.
  • the parameters involved in calculating the long-term energy of the noise in the first uplink audio are set so that the call mode can only be switched from the quiet mode to the normal mode, and from the normal mode to the noisy mode , Switch from noisy mode to normal mode and switch from normal mode to quiet mode.
  • the talking mode of the electronic device will not suddenly switch from the quiet mode to the noisy mode, nor will it suddenly switch from the noisy mode to the quiet mode, so that the sound changes heard by the user can be peaceful. It doesn't suddenly get bigger and then suddenly get smaller.
  • the method further includes: the electronic device determines that the user is in the call process, and the first sound generator and The second sounder plays audio.
  • the mode switching solution in this solution is only used when the electronic device uses the first sound generator and the second sound generator to play audio. If the audio is not played by using the first sound generator and the second sound generator, for example, the audio is played by a speaker, other algorithms are used to process the audio. Increase the adaptability of electronic equipment and hardware.
  • the electronic device defaults to a call environment type as normal; the electronic device defaults to a state between the user and the screen as a close-to-screen state.
  • the talking mode is the normal mode, so that the sound heard by the user is maintained at an average level, which is more universal.
  • the method further includes: the electronic device estimates the echo according to the first reference signal and the second reference signal, and the first reference signal is the audio frequency of the first left channel after passing the first power The audio output after the amplifier, the second reference signal is the audio output after the first right channel audio passes through the second power amplifier, and the echo is the estimated microphone collected by the first sounder and played by the second sounder Audio; remove the echo from the first uplink audio to obtain the target uplink audio.
  • the echo in the processed audio collected by the microphone is removed, so that other devices cannot hear the echo collected by the local device when the local device communicates through the calling APP, and the call quality can be improved.
  • the present application provides an electronic device, which includes: one or more processors and memory; the memory is coupled to the one or more processors, the memory is used to store computer program codes, and the computer
  • the program code includes computer instructions, and the one or more processors call the computer instructions to make the electronic device perform: display the call application interface; determine the first call mode, and the first call mode corresponds to the first left channel audio feature and The first right channel audio feature, the first left channel audio feature is the audio feature of the audio signal output by the left channel, the first right channel audio feature is the audio feature of the audio signal output by the right channel, the The first call mode corresponds to the first call environment; determine to be in the second call environment; switch to the second call mode, the second call mode corresponds to the second left channel audio feature and the second right channel audio feature, the second left The channel audio feature is the audio feature of the audio signal output by the left channel, the second right channel audio feature is the audio feature of the audio signal output by the right channel, the second left The channel audio feature is the audio feature of the audio signal output by the
  • the audio played by the electronic device can be adjusted according to the change of the calling environment, so as to obtain audio adapted to the calling environment.
  • the volume of the played audio can be increased, and the energy of the sound in the frequency band (such as the frequency of 1khz-3khz) that is sensitive to the human ear and has good directivity can be increased, so that the user It can pick up sound clearly in a noisy call environment.
  • the outside world is relatively quiet, which can reduce the volume of the audio played, and at the same time highlight the sound of the frequency band (for example, the frequency is 1khz-3khz) that is sensitive to the human ear and has good directivity, so that while reducing sound leakage, Ensure that the user can pick up the sound clearly.
  • the frequency band for example, the frequency is 1khz-3khz
  • the one or more processors are further configured to call the computer instruction so that the electronic device performs: receiving downlink audio; the downlink audio is sent to by other electronic devices during a call; Audio: In the first call mode, the downlink audio is processed to obtain the first left channel audio and the first right channel audio, wherein, in the first left channel audio, the energy of the low frequency sound is greater than the high frequency The energy of the sound, in the first right channel audio, the energy of the high frequency sound is greater than the energy of the low frequency sound; the first left channel audio is played through the first sound generator, and the first right channel is played through the second sound generator channel audio.
  • the materials of the two sound generators are different, and one of the sound generators is suitable for playing high-frequency audio, and the other is better for playing low-frequency audio, then the two-way audio generated by the electronic device One of them has low-frequency energy greater than high-frequency, and the other has high-frequency energy greater than low frequency, so it can be adapted to different sound generators and improve sound quality.
  • the one or more processors are specifically configured to call the computer instruction so that the electronic device performs: obtain the pre-processed first left channel audio and the processed first left channel audio according to the downlink audio
  • the first right channel audio; the first left channel audio before processing and the first right channel audio before processing are respectively adjusted for timbre and volume adjustment to obtain the first left channel audio and the first right channel audio
  • the timbre adjustment refers to adjusting the energy distribution of sounds in different frequency bands in the audio
  • the volume adjustment refers to adjusting the energy of the audio.
  • the electronic device can adjust the tone color and volume of the audio, so that the processed audio can adapt to the environment during the call. Realize that the audio can be adjusted with the change of the call environment, and the audio adapting to the call environment can be obtained.
  • the one or more processors are further configured to call the computer instruction so that the electronic device performs: determine the first left channel audio before processing and the pre-processing Parameters for processing the first right channel audio, the parameters include left channel timbre parameters, right channel timbre parameters, left channel volume parameters and right channel volume parameters; the one or more processors are specifically used to call the Instructing the computer to make the electronic device perform: use the left channel timbre parameter and the left channel volume parameter to respectively perform timbre adjustment and volume adjustment on the pre-processed left channel audio to obtain the first left channel audio; The channel timbre parameter and the right channel volume parameter respectively perform timbre adjustment and volume adjustment on the right channel audio before processing to obtain the first right channel audio.
  • the parameters for the electronic device to adjust the tone color and volume of the audio are different, so that the processed audio can be adapted to the environment during the call. Realize that the audio can be adjusted with the change of the call environment, and the audio adapting to the call environment can be obtained.
  • the one or more processors are specifically configured to call the computer instruction to make the electronic device perform: determine a call environment type, where the call environment type includes quiet, normal, and noisy; , compared with when the call environment type is normal/noisy when the call environment type is quiet, the long-term energy of the noise in the first uplink audio corresponding to the former is smaller than the latter; when the call environment type is noisy and the call environment type is quiet/noisy Compared with ordinary times, the long-term energy of the noise in the first uplink audio corresponding to the former is greater than that of the latter; determine the state between the user and the screen, which includes the state of sticking to the screen and the state of not sticking to the screen State; the close-to-screen state is a state in which the distance between the user and the screen is less than a preset value and the duration of the preset value is greater than a preset time, and the non-close-to-screen state is the state in which the user and the screen The distance between them is not
  • the electronic device determines the call mode by using the call environment type and the state between the user and the screen. In this way, the determined call mode can be made more accurate. For example, when the user is close to the screen and the environment is noisy, it can be determined as a noisy mode. At this time, the volume of the played audio can be set to increase so that the user can pick up the sound clearly.
  • the one or more processors are specifically configured to call the computer instruction so that the electronic device executes: when the call environment type is normal and the state between the user and the screen is In the case of close to the screen state or the state between the user and the screen is not close to the screen state, determine that the call mode is the normal mode; determine that the parameter corresponding to the normal mode is the first left Channel audio and the parameters for processing the first right channel audio before the processing; when the call environment type is quiet and the state between the user and the screen is close to the screen, determine that the call mode is quiet mode ; Determine that the parameter corresponding to the quiet mode is the parameter for processing the first left channel audio before processing and the first right channel audio before processing; when the call environment type is noisy, the distance between the user and the screen When the status is close to the screen, it is determined that the call mode is noisy mode; the parameters corresponding to the noisy mode are determined as processing the first left channel audio before processing and the first right channel audio before processing parameters.
  • the call mode can be divided into quiet mode, normal mode and noisy mode, and the characteristics of the processed audio obtained in these three modes are different.
  • the normal mode the overall energy of the audio signal played by the first sound generator and the second sound generator is set to be larger than that of the quiet mode but smaller than that of the noisy mode, and the energy of the sound signal of the first frequency band can be highlighted. It makes the pickup sound clear while reducing sound leakage.
  • the noisy mode the overall energy of the audio signals played by the first sounder and the second sounder is set to the maximum, and the energy of the sound signal in the first frequency band is highlighted, so that it can be picked up even in a noisy environment. The sound is clear.
  • the one or more processors are further configured to invoke the computer instruction to enable the electronic device to: determine whether the user uses the first sounder and the second sounder during a call. player to play audio.
  • the mode switching solution in this solution is only used when the electronic device uses the first sound generator and the second sound generator to play audio. If the audio is not played by using the first sound generator and the second sound generator, for example, the audio is played by a speaker, other algorithms are used to process the audio. Increase the adaptability of electronic equipment and hardware.
  • the one or more processors are further configured to invoke the computer instruction to enable the electronic device to: estimate the echo according to the first reference signal and the second reference signal, the first The reference signal is the audio output after the first left channel audio passes through the first power amplifier, the second reference signal is the output audio after the first right channel audio passes through the second power amplifier, and the echo is the estimated microphone collected Audio played by the first sound generator and the second sound generator; removing the echo from the first uplink audio to obtain the target uplink audio.
  • the echo in the processed audio collected by the microphone is removed, so that other devices cannot hear the echo collected by the local device when the local device communicates through the calling APP, which can improve the call quality.
  • the present application provides an electronic device, which includes: one or more processors and memory; the memory is coupled to the one or more processors, the memory is used to store computer program codes, and the computer
  • the program code includes computer instructions, and the one or more processors invoke the computer instructions to make the electronic device execute the method described in the first aspect or any implementation manner of the first aspect.
  • the audio played by the electronic device can be adjusted according to the change of the calling environment, so as to obtain audio adapted to the calling environment.
  • the volume of the played audio can be increased, and the energy of the sound in the frequency band (such as the frequency of 1khz-3khz) that is sensitive to the human ear and has good directivity can be increased, so that the user It can pick up sound clearly in a noisy call environment.
  • the outside world is relatively quiet, which can reduce the volume of the audio played, and at the same time highlight the sound of the frequency band (for example, the frequency is 1khz-3khz) that is sensitive to the human ear and has good directivity, so that while reducing sound leakage, Ensure that the user can pick up the sound clearly.
  • the frequency band for example, the frequency is 1khz-3khz
  • an embodiment of the present application provides a chip system, which is applied to an electronic device, and the chip system includes one or more processors, and the processor is used to call a computer instruction so that the electronic device executes the first Aspect or the method described in any implementation of the first aspect.
  • the audio played by the electronic device can be adjusted according to the change of the calling environment, so as to obtain audio adapted to the calling environment.
  • the volume of the played audio can be increased, and the energy of the sound in the frequency band (such as the frequency of 1khz-3khz) that is sensitive to the human ear and has good directivity can be increased, so that the user It can pick up sound clearly in a noisy call environment.
  • the outside world is relatively quiet, which can reduce the volume of the audio played, and at the same time highlight the sound of the frequency band (for example, the frequency is 1khz-3khz) that is sensitive to the human ear and has good directivity, so that while reducing sound leakage, Ensure that the user can pick up the sound clearly.
  • the frequency band for example, the frequency is 1khz-3khz
  • the embodiment of the present application provides a computer program product containing instructions, and when the computer program product is run on the electronic device, the electronic device is made to execute any one of the first aspect or the first aspect. method described.
  • the audio played by the electronic device can be adjusted according to the change of the calling environment, so as to obtain audio adapted to the calling environment.
  • the volume of the played audio can be increased, and the energy of the sound in the frequency band (such as the frequency of 1khz-3khz) that is sensitive to the human ear and has good directivity can be increased, so that the user It can pick up sound clearly in a noisy call environment.
  • the outside world is relatively quiet, which can reduce the volume of the audio played, and at the same time highlight the sound of the frequency band (for example, the frequency is 1khz-3khz) that is sensitive to the human ear and has good directivity, so that while reducing sound leakage, Ensure that the user can pick up the sound clearly.
  • the frequency band for example, the frequency is 1khz-3khz
  • the embodiment of the present application provides a computer-readable storage medium, which, when the instruction is run on the electronic device, causes the electronic device to execute the method described in the first aspect or any implementation manner of the first aspect. method.
  • the audio played by the electronic device can be adjusted according to the change of the calling environment, so as to obtain audio adapted to the calling environment.
  • the volume of the played audio can be increased, and the energy of the sound in the frequency band (such as the frequency of 1khz-3khz) that is sensitive to the human ear and has good directivity can be increased, so that the user It can pick up sound clearly in a noisy call environment.
  • the outside world is relatively quiet, which can reduce the volume of the audio played, and at the same time highlight the sound of the frequency band (for example, the frequency is 1khz-3khz) that is sensitive to the human ear and has good directivity, so that while reducing sound leakage, Ensure that the user can pick up the sound clearly.
  • the frequency band for example, the frequency is 1khz-3khz
  • Fig. 1 shows a schematic diagram of a call algorithm
  • Fig. 2 shows a schematic diagram of a sounder of an electronic device in a scheme
  • FIG. 3 shows a schematic diagram of a sound generator of an electronic device in an embodiment of the present application
  • FIG. 4 shows a schematic diagram of a call method involved in an embodiment of the present application
  • FIG. 5 shows a schematic scenario where the electronic device is not a handheld phone
  • FIGS. 6a-6d show three schematic diagrams of the call mode
  • FIG. 7a-7d show a set of exemplary user interfaces for setting whether the call mode of the electronic device is an adjustable mode
  • Fig. 8 is a schematic flow chart of the call method involved in the embodiment of the present application.
  • FIG. 9 is a schematic illustration of a change in the call environment type provided by the embodiment of the present application.
  • FIG. 10 is a schematic flow chart of electronic equipment processing downlink audio signals in normal mode
  • Fig. 11 is a schematic flow chart of the electronic device removing the echo signal in the audio signal collected by the microphone;
  • Fig. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a system structure of an electronic device provided by an embodiment of the present application.
  • first and second are used for descriptive purposes only, and cannot be understood as implying or implying relative importance or implicitly specifying the quantity of indicated technical features. Therefore, the features defined as “first” and “second” may explicitly or implicitly include one or more of these features. In the description of the embodiments of the present application, unless otherwise specified, the “multiple” The meaning is two or more.
  • UI user interface
  • the term "user interface (UI)” in the following embodiments of this application is a medium interface for interaction and information exchange between an application program or an operating system and a user, and it realizes the difference between the internal form of information and the form acceptable to the user. conversion between.
  • the user interface is the source code written in a specific computer language such as java and extensible markup language (XML).
  • the source code of the interface is parsed and rendered on the electronic device, and finally presented as content that can be recognized by the user.
  • the commonly used form of user interface is the graphical user interface (graphic user interface, GUI), which refers to the user interface related to computer operation displayed in a graphical way. It may be text, icons, buttons, menus, tabs, text boxes, dialog boxes, status bars, navigation bars, Widgets, and other visible interface elements displayed on the display screen of the electronic device.
  • the call algorithm includes an algorithm involved in a downlink call and an algorithm involved in an uplink call.
  • the call downlink means that after the electronic device receives an input audio signal sent to the local device by other electronic devices, the electronic device first processes the input audio signal to obtain an audio signal, which can be played through a sound generator or the like.
  • Uplink communication means that electronic devices collect sound signals through microphones, and perform second processing on the sound signals to generate output audio signals, which are then sent to other electronic devices.
  • the algorithm used in the first processing is the algorithm involved in the downlink call
  • the algorithm used in the second processing is the algorithm involved in the online call.
  • Fig. 1 shows a schematic diagram of a call algorithm.
  • the electronic device first processes the input audio signal transmitted by other electronic devices to the device through the base station.
  • the first processing includes: firstly decode it into an audio signal recognizable by the electronic device through the modem, then through the call downlink processing module, and then use the codec to decode it into an analog audio signal, perform power amplification through the power amplifier, and then drive The sounder plays it.
  • Algorithms involved in the call downlink processing module may include noise reduction, timbre adjustment, and volume adjustment.
  • the microphone of the electronic device collects the sound signal, and performs the second processing on the sound signal.
  • the second processing also includes: first encoding it by a codec to obtain a digital audio signal, then using a call uplink processing module, and then modulating by a modem to obtain an output audio signal that can be recognized by the base station.
  • Algorithms involved in the call uplink processing module may include noise reduction, timbre adjustment, and volume adjustment.
  • the noise reduction, timbre adjustment and volume adjustment involved in the call downlink processing module and the call uplink processing module are the same.
  • the noise reduction is used to perform noise reduction on one audio signal, and suppress noise signals and reverberation signals in the audio signal.
  • the timbre adjustment is used to adjust the energy of the audio signal in different frequency bands in the audio signal to improve the voice timbre.
  • the unit of energy is decibel (dB), which is used to describe the strength of the sound signal. Audio signals with more energy will sound louder when played through the same sound generator.
  • timbre refers to the proportion of energy of audio signals in different frequency bands in the audio signal.
  • the volume adjustment is used to adjust the energy of the audio signal.
  • the sounder in order to increase the screen-to-body ratio of the electronic device and achieve a full screen, the sounder is placed on the side of the electronic device, and the side slit or top opening is used to transmit the audio signal played by the sounder to the human ear.
  • FIG. 2 shows a schematic diagram of a sound generator of an electronic device in a solution.
  • the user interface 20 is a call interface of the electronic device
  • the content displayed in the area 201 is the side of the electronic device
  • the sounder is placed on the side of the electronic device
  • the electronic device can be provided with a side slit And the top opening, so that the audio signal played by the sound generator can be transmitted to the human ear
  • the side slit can be as shown in area 201A
  • the top opening can be as shown in area 201B.
  • part of the audio signal can be audio signal a and audio signal b, and the other part of the audio signal can be Leaked sound a.
  • the audio signal a can directly enter the human ear through the side slit
  • the audio signal b can directly enter the human ear through the top opening.
  • the leakage a is the part of the audio signal that is not transmitted to the human ear in the audio signal played by the sound generator.
  • the audio signal played by the electronic device leaks due to sound leakage a generated by the electronic device, which may cause user privacy to be leaked.
  • the noise signal may be noise a.
  • the noise a After the noise a enters the human ear, it will interfere with the user's recognition of the audio signal a. If the energy of the audio signal a is smaller than the noise a, it will cause inaudible hearing.
  • the sounder is placed on the side of the electronic device, when the electronic device uses the sounder to play the audio signal during the call, it will cause privacy leakage caused by sound leakage and the noise signal entering the human ear to interfere with the user's identification of the audio signal.
  • two sound generators are provided in the electronic device: the first sound generator is placed on the side of the electronic device, and the second sound generator is placed inside the screen of the electronic device.
  • the electronic device can process the audio signal sent to the device by other electronic devices to generate a processed left channel audio signal and a processed right channel audio signal.
  • the first sound generator is used to play the processed left channel audio signal, and the played left channel audio signal (processed) is transmitted to the human ear through the air.
  • the second sound generator is used to play the processed right channel audio signal, and the played right channel audio signal (after processing) is transmitted to the human ear through the bone, and the second sound generator may be called a bone conduction sound generator.
  • the energy of the low frequency sound signal is greater than the energy of the high frequency sound signal
  • the right channel audio signal the energy of the high frequency sound signal is greater than the energy of the low frequency sound signal.
  • the left channel audio signals played by the first sound generator are all processed left channel audio signals
  • the right channel audio signals played by the second sound generator are all processed right channel audio signals .
  • Fig. 3 shows a schematic diagram of a sound generator of an electronic device in an embodiment of the present application.
  • the user interface 30 is a call interface of the electronic device
  • the first sounder can refer to the relevant description of the sounder involved in the aforementioned Figure 2
  • the second sounder can be arranged in the area 301 shown.
  • a part of the audio signal may be audio signal 1 and audio signal 2, wherein , the audio signal 1 can directly enter the human ear through the side slit, and the audio signal 2 can directly enter the human ear through the top opening.
  • Another part of the audio signal may be sound leakage 1 .
  • the electronic device can use the second sound generator to play the right channel audio signal, and the right channel audio signal is shown as audio signal 3 .
  • the first sound generator causes sound leakage
  • the second sound generator can play the audio signal of the right channel to make up for it, so that the energy of the audio signal entering the human ear is increased, so that the user can pick up the sound clearly.
  • the noise signal in a noisy environment, there is a noise signal around the human ear, for example, the noise signal may be noise 1 .
  • the electronic device can also increase the energy of the processed left-channel audio signal and the processed right-channel audio signal so that the user can identify the left-channel audio signal and the right-channel audio signal, thereby reducing interference of noise signals to the user.
  • the electronic device may reduce the energy of the audio signal of the left channel, so that the first sound generator can reduce sound leakage. Mainly rely on the second sound generator to play the right channel audio signal so that the user can pick up the sound clearly.
  • the call method involved in the embodiment of the present application is applicable to the processes of downlink call and uplink call.
  • FIG. 4 shows a schematic diagram of the call method involved in the embodiment of the present application.
  • the call mode when the electronic device determines that the call mode is an adjustable mode and the user is close to the screen during the downlink call, the call mode may be determined in combination with the call environment type.
  • the electronic device can set different parameters to process the downlink audio signal, obtain the processed left channel audio signal and the processed right channel audio signal with different timbre and volume, and then use the first sounder The left channel audio signal (processed) is played and the right channel audio signal (processed) is played using the second sound generator.
  • noise reduction, timbre adjustment, and volume adjustment can be performed on the downlink audio signal to obtain the processed left channel audio signal and the processed right channel audio signal .
  • the parameters involved in tone color adjustment and volume adjustment are different.
  • the electronic device amplifies the processed left channel audio signal through the first power amplifier, drives the first sound generator to play the left channel audio signal (after processing), and passes the processed right channel audio signal through the first power amplifier.
  • the second power amplifier performs power amplification, and drives the second sound generator to play the right channel audio signal (processed).
  • the downlink audio signal is an audio signal sent to the machine by other electronic devices.
  • the talking mode can be divided into quiet mode, normal mode and noisy mode.
  • the electronic device can perform echo cancellation on the uplink audio signal. Specifically, the electronic device can use the reference signal output by the first power amplifier and the reference signal output by the second power amplifier to estimate the echo signal through the echo cancellation algorithm in the dual-device call uplink processing module, and then remove the echo signal from the uplink audio signal. echo signal.
  • the calling method of the embodiment of the present application is applicable to the situation that the electronic device is in the handheld calling mode.
  • the handheld call mode means that the electronic device plays an audio signal through the first sounder and/or the second sounder. For example, during a call, the electronic device plays the audio signal through the speaker of the electronic device. Playing audio signals by other sound generators such as earphones or speakers does not belong to the handheld call mode. The audio signal is played through the speaker. As shown in (a) in FIG. 5 , the electronic device plays the audio signal through the speaker. See the speaker icon 501 in the user interface 50 is gray.
  • the audio signal is played through other sound generators such as earphones, for example, the audio signal is played through a Bluetooth earphone, as shown in (b) in Figure 5, the electronic device plays the audio signal through the TWS earphone, and the earphone is inserted into the prompt icon 502 in the user interface 51 can be shown.
  • earphones for example, the audio signal is played through a Bluetooth earphone, as shown in (b) in Figure 5, the electronic device plays the audio signal through the TWS earphone, and the earphone is inserted into the prompt icon 502 in the user interface 51 can be shown.
  • the electronic device can determine the call mode, so that parameters for processing downlink audio signals can be set according to different call modes.
  • a processed right channel audio signal and a processed left channel audio signal are obtained.
  • the downlink audio signal is first used to generate the unprocessed left channel audio signal and the unprocessed right channel audio signal, and then different parameters are used to generate the unprocessed left channel audio signal and the unprocessed right channel audio signal respectively.
  • the tone color adjustment and the volume adjustment are performed on the channel audio signal, and the processed right channel audio signal and the processed left channel audio signal conforming to the call environment are obtained.
  • the first sounder is used to play the left channel audio signal (processed), and the second sounder is used to play the right channel audio signal (processed), so that the two channels are used to play the audio signal in the downlink process of the call.
  • step S108 For the detailed process of how the electronic device generates the processed right-channel audio signal and the processed left-channel audio signal, reference may be made to the description of step S108 below, which will not be repeated here.
  • the talking modes may include normal mode, quiet mode and noisy mode.
  • the processed left channel audio signal and the processed right channel audio signal obtained by the electronic device are different, and the difference can be reflected in volume and/or timbre, wherein , the volume is used to indicate the energy or sound size of the audio signal, and the timbre is used to indicate the energy distribution (proportion) of the sound signal of the audio signal in different frequency bands.
  • the volume is used to indicate the energy or sound size of the audio signal
  • the timbre is used to indicate the energy distribution (proportion) of the sound signal of the audio signal in different frequency bands.
  • the noise mode is the largest
  • the normal mode is next
  • the quiet mode is the smallest.
  • the energy of the sound signal in the first frequency band is greater than the energy of the sound signal in other frequency bands by the first degree
  • the sound signal in the first frequency band is greater than the energy of the sound signals in other frequency bands by the second degree
  • the energy of the sound signal in the first frequency band is greater than the energy of the sound signals in other frequency bands by the third degree.
  • the energy distribution on different frequency bands is not adjusted (it is the same as the energy ratio of the left channel audio signal before processing in different frequency bands), quiet In the mode, the energy of the sound signal in the first frequency band is the fourth degree smaller than the energy of the sound signal in other frequency bands, wherein the first degree, the second degree, the third degree and the fourth degree can be measured in decibels, and the first degree, The second degree, the third degree and the fourth degree can be the same or different, generally speaking, the first degree > the second degree > the third degree.
  • volume and/or timbre in different call modes, for the same downlink audio signal, the processed left channel audio signal and the processed right channel audio signal obtained by the electronic device can also be There are other differences, and the volume and/or timbre are used as examples in the present application, which should not be construed as limiting the present application.
  • the electronic device After the user answers the call, the electronic device sets the call mode to the normal mode. Then, in the case that the talking mode is an adjustable mode, the electronic device can switch among the three talking modes. Optionally, in response to the user answering the call, the electronic device sets the call mode to the normal mode, and the user can start a call. It can be understood that after the user answers the call, the electronic device can also set the call mode to the quiet mode or the noisy mode. For the convenience of description, the call mode is taken as the normal mode as an example for illustration.
  • the call mode when the electronic device determines that the user is in close contact with the screen and the call environment type is normal or in the case that the user is not in close contact with the screen, the call mode may be determined to be the normal mode.
  • the user not being in close contact with the screen means that the distance between the user and the screen of the electronic device is greater than a preset value and the duration of the greater than the preset value is greater than a preset time.
  • the close contact of the user with the screen means that the distance between the user and the screen of the electronic device is less than a preset value and the duration of being less than the preset value is longer than a preset time.
  • the electronic device determines that the user is close to the screen and the call environment type is quiet, it may determine that the call mode is the quiet mode.
  • the electronic device may determine that the call mode is a noisy mode.
  • the call environment type may be used to describe the long-term energy of the noise in the surrounding environment of the electronic device during a call.
  • the long-term energy of the noise is the average energy of the noise within a period of time (such as 30s).
  • the call environment types can be divided into quiet, normal and noisy.
  • the electronic device can judge the type of the call environment through the magnitude of the long-term energy. If the long-term energy is large, it is noisy, if it is small, it is quiet, and if it is in the middle, it is normal.
  • the long-term energy of the noise is the average energy of the noise within a period of time, which is used to indicate the energy of the noise within a period of time.
  • the high long-term energy means that the long-term energy is greater than a threshold
  • the small long-term energy means that the long-term energy is smaller than another threshold
  • the long-term energy is an intermediate state between one threshold and another threshold.
  • 6a-6d show three schematic diagrams of the call mode.
  • the icon 611 and the icon 612 represent the noise
  • the number of the icon 611 and the icon 612 represents the size of the noise
  • Icon 613 represents the processed audio signal played by the first sound generator (the left channel after processing)
  • the more the number of the icon 613 represents the greater the energy of the processed left channel audio signal, that is, the greater the volume, and vice versa
  • the icon 614 represents the audio signal played by the second sound generator (the processed right channel), and the number of the icons 614 represents the energy level of the processed right channel audio signal.
  • Figure 6a shows an example when the user is not in close contact with the screen.
  • Fig. 6a and Fig. 6b-Fig. 6d shows an example when the user sticks to the screen.
  • the electronic device can determine that the call mode is the normal mode.
  • FIG. 6b it is a schematic diagram of another common mode.
  • the user is in close contact with the screen.
  • There is noise around and the energy of the noise is in an intermediate state for a long time, and the electronic device can determine that it is in a normal mode at this time.
  • the left channel audio signal in the normal mode, when the electronic device is close to the human ear and uses the first sounder to play the left channel audio signal (processed), the left channel audio signal will
  • the signal may include audio signal 1 and/or audio signal 2 (the audio signal 1 and audio signal 2 are taken as examples below for the convenience of description), wherein the audio signal 1 may enter the human ear through an opening or a side slit or other physical channels , the audio signal 2 can enter the human ear through the top opening or the side slit or other physical channels.
  • the left channel audio signal may also include leakage 1 .
  • the electronic device can use the second sound generator to play the right channel audio signal, the right channel audio signal includes audio signal 3, and the audio signal 3 is the audio signal played by the second generator.
  • the noise in the environment is Noise 1.
  • the left and right channels are exemplary, and the audio signal of the left channel may correspond to the audio signal 3 , and the audio signal of the right channel may correspond to the audio signal 1 and/or the audio signal 2 .
  • the second sound generator can play the right channel audio signal to make up for the energy of the audio signal entering the human ear, so that the user can pick up the sound clearly;
  • the addition of the second sound generator makes the path for the audio signal to enter the user's ear shorter and more directional. Compared with only the first sound generator, the sound pickup is clearer.
  • FIG. 6c it is a schematic diagram of the quiet mode. At this point, the user is in close contact with the screen. If there is no noise around, the electric device can determine that it is in a quiet mode at this time.
  • the left channel audio signal in quiet mode, when the electronic device is close to the human ear and uses the first sounder to play the left channel audio signal (processed), the left channel audio signal will
  • the signal may include audio signal 1 and/or audio signal 2 (the audio signal 1 and audio signal 2 are taken as examples for the convenience of description below), wherein the audio signal 1 may enter the human body through an opening, a side slit or other physical passages. Ear, the audio signal 2 can enter the human ear through the opening, side slit or other physical channels.
  • the left channel audio signal may also include leakage 1 .
  • the electronic device can use the second sound generator to play the right channel audio signal, and the right channel audio signal also includes an audio signal 3, and the audio signal 3 is an audio signal generated by the second sound generator.
  • the noise in the environment is no noise.
  • the left and right channels are exemplary, and the audio signal of the left channel may correspond to the audio signal 3 , and the audio signal of the right channel may correspond to the audio signal 1 and/or the audio signal 2 .
  • the energy of the leakage sound 1 in the quiet mode is smaller than that in the normal mode, the privacy of the user can be protected in a quiet environment.
  • FIG. 6d it is a schematic diagram of the noisy mode. At this point, the user is in close contact with the screen. If there is noise around, the electronic device can determine that it is in a noisy mode.
  • a part of the audio signal can be Audio signal 1 and audio signal 2, wherein the audio signal 1 can directly enter the human ear through the side slit, and the audio signal 2 can directly enter the human ear through the top opening.
  • Another part of the audio signal may be sound leakage 1 .
  • the electronic device can use the second sound generator to play the right channel audio signal, and the right channel audio signal is shown as audio signal 3 .
  • the noise signal in the noisy mode is noise 1, and compared with the noise 1 in the normal mode, the energy of the noise 1 in the noisy mode becomes larger, and the sound becomes louder. Even in noisy environments, users can pick up sound clearly.
  • Fig. 6a-Fig. 6d shows that: in different modes, different sound generators play different volumes of audio signals. In addition to different volumes, in different modes, different sound generators may play different frequency domains of audio signals, and the frequency domains may be set according to different modes. For a specific description, reference may be made to the foregoing introduction to the frequency domain and the following description of related content in step S106.
  • the characteristics of the processed left channel audio signal and the processed right channel audio signal obtained in different modes are shown in Table 1 below:
  • the characteristics of the processed left channel audio signal are: the energy of the processed left channel audio signal is the first energy, and optionally, the processed left channel audio signal
  • the energy of the low-frequency sound signal of the signal is greater than the energy of the high-frequency sound signal, wherein the low-frequency sound signal and the high-frequency sound signal are set according to actual needs, which is not limited in this embodiment of the present application.
  • low frequency may refer to sound signals below 2khz
  • high frequency may refer to sound signals above 2khz.
  • the characteristics of the processed right channel audio signal are: the energy of the processed right channel audio signal is the fourth energy, and the energy of the processed sound signal in the first frequency band is the first decibel greater than the energy of the sound signals in other frequency bands (decibel, dB); Optionally, the energy of the high-frequency sound signal of the processed right channel audio signal is greater than the energy of the low-frequency sound signal.
  • the first energy is the same as the fourth energy; or the fourth energy is different from the first energy, but the difference is small.
  • the characteristics of the processed left channel audio signal are: the energy of the processed left channel audio signal is the second energy, in order to make the sound sound smaller than the normal mode in the quiet mode, the second energy Less than the first energy, the energy of the processed sound signal of the second frequency band is smaller than the energy of the sound signals of other frequency bands by the second decibel; optionally, the energy of the low-frequency sound signal of the processed left channel audio signal is greater than that of the high-frequency sound signal energy of the audio signal.
  • the characteristics of the processed right channel audio signal are: the energy of the processed right channel audio signal is the fifth energy, the fifth energy is less than the fourth energy, and the energy of the processed sound signal of the first frequency band is higher than that of other frequency bands.
  • the energy of the sound signal is greater than three decibels.
  • the energy of the high-frequency sound signal of the processed right channel audio signal is greater than the energy of the low-frequency sound signal.
  • the fifth energy is less than the fourth energy.
  • the characteristics of the processed left channel audio signal are: the energy of the processed left channel audio signal is the third energy, in order to make the sound sound louder in the noisy mode than in the normal mode, the third energy greater than the first energy.
  • the energy of the low-frequency sound signal of the processed left channel audio signal is greater than the energy of the high-frequency sound signal.
  • the characteristics of the processed right channel audio signal are: the energy of the processed right channel audio signal is the sixth energy, and the energy of the processed sound signal in the first frequency band is fourth decibel greater than the energy of the sound signal in other frequency bands ;
  • the energy of the high-frequency sound signal of the processed right channel audio signal is greater than the energy of the low-frequency sound signal.
  • the sixth energy is greater than the fourth energy.
  • the sixth energy is the same as the third energy; or the sixth energy is different from the third energy, but the difference is small.
  • the existing technology can be used, and details will not be described here.
  • the timbre adjustment of the left channel audio signal is to adapt to the situation of two sounders. In some cases, other corresponding adjustments will be made for the left channel audio signal in normal mode and noisy mode. Tone adjustment.
  • first decibel, the second decibel, the third decibel and the fourth decibel can be the same or different, generally speaking, the first decibel ⁇ the second decibel ⁇ the fourth decibel.
  • the sound signal in the first frequency band may be a sound signal in a frequency band that is more sensitive to the user's sense of hearing and has better directivity, for example, it may be a sound signal with a frequency of 1khz-3khz.
  • the sound signal in the second frequency band may be a high-frequency sound signal, for example, a sound signal greater than 1khz.
  • the fourth decibel can be made the largest, the third decibel next, the first decibel the smallest, and the second decibel can be the same as the third decibel.
  • the first decibel can be 3dB
  • the second and third decibels can be 6dB
  • the fourth decibel can be 9db.
  • the first energy can be (-9dB ⁇ -6dB), the second energy can be (-15dB ⁇ -12dB), the third energy can be (-3dB ⁇ -0dB), the fifth energy can be (-12dB ⁇ -9dB ).
  • the first energy of any processed left-channel audio signal and the fourth energy of any processed right-channel audio signal may be different or the same.
  • the second energy of any processed left-channel audio signal and the fifth energy of any processed right-channel audio signal may be different or the same.
  • the third energy of any processed left-channel audio signal and the sixth energy of any processed right-channel audio signal may be different or the same.
  • the first energy of the processed left channel audio signal may be greater than the fourth energy of the processed right channel audio signal and may also be the same.
  • the electronic device can make sound mainly through the second sounder, so that the sound leakage can be reduced.
  • the directivity is better, and the user is more sensitive to hearing. Highlighting the energy of this part of the audio signal can reduce the sound leakage of the user in the quiet mode, and can achieve clear sound pickup for the user.
  • the overall energy of the audio signal played by the first sounder and the second sounder is set to be larger than that of the quiet mode but smaller than that of the noisy mode, and highlight the energy of the sound signal of the first frequency band, which can make the pickup Clear sound while reducing sound leakage.
  • the overall energy of the audio signals played by the first sounder and the second sounder is set to the maximum, and the energy of the sound signal in the first frequency band is highlighted, so that it can be picked up even in a noisy environment.
  • the sound is clear.
  • whether the call mode is an adjustable mode can be set by the user.
  • FIG. 7a-7d show a set of exemplary user interfaces for setting whether the call mode of the electronic device is an adjustable mode.
  • the user interface 70 is a setting interface of the electronic device.
  • the user interface 70 includes a sound and vibration setting item 701 .
  • the electronic device may display the user interface 71 shown in FIG. 7b.
  • the user interface 71 is a user interface corresponding to the setting content of the sound and vibration setting item 701 .
  • the user interface 71 may include a handheld call answering mode setting item 711.
  • the electronic device may display the user interface 72 as shown in FIG. 7c.
  • the user interface 72 is a user interface corresponding to the setting content of the setting item 711 of the handset call answering mode. It is used to prompt the user whether to enable the on-ear automatic sound quality adjustment function when answering a call with a handheld device.
  • enabling the ear-mounted automatic sound quality adjustment function means that the electronic device can switch between three call modes.
  • Turning off the on-ear automatic sound quality adjustment function means that the electronic device cannot switch between the three call modes, and always maintains a call mode, such as the normal mode.
  • the electronic device can be set to enable the ear-mounted automatic sound quality adjustment function by default, and can switch among three call modes.
  • the electronic device defaults to enabling the on-ear automatic sound quality adjustment function, and at this time, the enable adjustment control 721 is grayed out.
  • the electronic device can switch among the three call modes during a call.
  • the electronic device cannot switch between the three call modes during a call.
  • the electronic device closes the automatic sound quality adjustment function on the ear, and displays the user interface 73 as shown in FIG. 7d.
  • the closing adjustment control 722 is grayed out. Then, during a call, the electronic device cannot switch between the three call modes, and always maintains one call mode, such as the normal mode.
  • the electronic device when the automatic sound quality adjustment function on the ear is turned on, the electronic device can switch between three call modes during a call. At this time, the electronic device may determine the call mode in combination with the state between the user and the screen and the environment type.
  • the state between the user and the screen can be divided into a state of sticking to the screen and a state of not sticking to the screen.
  • the close-to-screen state is a state in which the distance between the user and the screen of the electronic device is less than a preset value and the duration greater than the preset value is greater than a preset time
  • the non-close-to-screen state is a state in which the distance between the user and the screen of the electronic device is A state in which the distance between the screens of the electronic device is not less than a preset value and the duration of not less than the preset value is longer than a preset time.
  • the electronic device determines that it is in a non-attached screen state (that is, the electronic device determines that the user is in a non-attached screen state, and the non-attached screen state can also be replaced by the first state or the second state, etc., used to identify the electronic
  • the state of being close to the screen is the same, and will not be described here, and it is also applicable to other embodiments). The user is close to the screen.
  • the electronic device switches from the non-sticking screen state to the sticking screen state ; If the above conditions are not met, it is still in the non-close to the screen state. Understandably, the switching condition may also be other conditions.
  • the electronic device determines that it is in the state of sticking to the screen, when it is determined that the distance between the user and the screen of the electronic device is greater than a second preset value and the duration of the user's holding (the distance greater than the second preset value) is longer than the second
  • the electronic device switches from the state of being close to the screen to the state of not being close to the screen; if the above conditions are not met, the electronic device is still in the state of being close to the screen. Understandably, the switching condition may also be other conditions.
  • the foregoing is described by taking the distance between the user and the screen of the electronic device and the duration of the distance as an example, and there are other ways of judging the state between the user and the screen.
  • the state between the user and the screen is determined by the pressure the user puts on the screen and the duration of the pressure, that is, the distance in the aforementioned method of judging the state of the user and the screen by using the distance can be changed to pressure, or the electronic device can It is realized by detecting the contact area between the human skin (including human face and ears, etc.) and the electronic device, and details will not be described here.
  • the first preset value and the second preset value may be the same or different.
  • the first preset time and the second preset time may be the same or different.
  • the first preset time and/or the second preset time may be set by the user.
  • the user can control how long the user is in contact with the screen before the electronic device determines that the user is in close contact with the screen by setting the sensitivity of the sound quality adjustment.
  • the sensitivity adjustment control 723 is in the prompt text: "fast"
  • the electronic device can determine that the user is in close contact with the screen.
  • the contact time may be 1 second to 5 seconds, for example 3 seconds.
  • the electronic device can determine that the user is in close contact with the screen when the user has been in contact with the screen for a long time.
  • the contact time may be more than 10 seconds, for example 10 seconds. It should be understood that, when the sensitivity adjustment control 723 is closer to the prompt text: "slow", the longer the user is in contact with the screen, the electronic device can determine that the user is in close contact with the screen.
  • the call mode is first set to the normal mode. Then, it is determined whether it is an adjustable mode, and if the calling mode is an unadjustable mode, the talking mode of the electronic device remains in the normal mode and does not change.
  • the call mode is an adjustable mode
  • the call mode can be re-determined in combination with the state between the user and the screen and the type of call environment, and mode switching can be performed. In this way, parameters for processing downlink audio signals can be set according to different call modes.
  • a processed right channel audio signal and a processed left channel audio signal are obtained. Then use the first sound generator to play the processed left channel audio signal, and use the second sound generator to play the processed right channel audio signal.
  • Fig. 8 is a schematic flow chart of the call method involved in the embodiment of the present application.
  • step S101-step S114 Regarding the call method involved in the embodiment of the present application, reference may be made to the following detailed description of step S101-step S114.
  • the electronic device starts a calling application
  • a call application is an APP that can provide a call function for an electronic device, and the call includes a voice call and a video call.
  • the electronic device displays an incoming call prompt, and in response to an operation on the answer control (such as a click operation), the electronic device can communicate with other electronic devices through the calling application, and the user can start a call through the electronic device.
  • an operation on the answer control such as a click operation
  • a voice call refers to a communication method for real-time transmission of audio signals between an electronic device and at least one other electronic device.
  • a video call refers to a communication method for real-time transmission of audio signals and image signals between an electronic device and at least one other electronic device.
  • the electronic device continuously acquires downlink audio signals, that is, the electronic device can continuously receive audio signals sent to the device by other electronic devices.
  • the downlink audio signal is one or more frames of audio signals sent by other electronic devices to the local device.
  • the specific duration of a frame of audio signal can be determined according to the processing capability of the electronic device, generally it can be 10ms-50ms, such as 10ms or a multiple of 10ms such as 20ms or 30ms.
  • the electronic device after the electronic device starts the call application, after receiving the first frame of downlink audio signal sent by other electronic devices to the device, before processing the downlink audio signal, the electronic device can execute step S102 and step S102. S103 and so on determine the manner of processing the downlink audio signal of the first frame.
  • the aspects of the electronic device acquiring the downlink audio signal such as the acquisition method and the acquisition time, are not limited in this embodiment of the present application, and step S102 and step S103 will be described in detail below.
  • the electronic device determines whether the call process is in a handheld call mode
  • the handheld call mode means that the electronic device starts a call application, and the user plays an audio signal through the first sounder or the second sounder when the user starts a call. That is, during a call, the electronic device does not play audio signals through other sound generators such as speakers or earphones.
  • FIG. 3 shows an exemplary user interface of an electronic device for a handheld call.
  • FIG. 3 shows an exemplary user interface of an electronic device for a handheld call.
  • FIG. 3 shows an exemplary user interface of an electronic device for a handheld call.
  • FIG. 3 shows an exemplary user interface of an electronic device for a handheld call.
  • FIG. 3 shows an exemplary user interface of an electronic device for a handheld call.
  • FIG. 3 shows an exemplary user interface of an electronic device for a handheld call.
  • FIG. 3 shows an exemplary user interface of an electronic device for a handheld call.
  • the audio signal is played through the first sounder or the second sounder, and when the electronic device detects that the machine is not connected to an earphone and the audio signal is played through another sounder such as a speaker, it can Make sure the call mode is handheld call mode. In the case that the electronic device detects that the device is connected to an earphone or plays an audio signal through other sound generators such as a speaker, it can be determined that it is not a handheld call mode.
  • step S104 to step S114 are performed.
  • step S103 is executed.
  • the electronic device uses other algorithms to process the downlink audio signal
  • the electronic device processes the downlink audio signal using other algorithms (such as noise reduction algorithms, etc.). Then, the processed downlink audio signal is played through other sound generators, for example, the electronic device may play the processed downlink audio signal through a loudspeaker.
  • other algorithms such as noise reduction algorithms, etc.
  • step S104 to step S114 are performed. Determine the call mode, process the downlink audio signal in the call mode, obtain the processed left channel audio signal and the processed right channel audio signal, and then play the processed left channel audio through the first sounder signal and play the processed right channel audio signal through the second sound generator.
  • step S104-step S114 is as follows:
  • the electronic device determines whether the call mode is an adjustable mode
  • the adjustable mode for the electronic device can be switched between three talk modes.
  • the electronic device can change whether the device is in an adjustable mode through user settings.
  • FIGS. 7a-7d a set of exemplary user interfaces for the user to set whether the call mode is an adjustable mode through the electronic device.
  • FIG. 7a-FIG. 7d For specific content, reference may be made to the relevant descriptions of FIG. 7a-FIG. 7d mentioned above.
  • the electronic device determines that the call mode is an adjustable mode, and then executes step S105-step S114.
  • the electronic device determines that the call mode is an unadjustable mode, and then processes the downlink audio signal in the normal mode, that is, processes the downlink audio signal using the parameters involved in the normal mode to obtain the processed left channel audio signal and the processed The right channel audio signal, and then play the processed left channel audio signal through the first sound generator and play the processed right channel audio signal through the second sound generator.
  • the call mode is an unadjustable mode
  • processes the downlink audio signal in the normal mode that is, processes the downlink audio signal using the parameters involved in the normal mode to obtain the processed left channel audio signal and the processed The right channel audio signal, and then play the processed left channel audio signal through the first sound generator and play the processed right channel audio signal through the second sound generator.
  • the electronic device After determining that the calling mode is an adjustable mode, the electronic device executes steps S105 to S114, wherein the steps S104 to S113 are continuously performed until the calling is ended.
  • step S102 after determining that the call mode is the handheld call mode, the electronic device may directly determine that the call mode is the normal mode without performing step S104, and then mode to process the downlink audio signal. Alternatively, after the user answers the call, the electronic device may set the call mode as a quiet mode or a noisy mode.
  • step S105 may be performed instead of directly determining that the call mode is the normal mode, to determine whether the call mode of the electronic device is an adjustable mode, If yes, re-judgment of the call mode of the electronic device.
  • the electronic device determines whether the state between the user and the screen is close to the screen state
  • the state between the user and the screen can be divided into a state of sticking to the screen and a state of not sticking to the screen.
  • a state of sticking to the screen For its specific description, reference may be made to the aforementioned relevant content, and details are not repeated here.
  • the electronic device may set the state between the user and the screen as a close-fitting state by default, and then update the state between the user and the screen according to whether the user is close to the screen.
  • the electronic device may detect whether the user is in contact with the screen through a sensor on the screen, and if so, determine that the state between the user and the screen is a state of close contact with the screen. Otherwise, it is determined that the state between the user and the screen is not in close contact with the screen.
  • the judging conditions have been described above, and will not be repeated here.
  • how long the user is in contact with the screen before the electronic device determines that the user is in close contact with the screen can be set by the user.
  • the user can control how long the user is in contact with the screen before the electronic device determines that the user is in close contact with the screen by setting the sensitivity of the sound quality adjustment.
  • this process please refer to the above-mentioned description of related content in FIG. 7c , which will not be repeated here.
  • step S106 may be performed to determine a call environment type, and then determine a call mode based on the call environment type.
  • step S107 may be performed to determine that the calling mode is the normal mode.
  • step S108 may be performed to process the downlink audio signal in the normal mode.
  • the electronic device may also determine that the call mode is a quiet mode or a noisy mode.
  • the electronic device determines the call environment type
  • the call environment type can be used to describe the long-term energy of the noise in the environment around the electronic device during a call.
  • the long-term energy of the noise is the average energy of the noise within a preset period of time.
  • the electronic device may determine the call environment type by calculating the long-term energy of the noise in the frame audio signal acquired by the microphone.
  • the call environment types can be divided into quiet, normal and noisy.
  • the electronic device Before updating the call environment type for the first time, the electronic device can set the call environment type as normal. Then, the call environment type is updated according to the long-term energy of the noise in the frame audio signal. The electronic device can judge the type of the call environment through the magnitude of the long-term energy. If the long-term energy is large, it is noisy, if it is small, it is quiet, and if it is in the middle, it is normal.
  • the electronic device may obtain the long-term energy of the noise in the first uplink audio signal by using the energy of the noise in the first uplink audio signal acquired by the microphone and the long-term energy of the noise in the second uplink audio signal.
  • the first uplink audio signal is the t-th frame audio signal acquired by the microphone of the electronic device.
  • the second uplink audio signal is an audio signal different from the first uplink audio signal by X frames, where X is an integer greater than or equal to 1.
  • the value range of X is related to the processing capability of the electronic device, and can be 1-5.
  • the second uplink audio signal is the audio signal of the previous frame of the first uplink audio signal, that is, the t-1th frame audio signal
  • the long-term energy of the noise in the first uplink audio signal can be understood as N l (t) in the following formula (1)
  • the energy of the noise in the first uplink audio signal can be understood is N t (t) in the following formula (1)
  • the long-term energy of the noise in the second uplink audio signal can be understood as N l (t-1) in the following formula (1).
  • noise can be divided into steady state noise and unsteady state noise, among which, steady state noise is the noise whose sound level fluctuation of the measured sound source is not greater than a certain threshold (such as 3dB) within the measurement time, non-stationary noise Steady-state noise refers to the noise in which the sound level fluctuation of the measured sound source is not less than a certain threshold (such as 3dB) within the measurement time.
  • a certain threshold such as 3dB
  • the relevant formula for calculating the long-term energy of the noise in the first uplink audio signal by the electronic device can refer to the following formula (1):
  • N l (t) a*N l (t-1)+(1-a)N t (t)(t>1) formula (1)
  • N l (t) is the long-term energy of the noise in the first uplink audio signal
  • N t (t) indicates the energy of the noise in the first uplink audio signal.
  • the noise can be steady-state noise, or include steady-state noise and non-stationary noise, which can be set according to requirements.
  • N l (t-1) represents the long-term energy of noise in the second uplink audio signal.
  • a represents a smoothing factor, and its value range is (0.9,1). a can be a constant or a variable.
  • the value of a can be adjusted based on the noise type included in N t (t), for example, When the noise of N t (t) includes steady-state noise but does not include non-stationary noise, the value of a may be 0.9, and the value of a may be changed according to other circumstances, which is not limited in this embodiment of the present application.
  • N t (t) may be obtained by a minimum-controlled recursive averaging (mcra) algorithm.
  • N l (t-1) in the formula (2) cannot be calculated yet, then at this time, N l (t -1) is set as an initial value, and the size of the initial value can be obtained based on experience.
  • the electronic device can determine the call environment type according to the first energy threshold, the second energy threshold, and the long-term energy of noise in the first uplink audio signal, and the process can refer to the following formula (2).
  • N 1 represents a first threshold
  • N 2 represents a second threshold
  • the first threshold is smaller than the second threshold.
  • the first threshold can be set to (-65db, -55db), for example -60db.
  • the second threshold may be set to (-35db, -25db), for example -30db.
  • the electronic device determines that the call environment type is quiet; when the long-term energy of the noise in the first uplink audio signal is greater than the first threshold but less than the second threshold , the electronic device determines that the call environment type is normal; when the long-term energy of the noise in the first online audio signal is greater than the second threshold, the electronic device determines that the call environment type is noisy.
  • the electronic device may Set the threshold from normal mode to quiet mode to be smaller than the threshold from quiet mode to normal mode, so that when the electronic device switches from quiet mode to normal mode, if it needs to switch to noisy mode again, the call environment needs to be quieter.
  • the threshold from the normal mode to the noisy mode is greater than the threshold from the noisy mode to the normal mode, so when the electronic device switches from the noisy mode to the normal mode, if it needs to switch to the noisy mode again, the call environment needs to be noisier.
  • the electronic device may use the third energy threshold, the fourth energy threshold, the fifth energy threshold, and the sixth energy threshold to determine the long-term energy of the noise in the first uplink audio signal to determine the call environment type.
  • the process may refer to Formula (3) above.
  • N 3 represents the third energy threshold
  • N 4 represents the fourth energy threshold
  • N 5 represents the fifth energy threshold
  • N 6 represents the sixth energy threshold, wherein, N 5 is less than N 6 , N 3 is less than N 4 , N 4 is greater than N 6 , N 6 is greater than N 3 .
  • N 5 is less than N 6
  • N 3 is less than N 4
  • N 4 is greater than N 6
  • N 6 is greater than N 3 .
  • the change direction of the call environment type is one of from noisy to common, common to quiet or noisy to quiet
  • the long-term energy of the noise in the first uplink audio signal is greater than N6
  • determine the call environment type as noisy when it is greater than N5 and less than N6 , it is determined that the call environment type is normal, and when it is less than N5 , it is quiet.
  • formula (3) needs to be referred to.
  • formula (2) needs to be referred to.
  • formula (3) needs to be referred to.
  • the previous call environment type is the call environment type determined by the long-term energy of noise in the second uplink audio signal.
  • the smoothing factor can be set relatively large, for example, the value range of the smoothing factor can be (0.9,1), and it can usually be configured as 0.95.
  • step S105 and step S106 there is no particular order in the execution order of the aforementioned step S105 and step S106, and the electronic device may execute step S105 or step S106 first, or at the same time, which is not limited in this embodiment of the present application.
  • the electronic device may determine the call mode in combination with the state between the user and the screen determined in step S105 and the call environment type determined in step S106.
  • a schematic logic for the electronic device to determine whether the call mode is the normal mode, the quiet mode or the noisy mode according to the state between the user and the screen and the type of the call environment can refer to Table 2 below.
  • the electronic device determines that the call mode is the normal mode.
  • the call environment type is quiet
  • the electronic device determines that the call mode is quiet.
  • the call environment type is noisy
  • the electronic device determines that the call mode is a noisy mode.
  • the electronic device may also determine in other ways besides the ways shown in Table 2. For example, when the pressure detected by the screen is greater than a preset pressure value and lasts longer than a preset time value, the electronic device may determine that the call mode is the quiet mode.
  • step S105 and step S106 are continuously executed.
  • the electronic device determines the call mode, it can obtain the state between the user and the screen and the type of the call environment. According to the description in Table 2, combined with the user and The state between screens and the type of call environment determine the call mode.
  • Table 2 shows an exemplary logic, and the electronic device may determine the call environment type according to other logics, which is not limited in this embodiment of the present application.
  • the electronic device can set the state between the user and the screen to be close to the screen by default, and set the call environment type to normal by default. Then, the electronic device can State and call environment type to redetermine the call mode, that is, to update the call mode.
  • the electronic device may process the downlink audio signal with different parameters to obtain a processed left channel audio signal and a processed right channel audio signal.
  • the process of obtaining the processed left-channel audio signal and the processed right-channel audio signal can refer to the following description of step S107 and step S108.
  • the electronic device determines that the calling mode is the normal mode for the process, reference may be made to the following descriptions of step S109 and step S110.
  • the electronic device determines that the call mode is the normal mode for the process, reference may be made to the following description of step S111 and step S112.
  • step S107 For the description that the calling mode is the normal mode, reference may be made to the following descriptions of step S107 and step S108.
  • the electronic device determines that the call mode is a normal mode
  • the electronic device may determine that the call mode is normal model.
  • FIG. 6 a and the aforementioned FIG. 6 b it is a schematic diagram of the normal mode.
  • FIG. 6 a and the aforementioned FIG. 6 b it is a schematic diagram of the normal mode.
  • the electronic device can use the downlink audio signal to obtain the unprocessed left channel audio signal and the unprocessed right channel audio signal. Then use the first parameter to process the unprocessed left channel audio signal and the unprocessed right channel audio signal respectively, to obtain the processed left channel audio signal and the processed right channel audio signal.
  • the first parameters include a first volume parameter and a first timbre parameter.
  • the first volume parameter includes a first right channel volume parameter and a first left channel volume parameter.
  • the first timbre parameter includes a first right-channel timbre parameter and a first left-channel timbre parameter.
  • the first left-channel timbre parameter is used to adjust the timbre of the unprocessed left-channel audio signal, so that the energy of the low-frequency sound signal of the processed left-channel audio signal is greater than the energy of the high-frequency sound signal.
  • the first left channel volume parameter is used to adjust the volume of the unprocessed left channel audio signal, so that the energy of the processed left channel audio signal is the first energy.
  • the first right channel timbre parameter is used to adjust the timbre of the right channel audio signal before processing, so that the energy of the high frequency sound signal of the processed right channel audio signal is greater than the energy of the low frequency sound signal and the sound of the first frequency band
  • the energy of the signal is greater than the energy of sound signals in other frequency bands by the first decibel (decibel, dB).
  • the first right channel volume parameter is used to adjust the volume of the unprocessed right channel audio signal, so that the energy of the processed right channel audio signal is the first energy.
  • step S108 For how the electronic device processes the downlink audio signal in the normal mode, reference may be made to the description of step S108 below.
  • Fig. 10 is a schematic flowchart of processing a downlink audio signal by an electronic device in a normal mode.
  • the electronic device processes the downlink audio signal in a normal mode
  • the electronic device can use the downlink audio signal to obtain the unprocessed left channel audio signal and the unprocessed right channel audio signal. Then use the first parameter to process the unprocessed left channel audio signal and the unprocessed right channel audio signal respectively, to obtain the processed left channel audio signal and the processed right channel audio signal. For this process, reference may be made to the description of step S201-step S203 shown in FIG. 10 .
  • the electronic device performs noise reduction on the downlink audio to obtain a pre-processed left channel audio signal and a pre-processed right channel audio signal.
  • the electronic device first performs noise reduction on the downlink audio signal to suppress noise in the downlink audio signal. Then copy the downlink audio signal after noise reduction to two downlink audio signals after noise reduction, one of which is used as the left channel audio signal before processing, and the other is used as the right channel audio before processing Signal.
  • the formula for denoising the downlink audio signal by the electronic device to obtain the denoised downlink audio signal may be parameterized in the following formula (4).
  • x 1-d represents the downlink audio signal after noise reduction
  • x 1 represents the downlink audio signal
  • x 1-n represents the noise in the downlink audio signal
  • the electronic device can use one of the optimally modified log-spectral amplitude estimator (OMLSA) algorithm, the improved minima controlled recursive averaging (IMCRA) algorithm, and the spectral subtraction algorithm. or a combination of multiple algorithms to calculate x 1-n in the formula (4), that is, the noise in the downlink audio signal.
  • OFMA optimally modified log-spectral amplitude estimator
  • IMCRA improved minima controlled recursive averaging
  • spectral subtraction algorithm or a combination of multiple algorithms to calculate x 1-n in the formula (4), that is, the noise in the downlink audio signal.
  • the electronic device copies the above noise reduction into two channels of downlink audio signals after noise reduction, one of which is used as the left channel audio signal before processing, and the other is used as the right channel audio signal before processing.
  • the electronic device copies the above noise reduction into two channels of downlink audio signals after noise reduction, one of which is used as the left channel audio signal before processing, and the other is used as the right channel audio signal before processing.
  • formula (5) For related formulas, please refer to formula (5).
  • x dl represents the audio signal of the left channel before processing
  • x dr represents the audio signal of the right channel before processing
  • the electronic device uses the first parameter to perform volume adjustment and timbre adjustment on the unprocessed left channel audio signal and the unprocessed right channel audio signal, to obtain the processed left channel audio signal and the processed right channel audio signal audio signal.
  • the timbre adjustment is used to adjust the proportion of energy of audio signals in different frequency bands in the audio signal to improve the voice timbre.
  • a common timbre adjustment algorithm is an equalizer (Equalizer, EQ) algorithm.
  • Other algorithms may also be used, which is not limited in this embodiment of the present application.
  • volume adjustment is used to adjust the energy of the audio signal.
  • Common volume adjustment algorithms may include one of a dynamic range control (dynamic range control, DRC) algorithm, an automatic gain control (automatic gain control, AGC) algorithm, or a combination of both.
  • DRC dynamic range control
  • AGC automatic gain control
  • Other algorithms may also be used, which is not limited in this embodiment of the present application.
  • the electronic device can use the first left channel timbre parameter and the first left channel volume parameter to process the unprocessed left channel audio signal to obtain the processed left channel audio signal, so that the processed left channel
  • the energy of the low-frequency sound signal in the audio signal is greater than the energy of the high-frequency sound signal, so that the energy of the processed left channel audio signal is the first energy.
  • the electronic device can use the first right channel tone color parameter and the first right channel volume parameter to process the unprocessed right channel audio signal to obtain the processed right channel audio signal, so that the processed right channel
  • the energy of the high-frequency sound signal in the audio signal is greater than the energy of the low-frequency sound signal and the energy of the sound signal in the first frequency band is greater than the energy of the sound signals in other frequency bands by the first decibel, and the processed left channel audio signal Energy is the first energy.
  • the process that the electronic device uses the unprocessed left channel audio signal to generate the processed left channel audio signal is similar to the process of generating the processed right channel audio signal from the right channel audio signal.
  • the following left channel audio signal is taken as an example to describe in detail:
  • the electronic device can adjust the timbre of the pre-processed left channel audio signal through an EQ algorithm.
  • the first left channel timbre parameter is the parameter for filtering the left channel audio signal involved in the EQ algorithm.
  • a filter coefficient in this case, the filter coefficient may also be referred to as a first left channel filter coefficient.
  • the first left channel timbre parameter is used to adjust the timbre of the unprocessed left channel audio signal, suppress or enhance the sound signals of different frequency bands of the unprocessed left channel audio signal, so that the processed left channel audio signal The energy of the low frequency sound signal of the channel audio signal is greater than the energy of the high frequency sound signal.
  • the electronic device can use the algorithm of DRC combined with AGC to adjust the volume of the left channel audio signal before processing.
  • the first left channel volume parameter is the value of the left channel audio signal before processing in the algorithm of DRC combined with AGC.
  • a gain coefficient for volume adjustment at this time, the gain coefficient may also be referred to as a first left channel gain coefficient.
  • the first left channel volume parameter is used to make the energy of the processed left channel audio signal be the first energy.
  • the following formula (6) may be referred to as a formula for obtaining the processed left channel audio signal by the electronic device using the unprocessed left channel audio signal.
  • x 11 represents the processed left channel audio signal
  • filter 11 represents the first left channel timbre parameter, for example, it can be the first left channel filter coefficient
  • gain 11 represents the first left channel volume
  • the parameter for example, may be the first left channel gain coefficient
  • gain 1l *filter 1l (x dl ) means to use the first left channel filter coefficient to adjust the timbre of the left channel audio signal before processing and to use the first left channel
  • the channel gain coefficient is used to adjust the volume of the left channel audio signal before processing.
  • the electronic device in order to avoid the problem that the energy of the processed left channel audio signal fluctuates when the electronic device switches between the three call modes, the electronic device can generate the processed left channel audio signal. Introduce a smooth transition time when the signal.
  • the following formula (7) may be referred to for the formula for obtaining the processed left channel audio signal by the electronic device using the left channel audio signal.
  • x 11 represents the processed left channel audio signal
  • T s represents the smooth transition time
  • the value is an integer greater than 1
  • its unit is frame.
  • i indicates that x 11 is the processed left channel audio signal of the i-th frame to be calculated by the electronic device in the normal mode. Its value range is an integer in (0, T s ). It should be understood that, each time it switches to the normal mode again, the value of i starts from 1. After each frame of the processed left channel audio signal is calculated, The value of i is incremented by 1.
  • x 1l-1 gain 1l-1 *filter 1l-1 (x dl )
  • x 1l-2 gain 1l-2 *filter 1l-2 (x dl ).
  • gain 1l-1 indicates the timbre parameter of the left channel audio signal calculated and processed in normal mode, that is, the first left channel timbre parameter
  • filter 1l-1 indicates the volume parameter of the left channel audio signal calculated and processed in normal mode , that is, the first left channel volume parameter
  • x 1l-1 is the processed left channel audio calculated by using the unprocessed left channel audio signal combined with the first left channel timbre parameter and the first left channel volume parameter Signal.
  • gain 1l-2 indicates the timbre parameters of the processed left channel audio signal in the call mode before switching to the normal mode
  • filter 1l-2 indicates the calculated and processed left channel audio signal in the call mode before switching to the normal mode x 11-2 is the processed left channel audio signal calculated by combining the timbre parameter and the volume parameter with the unprocessed left channel audio signal.
  • the electronic device uses the first sound generator to play the processed left channel audio signal and the second sound generator to play the processed right channel audio signal.
  • the electronic device can use a codec to decode the processed left channel audio signal, and decode it into an analog electrical signal.
  • the decoded and processed left channel audio signal is obtained, and then the first power amplifier is used for power amplification, and the first sound generator is driven to play the encoded and processed left channel audio signal.
  • the electronic device can use a codec to decode the processed right channel audio signal into an analog electrical signal.
  • the decoded and processed right channel audio signal is obtained, and then the second power amplifier is used for power amplification, and the second sound generator is driven to play the encoded and processed right channel audio signal.
  • the electronic device determines that the call mode is a quiet mode
  • the electronic device may determine that the call mode is the quiet mode.
  • the quiet mode As shown in the aforementioned Fig. 6c, it is a schematic diagram of the quiet mode. For specific content, reference may be made to the foregoing description of FIG. 6 c , which will not be repeated here.
  • the electronic device can use the downlink audio signal to obtain the unprocessed left channel audio signal and the unprocessed right channel audio signal. Then use the second parameter to process the unprocessed left channel audio signal and the unprocessed right channel audio signal respectively, to obtain the processed left channel audio signal and the processed right channel audio signal.
  • the second parameter includes a second volume parameter and a second tone color parameter.
  • the second volume parameter includes a second right channel volume parameter and a second left channel volume parameter.
  • the second timbre parameter includes a second right channel timbre parameter and a second left channel timbre parameter.
  • the second left-channel timbre parameter is used to adjust the timbre of the left-channel audio signal before processing, so that the energy of the low-frequency sound signal of the processed left-channel audio signal is greater than the energy of the high-frequency sound signal and the second frequency band
  • the energy of the sound signal is 2 decibels smaller than the energy of sound signals in other frequency bands.
  • the second left channel volume parameter is used to adjust the volume of the unprocessed left channel audio signal, so that the energy of the processed left channel audio signal is the second energy.
  • the second right channel timbre parameter is used to adjust the timbre of the right channel audio signal before processing, so that the energy of the high frequency sound signal of the processed right channel audio signal is greater than the energy of the low frequency sound signal and the sound of the first frequency band
  • the energy of the signal is the third decibel (dB) greater than the energy of sound signals in other frequency bands.
  • the second right channel volume parameter is used to adjust the volume of the unprocessed right channel audio signal, so that the energy of the processed right channel audio signal is the second energy.
  • the electronic device processes the downlink audio signal in a quiet mode
  • the electronic device can use the downlink audio signal to obtain the unprocessed left channel audio signal and the unprocessed right channel audio signal. Then use the second parameter to process the unprocessed left channel audio signal and the unprocessed right channel audio signal respectively, to obtain the processed left channel audio signal and the processed right channel audio signal. For this process, reference may be made to the foregoing description of step 109 .
  • step S11 the electronic device uses the unprocessed left channel audio signal to generate the processed left channel audio signal can refer to the aforementioned step S108 using the unprocessed left channel audio signal to generate the processed left channel audio signal.
  • step S11 the process of generating the processed right channel audio signal by the electronic device using the unprocessed right channel audio signal can refer to the aforementioned process of using the unprocessed left channel audio signal to generate the processed left channel audio signal
  • the description of the first left channel timbre parameter involved in formula (6) and formula (7) is changed to the second right channel timbre parameter, and the first left channel volume involved in formula (6) and formula (7) The parameter is changed to the volume parameter of the second right channel. Other descriptions are similar and will not be repeated here.
  • the electronic device determines that the call mode is a noisy mode
  • the electronic device may determine that the call mode is the noisy mode.
  • the quiet mode As shown in the aforementioned Fig. 6d, it is a schematic diagram of the quiet mode. For specific content, reference may be made to the foregoing description of FIG. 6d , which will not be repeated here.
  • the electronic device can use the downlink audio signal to obtain the unprocessed left channel audio signal and the unprocessed right channel audio signal. Then use the third parameter to process the unprocessed left channel audio signal and the unprocessed right channel audio signal respectively to obtain the processed left channel audio signal and the processed right channel audio signal.
  • the third parameter includes a third volume parameter and a third tone color parameter.
  • the third volume parameter includes a third right channel volume parameter and a third left channel volume parameter.
  • the third timbre parameter includes a third right-channel timbre parameter and a third left-channel timbre parameter.
  • the third left-channel timbre parameter is used to adjust the timbre of the unprocessed left-channel audio signal, so that the energy of the low-frequency sound signal of the processed left-channel audio signal is greater than the energy of the high-frequency sound signal.
  • the third left channel volume parameter is used to adjust the volume of the unprocessed left channel audio signal, so that the energy of the processed left channel audio signal is the third energy.
  • the third right channel timbre parameter is used to adjust the timbre of the right channel audio signal before processing, so that the energy of the high frequency sound signal of the processed right channel audio signal is greater than the energy of the low frequency sound signal and the sound of the first frequency band
  • the energy of the signal is fourth decibel (decibel, dB) greater than the energy of sound signals in other frequency bands.
  • the third right channel volume parameter is used to adjust the volume of the unprocessed right channel audio signal, so that the energy of the processed right channel audio signal is the third energy.
  • the electronic device processes the downlink audio signal in a noisy mode
  • the electronic device can use the downlink audio signal to obtain the unprocessed left channel audio signal and the unprocessed right channel audio signal. Then use the third parameter to process the unprocessed left channel audio signal and the unprocessed right channel audio signal respectively to obtain the processed left channel audio signal and the processed right channel audio signal. For this process, reference may be made to the foregoing description of step 109 .
  • step S13 the process of generating the processed left channel audio signal by the electronic device using the unprocessed left channel audio signal can refer to the generation of the processed left channel audio signal by using the unprocessed left channel audio signal in the aforementioned step S108
  • the description of the process, the first left channel timbre parameter involved in formula (6) and formula (7) is changed to the third left channel timbre parameter, the first left channel timbre parameter involved in formula (6) and formula (7) channel volume parameter is changed to the third left channel volume parameter, other descriptions are similar, and will not be repeated here.
  • step S13 the process of generating the processed right channel audio signal by the electronic device using the unprocessed right channel audio signal can refer to the aforementioned process of using the unprocessed left channel audio signal to generate the processed left channel audio signal
  • the description of the first left channel timbre parameter involved in formula (6) and formula (7) is changed to the third right channel timbre parameter, and the first left channel volume involved in formula (6) and formula (7) The parameter is changed to the volume parameter of the third right channel. Other descriptions are similar and will not be repeated here.
  • the electronic device determines whether the call ends
  • the electronic device When it is determined that the call is not over, the electronic device continues to acquire the next frame of downlink audio signal, and repeatedly executes step S104-step S113. Re-determine whether the call mode is an adjustable mode, and then obtain the processed left channel audio signal and the processed right channel audio signal, and then play them.
  • the electronic device continues to acquire the next frame of downlink audio signal, repeats step S105-step S113, does not perform step S104, and does not need to re-determine whether the call mode is an adjustable mode.
  • step S114 If it is determined that the call ends, the electronic device executes step S114.
  • the audio signal collected by the microphone includes echo signals of other electronic devices.
  • the echo signal is caused by the audio signal collected by the microphone including the audio signal played by the first sound generator and the second sound generator.
  • the electronic device can remove the echo signal in the audio signal collected by the microphone
  • FIG. 11 is a schematic flow chart for the electronic device to remove the echo signal in the audio signal collected by the microphone.
  • step S301-step S304 For the detailed description involved in this process, reference may be made to the following description of step S301-step S304.
  • the electronic device acquires an uplink audio signal
  • the uplink audio signal is a frame of audio signal collected by the microphone of the electronic device.
  • the specific duration of a frame of audio signal can be determined according to the processing capability of the electronic device, generally it can be 10ms-50ms, such as 10ms or a multiple of 10ms such as 20ms or 30ms.
  • the uplink audio signal includes the sound signal around the electronic device and the user's sound signal, and also includes the echo signal caused by the audio signal played by the first sound generator and the second sound generator.
  • the electronic device may perform the following steps S302 to S304 to remove the echo signal.
  • the electronic device acquires a first reference signal and a second reference signal
  • the first reference signal is an audio signal output after the processed left channel audio signal passes through the first power amplifier.
  • the second parameter signal is an audio signal output after the processed right channel audio signal passes through the second power amplifier.
  • the electronic device may acquire a frame of audio signals output by the first power amplifier as the first reference signal and acquire a frame of audio signals output by the first power amplifier as the first reference signal.
  • the electronic device estimates the echo signal by using the first reference signal and the second reference signal
  • the echo signal is the estimated audio signal collected by the microphone and played by the first sound generator and the second sound generator.
  • the electronic device may combine the first reference signal and the second reference signal to estimate the echo signal.
  • the electronic device determines a related formula of the echo signal, and reference may be made to the following formula (8).
  • f l represents the transfer function from the first reference signal to the echo signal.
  • f r represents the transfer function from the second reference signal to the echo signal.
  • x' l (t, f) represents the first reference signal in the frequency domain, and x' r (t, f) represents the second reference signal in the frequency domain, where t represents a frame and f represents a frequency point.
  • the electronic device may refer to the following formula (9) for determining a related formula of the echo signal.
  • the transfer functions designed in the above formula (8) and formula (9) can be determined by using an echo cancellation (acoustic echo cancellation, AEC) algorithm, or can be determined by using other algorithms. It does not constitute a limitation to the embodiment of the application.
  • AEC acoustic echo cancellation
  • the electronic device removes the echo signal from the uplink audio signal to obtain a processed uplink audio signal.
  • the processed uplink audio signal is the part of the uplink audio signal after the echo signal is removed.
  • the electronic device uses the uplink audio signal and the echo signal to obtain a related formula of the processed uplink audio signal, which may refer to the following formula (10).
  • x 2-d represents the uplink audio signal after processing
  • x 2 represents the uplink audio signal, Indicates an echo signal.
  • the quiet mode, normal mode and noisy mode involved in this application can also be referred to as one of the first talking mode or the second talking mode, when the first mode is three modes (quiet mode, normal mode and noisy mode), the second mode can be another of the three modes, for example, when the first talking mode is quiet mode, the second mode can be one of normal mode or noisy mode.
  • the call environment can be a quiet environment, an ordinary environment or a noisy environment.
  • the characteristic of the left-channel audio signal in the first mode may be referred to as a first left-channel audio characteristic
  • the characteristic of the right-channel audio signal in the first mode may be referred to as a first right-channel audio characteristic.
  • the state between the user and the screen and/or the type of call environment may be referred to as the first call environment.
  • the characteristic of the left-channel audio signal in the second mode may be referred to as a second left-channel audio characteristic, and the characteristic of the right-channel audio signal in the second mode may be referred to as a second right-channel audio characteristic.
  • the state and/or the type of call environment between the user and the screen may be referred to as a second call environment.
  • the first left channel audio feature may be different from the second left channel audio feature, and/or the first right channel audio feature may be different from the second right channel audio feature
  • the The difference can be in volume and/or timbre.
  • the volume of the first left channel audio signal is the first energy
  • the volume of the second left channel audio signal is the second energy
  • the first The energy being greater than the second energy makes the first left channel audio characteristic different from the second left channel audio characteristic.
  • Audio signals involved in the embodiments of the present application may also be referred to as audio, and audio signals played by sound generators (first sound generator and second sound generator) (left channel audio signal and right channel audio signal) may also be referred to as for the output audio signal. Sound signals may also be referred to as sounds.
  • the first uplink audio signal mentioned in this embodiment of the present application may be the audio signal of the tth frame.
  • FIG. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present application.
  • an electronic device is taken as an example to describe the embodiment in detail. It should be understood that an electronic device may have more or fewer components than shown in the figures, two or more components may be combined, or may have a different configuration of components.
  • the various components shown in the figures may be implemented in hardware, software, or a combination of hardware and software including one or more signal processing and/or application specific integrated circuits.
  • the electronic device may include: a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194 and user An identification module (subscriber identification module, SIM) card interface 195 and the like.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, bone conduction sensor 180M, etc.
  • the structure shown in the embodiment of the present invention does not constitute a specific limitation on the electronic device.
  • the electronic device may include more or fewer components than shown in the illustrations, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor (modem for short), a graphics processing unit (graphics processing unit, GPU), Image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU), etc. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem modem for short
  • graphics processing unit graphics processing unit
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the controller may be the nerve center and command center of the electronic equipment.
  • the controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is a cache memory.
  • the memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • the modem is used to decode the audio signal sent to the machine by other electronic devices to obtain the downlink audio signal. This downlink audio signal is then passed to the two-device talk algorithm.
  • the display screen 194 is used to display images, videos and the like.
  • the display screen 194 includes a display panel.
  • the display panel may be a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (organic light-emitting diode, OLED) or the like.
  • the electronic device may include 1 or N display screens 194, where N is a positive integer greater than 1.
  • the display screen 194 may also be referred to as a screen.
  • the electronic device can realize the shooting function through ISP, camera 193 , video codec, GPU, display screen 194 and application processor.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when an electronic device selects a frequency point, a digital signal processor is used to perform Fourier transform on the frequency point energy, etc.
  • the NPU is a neural-network (NN) computing processor.
  • NPU neural-network
  • Applications such as intelligent cognition of electronic devices can be realized through NPU, such as: image recognition, face recognition, speech recognition, text understanding, etc.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, to expand the storage capacity of the electronic device.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. Such as saving music, video and other files in the external memory card.
  • the internal memory 121 may be used to store computer-executable program codes including instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device by executing instructions stored in the internal memory 121 .
  • the internal memory 121 may include an area for storing programs and an area for storing data.
  • the electronic device can implement audio functions through the audio module 170, the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signal.
  • the audio module 170 may also be used to encode and decode audio signals.
  • the audio module 170 may be set in the processor 110 , or some functional modules of the audio module 170 may be set in the processor 110 .
  • Speaker 170A also referred to as a "horn" is used to convert audio electrical signals into sound signals.
  • the electronic device can listen to music through speaker 170A, or listen to hands-free calls.
  • Receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the electronic device receives a call or a voice message, it can listen to the voice by placing the receiver 170B close to the human ear.
  • the receiver 170B may also be called a sounder, and the electronic device may include a first sounder (not shown) and a second sounder (not shown), and the first sounder is used to play the simulated Left channel audio signal.
  • the second sound generator is used to play the analog right channel audio signal.
  • the microphone 170C also called “microphone” or “microphone” is used to convert sound signals into electrical signals.
  • the user can put his mouth close to the microphone 170C to make a sound, and input the sound signal to the microphone 170C.
  • the electronic device may be provided with at least one microphone 170C.
  • the electronic device can be provided with two microphones 170C, which can also implement a noise reduction function in addition to collecting sound signals.
  • the electronic device can also be equipped with three, four or more microphones 170C to realize sound signal collection, noise reduction, identify sound sources, and realize directional recording functions, etc.
  • the microphone may transmit the collected audio signal to a codec for encoding to obtain an uplink audio signal, and then transmit the uplink audio signal to a two-device call algorithm.
  • the dual-device call algorithm can combine the uplink audio signal to calculate the call environment type.
  • the earphone interface 170D is used for connecting wired earphones.
  • the earphone interface 170D can be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, or a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • the pressure sensor 180A is used to sense the pressure signal and convert the pressure signal into an electrical signal.
  • pressure sensor 180A may be disposed on display screen 194 .
  • the pressure sensor can be used to determine the state between the user and the screen, for example, when the pressure sensor detects that the pressure between the user and the screen is greater than a preset pressure value and lasts longer than a preset time , the electronic device may determine that the state between the user and the screen is a state of close contact with the screen. When the pressure sensor detects that the pressure between the user and the screen is less than a preset pressure value, or the duration is less than a preset time, the electronic device may determine that the state between the user and the screen is not in close contact with the screen.
  • Proximity light sensor 180G may include, for example, light emitting diodes (LEDs) and light detectors, such as photodiodes.
  • the light emitting diodes may be infrared light emitting diodes.
  • the ambient light sensor 180L is used for sensing ambient light brightness.
  • the electronic device can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device is in the pocket to prevent accidental touch.
  • Touch sensor 180K also known as "touch panel”.
  • the touch sensor 180K can be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
  • Sensors on the display screen 194 can detect whether the user is in contact with the display screen 194 .
  • the keys 190 include a power key, a volume key and the like.
  • the key 190 may be a mechanical key. It can also be a touch button.
  • the electronic device can receive key input and generate key signal input related to user settings and function control of the electronic device.
  • the electronic device further includes a codec (not shown), a first power amplifier (not shown) and a second power amplifier (not shown).
  • the codec is used to encode an analog signal into a digital signal, and it can also be used to decode a digital signal into an analog signal.
  • the digital processed left-channel audio signal may be encoded to obtain an analog left-channel audio signal.
  • the first power amplifier is used to amplify the power of the analog audio signal, and drive the receiver 170B to play the analog audio signal.
  • the encoded processed left channel audio signal is amplified to drive the first sound generator to play the analog processed left channel audio signal.
  • the second power amplifier is used to amplify the power of the analog audio signal, and drive the receiver 170B to play the analog audio signal. For example, power amplifies the encoded and processed audio signal of the right channel, and drives the second sound generator to play the analog processed audio signal of the right channel.
  • the processor 110 may invoke computer instructions stored in the internal memory 121, so that the electronic device executes the calling method in the embodiment of the present application.
  • FIG. 13 is a schematic diagram of a system structure of an electronic device according to an embodiment of the present application.
  • the system structure of the electronic device is introduced as an example below.
  • the layered architecture divides the system into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces. In some embodiments, the system is divided into four layers, which are application program layer, application program framework layer, hardware abstraction layer and hardware layer from top to bottom.
  • the application layer can consist of a series of application packages.
  • the application program package may include application programs (also called applications) such as phone and settings.
  • the setting application can provide a user interface for setting whether the call mode is an adjustable mode, and a user interface for setting the sound quality adjustment sensitivity to control how long the user is in contact with the screen before the electronic device determines that the user is close to the screen.
  • the aforementioned Figures 7a-7d may be related user interfaces.
  • the setting application can transmit the information of whether the call mode set by the user is an adjustable mode to the audio hardware abstraction in the hardware abstraction layer described below. And the information of the sound quality adjustment sensitivity set by the user is transmitted to the screen hardware abstraction in the hardware abstraction layer described below.
  • the calling application is a calling application.
  • the user can make a call through the electronic device.
  • the phone application can determine that the call has been connected through the phone manager of the application framework layer, and then the phone manager can call the audio hardware abstraction of the abstraction layer to start the microphone, make the first sound
  • the hardware involved in the call process such as the speaker and the second sounder, enables the electronic device to open the call application, and the user can start the call.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer may include a telephony manager and the like.
  • the phone manager is used to provide communication functions of electronic devices. For example, the management of call status (including connected, hung up, etc.).
  • the phone manager can also determine whether the electronic device is in a handheld talking mode. And pass the information of whether it is the handset talk mode to the audio hardware abstraction.
  • the hardware abstraction layer is an interface layer between the application framework layer and the hardware layer, and provides a virtual hardware platform for the operating system.
  • the hardware abstraction layer may include audio hardware abstraction and screen hardware abstraction.
  • the audio hardware abstraction can be used to receive the information of the handheld call mode issued by the phone manager and the information of whether the call mode issued by the setting application is an adjustable mode, and store these two pieces of information in the built-in database.
  • the screen hardware abstraction can be called to obtain the state between the user and the screen, and the dual-device call algorithm can be called to The downlink audio signal is processed.
  • the phone manager determines that the call state is the handheld call mode and the call mode is not the adjustable mode, it calls the dual-device call algorithm to process the downlink audio signal.
  • the phone manager determines that the call state is not in the hand-held call mode, other call algorithms are called to process the downlink audio signal.
  • the following description takes the audio hardware abstraction to determine the call state as the handheld call mode and the call mode as the adjustable mode as an example. Other situations can refer to this description.
  • the screen hardware abstraction can be used to receive the sound quality adjustment sensitivity information issued by the setting application, and store the information in the built-in database.
  • the screen hardware abstraction can obtain the information of the sound quality adjustment sensitivity from the built-in database, and combine the sound quality adjustment sensitivity The information of the screen detects whether the user is in close contact with the screen through the sensor on the screen. Then the information of whether the user is close to the screen is sent to the dual-device talk algorithm in the audio digital signal processor described below.
  • the hardware involved in the hardware layer may include: an audio digital signal processor, a codec, a modem, a screen, a first power amplifier, a second power amplifier, a first sound generator, a second sound generator, and a microphone Wait.
  • the call algorithm can be set in the audio digital signal processor.
  • the call algorithm may include a two-device call algorithm and other call algorithms.
  • the two-device call algorithm is the call algorithm involved in the embodiment of the present application.
  • the dual-device call algorithm can receive the downlink audio signal transmitted by the following modem, and process the downlink audio signal to obtain a processed left channel audio signal and a processed right channel audio signal. Then send the processed left channel audio signal and the processed right channel audio signal to the codec.
  • the dual-device talk algorithm can also receive the uplink audio signal transmitted by the following codec, and simultaneously obtain the first reference signal and the second reference signal transmitted by the codec. Then, echo cancellation is performed on the uplink audio signal in combination with the first reference signal and the second reference signal to obtain a processed uplink audio signal.
  • the downlink audio signal can be obtained.
  • the modem can then pass the downstream audio signal to the two-device talk algorithm.
  • the microphone can transmit the collected audio signal to the codec for encoding.
  • the codec After the codec receives the processed left channel audio signal and the processed right channel audio signal, it can decode the processed left channel audio signal and the processed right channel audio signal to obtain the decoded The processed left channel audio signal and the processed right channel audio signal.
  • the decoded processed left channel audio signal is then transmitted to a first power amplifier and the decoded processed right channel audio signal is transmitted to a second power amplifier.
  • the codec can receive the audio signal collected by the microphone, encode it to obtain an uplink audio signal, and then transmit the uplink audio signal to the dual-device call algorithm.
  • the codec may also receive the decoded and processed left channel audio signal transmitted by the first power amplifier and encode it to obtain the first reference signal. Moreover, the decoded and processed right channel audio signal transmitted by the second power amplifier is received and encoded to obtain the second reference signal. Then transmit the first reference signal and the second reference signal to the dual-device talk algorithm.
  • the first power amplifier may amplify its power, and drive the first sound generator to play the decoded and processed left channel audio signal.
  • the second power amplifier After receiving the decoded and processed right channel audio signal, the second power amplifier can amplify its power, and drive the second sound generator to play the decoded and processed right channel audio signal.
  • the term “when” may be interpreted to mean “if” or “after” or “in response to determining" or “in response to detecting".
  • the phrases “in determining” or “if detected (a stated condition or event)” may be interpreted to mean “if determining" or “in response to determining" or “on detecting (a stated condition or event)” or “in response to detecting (a stated condition or event)”.
  • all or part of them may be implemented by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it may be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions. When the computer program instructions are loaded and executed on the computer, the processes or functions according to the embodiments of the present application will be generated in whole or in part.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable devices.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from a website, computer, server or data center Transmission to another website site, computer, server, or data center by wired (eg, coaxial cable, optical fiber, DSL) or wireless (eg, infrared, wireless, microwave, etc.) means.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available media may be magnetic media (eg, floppy disk, hard disk, magnetic tape), optical media (eg, DVD), or semiconductor media (eg, solid state hard disk), etc.
  • the processes can be completed by computer programs to instruct related hardware.
  • the programs can be stored in computer-readable storage media.
  • When the programs are executed may include the processes of the foregoing method embodiments.
  • the aforementioned storage medium includes: ROM or random access memory RAM, magnetic disk or optical disk, and other various media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Business, Economics & Management (AREA)
  • General Business, Economics & Management (AREA)
  • Environmental & Geological Engineering (AREA)
  • Telephone Function (AREA)

Abstract

一种通话方法及电子设备。在该方法中,电子设备中设置了两个发声器:第一发声器置于电子设备的侧面,第二发声器置于电子设备的屏幕内侧。电子设备可以将其他电子设备发送给本机的音频信号进行处理,生成左声道音频信号以及右声道音频信号。第一发声器用于播放左声道音频信号,其播放的左声道音频信号通过空气传输到人耳。第二发声器用于播放右声道音频信号,其播放的右声道音频信号通过骨骼传输到人耳,该第二发声器可以被称为骨传导发声器。其中,左声道音频信号中,低频声音信号的能量大于高频声音信号的能量,该右声道音频信号中高频声音信号的能量大于低频声音信号的能量。

Description

一种通话方法及电子设备
本申请要求于2021年10月13日提交中国专利局、申请号为202111194770.0、申请名称为“一种通话方法及电子设备”的中国专利申请的优先权,以及于2021年07月13日提交中国专利局、申请号为202110791580.0、申请名称为“一种音频处理方法和电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及终端及通信技术领域,尤其涉及一种通话方法及电子设备。
背景技术
屏占比为电子设备的屏幕占电子设备正面的比例。随着电子设备的不断发展,用户对电子设备的屏占比要求越来越高。发展到现阶段,多数电子设备都是全面屏,即该电子设备的正面全部都是屏幕,四个边框位置都是采用的无边框设计,是接近100%的屏占比。
虽然,全面屏电子设备对于视觉体验有显著的提升。但是,全面屏的出现导致电子设备的听筒不能设置在手机的正面,只能设置在电子设备的侧面。
这样,电子设备紧贴人耳在利用听筒播放音频信号时,部分音频信号从听筒输出到人耳之外,没有进入到人耳而是进入人耳周围环境中,导致漏音。另一部分音频信号可以虽然从听筒输出到人耳,但是由于漏音导致该部分音频信号的能量相比于完整的音频信号有所损失,这时,如果人耳周围环境存在噪声信号,该噪声信号进入人耳后,会干扰用户对该部分音频信号的识别,导致听音不清晰。
发明内容
本申请提供了一种通话方法及电子设备,通话过程中,电子设备可以将其他电子设备发送给本机的音频信号在不同的通话模式下利用不同的参数进行处理,生成不同的左声道音频信号以及右声道音频信号,以适配通话环境。
第一方面,本申请提供了一种通话方法,应用于包括第一发声器和第二发声器的电子设备,该第二发声器与该第一发声器不同,该第一发声器对应左声道,该第二发声器对应右声道,该方法包括:显示通话应用界面;该电子设备确定为第一通话模式,该第一通话模式对应第一左声道音频特征和第一右声道音频特征,该第一左声道音频特征为左声道输出的音频信号的音频特征,该第一右声道音频特征为该右声道输出的音频信号的音频特征,该第一通话模式对应第一通话环境;确定该电子设备处于第二通话环境;该电子设备切换为第二通话模式,该第二通话模式对应第二左声道音频特征和第二右声道音频特征,该第二左声道音频特征为该左声道输出的音频信号的音频特征,该第二右声道音频特征为该右声道输出的音频信号的音频特征,该第二通话模式对应第二通话环境,该第一通话环境和该第二通话环境不同,其中,该第一左声道音频特征与该第二左声道音频特征不同,和/或,该第一右声道音频特征与该第二右声道音频特征不同。
在上述实施例中,用户通话时,电子设备播放的音频可以随着通话环境的变化进行调整,得到适应该通话环境的音频。例如,普通模式下,嘈杂模式下,外界噪音大,则可以使得播放的音频音量增大,增大人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音的能量,使得用户在嘈杂的通话环境中可以拾音清晰。在安静模式下,外界较为安静,则可以使得播放的音频的音量变小,同时突出人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音,使得减少漏音的同时,保证用户可以拾音清晰。
结合第一方面,在一种实施方式中,该电子设备确定为第一通话模式之后,该方法还包括:该电子设备接收下行音频;该下行音频为通话过程中其他电子设备发送给该电子设备的音频;该电子设备在该第一通话模式下,对该下行音频进行处理,得到第一左声道音频以及第一右声道音频,其中,该第一左声道音频中,低频音的能量大于高频音的能量,该第一右声道音频中,高频音的能量大于低频音的能量;该电子设备通过该第一发声器播放该第一左声道音频,通过该第二发声器播放该第一右声道音频。
在上述实施例中,通常来说,两个发声器的材质是不同的,其中一个发声器适合用于播放高频的音频,另一个播放低频的音频效果更好,则电子设备生成的两路音频其中一路为低频的能量大于高频,另一路为高频的能量大于低频,则可以适配不同的发声器,提高音质。
结合第一方面,在一种实施方式中,该第一发声器置于该电子设备的侧面,该第二发声器置于该电子设备的屏幕内侧;其中,该第一发声器播放的目标左声道音频通过空气传输到人耳,该第二发声器播放的目标右声道音频通过骨骼传输到人耳。
在上述实施例中,第二发声器置于电子设备的屏幕内侧,通过骨传导实现声音的传播,使得用户可以实现在任何通话模式下都拾音清晰。同时,有了第二发声器,第一发声器播放的音频声音的能量可以适当减小,也可以使得用户拾音清晰,同时实现减少漏音。
结合第一方面,在一种实施方式中,对该下行音频进行处理,得到第一左声道音频以及第一右声道音频,具体包括:该电子设备根据下行音频得到处理前的第一左声道音频以及处理后的第一右声道音频;将该处理前的第一左声道音频以及处理前的第一右声道音频分别进行音色调整以及音量调整,得到第一左声道音频以及第一右声道音频,该音色调整是指调整音频中不同频段的声音的能量分布,该音量调整是指调整音频的能量大小。
在上述实施例中,电子设备可以对音频进行音色和音量调整,使得处理后的音频与通话时的环境适配。实现音频可以随着通话环境的变化进行调整,得到适应该通话环境的音频。
结合第一方面,在一种实施方式中,该电子设备根据下行音频得到处理前的第一左声道音频以及处理前的第一右声道音频之后,将该处理前的第一左声道音频和该处理前的第一右声道音频分别进行音色调整以及音频调整之前,该方法还包括:该电子设备确定对该处理前的第一左声道音频和该处理前的第一右声道音频进行处理的参数,该参数包括左声道音色参数、右声道音色参数、左声道音量参数及右声道音量参数;将该处理前的第一左声道音频和该处理前的第一右声道音频分别进行音色调整以及音频调整,得到第一左声道音频以及第一右声道音频,具体包括:利用该左声道音色参数和该左声道音量参数对处理前的左声道音频分别进行音色调整以及音量调整得到第一左声道音频;利用该右声道音色 参数和该右声道音量参数对处理前的右声道音频分别进行音色调整以及音量调整,得到第一右声道音频。
在上述实施例中,在不同的通话环境下,电子设备对音频进行音色调整和音量调整的参数是不同的,这样可以使得处理后的音频与通话时的环境适配。实现音频可以随着通话环境的变化进行调整,得到适应该通话环境的音频。
结合第一方面,在一种实施方式中,确定对该处理前的左声道音频和该处理前的右声道音频进行处理的参数,具体包括:该电子设备确定通话环境类型,该通话环境类型包括安静、普通及嘈杂;其中,通话环境类型为安静时与通话环境类型为普通/嘈杂时相比,前者对应的第一上行音频中噪声的长时能量比后者小;通话环境类型为嘈杂时与通话环境类型为安静/普通时相比,前者对应的第一上行音频中噪声的长时能量比后者大;该电子设备确定用户与屏幕之间的状态,该用户与屏幕之间的状态包括紧贴屏幕状态和非紧贴屏幕状态;该紧贴屏幕状态为用户与该电子设备的屏幕之间的距离小于一个预设值且大于该预设值的此续时间大于一个预设时间的状态,该非紧贴屏幕状态为用户与该电子设备的屏幕之间的距离不小于一个预设值且不小于该预设值的此续时间大于一个预设时间的状态;基于该通话环境类型以及用户与屏幕之间的状态确定通话模式,该通话模式为第一通话模式及第二通话模式中的一个。
在上述实施例中,电子设备用过通话环境类型以及用户与屏幕之间的状态确定通话模式。这样,可以使得确定的通话模式更准确,例如,当用户与屏幕紧贴时,且环境嘈杂时,可以确定为嘈杂模式,此时可以设置增大播放的音频的音量使得用户拾音清晰。
结合第一方面,在一种实施方式中,该第一模式为安静模式、普通模式以及嘈杂模式中的一个,第二模式为安静模式、普通模式以及嘈杂模式中的另一个,基于该通话环境类型以及用户与屏幕之间的状态确定通话模式,具体包括:在该通话环境类型为普通且该用户与屏幕之间的状态为紧贴屏幕状态的情况下或者该用户与屏幕之间的状态为非紧贴屏幕状态的情况下,该电子设备确定该通话模式为普通模式;该电子设备确定该普通模式对应的参数为对该处理前的第一左声道音频和该处理前的第一右声道音频进行处理的参数;在该通话环境类型为安静,该用户与屏幕之间的状态为紧贴屏幕状态的情况下,该电子设备确定通话模式为安静模式;该电子设备确定该安静模式对应的参数为对该处理前的第一左声道音频和该处理前的第一右声道音频进行处理的参数;在该通话环境类型为嘈杂,该用户与屏幕之间的状态为紧贴屏幕状态的情况下,确定通话模式为嘈杂模式;该电子设备确定该嘈杂模式对应的参数为对该处理前的第一左声道音频和该处理前的第一右声道音频进行处理的参数。
在上述实施例中,通话模式可以分为安静模式、普通模式和嘈杂模式,这三个模式下得到的处理后的音频的特征是不同的。例如,对于普通模式来说,将第一发声器以及第二发声器播放的音频信号的能量整体都设置为比安静模式大但是比嘈杂模式小,且突出第一频段的声音信号的能量,可以使得拾音清晰的同时减少漏音。对于嘈杂模式来说,将第一发声器以及第二发声器播放的音频信号的能量整体都设置为最大,且突出第一频段的声音信号的能量,可以使得在嘈杂的环境下也能实现拾音清晰。
结合第一方面,在一种实施方式中,设置计算第一上行音频中噪声的长时能量时涉及 的参数,使得该通话模式,只能从安静模式切换到普通模式、普通模式切换到嘈杂模式、嘈杂模式切换到普通模式及普通模式切换到安静模式。
在上述实施例中,电子设备的通话模式不会从安静模式突然切换到嘈杂模式,也不会从嘈杂模式突然切换到安静模式,这样,可以使得用户听到的声音变化时平和的。不会突然一下变大然后突然一下变小。
结合第一方面,在一种实施方式中,显示通话应用界面之后,该电子设备确定为第一通话模式之前,该方法还包括:该电子设备确定用户通话过程中,通过该第一发声器以及该第二发声器播放音频。
在上述实施例中,当电子设备使用第一发声器以及该第二发声器播放音频时才使用本方案中模式切换方案。如果不是使用第一发声器以及该第二发声器播放音频,例如,使用扬声器播放音频,则使用其他的算法处理音频。增加了电子设备与硬件的适配性。
结合第一方面,在一种实施方式中,该电子设备默认设置通话环境类型为普通;该电子设备默认设置用户与屏幕之间的状态为紧贴屏幕状态。
在上述实施例中,这样,可以使得电子设备刚开始通话的时候,确定通话模式为普通模式,是用户听到的声音保持在一个平均的水平,更加具有普适性。
结合第一方面,在一种实施方式中,该方法还包括:该电子设备根据第一参考信号及第二参考信号估计出回声,该第一参考信号为第一左声道音频经过第一功率放大器之后输出的音频,该第二参考信号为第一右声道音频经过第二功率放大器之后输出的音频,该回声为估计出的麦克风采集的该第一发声器以及该第二发声器播放的音频;从该第一上行音频中除去该回声,得到目标上行音频。
在上述实施例中,除去处理后麦克风采集的音频中的回声,使得其他设备在于本机通过通话类APP进行通信时听不到本机采集的回声,可以体高通话质量。
第二方面,本申请提供了一种电子设备,该电子设备包括:一个或多个处理器和存储器;该存储器与该一个或多个处理器耦合,该存储器用于存储计算机程序代码,该计算机程序代码包括计算机指令,该一个或多个处理器调用该计算机指令以使得该电子设备执:显示通话应用界面;确定为第一通话模式,该第一通话模式对应第一左声道音频特征和第一右声道音频特征,该第一左声道音频特征为左声道输出的音频信号的音频特征,该第一右声道音频特征为该右声道输出的音频信号的音频特征,该第一通话模式对应第一通话环境;确定处于第二通话环境;切换为第二通话模式,该第二通话模式对应第二左声道音频特征和第二右声道音频特征,该第二左声道音频特征为该左声道输出的音频信号的音频特征,该第二右声道音频特征为该右声道输出的音频信号的音频特征,该第二通话模式对应第二通话环境,该第一通话环境和该第二通话环境不同,其中,该第一左声道音频特征与该第二左声道音频特征不同,和/或,该第一右声道音频特征与该第二右声道音频特征不同。
上述实施例中,用户通话时,电子设备播放的音频可以随着通话环境的变化进行调整,得到适应该通话环境的音频。例如,普通模式下,嘈杂模式下,外界噪音大,则可以使得播放的音频音量增大,增大人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音的能量,使得用户在嘈杂的通话环境中可以拾音清晰。在安静模式下,外界较为安静,则 可以使得播放的音频的音量变小,同时突出人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音,使得减少漏音的同时,保证用户可以拾音清晰。
结合第二方面,在一种实施方式中,该一个或多个处理器还用于调用该计算机指令以使得该电子设备执:接收下行音频;该下行音频为通话过程中其他电子设备发送给的音频;在该第一通话模式下,对该下行音频进行处理,得到第一左声道音频以及第一右声道音频,其中,该第一左声道音频中,低频音的能量大于高频音的能量,该第一右声道音频中,高频音的能量大于低频音的能量;通过该第一发声器播放该第一左声道音频,通过该第二发声器播放该第一右声道音频。
上述实施例中,通常来说,两个发声器的材质是不同的,其中一个发声器适合用于播放高频的音频,另一个播放低频的音频效果更好,则电子设备生成的两路音频其中一路为低频的能量大于高频,另一路为高频的能量大于低频,则可以适配不同的发声器,提高音质。
结合第二方面,在一种实施方式中,该一个或多个处理器具体用于调用该计算机指令以使得该电子设备执:根据下行音频得到处理前的第一左声道音频以及处理后的第一右声道音频;将该处理前的第一左声道音频以及处理前的第一右声道音频分别进行音色调整以及音量调整,得到第一左声道音频以及第一右声道音频,该音色调整是指调整音频中不同频段的声音的能量分布,该音量调整是指调整音频的能量大小。
上述实施例中,电子设备可以对音频进行音色和音量调整,使得处理后的音频与通话时的环境适配。实现音频可以随着通话环境的变化进行调整,得到适应该通话环境的音频。
结合第二方面,在一种实施方式中,该一个或多个处理器还用于调用该计算机指令以使得该电子设备执:确定对该处理前的第一左声道音频和该处理前的第一右声道音频进行处理的参数,该参数包括左声道音色参数、右声道音色参数、左声道音量参数及右声道音量参数;该一个或多个处理器具体用于调用该计算机指令以使得该电子设备执:利用该左声道音色参数和该左声道音量参数对处理前的左声道音频分别进行音色调整以及音量调整得到第一左声道音频;利用该右声道音色参数和该右声道音量参数对处理前的右声道音频分别进行音色调整以及音量调整,得到第一右声道音频。
上述实施例中,在不同的通话环境下,电子设备对音频进行音色调整和音量调整的参数是不同的,这样可以使得处理后的音频与通话时的环境适配。实现音频可以随着通话环境的变化进行调整,得到适应该通话环境的音频。
结合第二方面,在一种实施方式中,该一个或多个处理器具体用于调用该计算机指令以使得该电子设备执:确定通话环境类型,该通话环境类型包括安静、普通及嘈杂;其中,通话环境类型为安静时与通话环境类型为普通/嘈杂时相比,前者对应的第一上行音频中噪声的长时能量比后者小;通话环境类型为嘈杂时与通话环境类型为安静/普通时相比,前者对应的第一上行音频中噪声的长时能量比后者大;确定用户与屏幕之间的状态,该用户与屏幕之间的状态包括紧贴屏幕状态和非紧贴屏幕状态;该紧贴屏幕状态为用户与的屏幕之间的距离小于一个预设值且大于该预设值的此续时间大于一个预设时间的状态,该非紧贴屏幕状态为用户与的屏幕之间的距离不小于一个预设值且不小于该预设值的此续时间大于一个预设时间的状态;基于该通话环境类型以及用户与屏幕之间的状态确定通话模式,该 通话模式为第一通话模式及第二通话模式中的一个。
上述实施例中,电子设备用过通话环境类型以及用户与屏幕之间的状态确定通话模式。这样,可以使得确定的通话模式更准确,例如,当用户与屏幕紧贴时,且环境嘈杂时,可以确定为嘈杂模式,此时可以设置增大播放的音频的音量使得用户拾音清晰。
结合第二方面,在一种实施方式中,该一个或多个处理器具体用于调用该计算机指令以使得该电子设备执:在该通话环境类型为普通且该用户与屏幕之间的状态为紧贴屏幕状态的情况下或者该用户与屏幕之间的状态为非紧贴屏幕状态的情况下,确定该通话模式为普通模式;确定该普通模式对应的参数为对该处理前的第一左声道音频和该处理前的第一右声道音频进行处理的参数;在该通话环境类型为安静,该用户与屏幕之间的状态为紧贴屏幕状态的情况下,确定通话模式为安静模式;确定该安静模式对应的参数为对该处理前的第一左声道音频和该处理前的第一右声道音频进行处理的参数;在该通话环境类型为嘈杂,该用户与屏幕之间的状态为紧贴屏幕状态的情况下,确定通话模式为嘈杂模式;确定该嘈杂模式对应的参数为对该处理前的第一左声道音频和该处理前的第一右声道音频进行处理的参数。
上述实施例中,通话模式可以分为安静模式、普通模式和嘈杂模式,这三个模式下得到的处理后的音频的特征是不同的。例如,对于普通模式来说,将第一发声器以及第二发声器播放的音频信号的能量整体都设置为比安静模式大但是比嘈杂模式小,且突出第一频段的声音信号的能量,可以使得拾音清晰的同时减少漏音。对于嘈杂模式来说,将第一发声器以及第二发声器播放的音频信号的能量整体都设置为最大,且突出第一频段的声音信号的能量,可以使得在嘈杂的环境下也能实现拾音清晰。
结合第二方面,在一种实施方式中,该一个或多个处理器还用于调用该计算机指令以使得该电子设备执:确定用户通话过程中,通过该第一发声器以及该第二发声器播放音频。
上述实施例中,当电子设备使用第一发声器以及该第二发声器播放音频时才使用本方案中模式切换方案。如果不是使用第一发声器以及该第二发声器播放音频,例如,使用扬声器播放音频,则使用其他的算法处理音频。增加了电子设备与硬件的适配性。
结合第二方面,在一种实施方式中,该一个或多个处理器还用于调用该计算机指令以使得该电子设备执:根据第一参考信号及第二参考信号估计出回声,该第一参考信号为第一左声道音频经过第一功率放大器之后输出的音频,该第二参考信号为第一右声道音频经过第二功率放大器之后输出的音频,该回声为估计出的麦克风采集的该第一发声器以及该第二发声器播放的音频;从该第一上行音频中除去该回声,得到目标上行音频。
上述实施例中,除去处理后麦克风采集的音频中的回声,使得其他设备在于本机通过通话类APP进行通信时听不到本机采集的回声,可以体高通话质量。
第三方面,本申请提供了一种电子设备,该电子设备包括:一个或多个处理器和存储器;该存储器与该一个或多个处理器耦合,该存储器用于存储计算机程序代码,该计算机程序代码包括计算机指令,该一个或多个处理器调用该计算机指令以使得该电子设备执行如第一方面或第一方面的任意一种实施方式所描述的方法。
在上述实施例中,用户通话时,电子设备播放的音频可以随着通话环境的变化进行调 整,得到适应该通话环境的音频。例如,普通模式下,嘈杂模式下,外界噪音大,则可以使得播放的音频音量增大,增大人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音的能量,使得用户在嘈杂的通话环境中可以拾音清晰。在安静模式下,外界较为安静,则可以使得播放的音频的音量变小,同时突出人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音,使得减少漏音的同时,保证用户可以拾音清晰。
第四方面,本申请实施例提供了一种芯片***,该芯片***应用于电子设备,该芯片***包括一个或多个处理器,该处理器用于调用计算机指令以使得该电子设备执行如第一方面或第一方面的任意一种实施方式所描述的方法。
在上述实施例中,用户通话时,电子设备播放的音频可以随着通话环境的变化进行调整,得到适应该通话环境的音频。例如,普通模式下,嘈杂模式下,外界噪音大,则可以使得播放的音频音量增大,增大人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音的能量,使得用户在嘈杂的通话环境中可以拾音清晰。在安静模式下,外界较为安静,则可以使得播放的音频的音量变小,同时突出人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音,使得减少漏音的同时,保证用户可以拾音清晰。
第五方面,本申请实施例提供了一种包含指令的计算机程序产品,当该计算机程序产品在电子设备上运行时,使得该电子设备执行如第一方面或第一方面的任意一种实施方式所描述的方法。
在上述实施例中,用户通话时,电子设备播放的音频可以随着通话环境的变化进行调整,得到适应该通话环境的音频。例如,普通模式下,嘈杂模式下,外界噪音大,则可以使得播放的音频音量增大,增大人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音的能量,使得用户在嘈杂的通话环境中可以拾音清晰。在安静模式下,外界较为安静,则可以使得播放的音频的音量变小,同时突出人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音,使得减少漏音的同时,保证用户可以拾音清晰。
第六方面,本申请实施例提供了一种计算机可读存储介质,当该指令在电子设备上运行时,使得该电子设备执行如第一方面或第一方面的任意一种实施方式所描述的方法。
在上述实施例中,用户通话时,电子设备播放的音频可以随着通话环境的变化进行调整,得到适应该通话环境的音频。例如,普通模式下,嘈杂模式下,外界噪音大,则可以使得播放的音频音量增大,增大人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音的能量,使得用户在嘈杂的通话环境中可以拾音清晰。在安静模式下,外界较为安静,则可以使得播放的音频的音量变小,同时突出人耳敏感且指向性较好的频段(例如频率为1khz-3khz)的声音,使得减少漏音的同时,保证用户可以拾音清晰。
附图说明
图1示出了一种通话算法的示意图;
图2示出了一种方案中电子设备的发声器的示意图;
图3示出了本申请实施例中电子设备的发声器的示意图;
图4示出了本申请实施例涉及的通话方法的一个示意图;
图5示出了电子设备不是手持通话的示意性场景;
图6a-图6d示出了通话模式的三个示意图;
图7a-图7d示出了电子设备设置通话模式是否为可调节模式的一组示例性用户界面;
图8为本申请实施例中涉及的通话方法的一个示意性流程图;
图9为本申请实施例提供的通话环境类型变化的示意性说明图;
图10为电子设备在普通模式下处理下行音频信号的一个示意性流程图;
图11为电子设备除去麦克风采集的音频信号中的回声信号的一个示意性流程图;
图12是本申请实施例提供的电子设备的结构示意图;
图13是本申请实施例提供的电子设备的***结构示意图。
具体实施方式
本申请以下实施例中所使用的术语只是为了描述特定实施例的目的,而并非旨在作为对本申请的限制。如在本申请的说明书和所附权利要求书中所使用的那样,单数表达形式“一个”、“一种”、“所述”、“上述”、“该”和“这一”旨在也包括复数表达形式,除非其上下文中明确地有相反指示。还应当理解,本申请中使用的术语“和/或”是指并包含一个或多个所列出项目的任何或所有可能组合。
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为暗示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征,在本申请实施例的描述中,除非另有说明,“多个”的含义是两个或两个以上。
本申请以下实施例中的术语“用户界面(user interface,UI)”,是应用程序或操作***与用户之间进行交互和信息交换的介质接口,它实现信息的内部形式与用户可以接受形式之间的转换。用户界面是通过java、可扩展标记语言(extensible markup language,XML)等特定计算机语言编写的源代码,界面源代码在电子设备上经过解析,渲染,最终呈现为用户可以识别的内容。用户界面常用的表现形式是图形用户界面(graphic user interface,GUI),是指采用图形方式显示的与计算机操作相关的用户界面。它可以是在电子设备的显示屏中显示的文本、图标、按钮、菜单、选项卡、文本框、对话框、状态栏、导航栏、Widget等可视的界面元素。
为了便于理解,下面先对本申请实施例涉及的相关术语及概念进行介绍。
(1)通话算法
通话算法中包括通话下行涉及的算法以及通话上行涉及的算法。
其中,通话下行是指电子设备接收其他电子设备发送给本机的输入音频信号之后,电子设备将该输入音频信号进行第一处理得到的音频信号,并可以通过发声器等将其进行播放。
通话上行是指电子设备通过麦克风采集声音信号,并对该声音信号进行第二处理,生 成输出音频信号,然后通过发送给其他的电子设备。第一处理使用的算法为通话下行涉及的算法,第二处理使用的算法为通话上线涉及的算法。
图1示出了一种通话算法的示意图。
如图1所示,通话上行的过程中,电子设备将其他电子设备通过基站传输给本机的输入音频信号进行第一处理。该第一处理包括:首先经过调制解调器将其解码成电子设备可以识别的音频信号,然后经过通话下行处理模块,再利用编解码器将其解码成模拟音频信号,经过功率放大器进行功率放大,然后驱动发声器对其进行播放。该通话下行处理模块涉及的算法可以包括降噪、音色调整以及音量调整。
通话上行的过程中,则是电子设备的麦克风采集声音信号,将该声音信号进行第二处理。还第二处理包括:首先通过编解码器将其进行编码得到数字音频信号,然后通过通话上行处理模块,再利用调制解调器进行调制,得到基站可以识别的输出音频信号。该通话上行处理模块涉及的算法可以包括降噪、音色调整以及音量调整。
该通话下行处理模块以及通话上行处理模块中涉及的降噪、音色调整以及音量调整相同。
其中,降噪用于对一路音频信号时进行降噪,抑制该音频信号中的噪声信号以及混响信号。
音色调整用于调整音频信号中不同频段的音频信号的能量大小,改善语音音色。能量的单位为分贝(decibel,dB),其用于描述声音信号的强度。能量越大的音频信号利用同一个发声器进行播放时听起来音量越大。
可以理解的是,音色为音频信号中不同频段的音频信号的能量占比大小。
音量调整用于调整音频信号的能量。
在一种方案中,为了提高电子设备的屏占比,实现全面屏,发声器置于电子设备的侧面,采取侧缝或顶部开孔使发声器播放的音频信号可以传输至人耳。
如图2示出了一种方案中电子设备的发声器的示意图。
如图2中的(a)所示,用户界面20为电子设备的一个通话界面,区域201中显示的内容为电子设备的侧面,发声器置于该电子设备的侧面,电子设备可以设置侧缝以及顶部开孔,使得发声器播放的音频信号可以传输至人耳,例如,该侧缝可以如区域201A中所示,顶部开孔可以为区域201B中所示。
如图2中的(b)所示,在通话的过程中,电子设备紧贴人耳利用发声器播放音频信时,一部分音频信号可以为音频信号a以及音频信号b,另一部分音频信号可以为漏音a。其中,音频信号a可以通过侧缝直接进入人耳,音频信号b可以通过顶部开孔直接进入人耳。而漏音a为发声器播放的音频信号中,没有传输到人耳的那部分音频信号。
一方面,可以理解的是,通话时,因为电子设备产生了漏音a,导致电子设备播放的音频信号泄露,可能会造成用户隐私泄露。
另一方面,在人耳周围存在有噪声信号的情况下,例如,如图2中的(b)所示,该噪声信号可以为噪声a。该噪声a进入人耳后,会干扰用户对音频信号a的识别,如果音频信号a的能量比噪声a小,则会导致听音不清。
这样,如果将发声器置于电子设备的侧面,则在通话过程中,电子设备利用发声器播放音频信号时,会导致漏音引起隐私泄露以及噪声信号进入人耳干扰用户识别音频信号的问题。
在本申请实施例中,电子设备中设置了两个发声器:第一发声器置于电子设备的侧面,第二发声器置于电子设备的屏幕内侧。电子设备可以将其他电子设备发送给本机的音频信号进行处理,生成处理后的左声道音频信号以及处理后的右声道音频信号。第一发声器用于播放处理后的左声道音频信号,其播放的左声道音频信号(处理后的)通过空气传输到人耳。第二发声器用于播放处理后的右声道音频信号,其播放的右声道音频信号(处理后的)通过骨骼传输到人耳,该第二发声器可以被称为骨传导发声器。其中,左声道音频信号中,低频声音信号的能量大于高频声音信号的能量,该右声道音频信号中高频声音信号的能量大于低频声音信号的能量。
在本申请实施例中,第一发声器播放的左声道音频信号皆为处理后的左声道音频信号,第二发声器播放的右声道音频信号皆为处理后的右声道音频信号。电子设备具体如何得到该处理后的左/右音频信号可以参考下述步骤S201-步骤S203的描述。
图3示出了本申请实施例中电子设备的发声器的示意图。
如图3中的(a)所示,用户界面30为电子设备的一个通话界面,第一发声器可以参考前述图2中涉及的发声器的相关描述,第二发声器可以设置在区域301中所示。
如图3中的(b)所示,在通话的过程中,电子设备紧贴人耳利用第一发声器播放左声道音频信号时,一部分音频信号可以为音频信号1以及音频信号2,其中,音频信号1可以通过侧缝直接进入人耳,音频信号2可以通过顶部开孔直接进入人耳。另一部分音频信号可以为漏音1。同时,电子设备可以利用第二发声器播放右声道音频信号,该右声道音频信号如音频信号3所示。此时,第一发声器虽然导致了漏音,但是第二发声器可以播放右声道音频信号进行弥补使得进入人耳的音频信号的能量增大,使得用户可以拾音清晰。
在一些实施例中,在嘈杂环境下,人耳周围存在噪声信号,例如,该噪声信号可以为噪声1。电子设备也可以通过增加处理后的左声道音频信号及处理后的右声道音频信号的能量使得用户识别该左声道音频信号及右声道音频信号,降低噪声信号对用户干扰。
在另一些实施例中,在安静环境下,电子设备可以通过降低左声道音频信号的能量,使得第一发声器可以减少漏音。主要依靠第二发声器播放右声道音频信号使得用户可以拾音清晰。
下面介绍本申请实施例涉及的通话方法。
本申请实施例涉及的通话方法适用于通话下行以及通话上行的过程。
如图4示出了本申请实施例涉及的通话方法的一个示意图。
如图4所示,通话下行的过程中,电子设备确定通话模式为可调节模式以及用户紧贴屏幕的情况下,可以结合通话环境类型确定通话模式。不同的通话模式下,电子设备可以设置不同的参数对下行音频信号进行处理,得到音色以及音量不同的处理后的左声道音频信号及处理后的右声道音频信号,然后利用第一发声器播放左声道音频信号(处理后的) 以及利用第二发声器播放右声道音频信号(处理后的)。
具体的,可以通过双器件通话算法中的双器件通话下行处理模块可以对该下行音频信号进行降噪、音色调整以及音量调整得到处理后的左声道音频信号及处理后的右声道音频信号。其中,不同的通话模式下,进行音色调整以及音量调整涉及的参数不同。
然后电子设备将处理后的左声道音频信号经过第一功率放大器进行功率放大,驱动第一发声器播放该左声道音频信号(处理后的),将处理后的右声道音频信号经过第二功率放大器进行功率放大,驱动第二发声器播放该右声道音频信号(处理后的)。
其中,下行音频信号为其他电子设备发送给本机的音频信号。
在一些实施例中,通话模式可以分为安静模式、普通模式以及嘈杂模式。
通话上行的过程中,电子设备可以对上行音频信号进行回声消除。具体的,电子设备可以通过双器件通话上行处理模块中的回声消除算法利用第一功率放大器输出的参考信号以及第二功率放大器输出的参考信号估计出回声信号,然后,从上行音频信号中除去该回声信号。
下面介绍本申请实施例中的通话方法的适用情况。
本申请实施例的通话方法适用于电子设备处于手持通话模式的情况下。该手持通话模式是指电子设备通过第一发声器和/或第二发声器播放音频信号,例如在通话的过程中,电子设备通过电子设备的扬声器对音频信号进行播放不属于手持通话模式,通过耳机或者音箱等其他的发声器对音频信号进行播放也不属于手持通话模式。通过扬声器对音频信号进行播放,如图5中的(a)所示,电子设备通过扬声器播放音频信号,参见用户界面50中扬声器图标501为灰色。通过耳机等其他的发声器对音频信号进行播放,例如通过蓝牙耳机播放音频信号,如图5中的(b)所示,电子设备通过TWS耳机播放音频信号,用户界面51中耳机***提示图标502可示出。
在电子设备确定通话时使用本申请实施例涉及的通话方法的情况下,电子设备可以确定通话模式,这样可以根据不同的通话模式,设置对下行音频信号进行处理的参数。得到处理后的右声道音频信号以及处理后的左声道音频信号。
具体的,首先利用该下行音频信号生成处理前的左声道音频信号及处理前的右声道音频信号,然后利用不同的参数分别对该处理前的左声道音频信号以及处理前的右声道音频信号进行音色调整以及音量调整,得到符合通话环境的处理后的右声道音频信号以及处理后的左声道音频信号。并且利用第一发声器播放左声道音频信号(处理后的),利用第二发声器播放右声道音频信号(处理后的),实现了通话下行过程中利用双通道播放音频信号。
关于电子设备具体如何生成处理后的右声道音频信号以及处理后的左声道音频信号的详细过程可以参考下述对步骤S108的描述,此处暂不赘述。
下面介绍本申请实施例中的通话方法涉及的通话模式。
在一些实施例中,通话模式可以包括普通模式、安静模式以及嘈杂模式。
在不同的通话模式下,对于同一下行音频信号,电子设备处理得到的处理后的左声道音频信号以及处理后的右声道音频信号不同,该不同可以体现在音量和/或音色上,其中, 音量用于指示音频信号的能量或声音大小,音色用于指示音频信号在不同频段上的声音信号的能量分布(占比大小)。例如,从音量的角度:对于该处理后的左声道音频信号和该处理后的右声道音频信号,嘈杂模式下最大,普通模式下次之,安静模式下最小。从音色的角度:对于处理后的右声道音频信号,嘈杂模式下,第一频段的声音信号的能量比其他频段的声音信号的能量大第一程度,普通模式下,第一频段的声音信号的能量比其他频段的声音信号的能量大第二程度,安静模式下,第一频段的声音信号的能量比其他频段的声音信号的能量大第三程度。对于处理后的左声道音频信号,嘈杂模式下及普通模式下,不同频段上的能量分布不做调整(与处理前的左声道音频信号在不同频段上的能量占比大小相同),安静模式下,第一频段的声音信号的能量比其他频段的声音信号的能量小第四程度,其中第一程度、第二程度、第三程度及第四程度可以用分贝度量,且第一程度、第二程度、第三程度及第四程度可以相同也可以不同,通常来说第一程度>第二程度>第三程度。
应该理解的是,除了音量和/或音色以外,在不同的通话模式下,对于同一下行音频信号,电子设备处理得到的处理后的左声道音频信号以及处理后的右声道音频信号还可以有其他的不同,本申请以音量和/或音色为例说明,不应该构成对本申请的限定。
用户接听电话后,电子设备设置通话模式为普通模式。然后,在通话模式为可调节模式的情况下,电子设备可以在三个通话模式之间进行切换。可选的,响应于用户接听电话,电子电子设备设置通话模式为普通模式,用户可以开始通话。可以理解的,也可以用户接听电话后,电子设备设置通话模式为安静模式或者嘈杂模式,为了便于说明,以通话模式为普通模式为例进行说明。
其中,当电子设备确定用户紧贴屏幕且通话环境类型为普通的情况下或者确定用户为非紧贴屏幕的情况下,可以确定通话模式为普通模式。
其中,用户非紧贴屏幕为用户与电子设备的屏幕之间的距离大于一个预设值且大于该预设值的此续时间大于一个预设时间。用户紧贴屏幕为用户与电子设备的屏幕之间的距离小于一个预设值且小于该预设值的此续时间大于一个预设时间。
当电子设备确定用户紧贴屏幕且通话环境类型为安静的情况下,可以确定通话模式为安静模式。
当电子设备确定用户紧贴屏幕且通话环境类型为嘈杂的情况下,可以确定通话模式为嘈杂模式。
其中,通话环境类型可以用于描述电子设备在通话时周围环境中噪声的长时能量大小。噪声的长时能量为一段时间(例如30s等时间段)内噪声的平均能量。在一些实施例中,通话环境类型可以分为安静、普通及嘈杂。电子设备可以通过该长时能量的大小来判断通话环境类型,长时能量大则为嘈杂,小则为安静,中间状态为普通。其中,噪声的长时能量为一段时间内噪声的平均能量,其用于指示一段时间内噪声的能量大小。该长时能量大是指该长时能量大于一个阈值,该长时能量小是指该长时能量小于另一个阈值,该长时能量介于一个阈值与另一个阈值之间为中间状态。该过程的具体内容可以参考下述对步骤S106的描述,此处暂不赘述。
图6a-图6d示出了通话模式的三个示意图。
其中,图6b-图6d中,图标611以及图标612表示噪声,该图标611以及图标612的 数量表示噪声的大小,数量越多表示噪声越大。图标613表示第一发声器(处理后的左声道)播放的处理后的音频信号,该图标613的数量越多表示处理后的左声道音频信号能量越大,即音量越大,反之表示能量越小,即声音越小。同理,图标614表示第二发声器(处理后的右声道)播放的音频信号,该图标614的数量表示处理后的右声道音频信号的能量大小。图6a中示出的是用户非紧贴屏幕时的一个示例。图6a及图6b-图6d中的(a)示出的为用户紧贴屏幕时的一个示例。
如图6a所示,为普通模式的一个示意图,用户没有紧贴屏幕,则此时,电子设备可以确定通话模式为普通模式。
如图6b中的(a)所示,为另一个普通模式的示意图。此时,用户紧贴屏幕。周围存在噪声且噪声长时能量为中间状态,电子设备可以确定此时为普通模式。
如图6b中的(b)所示,普通模式下,在通话的过程中,电子设备紧贴人耳利用第一发声器播放左声道音频信号(处理后的)时,该左声道音频信号可以包括音频信号1和/或音频信号2(下面为了便于说明以音频信号1和音频信号2为例进行说明),其中,音频信号1可以通过开孔或侧缝或其他物理通道进入人耳,音频信号2可以通过顶部开孔或者侧缝或者其他物理通道进入人耳。该左声道音频信号还可以包括漏音1。同时,电子设备可以利用第二发声器播放右声道音频信号,该右声道音频信号包括音频信号3,音频信号3为第二发生器所播放的音频信号。此时,环境中的噪声为噪声1。可以理解的,左右声道为示例性的,也可以左声道音频信号对应音频信号3,右声道音频信号对应音频信号1和/或音频信号2。
这样,第一发声器虽然导致了漏音且外界存在噪声,但是第二发声器可以播放右声道音频信号进行弥补使得进入人耳的音频信号的能量增大,使得用户可以拾音清晰;由于第二声生器的增设,使音频信号进入用户耳朵的路径变短且方向性强,相对于仅有第一发声器,拾音更加清晰。
如图6c中的(a)所示,为安静模式的一个示意图。此时,用户紧贴屏幕。周围不存在噪声,则电设备可以确定此时为安静模式。
如图6c中的(b)所示,安静模式下,在通话的过程中,电子设备紧贴人耳利用第一发声器播放左声道音频信号(处理后的)时,该左声道音频信号可以包括为音频信号1和/或音频信号2(下面为了便于说明以音频信号1和音频信号2为例进行说明),其中,音频信号1可以通过开孔、侧缝或其他物理通道进入人耳,音频信号2可以通过开孔、侧缝或其他物理通道进入人耳。该左声道音频信号还可以包括漏音1。同时,电子设备可以利用第二发声器播放右声道音频信号,该右声道音频信号还包括音频信号3,音频信号3为第二发声器所产生的音频信号。此时,环境中的噪声为没有噪声。可以理解的,左右声道为示例性的,也可以左声道音频信号对应音频信号3,右声道音频信号对应音频信号1和/或音频信号2。
图6c中的(b)对比图6b中的(b)可知,安静模式下的音频信号1、音频信号2以及音频信号3分别相比于普通模式的音频信号1、音频信号2以及音频信号3的能量都变小了,则听起来声音变小。
此时,由于安静模式下的漏音1相比于普通模式下的漏音1的能量变小了,则在安静 的环境下,可以保护用户的隐私。
如图6d中的(a)所示,为嘈杂模式的一个示意图。此时,用户紧贴屏幕。周围存在噪声,则电子设备可以确定此时为嘈杂模式。
如图6d中的(b)所示,嘈杂模式下,在通话的过程中,电子设备紧贴人耳利用第一发声器播放左声道音频信号(处理后的)时,一部分音频信号可以为音频信号1以及音频信号2,其中,音频信号1可以通过侧缝直接进入人耳,音频信号2可以通过顶部开孔直接进入人耳。另一部分音频信号可以为漏音1。同时,电子设备可以利用第二发声器播放右声道音频信号,该右声道音频信号如音频信号3所示。
图6d中的(b)对比图6b中的(b)可知,嘈杂模式下的音频信号1、音频信号2以及音频信号3分别相比于普通模式的音频信号1、音频信号2以及音频信号3的能量都变大了,则听起来声音变大。
此时,嘈杂模式下的噪声信号为噪声1,相比于普通模式下的噪声1,该嘈杂模式下的噪声1的能量变大了,则听起来声音变大。即时在嘈杂的环境中,用户也可以实现拾音清晰。
应该理解的是,上述图6a-图6d中的内容展示的是:不同模式下,不同的发声器播放音频信号的音量不同。除了音量不同以外,不同模式下,不同的发声器播放音频信号的频域也可以不相同,该频域可以根据不同模式进行设置。具体的描述可以参考前述对频域的相关介绍以及下文对步骤S106中相关内容的描述。
可选的,在一些实施例中,不同模式下得到的处理后的左声道音频信号以及处理后的右声道音频信号的特点如下表1所示:
表1
Figure PCTCN2022093888-appb-000001
如表1所示,在普通模式下,处理后的左声道音频信号的特点为:处理后的左声道音频信号的能量为第一能量,可选的,该处理后的左声道音频信号的低频声音信号的能量大于高频声音信号的能量,其中,低频声音信号以及高频声音信号是根据实际需求进行设置的,本申请实施例对此不作限定。例如,低频可以是指2khz以下的声音信号,高频是指2khz以上的声音信号。处理后的右声道音频信号的特点为:处理后的右声道音频信号的能量为第四能量,处理后的第一频段的声音信号的能量比其他频段的声音信号的能量大第一分贝(decibel,dB);可选的,该处理后的右声道音频信号的高频声音信号的能量大于低频声音信号的能量。可选的,第一能量与第四能量相同;或者第四能量与第一能量不同,但相差较小。
在安静模式下,处理后的左声道音频信号的特点为:处理后的左声道音频信号的能量为第二能量,为了使得安静模式下声音听起来比普通模式小,则该第二能量小于第一能量,处理后的第二频段的声音信号的能量比其他频段的声音信号的能量小第二分贝;可选的,该处理后的左声道音频信号的低频声音信号的能量大于高频声音信号的能量。处理后的右声道音频信号的特点为:处理后的右声道音频信号的能量为第五能量,第五能量小于第四能量,处理后的第一频段的声音信号的能量比其他频段的声音信号的能量大第三分贝,可选的,该处理后的右声道音频信号的高频声音信号的能量大于低频声音信号的能量。第五能量小于第四能量。
在嘈杂模式下,处理后的左声道音频信号的特点为:处理后的左声道音频信号的能量为第三能量,为了使得嘈杂模式下声音听起来比普通模式大,则该第三能量大于第一能量。可选的,该处理后的左声道音频信号的低频声音信号的能量大于高频声音信号的能量。处理后的右声道音频信号的特点为:处理后的右声道音频信号的能量为第六能量,处理后的第一频段的声音信号的能量比其他频段的声音信号的能量大第四分贝;可选的,该处理后的右声道音频信号的高频声音信号的能量大于低频声音信号的能量。第六能量大于第四能量。可选的,第六能量与第三能量相同;或者第六能量与第三能量不同,但相差较小。
普通模式以及嘈杂模式下,对于左声道音频信号进行音色调整时,采用现有技术即可,在此不进行赘述。安静模式下,对左声道音频信号进行的音色调整是为了适应两个发声器时的情况,在某些情况下,对于普通模式以及嘈杂模式下的左声道音频信号也会对应进行其他的音色调整。
应该理解的是,第一分贝、第二分贝、第三分贝以及第四分贝可以相同也可以不同,通常来说第一分贝<第二分贝<第四分贝。
其中,第一频段的声音信号可以为用户听觉上比较敏感,指向性比较好的频段上的声音信号,例如可以是频率为1khz-3khz的声音信号。第二频段的声音信号可以为高频的声音信号,例如,大于1khz的声音信号。为了使得用户在嘈杂环境中拾音清晰,可以使得第四分贝最大,第三分贝次之,第一分贝最小,第二分贝可以与第三分贝相同。通常,第一分贝可以为3dB,第二分贝和第三分贝可以为6dB,第四分贝为9db。第一能量可以为(-9dB~-6dB),第二能量可以为(-15dB~-12dB),第三能量可以为(-3dB~-0dB),第五能量可以为(-12dB~-9dB)。
应该理解的是,以上1khz-3khz、1khz、3dB、6dB、9dB、(-9dB~-6dB)、(-15dB~-12dB) 以及(-3dB~-0dB)等数据只是举例说明,可以根据实际需求更改,不构成对本申请实施例的限制。其中,(-9dB~-6dB)、(-15dB~-12dB)以及(-3dB~-0dB)可以为进行归一化后的数据。
在一些实施例中,任一处理后的左声道音频信号的第一能量与任一处理后的右声道音频信号的第四能量可以不同也可以相同。任一处理后的左声道音频信号的第二能量与任一处理后的右声道音频信号的第五能量可以不同也可以相同。任一处理后的左声道音频信号的第三能量与任一处理后的右声道音频信号的第六能量可以不同也可以相同。例如,处理后的左声道音频信号的第一能量可以大于处理后的右声道音频信号的第四能量也可以相同。这样,对于安静模式来说,电子设备的周围噪音较小,第二发声器播放(骨传导的)的右声道音频信号的能量大于第一发声器播放的左声道音频信号(处理后的)的能量,可以使得电子设备主要是通过第二发声器发声,使得漏音得以减小。对于第一频段的声音信号,指向性比较好,用户听觉上比较敏感,突出该部分音频信号的能量可以使得用户在安静模式下对外减少漏音,对用户则可以实现拾音清晰。
对于普通模式来说,将第一发声器以及第二发声器播放的音频信号的能量整体都设置为比安静模式大但是比嘈杂模式小,且突出第一频段的声音信号的能量,可以使得拾音清晰的同时减少漏音。
对于嘈杂模式来说,将第一发声器以及第二发声器播放的音频信号的能量整体都设置为最大,且突出第一频段的声音信号的能量,可以使得在嘈杂的环境下也能实现拾音清晰。
电子设备确定通话模式为可调节模式的情况下,才可以在三个通话模式之间进行切换。
可选的,通话模式是否为可调节模式可以通过用户进行设置。
图7a-图7d示出了电子设备设置通话模式是否为可调节模式的一组示例性用户界面。
如图7a所示,用户界面70为电子设备的一个设置界面。该用户界面70中包括声音和振动设置项701。响应于用户在声音和振动设置控件701上的操作(例如点击操作),该电子设备可以显示图7b中示出的用户界面71。
如图7b所示,用户界面71为对应声音和振动设置项701的设置内容的一个用户界面。该用户界面71中可以包括手持通话接听方式设置项711,响应于用户在手持通话接听方式设置项711上的操作(例如点击操作),电子设备可以显示如图7c所示的用户界面72。
如图7c所示,用户界面72为对应手持通话接听方式设置项711的设置内容的一个用户界面。用于提示用户在手持通话接听的情况下,是否开启贴耳自动音质调节功能。其中,开启贴耳自动音质调节功能是指电子设备可以在三个通话模式之间进行切换。关闭贴耳自动音质调节功能是指电子设备不可以在三个通话模式之间进行切换,始终保持一种通话模式,例如普通模式。
在一些实施例中,电子设备可以设置默认开启贴耳自动音质调节功能,可以在三个通话模式之间进行切换。
例如,用户界面72所示,电子设备默认开启贴耳自动音质调节功能,此时,开启调整控件721置灰。在用户不改变设置的情况下,则通话时,电子设备可以在三个通话模式之间进行切换。在用户改变设置的情况下,则通话时,电子设备不可以在三个通话模式之间 进行切换。例如,响应于用户在关闭调整控件722上的操作(例如点击操作),则电子设备关闭贴耳自动音质调节功能,显示如图7d所示的用户界面73。
如图7d所示,用户界面73中,关闭调整控件722置灰。则通话时,电子设备不可以在三个通话模式之间进行切换,始终保持一种通话模式,例如普通模式。
在一些实施例中,在开启贴耳自动音质调节功能的情况下,通话时电子设备可以在三个通话模式之间进行切换。此时,电子设备可以结合用户与屏幕之间的状态以及环境类型确定通话模式。
用户与屏幕之间的状态可以分为紧贴屏幕状态与非紧贴屏幕状态。
可选的,紧贴屏幕状态为用户与电子设备的屏幕之间的距离小于一个预设值且大于该预设值的此续时间大于一个预设时间的状态,非紧贴屏幕状态为用户与电子设备的屏幕之间的距离不小于一个预设值且不小于该预设值的此续时间大于一个预设时间的状态。
可选的,电子设备确定处于非紧贴屏幕状态(即,电子设备确定用户处于非紧贴屏幕状态,该非紧贴屏幕状态也可以替代为第一状态或者第二状态等,用于标识电子设备所处的状态或者用于标识相对于用户的距离,同理紧贴屏幕的状态是相同的,在此不进行赘述,同样适用于其他实施例中),用户紧贴屏幕为用户与电子设备的屏幕之间的距离小于第一预设值且用户保持(该小于第一预设值的距离的)持续时间大于第一预设时间,电子设备从非紧贴屏幕状态切换为紧贴屏幕状态;若未达到上述条件,则依然处于非紧贴屏幕状态。可以理解的,切换条件也可以是其他条件。
同样的,电子设备确定处于紧贴屏幕状态,当确定用户与电子设备的屏幕之间的距离大于第二预设值且用户保持(该大于第二预设值的距离的)持续时间大于第二预设时间,电子设备从紧贴屏幕状态切换为非紧贴屏幕状态;若未达到上述条件,则依然处于紧贴屏幕状态。可以理解的,切换条件也可以是其他条件。
应该理解的是,前述是以用户与电子设备的屏幕之间的距离,以及该距离持续的时间为例进行说明,还有其他判断用户与屏幕之间状态的方式。例如,通过用户给屏幕的压力以及该压力持续的时间来确定用户与屏幕之间的状态,即将前述利用距离判断用户与屏幕的状态的方法中的距离更改为压力即可,也可以通过电子设备检测到人体皮肤(包括人脸和耳朵等)与电子设备的接触面积来实现,此处不再进行赘述。
可选的,第一预设值和第二预设值可以相同,也可以不同。可选的,第一预设时间和第二预设时间可以相同,也可以不同。
可选的,第一预设时间和/或第二预设时间可以由用户进行设置的。例如,如图7c中的用户界面72所示。用户可以通过设置音质调节灵敏度来控制用户与屏幕接触多长时间电子设备才确定用户紧贴屏幕。例如,当灵敏度调整控件723处于提示文字:“快”时,则当用户与屏幕接触时间较短时,电子设备就可以确定用户紧贴屏幕。该接触时间可以为1秒-5秒,例如3秒。当灵敏度调整控件723处于提示文字:“慢”时,则当用户与屏幕接触时间较长时,电子设备才可以确定用户紧贴屏幕。该接触时间可以为10秒以上,例如10秒。应该理解的是,当敏度调整控件723越接近提示文字:“慢”时,则用户与屏幕接触时间越长,电子设备才可以确定用户紧贴屏幕。
下面详细介绍本申请实施例涉及的通话方法。
本申请实施例中,对于通话下行的过程,电子设备开始通话之后,在处于手持通话模式的情况下,首先设置通话模式为普通模式。然后,确定是否为可调节模式,在通话模式为不可调节模式的情况下,则电子设备的通话模式保持普通模式不改变。在通话模式为可调节模式的情况下,电子设备确定用户与屏幕之间的状态以及通话环境类型之后,则结合用户与屏幕之间的状态以及通话环境类型重新确定通话模式,可以进行模式切换。这样,可以根据不同的通话模式,设置对下行音频信号进行处理的参数。得到处理后的右声道音频信号以及处理后的左声道音频信号。再利用第一发声器播放处理后的左声道音频信号,利用第二发声器播放处理后的右声道音频信号。
图8为本申请实施例中涉及的通话方法的一个示意性流程图。
关于本申请实施例中涉及的通话方法可以参考下述对步骤S101-步骤S114的详细描述。
S101.电子设备开启通话类应用;
通话类应用为可以为电子设备提供通话功能的APP,通话包括语音通话和视频通话。
可选的,电子设备显示来电提示,响应于接听控件上的操作(例如点击操作),电子设备可以通过该通话类应用与其他电子设备进行通信,通过电子设备,用户可以开始通话。
其中,语音通话是指电子设备与至少一个其他电子设备之间实时传送音频信号的一种通信方式。视频通话是指电子设备与至少一个其他电子设备之间实时传送音频信号和图像信号的一种通信方式。
可选的,从通话开始到结束通话的过程中,电子设备持续获取下行音频信号,即电子设备可以持续接收其他电子设备发送给本机的音频信号。该下行音频信号为其他电子设备发送给本机的一帧或多帧音频信号。一帧音频信号具体为多长时间的音频信号可以根据电子设备的处理能力决定,一般可以为10ms-50ms,例如10ms或者20ms、30ms等10ms的倍数。
可选的,在电子设备开启通话类应用之后,接收到其他电子设备给本机发的第一帧下行音频信号之后,在对该下行音频信号进行处理之前,该电子设备可以执行步骤S102及步骤S103等确定对该第一帧下行音频信号进行处理的方式。对于电子设备获取下行音频信号的方面,例如获取方式和获取时间,本申请实施例不进行限定,下面详细介绍步骤S102及步骤S103。
S102.电子设备确定通话过程是否为手持通话模式;
手持通话模式是指,电子设备开启通话类应用,用户开始通话的过程中,通过第一发声器或第二发声器播放音频信号。即在通话的过程中,电子设备没有通过扬声器或者耳机等其他的发声器对音频信号进行播放。
例如,前述图3中的(a)示出的即为电子设备为手持通话的一个示例性用户界面,具体内容可以参考前述对图3中的(a)的相关描述。前述图5中的(a)以及图5中的(b)示出的即为电子设备不为手持通话模式的示例性用户界面,具体内容可以参考前述对图5中的(a)以及图5中的(b)的相关描述。
在一些实施例中,默认情况下,音频信号通过第一发声器或者第二发声器播放,在电子设备检测到本机没有连接耳机以及通过扬声器等其他发声器播放音频信号的情况下,则可以确定通话模式为手持通话模式。在电子设备检测到本机连接耳机或通过扬声器等其他发声器播放音频信号的情况下,则可以确定不为手持通话模式。
在电子设备确定通话模式为手持通话模式的情况下,则执行步骤S104-步骤S114。
在电子设备确定不为手持通话模式的情况下,则执行步骤S103。
S103.电子设备使用其他算法处理下行音频信号;
在确定不是手持通话模式的情况下,电子设备使用其他算法(例如降噪算法等)处理该下行音频信号。然后,通过其他的发声器播放该处理后的下行音频信号,例如,电子设备可以通过扬声器播放该处理后的下行音频信号。
在电子设备确定通话模式为手持通话模式的情况下,则执行步骤S104-步骤S114。确定通话模式,在该通话模式下处理该下行音频信号,得到处理后的左声道音频信号以及处理后的右声道音频信号,然后,通过第一发声器播放该处理后的左声道音频信号以及通过第二发声器播放该处理后的右声道音频信号。关于步骤S104-步骤S114的详细描述如下:
S104.电子设备确定通话模式是否为可调节模式;
可调节模式为电子设备可以在三个通话模式之间进行切换。
在一些实施例中,电子设备可以通过用户设置改变本机是否为可调节模式。
例如,如前述图7a-图7d所示,为用户通过电子设备设置通话模式是否为可调节模式的一组示例性用户界面。具体内容可以参考前述对图7a-图7d的相关描述。
电子设备确定通话模式为可调节模式,则执行步骤S105-步骤S114。
电子设备确定通话模式为不可调节模式,则在普通模式下处理该下行音频信号,即利用普通模式下涉及的参数对该下行音频信号进行处理,得到处理后的左声道音频信号以及处理后的右声道音频信号,然后,通过第一发声器播放该处理后的左声道音频信号以及通过第二发声器播放该处理后的右声道音频信号。具体过程可以参考下述对步骤S108的描述,此处暂不赘述。
在确定通话模式为可调节模式之后,电子设备执行步骤S105-步骤S114,其中,步骤S104-步骤S113是持续进行的,直到结束通话。
应该理解的是,可选的,在一种可能的实现方式中,步骤S102中,确定通话模式为手持通话模式之后,电子设备可以不执行步骤S104,直接确定通话模式为普通模式,然后在普通模式下处理该下行音频信号。或者,也可以用户接听电话后,电子设备设置通话模式为安静模式或者嘈杂模式。
在另一种可能的实现方式中,步骤S102中,确定通话模式为手持通话模式之后,可以不直接确定通话模式为普通模式而是执行步骤S105,判断电子设备的通话模式是否为可调 节模式,如果是,则重新判断电子设备的通话模式。
S105.电子设备确定用户与屏幕之间的状态是否为紧贴屏幕状态;
用户与屏幕之间的状态可以分为紧贴屏幕状态与非紧贴屏幕状态。其具体的描述可以参考前述相关内容,此处不再赘述。
可选的,电子设备可以默认设置用户与屏幕之间的状态为紧贴状态,然后,根据用户是否紧贴屏幕更新该用户与屏幕之间的状态。
具体的,电子设备可以通过屏幕上的传感器检测到用户是否与屏幕接触,如果是,则确定用户与屏幕之间的状态为紧贴屏幕状态。否则,确定用户与屏幕之间的状态为非紧贴屏幕状态。判断条件前面已经进行说明,在此不进行赘述。
在一些实施例中,用户与屏幕接触多长时间电子设备才确定用户紧贴屏幕是可以由用户进行设置的。例如,如图7c中的用户界面72所示。用户可以通过设置音质调节灵敏度来控制用户与屏幕接触多长时间电子设备才确定用户紧贴屏幕,对该过程的具体描述可以参考前述对图7c中的相关内容的描述,此处不再赘述。
电子设备确定用户与屏幕之间的状态为紧贴屏幕状态时,可以执行步骤S106以确定通话环境类型,再基于该通话环境类型确定通话模式。
可选的,在一些实施例中,电子设备确定用户与屏幕之间的状态为非紧贴屏幕状态时,可以执行步骤S107,确定通话模式为普通模式。
在另一些实施例中,电子设备确定用户与屏幕之间的状态为非紧贴屏幕状态时,可以执行步骤S108,在普通模式下处理该下行音频信号。
应该理解的是,当电子设备确定用户与屏幕之间的状态为非紧贴屏幕状态时,电子设备还可以确定通话模式为安静模式或者嘈杂模式。
S106.电子设备确定通话环境类型;
通话环境类型可以用于描述通话时,电子设备周围环境中噪声的长时能量大小。噪声的长时能量为预设一段时间内噪声的平均能量。
可选的,电子设备可以通过计算麦克风获取的帧音频信号中噪声的长时能量来确定通话环境类型。
在一些实施例中,通话环境类型可以分为安静、普通及嘈杂。在第一次更新通话环境类型之前,电子设备可以设置通话环境类型为普通。然后,根据帧音频信号中噪声的长时能量来更新通话环境类型。电子设备可以通过该长时能量的大小来判断通话环境类型,长时能量大则为嘈杂,小则为安静,中间状态为普通。
具体的,电子设备可以利用麦克风获取的第一上行音频信号中噪声的能量以及第二上行音频信号中噪声的长时能量,得到第一上行音频信号中噪声的长时能量。
该第一上行音频信号为电子设备的麦克风获取的第t帧音频信号。
该第二上行音频信号为与第一上行音频信号相差X帧的音频信号,其中,X为大于等于1的整数。X的取值范围与电子设备的处理能力相关,通过可以为1-5,例如X=1时,则第二上行音频信号为第一上行音频信号的前一帧音频信号,即第t-1帧音频信号,此时, 该第一上行音频信号中噪声的长时能量,可以理解为下述公式(1)中的N l(t),该第一上行音频信号中噪声的能量,可以理解为下述公式(1)中的N t(t),此时,该第二上行音频信号中噪声的长时能量,可以理解为下述公式(1)中的N l(t-1)。
这里应该理解的是,噪声可以分为稳态噪声和非稳态噪声,其中,稳态噪声为在测量时间内,被测声源的声级起伏不大于一定阈值(如3dB)的噪声,非稳态噪声是指在测量时间内,被测声源的声级起伏不小于一定阈值(如3dB)的噪声。
电子设备计算该第一上行音频信号中噪声的长时能量的相关公式可以参考下述公式(1):
N l(t)=a*N l(t-1)+(1-a)N t(t)(t>1)  公式(1)
公式(1)中,N l(t)即为第一上行音频信号中噪声的长时能量,N t(t)表示第一上行音频信号中噪声的能量,特别的,该N t(t)的噪声可以为稳态噪声,也可以包括稳态噪声和非稳态噪声,具体可以根据需要进行设置。N l(t-1)表示第二上行音频信号中噪声的长时能量。a表示平滑因子,其取值范围为(0.9,1),a可以常数,也可以是变量,为变量时,可以基于N t(t)中包括的噪声类型对a的值进行调整,例如,当N t(t)的噪声包括稳态噪声但是不包括非稳态噪声时,该a可以取值为0.9,该a可以根据其他的情况进行改变,本申请实施例对此不做限定。
其中,可选的,N t(t)可以通过最小值控制的递归平均(mcra)算法得到。
应该理解的是,当电子设备计算麦克风获取的第一帧第一上行音频信号时,公式(2)中的N l(t-1)还无法计算得到,则此时,可以将N l(t-1)设置为一个初始值,该初始值的大小可以根据经验取得。
在一些实施例中,电子设备可以通过第一能量阈值、第二能量阈值以及该第一上行音频信号中噪声的长时能量确定通话环境类型,该过程可以参考下述公式(2)。
Figure PCTCN2022093888-appb-000002
公式(2)中,N 1表示第一阈值,N 2表示第二阈值,该第一阈值小于第二阈值。通常,该第一阈值可以设置为(-65db,-55db),例如-60db。该第二阈值可以设置为(-35db,-25db),例如-30db。当第一上行音频信号中噪声的长时能量小于第一阈值时,则电子设备确定通话环境类型为安静;当第一上线音频信号中噪声的长时能量大于第一阈值但是小于第二阈值时,则电子设备确定通话环境类型为普通;当第一上线音频信号中噪声的长时能量大于第二阈值时,则电子设备确定通话环境类型为嘈杂。
在另一些实施中,为了使得通话环境类型不会在安静、普通及嘈杂模式之间进行频繁的切换,从而导致通话模式在安静模式、普通模式及嘈杂模式之间进行频繁的切换,电子设备可以设置从普通模式到安静模式的阈值小于从安静模式到普通模式的阈值,这样当电子设备从安静模式切换到普通模式之后,如需再切换到嘈杂模式,则需要通话环境更安静才可以,从普通模式到嘈杂模式的阈值大于从嘈杂模式到普通模式的阈值,这样当电子设备从嘈杂模式切换到普通模式之后,如需再切换到嘈杂模式,则需要通话环境更嘈杂才可以。
可选的,电子设备可以通过第三能量阈值、第四能量阈值、第五能量阈值、第六能量阈值,确定该第一上行音频信号中噪声的长时能量确定通话环境类型,该过程可以参考上述公式(3)。
Figure PCTCN2022093888-appb-000003
公式(3)中,N 3表示第三能量阈值,N 4表示第四能量阈值,N 5表示第五能量阈值,N 6表示第六能量阈值,其中,N 5小于N 6,N 3小于N 4,N 4大于N 6,N 6大于N 3。对该公式(3)的描述,可以参考对图9的描述。如图9所示,当通话环境类型变化的方向是从安静到普通、普通到嘈杂或者安静到嘈杂中的一种时,当第一上行音频信号中噪声的长时能量小于N 3时,则确定通话环境类型为安静,大于N 3小于N 4时,则确定通话环境类型为普通,大于N 4时为嘈杂。当通话环境类型变化的方向是从嘈杂到普通、普通到安静或者嘈杂到安静中的一种时,当第一上行音频信号中噪声的长时能量大于N 6时,则确定通话环境类型为嘈杂,大于N 5小于N 6时,则确定通话环境类型为普通,小于N 5时为安静。
当之前通话环境类型为安静时,如果要确定通话环境类型,则需要参考公式(3)。当之前通话环境类型为嘈杂时,如果要确定通话环境类型,则需要参考公式(2)。当之前通话环境类型为普通时,如果要确定通话环境类型为嘈杂,则需要参考公式(2)。当之前通话环境类型为普通时,如果要确定通话环境类型为安静,则需要参考公式(3)。
其中,之前通话环境类型为通过第二上行音频信号中噪声的长时能量所确定的通话环境类型。
其中,为了使得通话环境类型不会在安静、普通及嘈杂三个模式之间频繁切换且不会从安静模式直接切换到嘈杂模式或者从嘈杂模式直接切换到安静模式,前述公式(2)中的平滑因子可以设置得比较大,例如,该平滑因子的取值范围可以为(0.9,1),通常可以配置为0.95。
应该理解的是,前述步骤S105以及步骤S106的执行顺序没有先后之分,电子设备可以先执行步骤S105或步骤S106,也可以同时执行,本申请实施例对此不作限定。
在一些实施例中,电子设备可以结合前述步骤S105中所确定的用户与屏幕之间的状态以及前述步骤S106中所确定的通话环境类型来确定通话模式。
具体的,电子设备通过用户与屏幕之间的状态以及通话环境类型来确定通话模式是普通模式、安静模式还是嘈杂模式的一个示意性逻辑可以参考下述表2。
表2
用户与屏幕之间的状态 通话环境类型 通话模式
紧贴屏幕状态 普通 普通模式
紧贴屏幕状态 安静 安静模式
紧贴屏幕状态 嘈杂 嘈杂模式
非紧贴屏幕状态 普通/安静/嘈杂 普通模式
上述表2中可以看出,在用户与屏幕之间的状态为紧贴屏幕状态的情况下,通话环境类型为普通,则电子设备确定通话模式为普通模式。当通话环境类型为安静的情况下,则电子设备确定通话模式为安静。当通话环境类型为嘈杂的情况下,则电子设备确定通话模式为嘈杂模式。在用户与屏幕之间的状态为非紧贴屏幕状态时,则电子设备确定通话模式为普通模式。
可选的,对于安静模式的确定,电子设备除了可以基于表2中示出的方式以外,还可以通过其他的方式进行确定。例如,当屏幕检测到的压力大于一个预设压力值时且持续时间大于一个预设时间值时,电子设备可以确定通话模式为安静模式。
电子设备在进行通话的过程中,步骤S105以及步骤S106是持续执行的,则电子设备确定通话模式时,可以获取用户与屏幕之间的状态以及通话环境类型,按照表2的描述,结合用户与屏幕之间的状态以及通话环境类型确定通话模式。
其中,表2示出的是一个示例性逻辑,电子设备可以按照其他的逻辑确定通话环境类型,本申请实施例对此不作限定。
应该理解的是,在一些实施例中,电子设备可以默认设置用户与屏幕之间的状态为紧贴屏幕状态,以及默认设置通话环境类型为普通,然后,电子设备可以根据用户与屏幕之间的状态以及通话环境类型重新确定通话模式,即更新通话模式。
对于不同的通话模式,电子设备可以用不同的参数对该下行音频信号进行处理,得到处理后的左声道音频信号以及处理后的右声道音频信号。
可选的,在一些实施例中,不同模式下得到的处理后的左声道音频信号以及右声道音频信号的特点可以参考上述对表1及相关内容的描述,此处不再赘述。
具体的,当电子设备确定通话模式为普通模式时,得到处理后的左声道音频信号以及处理后的右声道音频信号的过程可以参考下述对步骤S107及步骤S108的描述。当电子设备确定通话模式为普通模式时,该过程可以参考下述对步骤S109及步骤S110的描述。当电子设备确定通话模式为普通模式时,该过程可以参考下述对步骤S111及步骤S112的描述。
对于通话模式为普通模式的描述可以参考下述对步骤S107及步骤S108的描述。
S107.电子设备确定通话模式为普通模式;
在用户与屏幕之间的状态为紧贴屏幕状态以及通话环境类型为普通的情况下,或者在用户与屏幕之间的状态为非紧贴屏幕状态的情况下,电子设备可以确定通话模式为普通模式。如前述图6a以及前述图6b所示,为普通模式的一个示意图。具体内容可以参考前述对前述图6a以及前述图6b的描述,此处不再赘述。
电子设备可以利用该下行音频信号得到处理前的左声道音频信号以及处理前的右声道音频信号。然后利用第一参数分别对该处理前的左声道音频信号以及处理前的右声道音频信号进行处理,得到处理后的左声道音频信号以及处理后的右声道音频信号。
该第一参数中包括第一音量参数以及第一音色参数。该第一音量参数中包括第一右声道音量参数以及第一左声道音量参数。该第一音色参数中包括第一右声道音色参数以及第一左声道音色参数。
其中,第一左声道音色参数用于对处理前的左声道音频信号进行音色调整,使得处理后的左声道音频信号的低频声音信号的能量大于高频声音信号的能量。
第一左声道音量参数用于对处理前的左声道音频信号进行音量调整,使得处理后的左声道音频信号的能量为第一能量。
第一右声道音色参数用于对处理前的右声道音频信号进行音色调整,使得处理后的右声道音频信号的高频声音信号的能量大于低频声音信号的能量且第一频段的声音信号的能量比其他频段的声音信号的能量大第一分贝(decibel,dB)。
第一右声道音量参数用于对处理前的右声道音频信号进行音量调整,使得处理后的右声道音频信号的能量为第一能量。
具体的,电子设备如何在普通模式下对该下行音频信号进行处理,可以参考下述对步骤S108的描述。
图10为电子设备在普通模式下处理下行音频信号的一个示意性流程图。
S108.电子设备在普通模式下处理该下行音频信号;
电子设备可以利用该下行音频信号得到处理前的左声道音频信号以及处理前的右声道音频信号。然后利用第一参数分别对该处理前的左声道音频信号以及处理前的右声道音频信号进行处理,得到处理后的左声道音频信号以及处理后的右声道音频信号。该过程可以参考对图10中示出的步骤S201-步骤S203的描述。
S201.电子设备对该下行音频进行降噪,得到处理前的左声道音频信号以及处理前的右声道音频信号。
电子设备首先对该下行音频信号进行降噪,抑制该下行音频信号中的噪声。然后将降噪后的下行音频信号拷贝为两路降噪后的下行音频信号,其中一路降噪后的下行音频信号作为处理前的左声道音频信号,另一路作为处理前的右声道音频信号。
电子设备对下行音频信号进行降噪得到降噪后的下行音频信号的公式可以参数下述公式(4)。
x 1-d=x 1-x 1-n  公式(4)
公式(4)中,x 1-d表示降噪后的下行音频信号,x 1表示下行音频信号,x 1-n表示下行音频信号中的噪声。
其中,电子设备可以利用最优改进对数谱幅度估计(optimally modified log-spectral amplitude estimator,OMLSA)算法、最小统计量控制递归平均(improved minima controlled recursive averaging,IMCRA)算法及谱减法等算法中一个或多个算法的结合来对计算得到公式(4)中的x 1-n,即下行音频信号中的噪声。
然后电子设备将上述降噪拷贝为两路降噪后的下行音频信号,其中一路降噪后的下行音频信号作为处理前的左声道音频信号,另一路作为处理前的右声道音频信号。相关公式可以参考公式(5)。
Figure PCTCN2022093888-appb-000004
公式(5)中,x dl表示处理前的左声道音频信号,x dr表示处理前的右声道音频信号。
S202.电子设备利用第一参数,对该处理前的左声道音频信号以及处理前的右声道音频信号进行音量调整以及音色调整,得到处理后的左声道音频信号以及处理后的右声道音频信号。
对该第一参数的介绍可以参考前述对步骤S107中相关内容的描述,此处不再赘述。
音色调整用于调整音频信号中不同频段的音频信号的能量占比大小,改善语音音色,常见的音色调整的算法为均衡器(Equalizer,EQ)算法。也可以为其他的算法,本申请实施例对此不作限定。
音量调整用于调整音频信号的能量。常见的音量调整算法可以包括动态范围调整(dynamic range control,DRC)算法,自动增益控制(automatic gain control,AGC)算法中的一个或两者的结合。也可以为其他的算法,本申请实施例对此不作限定。
电子设备可以利用第一左声道音色参数以及第一左声道音量参数对该处理前的左声道音频信号进行处理,得到处理后的左声道音频信号,使得该处理后的左声道音频信号中的低频声音信号的能量大于高频声音信号的能量且使得该处理后的左声道音频信号的能量为第一能量。
电子设备可以利用第一右声道音色参数以及第一右声道音量参数对该处理前的右声道音频信号进行处理,得到处理后的右声道音频信号,使得该处理后的右声道音频信号中的高频声音信号的能量大于低频声音信号的能量且第一频段的声音信号的能量比其他频段的声音信号的能量大第一分贝,以及,该处理后的左声道音频信号的能量为第一能量。
应该理解的是,电子设备利用处理前的左声道音频信号生成处理后的左声道音频信号的过程与右声道音频信号生成处理后的右声道音频信号的过程相似,下面以生成处理后的左声道音频信号为例进行详细描述:
可选的,电子设备可以通过EQ算法对该处理前的左声道音频信号进行音色调整,此时,第一左声道音色参数为该EQ算法中涉及的对左声道音频信号进行滤波的滤波器系数,此时,该滤波器系数也可以被称为第一左声道滤波器系数。该第一左声道音色参数用于对该处理前的左声道音频信号进行音色调整,对该处理前的左声道音频信号的不同频段的声音信号进行抑制或者增强,使得处理后的左声道音频信号的低频声音信号的能量大于高频声音信号的能量。电子设备可以使用DRC结合AGC的算法对该处理前的左声道音频信号进行音量调整,此时,第一左声道音量参数即为DRC结合AGC的算法中对处理前的左声道音频信号进行音量调整的增益系数,此时,该增益系数也可以被称为第一左声道增益系数。该第一左声道音量参数用于使得处理后的左声道音频信号的能量为第一能量。
在一些实施例中,电子设备利用处理前的左声道音频信号得到处理后的左声道音频信号的公式可以参考下述公式(6)。
x 1l=gain 1l*filter 1l(x dl)   公式(6)
公式(6)中,x 1l表示处理后的左声道音频信号,filter 1l表示第一左声道音色参数, 例如可以为第一左声道滤波器系数,gain 1l表示第一左声道音量参数,例如可以为第一左声道增益系数,gain 1l*filter 1l(x dl)表示利用第一左声道滤波器系数对处理前的左声道音频信号进行音色调整以及利用第一左声道增益系数对处理前的左声道音频信号进行音量调节。
在另一些实施例中,为了避免电子设备在三个通话模式之间切换引起处理后的左声道音频信号的能量忽大忽小的问题,可以在电子设备在生成处理后的左声道音频信号时引入一个平滑过渡时间。则电子设备利用左声道音频信号得到处理后的左声道音频信号的公式可以参考下述公式(7)。
Figure PCTCN2022093888-appb-000005
公式(7)中,x 1l表示处理后的左声道音频信号,T s表示平滑过渡时间,该值为大于1的整数,其单位为帧。i表示x 1l为电子设备在普通模式要计算的第i帧处理后的左声道音频信号。其取值范围为(0,T s)中的整数,应该理解的是,每次重新切换到普通模式,则i值从1开始,每计算出一帧处理后的左声道音频信号之后,i值增加1。其中,x 1l-1=gain 1l-1*filter 1l-1(x dl),x 1l-2=gain 1l-2*filter 1l-2(x dl)。gain 1l-1表示普通模式下计算处理后的左声道音频信号的音色参数,即第一左声道音色参数,filter 1l-1表示普通模式下计算处理后的左声道音频信号的音量参数,即第一左声道音量参数,x 1l-1为利用处理前的左声道音频信号结合第一左声道音色参数以及第一左声道音量参数计算得到的处理后的左声道音频信号。gain 1l-2表示切换到普通模式之前的通话模式下计算处理后的左声道音频信号的音色参数,filter 1l-2表示切换到普通模式之前的通话模式下计算处理后的左声道音频信号的音量参数,x 1l-2为利用处理前的左声道音频信号结合该音色参数以及音量参数计算得到的处理后的左声道音频信号。
则电子设备利用处理前的右声道音频信号生成处理后的右声道音频信号的过程可以参考前述对利用处理前的左声道音频信号生成处理后的左声道音频信号的过程的描述,公式(6)以及公式(7)中涉及的第一左声道音色参数更改为第一右声道音色参数,公式(6)以及公式(7)中涉及的第一左声道音量参数更改为第一右声道音量参数,其他描述相似,此处不再赘述。
S203.电子设备利用第一发声器播放处理后的左声道音频信号和第二发声器播放处理后的右声道音频信号。
在一些实施例中,电子设备可以利用编解码器对该处理后的左声道音频信号进行解码,将其解码为模拟电信号。得到解码后的处理后的左声道音频信号,然后利用第一功率放大器进行功率放大,驱动第一发声器播放该编码后的处理后的左声道音频信号。
电子设备可以利用编解码器对该处理后的右声道音频信号进行解码,将其解码为模拟电信号。得到解码后的处理后的右声道音频信号,然后利用第二功率放大器进行功率放大,驱动第二发声器播放该编码后的处理后的右声道音频信号。
S109.电子设备确定通话模式为安静模式;
在用户与屏幕之间的状态为紧贴屏幕状态以及通话环境类型为安静的情况下,电子设备可以确定通话模式为安静模式。如前述图6c所示,为安静模式的一个示意图。具体内容可以参考前述对前述图6c的描述,此处不再赘述。
电子设备可以利用该下行音频信号得到处理前的左声道音频信号以及处理前的右声道音频信号。然后利用第二参数分别对该处理前的左声道音频信号以及处理前的右声道音频信号进行处理,得到处理后的左声道音频信号以及处理后的右声道音频信号。
该第二参数中包括第二音量参数以及第二音色参数。该第二音量参数中包括第二右声道音量参数以及第二左声道音量参数。该第二音色参数中包括第二右声道音色参数以及第二左声道音色参数。
其中,第二左声道音色参数用于对处理前的左声道音频信号进行音色调整,使得处理后的左声道音频信号的低频声音信号的能量大于高频声音信号的能量且第二频段的声音信号的能量比其他频段的声音信号的能量小第二分贝。
第二左声道音量参数用于对处理前的左声道音频信号进行音量调整,使得处理后的左声道音频信号的能量为第二能量。
第二右声道音色参数用于对处理前的右声道音频信号进行音色调整,使得处理后的右声道音频信号的高频声音信号的能量大于低频声音信号的能量且第一频段的声音信号的能量比其他频段的声音信号的能量大第三分贝(decibel,dB)。
第二右声道音量参数用于对处理前的右声道音频信号进行音量调整,使得处理后的右声道音频信号的能量为第二能量。
S110.电子设备在安静模式下处理该下行音频信号;
电子设备可以利用该下行音频信号得到处理前的左声道音频信号以及处理前的右声道音频信号。然后利用第二参数分别对该处理前的左声道音频信号以及处理前的右声道音频信号进行处理,得到处理后的左声道音频信号以及处理后的右声道音频信号。该过程可以参考前述对步骤109的描述。
步骤S11中,电子设备利用处理前的左声道音频信号生成处理后的左声道音频信号的过程可以参考前述步骤S108中利用处理前的左声道音频信号生成处理后的左声道音频信号的过程的描述,公式(6)以及公式(7)中涉及的第一左声道音色参数更改为第二左声道音色参数,公式(6)以及公式(7)中涉及的第一左声道音量参数更改为第二左声道音量参数,其他描述相似,此处不再赘述。
步骤S11中,电子设备利用处理前的右声道音频信号生成处理后的右声道音频信号的过程可以参考前述对利用处理前的左声道音频信号生成处理后的左声道音频信号的过程的描述,公式(6)以及公式(7)中涉及的第一左声道音色参数更改为第二右声道音色参数,公式(6)以及公式(7)中涉及的第一左声道音量参数更改为第二右声道音量参数,其他描述相似,此处不再赘述。
S111.电子设备确定通话模式为嘈杂模式;
在用户与屏幕之间的状态为紧贴屏幕状态以及通话环境类型为嘈杂的情况下,电子设备可以确定通话模式为嘈杂模式。如前述图6d所示,为安静模式的一个示意图。具体内容可以参考前述对前述图6d的描述,此处不再赘述。
电子设备可以利用该下行音频信号得到处理前的左声道音频信号以及处理前的右声道音频信号。然后利用第三参数分别对该处理前的左声道音频信号以及处理前的右声道音频信号进行处理,得到处理后的左声道音频信号以及处理后的右声道音频信号。
该第三参数中包括第三音量参数以及第三音色参数。该第三音量参数中包括第三右声道音量参数以及第三左声道音量参数。该第三音色参数中包括第三右声道音色参数以及第三左声道音色参数。
其中,第三左声道音色参数用于对处理前的左声道音频信号进行音色调整,使得处理后的左声道音频信号的低频声音信号的能量大于高频声音信号的能量。
第三左声道音量参数用于对处理前的左声道音频信号进行音量调整,使得处理后的左声道音频信号的能量为第三能量。
第三右声道音色参数用于对处理前的右声道音频信号进行音色调整,使得处理后的右声道音频信号的高频声音信号的能量大于低频声音信号的能量且第一频段的声音信号的能量比其他频段的声音信号的能量大第四分贝(decibel,dB)。
第三右声道音量参数用于对处理前的右声道音频信号进行音量调整,使得处理后的右声道音频信号的能量为第三能量。
S112.电子设备在嘈杂模式下处理该下行音频信号;
电子设备可以利用该下行音频信号得到处理前的左声道音频信号以及处理前的右声道音频信号。然后利用第三参数分别对该处理前的左声道音频信号以及处理前的右声道音频信号进行处理,得到处理后的左声道音频信号以及处理后的右声道音频信号。该过程可以参考前述对步骤109的描述。
步骤S13中,电子设备利用处理前的左声道音频信号生成处理后的左声道音频信号的过程可以参考前述步骤S108中利用处理前的左声道音频信号生成处理后的左声道音频信号的过程的描述,公式(6)以及公式(7)中涉及的第一左声道音色参数更改为第三左声道音色参数,公式(6)以及公式(7)中涉及的第一左声道音量参数更改为第三左声道音量参数,其他描述相似,此处不再赘述。
步骤S13中,电子设备利用处理前的右声道音频信号生成处理后的右声道音频信号的过程可以参考前述对利用处理前的左声道音频信号生成处理后的左声道音频信号的过程的描述,公式(6)以及公式(7)中涉及的第一左声道音色参数更改为第三右声道音色参数,公式(6)以及公式(7)中涉及的第一左声道音量参数更改为第三右声道音量参数,其他描述相似,此处不再赘述。
S113.电子设备确定通话是否结束;
在确定通话没有结束的情况下,电子设备继续获取下一帧下行音频信号,重复执行步骤S104-步骤S113。重新确定通话模式是否为可调节模式,然后,得到处理后的左声道音 频信号以及处理后的右声道音频信号,然后进行播放。
可选的,在确定通话没有结束的情况下,电子设备继续获取下一帧下行音频信号,重复执行步骤S105-步骤S113,不执行步骤S104,不用重新确定通话模式是否为可调节模式。
在确定通话结束的情况下,电子设备执行步骤S114。
S114.电子设备退出通话类应用。
本申请实施例中,对于通话上行的过程,电子设备通过麦克风获取音频信号时,在采集电子设备周围的音频信号的同时,也可以采集到第一发声器播放以及第二发声器播放的音频信号从而导致该麦克风采集的音频信号中包括其他电子设备的回声信号。该回声信号是由于麦克风采集的音频信号中,包括第一发声器以及第二发声器播放的音频信号而导致的。
电子设备可以除去麦克风采集的音频信号中的回声信号,图11为电子设备除去麦克风采集的音频信号中的回声信号的一个示意性流程图。
该过程涉及的详细描述可以参考下述对步骤S301-步骤S304的描述。
S301.电子设备获取上行音频信号;
该上行音频信号为电子设备的麦克风采集的一帧音频信号。一帧音频信号具体为多长时间的音频信号可以根据电子设备的处理能力决定,一般可以为10ms-50ms,例如10ms或者20ms、30ms等10ms的倍数。
该上行音频信号中包括电子设备周围的声音信号及用户的声音信号,还包括第一发声器以及第二发声器播放的音频信号而导致的回声信号。电子设备可以执行下述步骤S302-步骤S304除去该回声信号。
S302.电子设备获取第一参考信号和第二参考信号;
该第一参考信号为处理后的左声道音频信号经过第一功率放大器之后输出的音频信号。
该第二参数信号为处理后的右声道音频信号经过第二功率放大器之后输出的音频信号。
电子设备可以获取一帧第一功率放大器输出的音频信号作为第一参考信号以及获取一帧第一功率放大器输出的音频信号作为第一参考信号。
S303.电子设备利用该第一参考信号和第二参考信号估计出回声信号;
该回声信号即为估计出的麦克风采集的第一发声器以及第二发声器播放的音频信号。
在一些实施例中,电子设备可以结合第一参考信号以及第二参考信号,估计出回声信号。
在一些实施例中,电子设备确定该回声信号的相关公式,可以参考下述公式(8)。
Figure PCTCN2022093888-appb-000006
公式(8)中,
Figure PCTCN2022093888-appb-000007
表示回声信号,f l表示第一参考信号到回声信号的传递函数。f r表示第二参考信号到回声信号的传递函数。x′ l(t,f)表示频域上的第一参考信号,x′ r(t,f)表示频域上的第二参考信号,其中,t表示帧,f表示频点。
在另一些实施例中,电子设备确定该回声信号的相关公式,可以参考下述公式(9)。
Figure PCTCN2022093888-appb-000008
公式(9)中,
Figure PCTCN2022093888-appb-000009
表示回声信号,max表示时频点取大操作,其他相关符号的定义可以参考公式(8)中的描述。
上述公式(8)以及公式(9)中设计的传递函数可以利用回声消除(acoustic echo cancellation,AEC)算法确定,也可以利用其他的算法确定。不构成对本申请实施例的限定。
S304.电子设备从上行音频信号中除去该回声信号,得到处理后的上行音频信号。
该处理后的上行音频信号为上行音频信号除去回声信号之后的那部分音频信号。
在一些实施例中,电子设备利用上行音频信号及回声信号,得到处理后的上行音频信号的相关公式可以参考下述公式(10)。
Figure PCTCN2022093888-appb-000010
公式(10)中,x 2-d表示处理后的上行音频信号,x 2表示上行音频信号,
Figure PCTCN2022093888-appb-000011
表示回声信号。
应该理解的是,本申请中涉及的安静模式、普通模式以及嘈杂模式也可以被称为第一通话模式或第二通话模式中的一个,当第一模式为三个模式(安静模式、普通模式以及嘈杂模式)中的一个时,第二模式可以为该三个模式中的另一个,例如,当第一通话模式为安静模式时,第二模式可以为普通模式或嘈杂模式中的一个。通话环境可以为安静环境、普通环境或者嘈杂环境。
其中,第一模式下的左声道音频信号的特点可以被称为第一左声道音频特征,第一模式下的右声道音频信号的特点可以被称为第一右声道音频特征。第一模式下,用户与屏幕之间的状态和/或通话环境类型可以被称为第一通话环境。
第二模式下的左声道音频信号的特点可以被称为第二左声道音频特征,第二模式下的右声道音频信号的特点可以被称为第二右声道音频特征。第二模式下,用户与屏幕之间的状态和/或通话环境类型可以被称为第二通话环境。
结合表1中的相关内容可知,第一左声道音频特征与第二左声道音频特征可以不同,和/或,第一右声道音频特征与第二右声道音频特征可以不同,该不同可以体现在音量和/或音色上。例如,当第一模式为普通模式时,第二模式为安静模式时,第一左声道音频信号的音量为第一能量,第二左声道音频信号的音量为第二能量,该第一能量大于第二能量,则使得第一左声道音频特征与第二左声道音频特征不同。
本申请实施例中涉及的全部音频信号也可以被称为音频,发声器(第一发声器以及第二发声器)播放音频信号(左声道音频信号以及右声道音频信号)也可以被称为输出音频信号。声音信号也可以被称为声音。
应该理解的是,本申请实施例中提及的第一上行音频信号可以为第t帧音频信号。
下面首先介绍本申请实施例提供的示例性电子设备。
图12是本申请实施例提供的电子设备的结构示意图。
下面以电子设备为例对实施例进行具体说明。应该理解的是,电子设备可以具有比图中所示的更多的或者更少的部件,可以组合两个或多个的部件,或者可以具有不同的部件配置。图中所示出的各种部件可以在包括一个或多个信号处理和/或专用集成电路在内的硬件、软件、或硬件和软件的组合中实现。
电子设备可以包括:处理器110,外部存储器接口120,内部存储器121,通用串行总线(universal serial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。
可以理解的是,本发明实施例示意的结构并不构成对电子设备的具体限定。在本申请另一些实施例中,电子设备可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器(简称调制解调器),图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
其中,控制器可以是电子设备的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了***的效率。
调制解调器用于在接收到其他电子设备发送给本机的音频信号之后,可以将其进行解码,得到下行音频信号。然后将该下行音频信号传输至双器件通话算法。
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED)等。在一些实施例中,电子设备可以包括1个或N个显示屏194,N为大于1的正整数。
在本申请实施例中,该显示屏194也可以被称为屏幕。
电子设备可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。
电子设备可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备可以通过扬声器170A收听音乐,或收听免提通话。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。
本申请实施例中,该受话器170B也可以被称为发声器,电子设备可以包括第一发声器(未示出)以及第二发声器(未示出),该第一发声器用于播放模拟的左声道音频信号。该第二发声器用于播放模拟的右声道音频信号。
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。电子设备可以设置至少一个麦克风170C。在另一些实施例中,电子设备可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。
在一些实施例中,麦克风可以将采集的音频信号传输至编解码器进行编码,得到上行音频信号,然后将该上行音频信号传输至双器件通话算法。双器件通话算法可以结合该上行音频信号计算得到通话环境类型。耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。
在一些实施例中,压力传感器可以用于确定用户与屏幕之间的状态,例如,当压力传感器检测到用户与屏幕之间的压力大于一个预设压力值时,且持续时间大于一个预设时间时,电子设备可以确定用户与屏幕之间的状态为紧贴屏幕状态。当压力传感器检测到用户与屏幕之间的压力小于一个预设压力值时,或持续时间小于一个预设时间时,电子设备可以确定用户与屏幕之间的状态为非紧贴屏幕状态。
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。
环境光传感器180L用于感知环境光亮度。电子设备可以根据感知的环境光亮度自适应调整显示屏194亮度。环境光传感器180L也可用于拍照时自动调整白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备是否在口袋里,以防误触。
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。
显示屏194上的传感器,例如触摸传感器,可以检测用户是否与该显示屏194接触。
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。电子设备可以接收按键输入,产生与电子设备的用户设置以及功能控制有关的键信号输入。
本申请实施例中,电子设备还包括编解码器(未示出),第一功率发大器(未示出)及第二功率发大器(未示出)。
该编解码器用于将模拟信号编码成数字信号,也可以用于将数字信号解码成模拟信号。例如,可以将数字的处理后的左声道音频信号进行编码,得到模拟的左声道音频信号。
该第一功率放大器用于将模拟的音频信号进行功率放大,驱动受话器170B播放该模拟的音频信号。例如,将编码后的处理后的左声道音频信号进行功率放大,驱动第一发声器播放该模拟的处理后的左声道音频信号。
该第二功率放大器用于将模拟的音频信号进行功率放大,驱动受话器170B播放该模拟的音频信号。例如,将编码后的处理后的右声道音频信号进行功率放大,驱动第二发声器播放该模拟的处理后的右声道音频信号。
本申请实施例中,该处理器110可以调用内部存储器121中存储的计算机指令,以使得电子设备执行本申请实施例中的通话方法。
图13是本申请实施例的电子设备的***结构示意图。
下面示例性介绍电子设备的***结构。
分层架构将***分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将***分为四层,从上至下分别为应用程序层,应用程序框架层、硬件抽象层以及硬件层。
应用程序层可以包括一系列应用程序包。
如图13所示,应用程序包可以包括电话、设置等应用程序(也可以称为应用)。
其中,设置应用中可以提供设置通话模式是否为可调节模式的用户界面,以及设置音质调节灵敏度来控制用户与屏幕接触多长时间电子设备才确定用户紧贴屏幕的用户界面。例如,前述图7a-图7d可以为相关的用户界面。
在一些实施例中,该设置应用可以将用户设置的通话模式是否为可调节模式的信息传输到下述硬件抽象层中的音频硬件抽象。以及用户设置的音质调节灵敏度的信息传输到下述硬件抽象层中的屏幕硬件抽象。
通话应用为通话类应用,开启该通话类应用,用户可以通过电子设备进行通话。例如,响应于用户在电话应用上接听电话的操作,电话应用可以通过应用框架层的电话管理器确定通话已连接,然后,该电话管理器可以调用抽象层的音频硬件抽象启动麦克风、第一发声器及第二发声器等通话过程中涉及的硬件使得电子设备开启通话类应用,用户可以开始通话。
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。
在一些实施例中,应用程序框架层可以包括电话管理器等。
电话管理器用于提供电子设备的通信功能。例如通话状态的管理(包括接通,挂断等)。
在一些实施例中,该电话管理器还可以确定电子设备的是否为手持通话模式。并且将是否为手持通话模式的信息传递到音频硬件抽象。
硬件抽象层为位于应用程序框架层以及硬件层之间的接口层,为操作***提供虚拟硬件平台。
本申请实施例中,硬件抽象层可以包括音频硬件抽象以及屏幕硬件抽象。
其中,音频硬件抽象可以用于接收电话管理器下发的手持通话模式的信息以及设置应用下发的通话模式是否为可调节模式的信息,并将这两个信息存储在内置的数据库中。
在一些实施例中,当音频硬件抽象确定通话状态为手持通话模式以及通话模式为可调节模式的情况下,可以调用屏幕硬件抽象获取用户与屏幕之间的状态,且,调用双器件通话算法对下行音频信号进行处理。当电话管理器确定通话状态为手持通话模式以及通话模式不为可调节模式的情况下,则调用双器件通话算法对下行音频信号进行处理。当电话管理器确定通话状态不为手持通话模式的情况下则调用其他通话算法对下行音频信号进行处理。
下述描述以音频硬件抽象确定通话状态为手持通话模式以及通话模式为可调节模式为例进行讲解,其他情况可以参考对此的描述。
屏幕硬件抽象可以用于接收设置应用下发的音质调节灵敏度的信息,并将该信息存储在内置的数据库中。
在一些实施例中,在接收到音频硬件抽象调用屏幕硬件抽象获取用户与屏幕之间的状态的指令之后,该屏幕硬件抽象可以从内置的数据库中获取音质调节灵敏度的信息,结合该音质调节灵敏度的信息,通过屏幕上的传感器检测到用户是否与屏幕紧贴。然后用户是否与屏幕紧贴的信息发送到下述音频数字信号处理器中的双器件通话算法。
在本申请实施例中,硬件层涉及的硬件可以包括:音频数字信号处理器、编解码器、调制解调器、屏幕、第一功率放大器、第二功率放大器、第一发声器、第二发声器以及麦 克风等。
编解码器等其他硬件的相关功能可以参考前述对图12中的相关内容的描述,此处不再赘述。
其中,音频数字信号处理器中可以设置通话算法。
通话算法中可以包括双器件通话算法以及其他通话算法。
其中,双器件通话算法为本申请实施例中涉及的通话算法。该双器件通话算法可以接收下述调制解调器传输的下行音频信号,并将该下行音频信号进行处理得到处理后的左声道音频信号以及处理后的右声道音频信号。然后将该处理后的左声道音频信号以及处理后的右声道音频信号下发至编解码器。
该双器件通话算法还可以接收下述编解码器传输的上行音频信号,同时获取编解码器传输的第一参考信号以及第二参考信号。然后,结合该第一参考信号以及第二参考信号对该上行音频信号进行回声消除,得到处理后的上行音频信号。
其他电子设备传输给本机的音频信号经过调制解调器解码之后,可以得到下行音频信号。然后,该调制解调器可以将下行音频信号传输至双器件通话算法。
麦克风可以将采集的音频信号传输至编解码器进行编码。
编解码器在接收到处理后的左声道音频信号以及处理后的右声道音频信号之后,可以将该处理后的左声道音频信号以及处理后的右声道音频信号进行解码,得到解码后的处理后的左声道音频信号以及处理后的右声道音频信号。然后将该解码后的处理后的左声道音频信号传输至第一功率放大器以及将该解码后的处理后的右声道音频信号传输至第二功率放大器。
编解码器可以接收麦克风采集的音频信号,将其进行编码得到上行音频信号,然后将该上行音频信号传输至双器件通话算法。
编解码器还可以接收第一功率放大器传输的解码后的处理后的左声道音频信号将其进行编码得到第一参考信号。且,接收第二功率放大器传输的解码后的处理后的右声道音频信号将其进行编码得到第二参考信号。然后将该第一参考信号以及第二参考信号传输至双器件通话算法。
第一功率放大器在接收到解码后的处理后的左声道音频信号之后可以将其进行功率放大,驱动第一发声器播放该解码后的处理后的左声道音频信号。
第二功率放大器在接收到解码后的处理后的右声道音频信号之后可以将其进行功率放大,驱动第二发声器播放该解码后的处理后的右声道音频信号。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的范围。
上述实施例中所用,根据上下文,术语“当…时”可以被解释为意思是“如果…”或“在…后”或“响应于确定…”或“响应于检测到…”。类似地,根据上下文,短语“在确定…时”或“如果检测到(所陈述的条件或事件)”可以被解释为意思是“如果确定…”或 “响应于确定…”或“在检测到(所陈述的条件或事件)时”或“响应于检测到(所陈述的条件或事件)”。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行所述计算机程序指令时,全部或部分地产生按照本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。所述计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,(例如,软盘、硬盘、磁带)、光介质(例如DVD)、或者半导体介质(例如固态硬盘)等。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,该流程可以由计算机程序来指令相关的硬件完成,该程序可存储于计算机可读取存储介质中,该程序在执行时,可包括如上述各方法实施例的流程。而前述的存储介质包括:ROM或随机存储记忆体RAM、磁碟或者光盘等各种可存储程序代码的介质。

Claims (15)

  1. 一种通话方法,其特征在于,应用于包括第一发声器和第二发声器的电子设备,所述第二发声器与所述第一发声器不同,所述第一发声器对应左声道,所述第二发声器对应右声道,所述方法包括:
    显示通话应用界面;
    所述电子设备确定为第一通话模式,所述第一通话模式对应第一左声道音频特征和第一右声道音频特征,所述第一左声道音频特征为左声道输出的音频信号的音频特征,所述第一右声道音频特征为所述右声道输出的音频信号的音频特征,所述第一通话模式对应第一通话环境;
    确定所述电子设备处于第二通话环境;
    所述电子设备切换为第二通话模式,所述第二通话模式对应第二左声道音频特征和第二右声道音频特征,所述第二左声道音频特征为所述左声道输出的音频信号的音频特征,所述第二右声道音频特征为所述右声道输出的音频信号的音频特征,所述第二通话模式对应第二通话环境,所述第一通话环境和所述第二通话环境不同,其中,
    所述第一左声道音频特征与所述第二左声道音频特征不同,和/或,所述第一右声道音频特征与所述第二右声道音频特征不同。
  2. 根据权利要求1所述的方法,其特征在于,所述电子设备确定为第一通话模式之后,所述方法还包括:
    所述电子设备接收下行音频;所述下行音频为通话过程中其他电子设备发送给所述电子设备的音频;
    所述电子设备在所述第一通话模式下,对所述下行音频进行处理,得到第一左声道音频以及第一右声道音频,其中,所述第一左声道音频中,低频音的能量大于高频音的能量,所述第一右声道音频中,高频音的能量大于低频音的能量;
    所述电子设备通过所述第一发声器播放所述第一左声道音频,通过所述第二发声器播放所述第一右声道音频。
  3. 根据权利要求1或2所述的方法,其特征在于:
    所述第一发声器置于所述电子设备的侧面,所述第二发声器置于所述电子设备的屏幕内侧;其中,所述第一发声器播放的目标左声道音频通过空气传输到人耳,所述第二发声器播放的目标右声道音频通过骨骼传输到人耳。
  4. 根据权利要求2或3所述的方法,其特征在于,对所述下行音频进行处理,得到第 一左声道音频以及第一右声道音频,具体包括:
    所述电子设备根据下行音频得到处理前的第一左声道音频以及处理后的第一右声道音频;
    将所述处理前的第一左声道音频以及处理前的第一右声道音频分别进行音色调整以及音量调整,得到第一左声道音频以及第一右声道音频,所述音色调整是指调整音频中不同频段的声音的能量分布,所述音量调整是指调整音频的能量大小。
  5. 根据权利要求4所述的方法,其特征在于,所述电子设备根据下行音频得到处理前的第一左声道音频以及处理前的第一右声道音频之后,将所述处理前的第一左声道音频和所述处理前的第一右声道音频分别进行音色调整以及音频调整之前,所述方法还包括:
    所述电子设备确定对所述处理前的第一左声道音频和所述处理前的第一右声道音频进行处理的参数,所述参数包括左声道音色参数、右声道音色参数、左声道音量参数及右声道音量参数;
    将所述处理前的第一左声道音频和所述处理前的第一右声道音频分别进行音色调整以及音频调整,得到第一左声道音频以及第一右声道音频,具体包括:
    利用所述左声道音色参数和所述左声道音量参数对处理前的左声道音频分别进行音色调整以及音量调整得到第一左声道音频;利用所述右声道音色参数和所述右声道音量参数对处理前的右声道音频分别进行音色调整以及音量调整,得到第一右声道音频。
  6. 根据权利要求5所述的方法,其特征在于,确定对所述处理前的左声道音频和所述处理前的右声道音频进行处理的参数,具体包括:
    所述电子设备确定通话环境类型,所述通话环境类型包括安静、普通及嘈杂;其中,通话环境类型为安静时与通话环境类型为普通/嘈杂时相比,前者对应的第一上行音频中噪声的长时能量比后者小;通话环境类型为嘈杂时与通话环境类型为安静/普通时相比,前者对应的第一上行音频中噪声的长时能量比后者大;
    所述电子设备确定用户与屏幕之间的状态,所述用户与屏幕之间的状态包括紧贴屏幕状态和非紧贴屏幕状态;所述紧贴屏幕状态为用户与所述电子设备的屏幕之间的距离小于一个预设值且大于该预设值的此续时间大于一个预设时间的状态,所述非紧贴屏幕状态为用户与所述电子设备的屏幕之间的距离不小于一个预设值且不小于该预设值的此续时间大于一个预设时间的状态;
    基于所述通话环境类型以及用户与屏幕之间的状态确定通话模式,所述通话模式为第一通话模式及第二通话模式中的一个。
  7. 根据权利要求1-6中的任一项所述的方法,其特征在于,所述第一模式为安静模式、普通模式以及嘈杂模式中的一个,第二模式为安静模式、普通模式以及嘈杂模式中的另一个,基于所述通话环境类型以及用户与屏幕之间的状态确定通话模式,具体包括:
    在所述通话环境类型为普通且所述用户与屏幕之间的状态为紧贴屏幕状态的情况下或者所述用户与屏幕之间的状态为非紧贴屏幕状态的情况下,所述电子设备确定所述通话模 式为普通模式;
    所述电子设备确定所述普通模式对应的参数为对所述处理前的第一左声道音频和所述处理前的第一右声道音频进行处理的参数;
    在所述通话环境类型为安静,所述用户与屏幕之间的状态为紧贴屏幕状态的情况下,所述电子设备确定通话模式为安静模式;
    所述电子设备确定所述安静模式对应的参数为对所述处理前的第一左声道音频和所述处理前的第一右声道音频进行处理的参数;
    在所述通话环境类型为嘈杂,所述用户与屏幕之间的状态为紧贴屏幕状态的情况下,确定通话模式为嘈杂模式;
    所述电子设备确定所述嘈杂模式对应的参数为对所述处理前的第一左声道音频和所述处理前的第一右声道音频进行处理的参数。
  8. 根据权利要求7所述的方法,其特征在于:
    设置计算第一上行音频中噪声的长时能量时涉及的参数,使得所述通话模式,只能从安静模式切换到普通模式、普通模式切换到嘈杂模式、嘈杂模式切换到普通模式及普通模式切换到安静模式。
  9. 根据权利要求1-8中任一项所述的方法,其特征在于,显示通话应用界面之后,所述电子设备确定为第一通话模式之前,所述方法还包括:
    所述电子设备确定用户通话过程中,通过所述第一发声器以及所述第二发声器播放音频。
  10. 根据权利要求6-9中任一项所述的方法,其特征在于:
    所述电子设备默认设置通话环境类型为普通;
    所述电子设备默认设置用户与屏幕之间的状态为紧贴屏幕状态。
  11. 根据权利要求1-10中任一项所述的方法,其特征在于,所述方法还包括:
    所述电子设备根据第一参考信号及第二参考信号估计出回声,所述第一参考信号为第一左声道音频经过第一功率放大器之后输出的音频,所述第二参考信号为第一右声道音频经过第二功率放大器之后输出的音频,所述回声为估计出的麦克风采集的所述第一发声器以及所述第二发声器播放的音频;
    从所述第一上行音频中除去该回声,得到目标上行音频。
  12. 一种电子设备,其特征在于,所述电子设备包括:一个或多个处理器和存储器;所述存储器与所述一个或多个处理器耦合,所述存储器用于存储计算机程序代码,所述计算机程序代码包括计算机指令,所述一个或多个处理器调用所述计算机指令以使得所述电子设备执行如权利要求1-11中任一项所述的方法。
  13. 一种芯片***,所述芯片***应用于电子设备,所述芯片***包括一个或多个处理器,所述处理器用于调用计算机指令以使得所述电子设备执行如权利要求1-11中任一项所述的方法。
  14. 一种包含指令的计算机程序产品,其特征在于,当所述计算机程序产品在电子设备上运行时,使得所述电子设备执行如权利要求1-11中任一项所述的方法。
  15. 一种计算机可读存储介质,包括指令,其特征在于,当所述指令在电子设备上运行时,使得所述电子设备执行如权利要求1-11中任一项所述的方法。
PCT/CN2022/093888 2021-07-13 2022-05-19 一种通话方法及电子设备 WO2023284406A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202110791580.0 2021-07-13
CN202110791580 2021-07-13
CN202111194770.0 2021-10-13
CN202111194770.0A CN115623121B (zh) 2021-07-13 2021-10-13 一种通话方法、电子设备、芯片***及存储介质

Publications (1)

Publication Number Publication Date
WO2023284406A1 true WO2023284406A1 (zh) 2023-01-19

Family

ID=84855470

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/093888 WO2023284406A1 (zh) 2021-07-13 2022-05-19 一种通话方法及电子设备

Country Status (2)

Country Link
CN (1) CN115623121B (zh)
WO (1) WO2023284406A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117135262B (zh) * 2023-01-29 2024-07-12 荣耀终端有限公司 一种通话方法及电子设备

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101378423A (zh) * 2007-08-28 2009-03-04 希姆通信息技术(上海)有限公司 手机通话接收端音量自动调节装置
CN201805538U (zh) * 2010-09-10 2011-04-20 康佳集团股份有限公司 根据环境声音改善播放声音清晰度的电路及装置
CN103369440A (zh) * 2013-08-01 2013-10-23 广东欧珀移动通信有限公司 一种压电骨导受话器及一种便携式电子设备
CN105657125A (zh) * 2014-11-12 2016-06-08 阿尔卡特朗讯 一种用于调节通话音量的方法与设备
US9553960B1 (en) * 2015-09-04 2017-01-24 Intel Corporation Loudspeaker with laminate panel for mobile computing platforms
CN206712855U (zh) * 2017-04-27 2017-12-05 上海爱优威软件开发有限公司 受话***及通话设备
CN108833638A (zh) * 2018-05-17 2018-11-16 Oppo广东移动通信有限公司 发声方法、装置、电子装置及存储介质
CN110944079A (zh) * 2019-11-29 2020-03-31 维沃移动通信有限公司 电子设备

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4841495B2 (ja) * 2007-04-16 2011-12-21 ソニー株式会社 音響再生システムおよびスピーカ装置
CN103973863B (zh) * 2014-05-30 2018-12-21 努比亚技术有限公司 自动调节通话音量的方法和通信终端
CN104378485A (zh) * 2014-11-28 2015-02-25 小米科技有限责任公司 调节音量的方法和装置
CN104935742B (zh) * 2015-06-10 2017-11-24 瑞声科技(南京)有限公司 移动通讯终端及改善其在听筒模式下的音质的方法
CN106604167B (zh) * 2016-11-21 2019-07-26 捷开通讯(深圳)有限公司 一种自动调整耳机左右声道输出音量的方法及移动终端
CN111385687B (zh) * 2018-12-29 2021-12-03 北京小米移动软件有限公司 一种防漏音电子设备
CN113079401B (zh) * 2021-03-29 2022-09-30 海信视像科技股份有限公司 显示设备及回声消除方法

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101378423A (zh) * 2007-08-28 2009-03-04 希姆通信息技术(上海)有限公司 手机通话接收端音量自动调节装置
CN201805538U (zh) * 2010-09-10 2011-04-20 康佳集团股份有限公司 根据环境声音改善播放声音清晰度的电路及装置
CN103369440A (zh) * 2013-08-01 2013-10-23 广东欧珀移动通信有限公司 一种压电骨导受话器及一种便携式电子设备
CN105657125A (zh) * 2014-11-12 2016-06-08 阿尔卡特朗讯 一种用于调节通话音量的方法与设备
US9553960B1 (en) * 2015-09-04 2017-01-24 Intel Corporation Loudspeaker with laminate panel for mobile computing platforms
CN206712855U (zh) * 2017-04-27 2017-12-05 上海爱优威软件开发有限公司 受话***及通话设备
CN108833638A (zh) * 2018-05-17 2018-11-16 Oppo广东移动通信有限公司 发声方法、装置、电子装置及存储介质
CN110944079A (zh) * 2019-11-29 2020-03-31 维沃移动通信有限公司 电子设备

Also Published As

Publication number Publication date
CN115623121A (zh) 2023-01-17
CN115623121B (zh) 2024-04-05

Similar Documents

Publication Publication Date Title
JP6505252B2 (ja) 音声信号を処理するための方法及び装置
CN104521247B (zh) 蓝牙耳机助听及抗噪方法和装置
CN108476256A (zh) 一种音量调节方法及终端
WO2016184119A1 (zh) 一种音量调节方法、***、设备和计算机存储介质
US20090124286A1 (en) Portable hands-free device with sensor
CN201750452U (zh) 移动通信终端
CN112954115B (zh) 一种音量调节方法、装置、电子设备及存储介质
EP3777114B1 (en) Dynamically adjustable sidetone generation
JP2007520943A (ja) ノイズの大きい環境における電話機の拡張された使用
US10121491B2 (en) Intelligent volume control interface
KR20170019929A (ko) 음질 개선을 위한 방법 및 헤드셋
WO2022262223A1 (zh) 一种辅听耳机及其控制方法、装置、***及可读介质
WO2023284406A1 (zh) 一种通话方法及电子设备
WO2023197474A1 (zh) 一种耳机模式对应的参数确定方法、耳机、终端和***
WO2022199354A1 (zh) 通话音量的调节方法、装置、终端及存储介质
CN107493376A (zh) 一种铃声音量调节方法和装置
CN113824838A (zh) 发声控制方法、装置、电子设备及存储介质
WO2020019822A1 (zh) 麦克风堵孔检测方法及相关产品
CN116055626B (zh) 一种通话方法、终端和存储介质
WO2024016229A1 (zh) 音频处理方法及电子设备
CN117093182B (zh) 一种音频播放方法、电子设备和计算机可读存储介质
US20240089671A1 (en) Hearing aid comprising a voice control interface
KR100810702B1 (ko) 음량 자동 조절 방법과 장치 및 이를 이용하는 이동통신단말기
US20230260526A1 (en) Method and electronic device for personalized audio enhancement
CN116320867A (zh) 风噪检测方法、装置及耳机

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22841046

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE