WO2022228089A1 - Method for audio reception, apparatus, and related electronic device - Google Patents

Method for audio reception, apparatus, and related electronic device Download PDF

Info

Publication number
WO2022228089A1
WO2022228089A1 PCT/CN2022/085899 CN2022085899W WO2022228089A1 WO 2022228089 A1 WO2022228089 A1 WO 2022228089A1 CN 2022085899 W CN2022085899 W CN 2022085899W WO 2022228089 A1 WO2022228089 A1 WO 2022228089A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
scene
designated
objects
microphone
Prior art date
Application number
PCT/CN2022/085899
Other languages
French (fr)
Chinese (zh)
Inventor
孙冉
韩博
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2022228089A1 publication Critical patent/WO2022228089A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/10Terrestrial scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers

Definitions

  • the present application relates to the technical field of sound collection of electronic equipment, and in particular, to a sound collection method, apparatus, and electronic equipment.
  • Smartphones and other electronic devices need to record audio information when shooting videos.
  • some electronic devices are equipped with multiple microphones, which can perform directional radio.
  • the front-facing radio is enabled by default to ensure that the front-facing vocal radio is clear and unaffected by ambient noise.
  • the current directional radio cannot be adjusted in any direction.
  • the front directional radio function When the front directional radio function is turned on, it usually points directly in front of the camera. Algorithms can not be opened to the strongest. For example, when the smart phone broadcasts live through the front-facing radio, if the front-facing radio algorithm is turned on to the strongest, then the front-facing radio has the strongest directivity. When the user walks out of the screen, the smartphone cannot receive the user. , which is clearly not what the user expected. Therefore, based on the current situation that the directivity of the front radio cannot be adjusted, the default directivity radio algorithm cannot be turned on to the strongest, which also prevents users from enjoying the best and purest pre-directivity radio.
  • the embodiments of the present application provide a method, device and related electronic equipment for sound pickup, which can better perform sound pickup.
  • an embodiment of the present application provides a sound collection method, which is applied to an electronic device including a microphone.
  • the method includes: recognizing an image captured in real time, and determining a scene corresponding to the image; the scene is a preset Set one of the scenarios; adjust the sound pickup directivity of the microphone, and match the sound pickup directivity of the microphone with the scene according to preset rules.
  • the sound pickup method adopted in the embodiment of the present application can adjust the sound pickup directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is beneficial to improve the sound pickup effect and user experience.
  • the recognizing an image captured in real time includes: recognizing one or more of the following information in the image: number, the definition of the specified object, the proportion of the specified object in the image, and the position of the specified object in the image.
  • the information for determining the preset scene includes determining one or more of the following information: the number of specified objects in the image, The definition of the specified object, the proportion of the specified object in the image, or the position of the specified object in the image.
  • the designated object includes a face image; and the recognizing the image captured in real time includes: performing face recognition on the image.
  • the preset scene includes one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene ;
  • the first scene includes: the image includes one or more designated objects, the designated objects are in the central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold;
  • the The second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold;
  • the third scenario includes : the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold;
  • the fourth scene includes: the The image does not include a designated object;
  • the preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens in turn.
  • the method before the recognizing the real-time captured image, the method further includes: switching the capturing mode of the electronic device to the front mode Or post mode.
  • the method before the recognizing the real-time captured image, the method further includes: determining a designated object in the real-time captured image;
  • the adjusting the sound pickup directivity of the microphone includes: when the shooting mode of the electronic device is the front mode, adjusting the sound pickup front directivity of the microphone; When the shooting mode is the rear mode, the rear directivity of the microphone for sound collection is adjusted.
  • the determining a specified object in the real-time captured image includes: acquiring a user's click operation on the image, and converting the click operation to the image.
  • the object is determined to be the specified object.
  • the method before determining the designated object in the real-time captured image, includes: taking a picture to obtain a first picture, and obtaining a first picture from the first image.
  • the specified object is determined from the picture; or, a second picture is acquired from a picture library, and the specified object is determined from the second picture; or the specified object is determined according to a pre-acquired description of the specified object.
  • an embodiment of the present application provides a sound-receiving device, comprising: a microphone, and further comprising: a first processing unit for recognizing an image captured in real time, and determining a scene corresponding to the image; the scene is a preset One of the scenarios; an adjustment unit, configured to adjust the sound pickup directivity of the microphone, and make the sound pickup front directivity of the microphone match the scene according to a preset rule.
  • the sound pickup device used in the embodiment of the present application can adjust the sound pickup directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is beneficial to improve the sound pickup effect and user experience.
  • the first processing unit is specifically configured to recognize the following information in the image: One or more of: the number of designated objects, the definition of the designated objects, the proportion of the designated objects in the image, and the position of the designated objects in the image.
  • the information for determining the preset scene includes determining one or more of the following information: the number of specified objects in the image, all The definition of the specified object, the proportion of the specified object in the image, or the position of the specified object in the image.
  • the designated object includes a face image; and the recognizing the image captured in real time includes: performing face recognition on the image.
  • the preset scene includes one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene;
  • the first scene includes: the image includes one or more designated objects, the designated objects are in the central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold;
  • the second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold;
  • the third scenario includes: The image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold;
  • the fourth scene includes: the image does not include a designated object;
  • the preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens in turn.
  • the device further includes: a switching unit, configured to change the shooting mode of the electronic device before the first processing unit recognizes the real-time captured image Switch to front mode or rear mode.
  • the first processing unit is further configured to, before identifying the real-time captured image, determine a designated object in the real-time captured image;
  • the adjustment unit is specifically configured to, when the shooting mode of the electronic device is the front mode, adjust the front direction of the microphone for sound collection; or, when the shooting mode of the electronic device is the rear mode , to adjust the directivity of the microphone's sound-receiving rear.
  • the first processing unit in terms of determining a specified object in the real-time captured image, is specifically configured to acquire a user's click operation on the image , and the object of the click operation is determined as the specified object.
  • the device further includes: a second processing unit, configured to take a picture to obtain the specified object before the first processing unit determines the specified object in the real-time captured image The first picture, for determining the specified object from the first picture; or, for obtaining the second picture from the picture library before the first processing unit determines the specified object in the real-time captured image, The second picture is used to determine the designated object; or, it is used to determine the designated object according to the pre-acquired description of the designated object.
  • a second processing unit configured to take a picture to obtain the specified object before the first processing unit determines the specified object in the real-time captured image The first picture, for determining the specified object from the first picture; or, for obtaining the second picture from the picture library before the first processing unit determines the specified object in the real-time captured image, The second picture is used to determine the designated object; or, it is used to determine the designated object according to the pre-acquired description of the designated object.
  • embodiments of the present application provide an electronic device, including: a microphone, a memory, and a processor, wherein the memory is used to save a preset scene, a preset rule, and a computer program code, and the computer program
  • the code includes instructions; when the instructions are executed by the processor, the instructions cause the electronic device to execute the first aspect or one or more of the voice pickup methods in multiple possible implementations of the first aspect.
  • embodiments of the present application provide a computer-readable storage medium, where the computer-readable storage medium stores program codes for execution by an electronic device, and when the program codes are executed, the electronic device executes the first
  • One aspect or one or more of the multiple possible implementation manners of the first aspect is a sound collection method.
  • embodiments of the present application provide a computer program product that, when the computer program product runs on a computer, causes the computer to execute the first aspect or one of the various possible implementations of the first aspect.
  • One or more radio methods are included in a computer program product.
  • the front directivity of the microphone can be adjusted according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the radio scene is conducive to improving the radio effect and user experience.
  • FIG. 1 is a schematic flowchart of a method for collecting audio according to an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a voice pickup method provided by another embodiment of the present application.
  • FIG. 3A is a schematic diagram of an image corresponding to a scene in an embodiment of the present application.
  • FIG. 3B is a schematic diagram of an image corresponding to a scene in another embodiment of the present application.
  • FIG. 3C is a schematic diagram of an image corresponding to a scene in another embodiment of the present application.
  • FIG. 3D is a schematic diagram of an image corresponding to a scene in another embodiment of the present application.
  • FIG. 4A is a schematic diagram of a simulation of a sound collection effect in an embodiment of the present application.
  • FIG. 4B is a schematic diagram of simulation of a sound collection effect in another embodiment of the present application.
  • FIG. 4C is a schematic diagram of simulation of a sound collection effect in another embodiment of the present application.
  • FIG. 4D is a schematic diagram of simulation of a sound collection effect in another embodiment of the present application.
  • FIG. 5A is a schematic diagram of the range and intensity of prompt sound collection according to an embodiment of the present application.
  • FIG. 5B is a schematic diagram of the range and intensity of prompt sound collection in another embodiment of the present application.
  • FIG. 5C is a schematic diagram of the range and intensity of prompt sound collection in another embodiment of the present application.
  • FIG. 5D is a schematic diagram of the range and intensity of prompt sound collection in another embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a sound pickup device according to an embodiment of the present application.
  • FIG. 7A is a schematic structural diagram of a sound pickup device according to another embodiment of the present application.
  • FIG. 7B is a schematic structural diagram of a sound pickup device according to another embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of an electronic device according to another embodiment of the present application.
  • the electronic device may be a terminal with a microphone, such as a smart phone with a microphone, a portable wearable device (such as a smart watch, etc.), or a tablet computer.
  • the user is sometimes required to operate the electronic device.
  • the solution usually adopted in the prior art is that when the recording mode is switched to the front mode, the front directivity of the microphone is fixed.
  • the front directivity of the microphone is fixed, that is, the parameters such as the range of received sound and the intensity of the sound are fixed. Yes, the front directivity will not be adjusted according to different scenes, such as the distance of the characters, the different positions of the characters in the image, etc.
  • the existing radio method is prone to defects such as unclear radio effect and high noise.
  • the recording method disclosed in the present application is in the front mode, the image captured in real time is identified, the scene corresponding to the image is determined, and the front directivity of the microphone is adjusted according to the scene corresponding to the image, which is beneficial to improve the sound collection effect. Improve user experience.
  • FIG. 1 is a schematic flowchart of a method for collecting audio according to an embodiment of the present application.
  • the method may include the following steps.
  • Step 101 Identify a real-time captured image, and determine a scene corresponding to the image; the scene is one of preset scenes.
  • recognizing an image captured in real time includes: recognizing one or more of the following information in the image: the number of specified objects, the clarity of specified objects, the presence of specified objects in the image , and specify the position of the object in the image, etc.
  • the specified object may be a human face image. For example, if the captured image includes an image of a human face, in this embodiment, a human face image can be identified. In other embodiments, if the captured images include images of two human faces, in this embodiment, two human face images can be identified. If there is no face image in the image, the number of the identified face image is 0 in this embodiment.
  • the sharpness of the recognized face image can also be evaluated, for example, it can be confirmed that the designated object is clear, unclear, etc., and multiple sharpness levels can also be set, and the designated objects included in it can be determined according to the actual image. Object clarity.
  • the proportion of the specified object in the image can be identified, for example, according to the size of the area occupied by the specified object in the image, it can be identified that the proportion of the face image is 30%, 10%, 0 or 50% Wait.
  • the proportion of the designated object in the center area of the image can also be identified, and a certain area near the center point of the image can be determined as the center area.
  • the circular area of the radius is determined as the central area, and then the proportion of the specified object in the central area is determined. It is understood that the central area may also be defined in other ways.
  • the position of the designated control in the image may also be identified, and the position of the designated object in the image may be: at the center of the image, above the image, below the image, to the left of the image, to the right of the image, to the image top left of the image, bottom left of the image, top right of the image, bottom right of the image, etc.
  • Step 102 Adjust the sound pickup directivity of the microphone, and match the sound pickup directivity of the microphone with the scene according to a preset rule.
  • the preset scene may include one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene.
  • the first scene includes: the image includes one or more designated objects, the designated objects are in a central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold.
  • the specified object may be a face image, as shown in FIG. 3A, through identification, it is determined that the image shown in FIG. 3A includes a specified object, the specified object is located in the middle area of the image, and the first threshold may be 30%,
  • the central area may be within the square area corresponding to the dotted line in FIG. 3A , for example, it may be a circular area with the center of the square as a dot and a quarter of the side length of the square as a radius. It can be understood that the first threshold and the center Zones can be fixed or set and adjusted as needed.
  • the second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold.
  • the specified object may be a face image, as shown in FIG. 3B , through identification, it is determined that the image shown in FIG. 3B includes two specified objects, the specified objects are not located in the central area of the image, and the second threshold may be 28 %, the two specified objects in Fig. 3B occupy more than 28% of the image.
  • the third scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; for example, the designated objects may be face images , as shown in FIG. 3C, through identification, it is determined that the image shown in FIG. 3C includes two specified objects, the specified objects are not located in the central area of the image, and the third threshold can be 15%, and the two specified objects in FIG. 3C are in the image. The proportion is less than 15%.
  • the fourth scene includes: the specified object is not included in the image, as shown in FIG. 3D.
  • the preset rules may include: the radio front directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially, and the radio effect is shown in FIG. 4A to FIG. 4D .
  • the smartphone is placed at the center of the screen perpendicular to the image, 0 degrees is the direction facing the rear camera, and 180 degrees is the facing direction of the front camera.
  • the black refers to the range where sound is picked up
  • the gray refers to the range where no sound is picked up.
  • the preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens in turn.
  • the sound pickup method adopted in the embodiment of the present application can adjust the sound pickup directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is beneficial to improve the sound pickup effect and user experience.
  • the method further includes: switching the photographing mode of the electronic device to a front-facing mode or a rear-facing mode.
  • the method before identifying the real-time captured image, the method further includes: determining a designated object in the real-time captured image.
  • adjusting the directivity of sound collection of the microphone includes: when the shooting mode of the electronic device is the front mode, adjusting the directivity of the sound collection front of the microphone. When the shooting mode of the electronic device is the rear mode, the rear directivity of the microphone is adjusted.
  • determining the designated object in the real-time captured image includes: acquiring a user's click operation on the image, and determining the object of the click operation as the designated object. For example, if a picture captured in real time includes a specified object, the user can click to confirm the specified object, and then determine the current scene according to the size, definition, position and other information of the specified object in the picture, and then select the sound from the microphone according to the scene.
  • the directivity can be adjusted, and the radio directivity of the scene and the microphone can be preset as needed to better receive the sound from the specified object.
  • the method may further include: taking a picture to obtain a first picture, and determining the designated object from the first picture.
  • the method may further include: acquiring pictures from other means, such as acquiring pictures from a gallery, or acquiring pictures from other electronic devices, or taking screenshots by the electronic device and other operations to obtain a picture, and select the specified object in the obtained picture.
  • acquiring pictures from other means such as acquiring pictures from a gallery, or acquiring pictures from other electronic devices, or taking screenshots by the electronic device and other operations to obtain a picture, and select the specified object in the obtained picture.
  • the method may further include: determining the specified object according to the description of the specified object.
  • the specified object can be described by means of information or voice recorded in this article, such as a person wearing a hat, a person wearing a necklace, a person wearing a dress, a person holding a microphone, etc. It is understood that the specified object is not limited to It can be a person or other things that can make sounds, such as robots, small animals (kittens, puppies, birds, etc.), etc., through the description of the specified object to identify whether the image captured in real time includes the specified object.
  • FIG. 2 is a schematic flowchart of a method for collecting audio according to another embodiment of the present application.
  • the method may include the following steps.
  • Step 201 Switch the shooting mode of the electronic device to the front mode.
  • the shooting mode can be switched through a hardware switch, and the shooting mode of the electronic device can also be switched to the front mode by operating a virtual key in the display interface.
  • Step 202 when the shooting mode is the front mode, identify the image captured in real time, and determine the scene corresponding to the image; the scene is one of the preset scenes.
  • recognizing an image captured in real time includes: recognizing one or more of the following information in the image: the number of specified objects, the clarity of specified objects, the presence of specified objects in the image , and specify the position of the object in the image, etc.
  • the specified object may be a human face image. For example, if the captured image includes an image of a human face, in this embodiment, a human face image can be identified. In other embodiments, if the captured images include images of two human faces, in this embodiment, two human face images can be identified. If there is no face image in the image, the number of the identified face image is 0 in this embodiment.
  • the sharpness of the recognized face image can also be evaluated, for example, it can be confirmed that the designated object is clear, unclear, etc., and multiple sharpness levels can also be set, and the designated objects included in it can be determined according to the actual image. Object clarity.
  • the proportion of the specified object in the image can be identified, for example, according to the size of the area occupied by the specified object in the image, it can be identified that the proportion of the face image is 30%, 10%, 0 or 50% Wait.
  • the proportion of the designated object in the center area of the image can also be identified, and a certain area near the center point of the image can be determined as the center area.
  • the circular area of the radius is determined as the central area, and then the proportion of the specified object in the central area is determined. It is understood that the central area may also be defined in other ways.
  • the position of the designated control in the image may also be identified, and the position of the designated object in the image may be: at the center of the image, above the image, below the image, to the left of the image, to the right of the image, to the image top left of the image, bottom left of the image, top right of the image, bottom right of the image, etc.
  • Step 203 Adjust the front-end directivity of the microphone, and match the front-end directivity of the microphone with the scene according to a preset rule.
  • the preset scene may include one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene.
  • the first scene includes: the image includes one or more designated objects, the designated objects are in a central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold.
  • the specified object may be a face image, as shown in FIG. 3A, through identification, it is determined that the image shown in FIG. 3A includes a specified object, the specified object is located in the middle area of the image, and the first threshold may be 30%,
  • the central area may be within the square area corresponding to the dotted line in FIG. 3A , for example, it may be a circular area with the center of the square as a dot and a quarter of the side length of the square as a radius. It can be understood that the first threshold and the center Zones can be fixed or set and adjusted as needed.
  • the second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold.
  • the specified object may be a face image, as shown in FIG. 3B , through identification, it is determined that the image shown in FIG. 3B includes two specified objects, the specified objects are not located in the central area of the image, and the second threshold may be 28 %, the two specified objects in Fig. 3B occupy more than 28% of the image.
  • the third scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; for example, the designated objects may be face images , as shown in FIG. 3C, through identification, it is determined that the image shown in FIG. 3C includes two specified objects, the specified objects are not located in the central area of the image, and the third threshold can be 15%, and the two specified objects in FIG. 3C are in the image. The proportion is less than 15%.
  • the fourth scene includes: the specified object is not included in the image, as shown in FIG. 3D.
  • the preset rules may include: the radio front directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially, and the radio effect is shown in FIG. 4A to FIG. 4D .
  • the smartphone is placed at the center of the screen perpendicular to the image, 0 degrees is the direction facing the rear camera, and 180 degrees is the facing direction of the front camera.
  • the black refers to the range where sound is picked up
  • the gray refers to the range where no sound is picked up.
  • the sound collection method adopted in the embodiment of the present application can adjust the forward direction directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound collection scene is beneficial to improve the sound collection effect and user experience.
  • FIG. 6 is a schematic structural diagram of a sound pickup device 600 according to an embodiment of the present application.
  • the sound pickup device includes a microphone 601 , a first processing unit 602 and an adjustment unit 603 .
  • the first processing unit 602 identifies a real-time captured image, and determines a scene corresponding to the image; the scene is one of preset scenes.
  • recognizing an image captured in real time includes: recognizing one or more of the following information in the image: the number of specified objects, the clarity of specified objects, the presence of specified objects in the image , and specify the position of the object in the image, etc.
  • the specified object may be a human face image. For example, if the captured image includes an image of a human face, in this embodiment, a human face image can be identified. In other embodiments, if the captured images include images of two human faces, in this embodiment, two human face images can be identified. If there is no face image in the image, the number of the identified face image is 0 in this embodiment.
  • the sharpness of the recognized face image can also be evaluated, for example, it can be confirmed that the designated object is clear, unclear, etc., and multiple sharpness levels can also be set, and the designated objects included in it can be determined according to the actual image. Object clarity.
  • the proportion of the specified object in the image can be identified, for example, according to the size of the area occupied by the specified object in the image, it can be identified that the proportion of the face image is 30%, 10%, 0 or 50% Wait.
  • the proportion of the designated object in the center area of the image can also be identified, and a certain area near the center point of the image can be determined as the center area.
  • the circular area of the radius is determined as the central area, and then the proportion of the specified object in the central area is determined. It is understood that the central area may also be defined in other ways.
  • the position of the designated control in the image may also be identified, and the position of the designated object in the image may be: at the center of the image, above the image, below the image, to the left of the image, to the right of the image, to the image top left of the image, bottom left of the image, top right of the image, bottom right of the image, etc.
  • the adjusting unit 603 is configured to adjust the front-end directivity of the microphone 601, and match the pre-directivity of the microphone 601 with the scene according to a preset rule.
  • the preset scene may include one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene.
  • the first scene includes: the image includes one or more designated objects, the designated objects are in a central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold.
  • the specified object may be a face image, as shown in FIG. 3A, through identification, it is determined that the image shown in FIG. 3A includes a specified object, the specified object is located in the middle area of the image, and the first threshold may be 30%,
  • the central area may be within the square area corresponding to the dotted line in FIG. 3A , for example, it may be a circular area with the center of the square as a dot and a quarter of the side length of the square as a radius. It can be understood that the first threshold and the center Zones can be fixed or set and adjusted as needed.
  • the second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold.
  • the specified object may be a face image, as shown in FIG. 3B , through identification, it is determined that the image shown in FIG. 3B includes two specified objects, the specified objects are not located in the central area of the image, and the second threshold may be 28 %, the two specified objects in Fig. 3B occupy more than 28% of the image.
  • the third scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; for example, the designated objects may be face images , as shown in FIG. 3C, through identification, it is determined that the image shown in FIG. 3C includes two specified objects, the specified objects are not located in the central area of the image, and the third threshold can be 15%, and the two specified objects in FIG. 3C are in the image. The proportion is less than 15%.
  • the fourth scene includes: the specified object is not included in the image, as shown in FIG. 3D.
  • the preset rules may include: the radio front directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially, and the radio effect is shown in FIG. 4A to FIG. 4D .
  • the smartphone is placed at the center of the screen perpendicular to the image, 0 degrees is the direction facing the rear camera, and 180 degrees is the facing direction of the front camera.
  • the black refers to the range where sound is picked up
  • the gray refers to the range where no sound is picked up.
  • the preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens in turn.
  • the sound pickup device used in the embodiments of the present application can adjust the forward directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is conducive to improving the sound pickup effect and user experience.
  • the sound pickup apparatus 700 may further include a switching unit 701 for switching the shooting mode of the electronic device to the front mode or the rear mode.
  • the first processing unit 702 may also be configured to determine a specified object in the real-time captured image before identifying the real-time captured image.
  • the adjusting unit 703 is specifically configured to adjust the front directivity of the microphone 704 when the shooting mode of the electronic device is the front mode. When the photographing mode of the electronic device is the rear mode, the rear directivity of the microphone 704 for sound collection is adjusted.
  • the first processing unit 702 is specifically configured to: acquire a user's click operation on the image, and determine the object of the click operation as the designated object. For example, if a picture captured in real time includes a specified object, the user can click to confirm the specified object, and then determine the current scene according to the size, definition, position and other information of the specified object in the picture, and then select the sound from the microphone according to the scene.
  • the directivity can be adjusted, and the radio directivity of the scene and the microphone can be preset as needed to better receive the sound from the specified object.
  • the radio apparatus 700 may further include a second processing unit 705, configured to take a picture to obtain a first picture before the first processing unit determines the specified object in the real-time captured image, The specified object is determined from the first picture.
  • a second processing unit 705 configured to take a picture to obtain a first picture before the first processing unit determines the specified object in the real-time captured image, The specified object is determined from the first picture.
  • the second processing unit 705 may also be configured to acquire pictures from other channels, such as acquiring pictures from a gallery, or acquiring pictures from other electronic devices, or acquiring pictures by the electronic device through operations such as screenshots. Select the specified object in the acquired image.
  • the second processing unit 705 may also be configured to: determine the specified object according to the description of the specified object.
  • a specified object can be described by means of information or voice recorded in this article, such as a person wearing a hat, a person wearing a necklace, a person wearing a dress, a person holding a microphone, a square object, a bird, etc.
  • the specified object understood is not limited to people, but can also be other things that can make sounds, such as robots, small animals (kittens, puppies, birds, etc.) Include specified objects for identification.
  • the sound pickup method adopted in the embodiment of the present application can adjust the sound pickup directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is beneficial to improve the sound pickup effect and user experience.
  • FIG. 7A is a schematic structural diagram of a sound pickup device 700 according to an embodiment of the present application.
  • the sound pickup device 700 includes a switching unit 701 , a first processing unit 702 , an adjustment unit 703 and a microphone 704 .
  • the switching unit 701 is configured to switch the photographing mode of the electronic device to the front mode before the first processing unit 702 recognizes the real-time photographed image.
  • the shooting mode can be switched through a hardware switch, and the shooting mode of the electronic device can also be switched to the front mode by operating a virtual key in the display interface.
  • the first processing unit 702 when the shooting mode is the front mode, identifies the image captured in real time, and determines a scene corresponding to the image; the scene is one of preset scenes.
  • the first processing unit 702 is specifically configured to: recognize one or more of the following information in the images: the number of specified objects, the number of specified objects Sharpness, the proportion of the specified object in the image, and the position of the specified object in the image, etc.
  • the specified object may be a human face image. For example, if the captured image includes an image of a human face, in this embodiment, a human face image can be identified. In other embodiments, if the captured images include images of two human faces, in this embodiment, two human face images can be identified. If there is no face image in the image, the number of the identified face image is 0 in this embodiment.
  • the sharpness of the recognized face image can also be evaluated, for example, it can be confirmed that the designated object is clear, unclear, etc., and multiple sharpness levels can also be set, and the designated objects included in it can be determined according to the actual image. Object clarity.
  • the proportion of the specified object in the image can be identified, for example, according to the size of the area occupied by the specified object in the image, it can be identified that the proportion of the face image is 30%, 10%, 0 or 50% Wait.
  • the proportion of the designated object in the center area of the image can also be identified, and a certain area near the center point of the image can be determined as the center area.
  • the circular area of the radius is determined as the central area, and then the proportion of the specified object in the central area is determined. It is understood that the central area may also be defined in other ways.
  • the position of the designated control in the image may also be identified, and the position of the designated object in the image may be: at the center of the image, above the image, below the image, to the left of the image, to the right of the image, to the image top left of the image, bottom left of the image, top right of the image, bottom right of the image, etc.
  • the adjustment unit 703 is configured to adjust the front directivity of the microphone for sound collection, and match the front direction of the sound collection of the microphone 704 with the scene according to a preset rule.
  • the preset scene may include one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene.
  • the first scene includes: the image includes one or more designated objects, the designated objects are in a central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold.
  • the specified object may be a face image, as shown in FIG. 3A, through identification, it is determined that the image shown in FIG. 3A includes a specified object, the specified object is located in the middle area of the image, and the first threshold may be 30%,
  • the central area may be within the square area corresponding to the dotted line in FIG. 3A , for example, it may be a circular area with the center of the square as a dot and a quarter of the side length of the square as a radius. It can be understood that the first threshold and the center Zones can be fixed or set and adjusted as needed.
  • the second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold.
  • the specified object may be a face image, as shown in FIG. 3B , through identification, it is determined that the image shown in FIG. 3B includes two specified objects, the specified objects are not located in the central area of the image, and the second threshold may be 28 %, the two specified objects in Fig. 3B occupy more than 28% of the image.
  • the third scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; for example, the designated objects may be face images , as shown in FIG. 3C, through identification, it is determined that the image shown in FIG. 3C includes two specified objects, the specified objects are not located in the central area of the image, and the third threshold can be 15%, and the two specified objects in FIG. 3C are in the image. The proportion is less than 15%.
  • the fourth scene includes: the specified object is not included in the image, as shown in FIG. 3D.
  • the preset rules may include: the radio front directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially, and the radio effect is shown in FIG. 4A to FIG. 4D .
  • the smartphone is placed at the center of the screen perpendicular to the image, 0 degrees is the direction facing the rear camera, and 180 degrees is the facing direction of the front camera.
  • the black refers to the range where sound is picked up
  • the gray refers to the range where no sound is picked up.
  • the sound pickup device used in the embodiments of the present application can adjust the forward directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is conducive to improving the sound pickup effect and user experience.
  • FIG. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
  • the electronic device 800 includes: a radio frequency unit 810 , a memory 820 , an input unit 830 , a camera 840 , an audio circuit 850 , and a processor 860 , an external interface 870 and a power supply 880 .
  • the input unit 830 includes a touch screen 831 and other input devices 832
  • the audio circuit 850 includes a speaker 851 , a microphone 852 and an earphone jack 853 .
  • the touch screen 831 may be a display screen with a touch function.
  • the user can switch the shooting mode of the electronic device 800 to the front mode or the rear mode by clicking the switch button displayed on the touch screen 831.
  • the electronic device After switching to the front mode, the electronic device It can be in the state of shooting video, video calling, webcasting, etc.
  • the processor 860 recognizes the image captured in real time, and determines the scene corresponding to the image; the scene is one of the preset scenes, and according to the change of the shooting video, the processor 860 can real-time Identify different scenarios.
  • the processor 860 adjusts the sound collection front directivity of the microphone 852, and matches the sound collection forward directivity of the microphone 852 with the scene according to a preset rule.
  • the sound pickup directivity of the microphone can be adjusted according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is conducive to improving the sound pickup effect and user experience.
  • Embodiments of the present application further provide an electronic device, including: a microphone, a camera, a memory, and a processor, wherein the memory is used to store preset rules and computer program code, and the computer program code includes instructions; the instructions are processed by the processor When running, the electronic device executes the instructions of part or all of the steps of the audio pickup method described in any of the foregoing method embodiments.
  • the embodiments of the present application further provide a computer program product, when the computer program product runs on a computer, the computer executes the instructions of part or all of the steps of the audio pickup method described in any of the foregoing method embodiments.
  • Embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores program codes for execution by an electronic device.
  • the program codes When executed, the electronic device executes any of the foregoing method embodiments. method described.
  • each module in the above apparatus is only a division of logical functions, and in actual implementation, it may be fully or partially integrated into a physical entity, or may be physically separated.
  • each of the above modules can be a separately established processing element, or can be integrated into a certain chip of the terminal to be implemented, in addition, it can also be stored in the storage element of the controller in the form of program code, and processed by a certain one of the processor.
  • the component calls and executes the functions of the above modules.
  • each module can be integrated together or can be implemented independently.
  • the processing element described here may be an integrated circuit chip with signal processing capability.
  • each step of the above-mentioned method or each of the above-mentioned modules can be completed by an integrated logic circuit of hardware in the processor element or an instruction in the form of software.
  • the processing element may be a general-purpose processor, such as a central processing unit (CPU), or may be one or more integrated circuits configured to implement the above method, such as one or more application-specific integrated circuits (application-specific integrated circuits) integrated circuit, ASIC), or, one or more microprocessors (digital signal processor, DSP), or, or, one or more field-programmable gate arrays (field-programmable gate array, FPGA), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Otolaryngology (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • General Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Human Computer Interaction (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

Disclosed in embodiments of the present application are a method for audio reception, an apparatus, and a related electronic device. The method is used in an electronic device that has a microphone, and said method comprises: performing recognition on an image photographed in real time, and determining a scenario corresponding to said image, the scenario being one scenario among preset scenarios; adjusting the audio reception directivity of a microphone, and causing the audio reception directivity of the microphone to match the scenario according to a preset rule. The method for audio reception utilized in an embodiment of the present application allows for adjustment of the audio reception directivity of a microphone according to a scenario corresponding to an image photographed in real time; timely adjustment of the directivity by means of determining an audio reception scenario facilitates audio reception effect improvement, and the experience of a user is improved.

Description

一种收音方法、装置及相关电子设备A radio method, device and related electronic equipment
本申请要求于2021年4月29日提交中国国家知识产权局、申请号为202110478055.3、发明名称为“一种收音方法、装置及相关电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with the application number of 202110478055.3 and the title of the invention "A sound recording method, device and related electronic equipment" submitted to the State Intellectual Property Office of China on April 29, 2021, the entire contents of which are by reference Incorporated in this application.
技术领域technical field
本申请涉及电子设备收音技术领域,尤其涉及一种收音方法、装置、电子设备。The present application relates to the technical field of sound collection of electronic equipment, and in particular, to a sound collection method, apparatus, and electronic equipment.
背景技术Background technique
智能手机等电子设备拍摄视频时需要对音频信息进行收音,为了提升收音效果,有的电子设备上设置了多个麦克风,可以进行指向性收音,当用户调用手机前置摄像头进行拍摄时,为了确保前置方向人声收音效果,默认开启前置指向收音,以确保前置方向人声收音清晰,不受环境噪音影响。Smartphones and other electronic devices need to record audio information when shooting videos. In order to improve the audio performance, some electronic devices are equipped with multiple microphones, which can perform directional radio. When the user calls the front camera of the mobile phone to shoot, in order to ensure For the front-facing vocal radio effect, the front-facing radio is enabled by default to ensure that the front-facing vocal radio is clear and unaffected by ambient noise.
由于技术和硬件架构的限制,目前指向性收音无法调整任意指向方向,当前置指向收音功能开启时,通常指向摄像头正前方,为了避免用户暂时离开画面等场景无法收音现象的出现,前置指向收音算法不能开到最强。比如,智能手机通过前置指向收音做直播时,若前置指向收音算法开到最强,这时前置收音指向性最强,当用户走到画面以外的时候,智能手机就收不到用户的声音了,这显然不是用户期望的结果。因此,基于当前无法调整前置收音指向性的情况,默认指向收音算法不能开到最强,这也导致用户无法享受到效果最好、最纯洁的前置指向性收音。Due to the limitation of technology and hardware architecture, the current directional radio cannot be adjusted in any direction. When the front directional radio function is turned on, it usually points directly in front of the camera. Algorithms can not be opened to the strongest. For example, when the smart phone broadcasts live through the front-facing radio, if the front-facing radio algorithm is turned on to the strongest, then the front-facing radio has the strongest directivity. When the user walks out of the screen, the smartphone cannot receive the user. , which is clearly not what the user expected. Therefore, based on the current situation that the directivity of the front radio cannot be adjusted, the default directivity radio algorithm cannot be turned on to the strongest, which also prevents users from enjoying the best and purest pre-directivity radio.
所以,亟需提供一种能够解决上述问题的收音方法、装置、电子设备及计算机可读存储介质。Therefore, there is an urgent need to provide a sound pickup method, device, electronic device and computer-readable storage medium that can solve the above problems.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供了一种收音方法、装置及相关电子设备,能够更好地进行收音。The embodiments of the present application provide a method, device and related electronic equipment for sound pickup, which can better perform sound pickup.
第一方面,本申请的实施例提供了一种收音方法,应用于包括麦克风的电子设备,所述方法包括:对实时拍摄的图像进行识别,确定所述图像对应的场景;所述场景为预设场景中的一种;对所述麦克风的收音指向性进行调节,按照预设规则使所述麦克风的收音指向性与所述场景匹配。In a first aspect, an embodiment of the present application provides a sound collection method, which is applied to an electronic device including a microphone. The method includes: recognizing an image captured in real time, and determining a scene corresponding to the image; the scene is a preset Set one of the scenarios; adjust the sound pickup directivity of the microphone, and match the sound pickup directivity of the microphone with the scene according to preset rules.
本申请的实施例采用的收音方法能够根据拍摄的实时图像对应的场景对麦克风的收音指向性进行调节,通过确定收音场景进行指向性的及时调节有利于提高收音效果,提升用户体验。The sound pickup method adopted in the embodiment of the present application can adjust the sound pickup directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is beneficial to improve the sound pickup effect and user experience.
根据第一方面,在所述收音方法的一种可能的实现方式中,所述对实时拍摄的图像进行识别,包括:识别所述图像中的如下信息中的一个或者多个:指定对象的个数、所述指定对象的清晰度、所述指定对象在所述图像中的占比、以及所述指定对象在所述图像中的位置。According to the first aspect, in a possible implementation manner of the audio pickup method, the recognizing an image captured in real time includes: recognizing one or more of the following information in the image: number, the definition of the specified object, the proportion of the specified object in the image, and the position of the specified object in the image.
根据第一方面,在所述收音方法的一种可能的实现方式中,用于确定所述预设场景的信息包括如下信息中的一个或者多个确定:所述图像中指定对象的个数、所述指定对象的清晰度、所述指定对象在所述图像中的占比、或者所述指定对象在所述图像中的位置。According to the first aspect, in a possible implementation manner of the audio pickup method, the information for determining the preset scene includes determining one or more of the following information: the number of specified objects in the image, The definition of the specified object, the proportion of the specified object in the image, or the position of the specified object in the image.
根据第一方面,在所述收音方法的一种可能的实现方式中,所述指定对象包括人脸图像; 所述对实时拍摄的图像进行识别,包括:对所述图像进行人脸识别。According to the first aspect, in a possible implementation manner of the audio pickup method, the designated object includes a face image; and the recognizing the image captured in real time includes: performing face recognition on the image.
根据第一方面,在所述收音方法的一种可能的实现方式中,所述预设场景包括如下场景中的一个或者多个场景:第一场景、第二场景、第三场景和第四场景;所述第一场景包括:所述图像中包括一个或者多个指定对象,所述指定对象在所述图像中心区域内,并且在所述图像中心区域内的占比超过第一阈值;所述第二场景包括:所述图像中包括一个或者多个指定对象,所述指定对象未在所述图像中心区域内,并且在所述图像中的占比超过第二阈值;所述第三场景包括:所述图像中包括一个或者多个指定对象,所述指定对象未在所述图像中心区域内,并且在所述图像中的占比未超过第三阈值;所述第四场景包括:所述图像中未包括指定对象;所述预设规则包括:所述第一场景、所述第二场景、所述第三场景和所述第四场景分别对应的收音指向性依次减弱。According to the first aspect, in a possible implementation manner of the radio method, the preset scene includes one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene ; the first scene includes: the image includes one or more designated objects, the designated objects are in the central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold; the The second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold; the third scenario includes : the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; the fourth scene includes: the The image does not include a designated object; the preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens in turn.
根据第一方面,在所述收音方法的一种可能的实现方式中,在所述对实时拍摄的图像进行识别之前,所述方法还包括:将所述电子设备的拍摄模式切换为前置模式或者后置模式。According to the first aspect, in a possible implementation manner of the sound collection method, before the recognizing the real-time captured image, the method further includes: switching the capturing mode of the electronic device to the front mode Or post mode.
根据第一方面,在所述收音方法的一种可能的实现方式中,在所述对实时拍摄的图像进行识别之前,所述方法还包括:在所述实时拍摄的图像中确定指定对象;According to the first aspect, in a possible implementation manner of the audio pickup method, before the recognizing the real-time captured image, the method further includes: determining a designated object in the real-time captured image;
所述对所述麦克风的收音指向性进行调节,包括:在所述电子设备的拍摄模式为前置模式时,对所述麦克风的收音前置指向性进行调节;或者,在所述电子设备的拍摄模式为后置模式时,对所述麦克风的收音后置指向性进行调节。The adjusting the sound pickup directivity of the microphone includes: when the shooting mode of the electronic device is the front mode, adjusting the sound pickup front directivity of the microphone; When the shooting mode is the rear mode, the rear directivity of the microphone for sound collection is adjusted.
根据第一方面,在所述收音方法的一种可能的实现方式中,所述在所述实时拍摄的图像中确定指定对象,包括:获取用户对所述图像的点击操作,将所述点击操作的对象确定为所述指定对象。According to the first aspect, in a possible implementation manner of the audio pickup method, the determining a specified object in the real-time captured image includes: acquiring a user's click operation on the image, and converting the click operation to the image. The object is determined to be the specified object.
根据第一方面,在所述收音方法的一种可能的实现方式中,所述在所述实时拍摄的图像中确定指定对象之前,所述方法包括:拍照获取第一图片,从所述第一图片中确定指定对象;或者,从图片库中获取第二图片,从所述第二图片确定指定对象;或者,根据预先获取的对指定对象的描述确定指定对象。According to the first aspect, in a possible implementation manner of the audio pickup method, before determining the designated object in the real-time captured image, the method includes: taking a picture to obtain a first picture, and obtaining a first picture from the first image. The specified object is determined from the picture; or, a second picture is acquired from a picture library, and the specified object is determined from the second picture; or the specified object is determined according to a pre-acquired description of the specified object.
第二方面,本申请的实施例提供了一种收音装置,包括:麦克风,还包括:第一处理单元,对实时拍摄的图像进行识别,确定所述图像对应的场景;所述场景为预设场景中的一种;调节单元,用于对所述麦克风的收音指向性进行调节,按照预设规则使所述麦克风的收音前置指向性与所述场景匹配。In a second aspect, an embodiment of the present application provides a sound-receiving device, comprising: a microphone, and further comprising: a first processing unit for recognizing an image captured in real time, and determining a scene corresponding to the image; the scene is a preset One of the scenarios; an adjustment unit, configured to adjust the sound pickup directivity of the microphone, and make the sound pickup front directivity of the microphone match the scene according to a preset rule.
本申请的实施例采用的收音装置能够根据拍摄的实时图像对应的场景对麦克风的收音指向性进行调节,通过确定收音场景进行指向性的及时调节有利于提高收音效果,提升用户体验。The sound pickup device used in the embodiment of the present application can adjust the sound pickup directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is beneficial to improve the sound pickup effect and user experience.
根据第二方面,在所述装置的一种可能的实现方式中,在所述对实时拍摄的图像进行识别方面,所述第一处理单元具体用于,识别所述图像中的如下信息中的一个或者多个:指定对象的个数、所述指定对象的清晰度、所述指定对象在所述图像中的占比、以及所述指定对象在所述图像中的位置。According to the second aspect, in a possible implementation manner of the device, in the aspect of recognizing the image captured in real time, the first processing unit is specifically configured to recognize the following information in the image: One or more of: the number of designated objects, the definition of the designated objects, the proportion of the designated objects in the image, and the position of the designated objects in the image.
根据第二方面,在所述装置的一种可能的实现方式中,用于确定所述预设场景的信息包括如下信息中的一个或者多个确定:所述图像中指定对象的个数、所述指定对象的清晰度、所述指定对象在所述图像中的占比、或者所述指定对象在所述图像中的位置。According to the second aspect, in a possible implementation manner of the device, the information for determining the preset scene includes determining one or more of the following information: the number of specified objects in the image, all The definition of the specified object, the proportion of the specified object in the image, or the position of the specified object in the image.
根据第二方面,在所述装置的一种可能的实现方式中,所述指定对象包括人脸图像;所述对实时拍摄的图像进行识别,包括:对所述图像进行人脸识别。According to the second aspect, in a possible implementation manner of the apparatus, the designated object includes a face image; and the recognizing the image captured in real time includes: performing face recognition on the image.
根据第二方面,在所述装置的一种可能的实现方式中,所述预设场景包括如下场景中的 一个或者多个场景:第一场景、第二场景、第三场景和第四场景;所述第一场景包括:所述图像中包括一个或者多个指定对象,所述指定对象在所述图像中心区域内,并且在所述图像中心区域内的占比超过第一阈值;所述第二场景包括:所述图像中包括一个或者多个指定对象,所述指定对象未在所述图像中心区域内,并且在所述图像中的占比超过第二阈值;所述第三场景包括:所述图像中包括一个或者多个指定对象,所述指定对象未在所述图像中心区域内,并且在所述图像中的占比未超过第三阈值;所述第四场景包括:所述图像中未包括指定对象;所述预设规则包括:所述第一场景、所述第二场景、所述第三场景和所述第四场景分别对应的收音指向性依次减弱。According to the second aspect, in a possible implementation manner of the apparatus, the preset scene includes one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene; The first scene includes: the image includes one or more designated objects, the designated objects are in the central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold; the first The second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold; the third scenario includes: The image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; the fourth scene includes: the image does not include a designated object; the preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens in turn.
根据第二方面,在所述装置的一种可能的实现方式中,还包括:切换单元,用于在所述第一处理单元对实时拍摄的图像进行识别之前,将所述电子设备的拍摄模式切换为前置模式或者后置模式。According to the second aspect, in a possible implementation manner of the device, the device further includes: a switching unit, configured to change the shooting mode of the electronic device before the first processing unit recognizes the real-time captured image Switch to front mode or rear mode.
根据第二方面,在所述装置的一种可能的实现方式中,所述第一处理单元还用于,在对实时拍摄的图像进行识别之前,在所述实时拍摄的图像中确定指定对象;所述调节单元具体用于,在所述电子设备的拍摄模式为前置模式时,对所述麦克风的收音前置指向性进行调节;或者,在所述电子设备的拍摄模式为后置模式时,对所述麦克风的收音后置指向性进行调节。According to the second aspect, in a possible implementation manner of the device, the first processing unit is further configured to, before identifying the real-time captured image, determine a designated object in the real-time captured image; The adjustment unit is specifically configured to, when the shooting mode of the electronic device is the front mode, adjust the front direction of the microphone for sound collection; or, when the shooting mode of the electronic device is the rear mode , to adjust the directivity of the microphone's sound-receiving rear.
根据第二方面,在所述装置的一种可能的实现方式中,在所述实时拍摄的图像中确定指定对象方面,所述第一处理单元具体用于,获取用户对所述图像的点击操作,将所述点击操作的对象确定为所述指定对象。According to the second aspect, in a possible implementation manner of the device, in terms of determining a specified object in the real-time captured image, the first processing unit is specifically configured to acquire a user's click operation on the image , and the object of the click operation is determined as the specified object.
根据第二方面,在所述装置的一种可能的实现方式中,还包括:第二处理单元,用于在所述第一处理单元在所述实时拍摄的图像中确定指定对象之前,拍照获取第一图片,从所述第一图片中确定指定对象;或者,用于在所述第一处理单元在所述实时拍摄的图像中确定指定对象之前,从图片库中获取第二图片,从所述第二图片确定指定对象;或者,用于根据预先获取的指定对象的描述确定指定对象。According to the second aspect, in a possible implementation manner of the device, the device further includes: a second processing unit, configured to take a picture to obtain the specified object before the first processing unit determines the specified object in the real-time captured image The first picture, for determining the specified object from the first picture; or, for obtaining the second picture from the picture library before the first processing unit determines the specified object in the real-time captured image, The second picture is used to determine the designated object; or, it is used to determine the designated object according to the pre-acquired description of the designated object.
第三方面,本申请的实施例提供了一种电子设备,包括:麦克风、存储器和处理器,其中,所述存储器,用于保存预设场景、预设规则和计算机程序代码,所述计算机程序代码包括指令;所述指令被所述处理器运行时,使得所述电子设备执行第一方面或者第一方面的多种可能的实现方式中的一种或几种的收音方法。In a third aspect, embodiments of the present application provide an electronic device, including: a microphone, a memory, and a processor, wherein the memory is used to save a preset scene, a preset rule, and a computer program code, and the computer program The code includes instructions; when the instructions are executed by the processor, the instructions cause the electronic device to execute the first aspect or one or more of the voice pickup methods in multiple possible implementations of the first aspect.
第四方面,本申请的实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储用于电子设备执行的程序代码,所述程序代码被执行时,所述电子设备执行第一方面或者第一方面的多种可能的实现方式中的一种或几种的收音方法。In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, where the computer-readable storage medium stores program codes for execution by an electronic device, and when the program codes are executed, the electronic device executes the first One aspect or one or more of the multiple possible implementation manners of the first aspect is a sound collection method.
第五方面,本申请的实施例提供了一种计算机程序产品,当所述计算机程序产品在计算机上运行时,使得所述计算机执行第一方面或者第一方面的多种可能的实现方式中的一种或几种的收音方法。In a fifth aspect, embodiments of the present application provide a computer program product that, when the computer program product runs on a computer, causes the computer to execute the first aspect or one of the various possible implementations of the first aspect. One or more radio methods.
对于上述任何一种可能的技术方案,在不违背自然规律的前提下,可以进行方案之间的组合。For any of the above-mentioned possible technical solutions, on the premise of not violating the laws of nature, the combination between the solutions can be carried out.
采用本申请实施例提供的技术方案,能够根据拍摄的实时图像对应的场景对麦克风的收音前置指向性进行调节,通过确定收音场景进行指向性的及时调节有利于提高收音效果,提升用户体验。Using the technical solutions provided by the embodiments of the present application, the front directivity of the microphone can be adjusted according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the radio scene is conducive to improving the radio effect and user experience.
附图说明Description of drawings
图1为本申请的一个实施例提供的一种收音方法的流程示意图。FIG. 1 is a schematic flowchart of a method for collecting audio according to an embodiment of the present application.
图2为本申请的另一个实施例提供的一种收音方法的流程示意图。FIG. 2 is a schematic flowchart of a voice pickup method provided by another embodiment of the present application.
图3A是本申请的一个实施例中一个场景对应的图像的示意图。FIG. 3A is a schematic diagram of an image corresponding to a scene in an embodiment of the present application.
图3B是本申请的另一个实施例中一个场景对应的图像的示意图。FIG. 3B is a schematic diagram of an image corresponding to a scene in another embodiment of the present application.
图3C是本申请的另一个实施例中一个场景对应的图像的示意图。FIG. 3C is a schematic diagram of an image corresponding to a scene in another embodiment of the present application.
图3D是本申请的另一个实施例中一个场景对应的图像的示意图。FIG. 3D is a schematic diagram of an image corresponding to a scene in another embodiment of the present application.
图4A是本申请的一个实施例中收音效果仿真示意图。FIG. 4A is a schematic diagram of a simulation of a sound collection effect in an embodiment of the present application.
图4B是本申请的另一个实施例中收音效果仿真示意图。FIG. 4B is a schematic diagram of simulation of a sound collection effect in another embodiment of the present application.
图4C是本申请的另一个实施例中收音效果仿真示意图。FIG. 4C is a schematic diagram of simulation of a sound collection effect in another embodiment of the present application.
图4D是本申请的另一个实施例中收音效果仿真示意图。FIG. 4D is a schematic diagram of simulation of a sound collection effect in another embodiment of the present application.
图5A是本申请的一个实施例中提示收音范围及强度示意图。FIG. 5A is a schematic diagram of the range and intensity of prompt sound collection according to an embodiment of the present application.
图5B是本申请的另一个实施例中提示收音范围及强度示意图。FIG. 5B is a schematic diagram of the range and intensity of prompt sound collection in another embodiment of the present application.
图5C是本申请的另一个实施例中提示收音范围及强度示意图。FIG. 5C is a schematic diagram of the range and intensity of prompt sound collection in another embodiment of the present application.
图5D是本申请的另一个实施例中提示收音范围及强度示意图。FIG. 5D is a schematic diagram of the range and intensity of prompt sound collection in another embodiment of the present application.
图6为本申请的一个实施例提供的一种收音装置的结构示意图。FIG. 6 is a schematic structural diagram of a sound pickup device according to an embodiment of the present application.
图7A为本申请的另一个实施例提供的一种收音装置的结构示意图。FIG. 7A is a schematic structural diagram of a sound pickup device according to another embodiment of the present application.
图7B为本申请的另一个实施例提供的一种收音装置的结构示意图。FIG. 7B is a schematic structural diagram of a sound pickup device according to another embodiment of the present application.
图8为本申请的另一个实施例提供的一种电子设备的结构示意图。FIG. 8 is a schematic structural diagram of an electronic device according to another embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请的一部分实施例,并不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments are only a part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.
本申请实施例中,电子设备可以是具有麦克风的终端,比如:具有麦克风的智能手机、便携式可穿戴设备(如智能手表等)或者平板电脑等。在实际使用时,有时需要用户对电子设备进行操作,现有技术通常采用方案是在收音时,在拍摄模式切换到前置模式时,麦克风的前置指向性是固定的,比如智能手机切换到前置拍摄时,拍摄的图像中不管有没有人,也不管人在图像中的占比,麦克风的前置指向性是固定不变的,也就是接收声音的范围和收音的强度等参数是固定的,前置指向性不会根据不同场景,比如人物的远近,人物在图像中不同的位置等做调整。现有的收音方法容易出现效果收音不清晰、噪音较大等缺陷。In this embodiment of the present application, the electronic device may be a terminal with a microphone, such as a smart phone with a microphone, a portable wearable device (such as a smart watch, etc.), or a tablet computer. In actual use, the user is sometimes required to operate the electronic device. The solution usually adopted in the prior art is that when the recording mode is switched to the front mode, the front directivity of the microphone is fixed. When shooting in the front, regardless of whether there are people in the captured image or the proportion of people in the image, the front directivity of the microphone is fixed, that is, the parameters such as the range of received sound and the intensity of the sound are fixed. Yes, the front directivity will not be adjusted according to different scenes, such as the distance of the characters, the different positions of the characters in the image, etc. The existing radio method is prone to defects such as unclear radio effect and high noise.
本申请公开的收音方法在拍摄模式为前置模式时,对实时拍摄的图像进行识别,确定图像对应的场景,根据图像对应的场景对麦克风的收音前置指向性进行调节有利于提高收音效果,提升用户体验。When the recording method disclosed in the present application is in the front mode, the image captured in real time is identified, the scene corresponding to the image is determined, and the front directivity of the microphone is adjusted according to the scene corresponding to the image, which is beneficial to improve the sound collection effect. Improve user experience.
下面通过具体实施例对本申请的技术方案进行具体描述。The technical solutions of the present application will be specifically described below through specific embodiments.
实施例一Example 1
请参阅图1,图1为本申请的一个实施例提供的一种收音方法的流程示意图,在该实施例中,所述方法可以包括以下步骤。Please refer to FIG. 1. FIG. 1 is a schematic flowchart of a method for collecting audio according to an embodiment of the present application. In this embodiment, the method may include the following steps.
步骤101、对实时拍摄的图像进行识别,确定所述图像对应的场景;所述场景为预设场景中的一种。Step 101: Identify a real-time captured image, and determine a scene corresponding to the image; the scene is one of preset scenes.
在一些可能的实现方式中,对实时拍摄的图像进行识别,包括:识别所述图像中的如下 信息中的一个或者多个:指定对象的个数、指定对象的清晰度、指定对象在图像中的占比、以及指定对象在图像中的位置等。在一些可能的实施方式中,指定对象可以是人脸图像,举例来说,若拍摄的图像中包括一个人脸的图像,则在该实施例中,可以识别出一个人脸图像。可以另一些实施例中,若拍摄的图像中包括两个人脸的图像,则在该实施例中,可以识别出两个人脸图像。若在图像中没有人脸图像,则在该实施例中识别出人脸图像的个数为0。In some possible implementations, recognizing an image captured in real time includes: recognizing one or more of the following information in the image: the number of specified objects, the clarity of specified objects, the presence of specified objects in the image , and specify the position of the object in the image, etc. In some possible implementations, the specified object may be a human face image. For example, if the captured image includes an image of a human face, in this embodiment, a human face image can be identified. In other embodiments, if the captured images include images of two human faces, in this embodiment, two human face images can be identified. If there is no face image in the image, the number of the identified face image is 0 in this embodiment.
在一些实施例中,还可以对识别出的人脸图像的清晰度进行评估,比如可以确认指定对象清晰、不清晰等,也可以设定多个清晰度等级,根据实际图像确定其中包括的指定对象的清晰度。In some embodiments, the sharpness of the recognized face image can also be evaluated, for example, it can be confirmed that the designated object is clear, unclear, etc., and multiple sharpness levels can also be set, and the designated objects included in it can be determined according to the actual image. Object clarity.
在一些实施例中,可以识别对指定对象在图像中的占比,比如,根据指定对象在图像中所占区域大小,可以识别出人脸图像占比为30%、10%、0或者50%等。In some embodiments, the proportion of the specified object in the image can be identified, for example, according to the size of the area occupied by the specified object in the image, it can be identified that the proportion of the face image is 30%, 10%, 0 or 50% Wait.
在一些实施例中,还可以识别指定对象在图像的中心区域的占比,可以将图像中心点附近一定的区域确定为中心区域,比如可以将以图像的中心点为圆心、图像宽度的一半为半径的圆形区域确定为中心区域,进而在确定指定对象在中心区域的比例,可以理解的,也可以按照其他的方式定义中心区域。In some embodiments, the proportion of the designated object in the center area of the image can also be identified, and a certain area near the center point of the image can be determined as the center area. The circular area of the radius is determined as the central area, and then the proportion of the specified object in the central area is determined. It is understood that the central area may also be defined in other ways.
在一些实施例中,还可以识别指定对照在图像的位置,指定对象在图像中的位置可以是:位于图像的中心、图像的上方、图像的下方、图像的左侧、图像的右侧、图像的左上方、图像的左下方、图像的右上方、图像的右下方等。In some embodiments, the position of the designated control in the image may also be identified, and the position of the designated object in the image may be: at the center of the image, above the image, below the image, to the left of the image, to the right of the image, to the image top left of the image, bottom left of the image, top right of the image, bottom right of the image, etc.
步骤102、对所述麦克风的收音指向性进行调节,按照预设规则使所述麦克风的收音指向性与所述场景匹配。Step 102: Adjust the sound pickup directivity of the microphone, and match the sound pickup directivity of the microphone with the scene according to a preset rule.
在一些可能的实现方式中,预设场景可以包括如下场景中的一个或者多个场景:第一场景、第二场景、第三场景和第四场景。In some possible implementations, the preset scene may include one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene.
第一场景包括:图像中包括一个或者多个指定对象,指定对象在图像中心区域内,并且在图像中心区域内的占比超过第一阈值。举例来说,指定对象可以是人脸图像,如图3A所示,通过识别,确定图3A所示的图像中包括一个指定对象,指定对象位于图像的中间区域,第一阈值可以是30%,中心区域可以是图3A中虚线对应的正方形区域内,比如可以是以正方形的中心为圆点,以正方形边长的四分之一为半径的圆形区域,可以理解的,第一阈值以及中心区域可以是固定的,也可以根据需要进行设定和调整。The first scene includes: the image includes one or more designated objects, the designated objects are in a central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold. For example, the specified object may be a face image, as shown in FIG. 3A, through identification, it is determined that the image shown in FIG. 3A includes a specified object, the specified object is located in the middle area of the image, and the first threshold may be 30%, The central area may be within the square area corresponding to the dotted line in FIG. 3A , for example, it may be a circular area with the center of the square as a dot and a quarter of the side length of the square as a radius. It can be understood that the first threshold and the center Zones can be fixed or set and adjusted as needed.
第二场景包括:图像中包括一个或者多个指定对象,指定对象未在图像中心区域内,并且在图像中的占比超过第二阈值。举例来说,指定对象可以是人脸图像,如图3B所示,通过识别,确定图3B所示的图像中包括两个指定对象,指定对象不位于图像的中心区域,第二阈值可以是28%,图3B中两个指定对象在图像中的占比超过28%。The second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold. For example, the specified object may be a face image, as shown in FIG. 3B , through identification, it is determined that the image shown in FIG. 3B includes two specified objects, the specified objects are not located in the central area of the image, and the second threshold may be 28 %, the two specified objects in Fig. 3B occupy more than 28% of the image.
第三场景包括:图像中包括一个或者多个指定对象,指定对象未在所述图像中心区域内,并且在图像中的占比未超过第三阈值;举例来说,指定对象可以是人脸图像,如图3C所示,通过识别,确定图3C所示的图像中包括两个指定对象,指定对象不位于图像的中心区域,第三阈值可以是15%,图3C中两个指定对象在图像中的占比低于15%。The third scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; for example, the designated objects may be face images , as shown in FIG. 3C, through identification, it is determined that the image shown in FIG. 3C includes two specified objects, the specified objects are not located in the central area of the image, and the third threshold can be 15%, and the two specified objects in FIG. 3C are in the image. The proportion is less than 15%.
第四场景包括:图像中未包括指定对象,如图3D所示。The fourth scene includes: the specified object is not included in the image, as shown in FIG. 3D.
在一些可能的实施方式中,预设规则可以包括:第一场景、第二场景、第三场景和第四场景分别对应的收音前置指向性依次减弱,收音效果如图4A至图4D所示。智能手机处在画面中心垂直于图像的位置放置,0度是后置摄像头正对方向,180度是前置摄像头正对方向,图4A至图4D所示的收音前置指向性依次减弱。为了便于理解可以参考图5A至图5D所示,黑色指收音的范围,灰色指没有收音的范围。In some possible implementations, the preset rules may include: the radio front directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially, and the radio effect is shown in FIG. 4A to FIG. 4D . . The smartphone is placed at the center of the screen perpendicular to the image, 0 degrees is the direction facing the rear camera, and 180 degrees is the facing direction of the front camera. For easy understanding, please refer to FIG. 5A to FIG. 5D , the black refers to the range where sound is picked up, and the gray refers to the range where no sound is picked up.
预设规则包括:所述第一场景、所述第二场景、所述第三场景和所述第四场景分别对应的收音指向性依次减弱。The preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens in turn.
本申请的实施例采用的收音方法能够根据拍摄的实时图像对应的场景对麦克风的收音指向性进行调节,通过确定收音场景进行指向性的及时调节有利于提高收音效果,提升用户体验。The sound pickup method adopted in the embodiment of the present application can adjust the sound pickup directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is beneficial to improve the sound pickup effect and user experience.
在一些可能的实现方式中,所述方法还包括:将所述电子设备的拍摄模式切换为前置模式或者后置模式。In some possible implementations, the method further includes: switching the photographing mode of the electronic device to a front-facing mode or a rear-facing mode.
在一些可能的实现方式中,在对实时拍摄的图像进行识别之前,所述方法还包括:在实时拍摄的图像中确定指定对象。In some possible implementations, before identifying the real-time captured image, the method further includes: determining a designated object in the real-time captured image.
在一些可能的实现方式中,对所述麦克风的收音指向性进行调节,包括:在电子设备的拍摄模式为前置模式时,对麦克风的收音前置指向性进行调节。在电子设备的拍摄模式为后置模式时,对麦克风的收音后置指向性进行调节。In some possible implementation manners, adjusting the directivity of sound collection of the microphone includes: when the shooting mode of the electronic device is the front mode, adjusting the directivity of the sound collection front of the microphone. When the shooting mode of the electronic device is the rear mode, the rear directivity of the microphone is adjusted.
在一些可能的实现方式中,在实时拍摄的图像中确定指定对象,包括:获取用户对所述图像的点击操作,将所述点击操作的对象确定为指定对象。举例来说,若实时拍摄的图片中包括指定对象,用户可以通过点击操作来确认指定对象,然后根据指定对象在图片中的大小、清晰度、位置等信息确定当前场景,根据场景对麦克风的收音指向性进行调节,场景与麦克风的收音指向性可以根据需要预先设定,以满足更好地对指定对象发出的声音进行收音。In some possible implementations, determining the designated object in the real-time captured image includes: acquiring a user's click operation on the image, and determining the object of the click operation as the designated object. For example, if a picture captured in real time includes a specified object, the user can click to confirm the specified object, and then determine the current scene according to the size, definition, position and other information of the specified object in the picture, and then select the sound from the microphone according to the scene. The directivity can be adjusted, and the radio directivity of the scene and the microphone can be preset as needed to better receive the sound from the specified object.
在一些可能的实现方式中,在实时拍摄的图像中确定指定对象之前,所述方法还可以包括:拍照获取第一图片,从第一图片中确定指定对象。In some possible implementations, before determining the designated object in the real-time captured image, the method may further include: taking a picture to obtain a first picture, and determining the designated object from the first picture.
在一些可能的实现方式中,在实时拍摄的图像中确定指定对象之前,还可以包括:从其他途径获取图片,比如从图库中获取图片、也可以从其他电子设备获取图片、或者电子设备通过截屏等操作获取图片,在获取的图片中选取指定对象。In some possible implementations, before determining the specified object in the real-time captured image, the method may further include: acquiring pictures from other means, such as acquiring pictures from a gallery, or acquiring pictures from other electronic devices, or taking screenshots by the electronic device and other operations to obtain a picture, and select the specified object in the obtained picture.
在一些可能的实现方式中,在实时拍摄的图像中确定指定对象之前,还可以包括:根据对指定对象的描述确定指定对象。举例来说,可以通过本文记录的信息或者语音等方式对指定对象进行描述,比如戴帽子的人、戴项链的人、穿礼服的人、拿话筒的人等,可以理解的,指定对象不限于是人,也可以是可以发出声音的其他事物,比如机器人、小动物(小猫、小狗、小鸟等)等,通过对指定对象的描述对实时拍摄的图像中是否包括指定对象进行识别。In some possible implementations, before determining the specified object in the real-time captured image, the method may further include: determining the specified object according to the description of the specified object. For example, the specified object can be described by means of information or voice recorded in this article, such as a person wearing a hat, a person wearing a necklace, a person wearing a dress, a person holding a microphone, etc. It is understood that the specified object is not limited to It can be a person or other things that can make sounds, such as robots, small animals (kittens, puppies, birds, etc.), etc., through the description of the specified object to identify whether the image captured in real time includes the specified object.
实施例二Embodiment 2
请参阅图2,图2为本申请的另一实施例提供的一种收音方法的流程示意图,在该实施例中,所述方法可以包括以下步骤。Please refer to FIG. 2. FIG. 2 is a schematic flowchart of a method for collecting audio according to another embodiment of the present application. In this embodiment, the method may include the following steps.
步骤201、将所述电子设备的拍摄模式切换为前置模式。Step 201: Switch the shooting mode of the electronic device to the front mode.
在一些可能的实施方式中,可以通过硬件开关进行拍摄模式的切换,也可以通过操作显示界面中的虚拟按键将电子设备的拍摄模式切换为前置模式。In some possible implementations, the shooting mode can be switched through a hardware switch, and the shooting mode of the electronic device can also be switched to the front mode by operating a virtual key in the display interface.
步骤202、在拍摄模式为前置模式时,对实时拍摄的图像进行识别,确定所述图像对应的场景;所述场景为预设场景中的一种。 Step 202 , when the shooting mode is the front mode, identify the image captured in real time, and determine the scene corresponding to the image; the scene is one of the preset scenes.
在一些可能的实现方式中,对实时拍摄的图像进行识别,包括:识别所述图像中的如下信息中的一个或者多个:指定对象的个数、指定对象的清晰度、指定对象在图像中的占比、以及指定对象在图像中的位置等。在一些可能的实施方式中,指定对象可以是人脸图像,举例来说,若拍摄的图像中包括一个人脸的图像,则在该实施例中,可以识别出一个人脸图像。可以另一些实施例中,若拍摄的图像中包括两个人脸的图像,则在该实施例中,可以识别出 两个人脸图像。若在图像中没有人脸图像,则在该实施例中识别出人脸图像的个数为0。In some possible implementations, recognizing an image captured in real time includes: recognizing one or more of the following information in the image: the number of specified objects, the clarity of specified objects, the presence of specified objects in the image , and specify the position of the object in the image, etc. In some possible implementations, the specified object may be a human face image. For example, if the captured image includes an image of a human face, in this embodiment, a human face image can be identified. In other embodiments, if the captured images include images of two human faces, in this embodiment, two human face images can be identified. If there is no face image in the image, the number of the identified face image is 0 in this embodiment.
在一些实施例中,还可以对识别出的人脸图像的清晰度进行评估,比如可以确认指定对象清晰、不清晰等,也可以设定多个清晰度等级,根据实际图像确定其中包括的指定对象的清晰度。In some embodiments, the sharpness of the recognized face image can also be evaluated, for example, it can be confirmed that the designated object is clear, unclear, etc., and multiple sharpness levels can also be set, and the designated objects included in it can be determined according to the actual image. Object clarity.
在一些实施例中,可以识别对指定对象在图像中的占比,比如,根据指定对象在图像中所占区域大小,可以识别出人脸图像占比为30%、10%、0或者50%等。In some embodiments, the proportion of the specified object in the image can be identified, for example, according to the size of the area occupied by the specified object in the image, it can be identified that the proportion of the face image is 30%, 10%, 0 or 50% Wait.
在一些实施例中,还可以识别指定对象在图像的中心区域的占比,可以将图像中心点附近一定的区域确定为中心区域,比如可以将以图像的中心点为圆心、图像宽度的一半为半径的圆形区域确定为中心区域,进而在确定指定对象在中心区域的比例,可以理解的,也可以按照其他的方式定义中心区域。In some embodiments, the proportion of the designated object in the center area of the image can also be identified, and a certain area near the center point of the image can be determined as the center area. The circular area of the radius is determined as the central area, and then the proportion of the specified object in the central area is determined. It is understood that the central area may also be defined in other ways.
在一些实施例中,还可以识别指定对照在图像的位置,指定对象在图像中的位置可以是:位于图像的中心、图像的上方、图像的下方、图像的左侧、图像的右侧、图像的左上方、图像的左下方、图像的右上方、图像的右下方等。In some embodiments, the position of the designated control in the image may also be identified, and the position of the designated object in the image may be: at the center of the image, above the image, below the image, to the left of the image, to the right of the image, to the image top left of the image, bottom left of the image, top right of the image, bottom right of the image, etc.
步骤203、对所述麦克风的收音前置指向性进行调节,按照预设规则使所述麦克风的收音前置指向性与所述场景匹配。Step 203 : Adjust the front-end directivity of the microphone, and match the front-end directivity of the microphone with the scene according to a preset rule.
在一些可能的实现方式中,预设场景可以包括如下场景中的一个或者多个场景:第一场景、第二场景、第三场景和第四场景。In some possible implementations, the preset scene may include one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene.
第一场景包括:图像中包括一个或者多个指定对象,指定对象在图像中心区域内,并且在图像中心区域内的占比超过第一阈值。举例来说,指定对象可以是人脸图像,如图3A所示,通过识别,确定图3A所示的图像中包括一个指定对象,指定对象位于图像的中间区域,第一阈值可以是30%,中心区域可以是图3A中虚线对应的正方形区域内,比如可以是以正方形的中心为圆点,以正方形边长的四分之一为半径的圆形区域,可以理解的,第一阈值以及中心区域可以是固定的,也可以根据需要进行设定和调整。The first scene includes: the image includes one or more designated objects, the designated objects are in a central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold. For example, the specified object may be a face image, as shown in FIG. 3A, through identification, it is determined that the image shown in FIG. 3A includes a specified object, the specified object is located in the middle area of the image, and the first threshold may be 30%, The central area may be within the square area corresponding to the dotted line in FIG. 3A , for example, it may be a circular area with the center of the square as a dot and a quarter of the side length of the square as a radius. It can be understood that the first threshold and the center Zones can be fixed or set and adjusted as needed.
第二场景包括:图像中包括一个或者多个指定对象,指定对象未在图像中心区域内,并且在图像中的占比超过第二阈值。举例来说,指定对象可以是人脸图像,如图3B所示,通过识别,确定图3B所示的图像中包括两个指定对象,指定对象不位于图像的中心区域,第二阈值可以是28%,图3B中两个指定对象在图像中的占比超过28%。The second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold. For example, the specified object may be a face image, as shown in FIG. 3B , through identification, it is determined that the image shown in FIG. 3B includes two specified objects, the specified objects are not located in the central area of the image, and the second threshold may be 28 %, the two specified objects in Fig. 3B occupy more than 28% of the image.
第三场景包括:图像中包括一个或者多个指定对象,指定对象未在所述图像中心区域内,并且在图像中的占比未超过第三阈值;举例来说,指定对象可以是人脸图像,如图3C所示,通过识别,确定图3C所示的图像中包括两个指定对象,指定对象不位于图像的中心区域,第三阈值可以是15%,图3C中两个指定对象在图像中的占比低于15%。The third scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; for example, the designated objects may be face images , as shown in FIG. 3C, through identification, it is determined that the image shown in FIG. 3C includes two specified objects, the specified objects are not located in the central area of the image, and the third threshold can be 15%, and the two specified objects in FIG. 3C are in the image. The proportion is less than 15%.
第四场景包括:图像中未包括指定对象,如图3D所示。The fourth scene includes: the specified object is not included in the image, as shown in FIG. 3D.
在一些可能的实施方式中,预设规则可以包括:第一场景、第二场景、第三场景和第四场景分别对应的收音前置指向性依次减弱,收音效果如图4A至图4D所示。智能手机处在画面中心垂直于图像的位置放置,0度是后置摄像头正对方向,180度是前置摄像头正对方向,图4A至图4D所示的收音前置指向性依次减弱。为了便于理解可以参考图5A至图5D所示,黑色指收音的范围,灰色指没有收音的范围。In some possible implementations, the preset rules may include: the radio front directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially, and the radio effect is shown in FIG. 4A to FIG. 4D . . The smartphone is placed at the center of the screen perpendicular to the image, 0 degrees is the direction facing the rear camera, and 180 degrees is the facing direction of the front camera. For easy understanding, please refer to FIG. 5A to FIG. 5D , the black refers to the range where sound is picked up, and the gray refers to the range where no sound is picked up.
本申请的实施例采用的收音方法能够根据拍摄的实时图像对应的场景对麦克风的收音前置指向性进行调节,通过确定收音场景进行指向性的及时调节有利于提高收音效果,提升用户体验。The sound collection method adopted in the embodiment of the present application can adjust the forward direction directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound collection scene is beneficial to improve the sound collection effect and user experience.
实施例三Embodiment 3
请参阅图6,图6为本申请的一个实施例提供的一种收音装置600的结构示意图,收音装置包括麦克风601、第一处理单元602和调节单元603。Please refer to FIG. 6 . FIG. 6 is a schematic structural diagram of a sound pickup device 600 according to an embodiment of the present application. The sound pickup device includes a microphone 601 , a first processing unit 602 and an adjustment unit 603 .
第一处理单元602,对实时拍摄的图像进行识别,确定所述图像对应的场景;所述场景为预设场景中的一种。The first processing unit 602 identifies a real-time captured image, and determines a scene corresponding to the image; the scene is one of preset scenes.
在一些可能的实现方式中,对实时拍摄的图像进行识别,包括:识别所述图像中的如下信息中的一个或者多个:指定对象的个数、指定对象的清晰度、指定对象在图像中的占比、以及指定对象在图像中的位置等。在一些可能的实施方式中,指定对象可以是人脸图像,举例来说,若拍摄的图像中包括一个人脸的图像,则在该实施例中,可以识别出一个人脸图像。可以另一些实施例中,若拍摄的图像中包括两个人脸的图像,则在该实施例中,可以识别出两个人脸图像。若在图像中没有人脸图像,则在该实施例中识别出人脸图像的个数为0。In some possible implementations, recognizing an image captured in real time includes: recognizing one or more of the following information in the image: the number of specified objects, the clarity of specified objects, the presence of specified objects in the image , and specify the position of the object in the image, etc. In some possible implementations, the specified object may be a human face image. For example, if the captured image includes an image of a human face, in this embodiment, a human face image can be identified. In other embodiments, if the captured images include images of two human faces, in this embodiment, two human face images can be identified. If there is no face image in the image, the number of the identified face image is 0 in this embodiment.
在一些实施例中,还可以对识别出的人脸图像的清晰度进行评估,比如可以确认指定对象清晰、不清晰等,也可以设定多个清晰度等级,根据实际图像确定其中包括的指定对象的清晰度。In some embodiments, the sharpness of the recognized face image can also be evaluated, for example, it can be confirmed that the designated object is clear, unclear, etc., and multiple sharpness levels can also be set, and the designated objects included in it can be determined according to the actual image. Object clarity.
在一些实施例中,可以识别对指定对象在图像中的占比,比如,根据指定对象在图像中所占区域大小,可以识别出人脸图像占比为30%、10%、0或者50%等。In some embodiments, the proportion of the specified object in the image can be identified, for example, according to the size of the area occupied by the specified object in the image, it can be identified that the proportion of the face image is 30%, 10%, 0 or 50% Wait.
在一些实施例中,还可以识别指定对象在图像的中心区域的占比,可以将图像中心点附近一定的区域确定为中心区域,比如可以将以图像的中心点为圆心、图像宽度的一半为半径的圆形区域确定为中心区域,进而在确定指定对象在中心区域的比例,可以理解的,也可以按照其他的方式定义中心区域。In some embodiments, the proportion of the designated object in the center area of the image can also be identified, and a certain area near the center point of the image can be determined as the center area. The circular area of the radius is determined as the central area, and then the proportion of the specified object in the central area is determined. It is understood that the central area may also be defined in other ways.
在一些实施例中,还可以识别指定对照在图像的位置,指定对象在图像中的位置可以是:位于图像的中心、图像的上方、图像的下方、图像的左侧、图像的右侧、图像的左上方、图像的左下方、图像的右上方、图像的右下方等。In some embodiments, the position of the designated control in the image may also be identified, and the position of the designated object in the image may be: at the center of the image, above the image, below the image, to the left of the image, to the right of the image, to the image top left of the image, bottom left of the image, top right of the image, bottom right of the image, etc.
调节单元603,用于对所述麦克风的收音前置指向性进行调节,按照预设规则使麦克风601的收音前置指向性与所述场景匹配。The adjusting unit 603 is configured to adjust the front-end directivity of the microphone 601, and match the pre-directivity of the microphone 601 with the scene according to a preset rule.
在一些可能的实现方式中,预设场景可以包括如下场景中的一个或者多个场景:第一场景、第二场景、第三场景和第四场景。In some possible implementations, the preset scene may include one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene.
第一场景包括:图像中包括一个或者多个指定对象,指定对象在图像中心区域内,并且在图像中心区域内的占比超过第一阈值。举例来说,指定对象可以是人脸图像,如图3A所示,通过识别,确定图3A所示的图像中包括一个指定对象,指定对象位于图像的中间区域,第一阈值可以是30%,中心区域可以是图3A中虚线对应的正方形区域内,比如可以是以正方形的中心为圆点,以正方形边长的四分之一为半径的圆形区域,可以理解的,第一阈值以及中心区域可以是固定的,也可以根据需要进行设定和调整。The first scene includes: the image includes one or more designated objects, the designated objects are in a central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold. For example, the specified object may be a face image, as shown in FIG. 3A, through identification, it is determined that the image shown in FIG. 3A includes a specified object, the specified object is located in the middle area of the image, and the first threshold may be 30%, The central area may be within the square area corresponding to the dotted line in FIG. 3A , for example, it may be a circular area with the center of the square as a dot and a quarter of the side length of the square as a radius. It can be understood that the first threshold and the center Zones can be fixed or set and adjusted as needed.
第二场景包括:图像中包括一个或者多个指定对象,指定对象未在图像中心区域内,并且在图像中的占比超过第二阈值。举例来说,指定对象可以是人脸图像,如图3B所示,通过识别,确定图3B所示的图像中包括两个指定对象,指定对象不位于图像的中心区域,第二阈值可以是28%,图3B中两个指定对象在图像中的占比超过28%。The second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold. For example, the specified object may be a face image, as shown in FIG. 3B , through identification, it is determined that the image shown in FIG. 3B includes two specified objects, the specified objects are not located in the central area of the image, and the second threshold may be 28 %, the two specified objects in Fig. 3B occupy more than 28% of the image.
第三场景包括:图像中包括一个或者多个指定对象,指定对象未在所述图像中心区域内,并且在图像中的占比未超过第三阈值;举例来说,指定对象可以是人脸图像,如图3C所示,通过识别,确定图3C所示的图像中包括两个指定对象,指定对象不位于图像的中心区域,第三阈值可以是15%,图3C中两个指定对象在图像中的占比低于15%。The third scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; for example, the designated objects may be face images , as shown in FIG. 3C, through identification, it is determined that the image shown in FIG. 3C includes two specified objects, the specified objects are not located in the central area of the image, and the third threshold can be 15%, and the two specified objects in FIG. 3C are in the image. The proportion is less than 15%.
第四场景包括:图像中未包括指定对象,如图3D所示。The fourth scene includes: the specified object is not included in the image, as shown in FIG. 3D.
在一些可能的实施方式中,预设规则可以包括:第一场景、第二场景、第三场景和第四场景分别对应的收音前置指向性依次减弱,收音效果如图4A至图4D所示。智能手机处在画面中心垂直于图像的位置放置,0度是后置摄像头正对方向,180度是前置摄像头正对方向,图4A至图4D所示的收音前置指向性依次减弱。为了便于理解可以参考图5A至图5D所示,黑色指收音的范围,灰色指没有收音的范围。In some possible implementations, the preset rules may include: the radio front directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially, and the radio effect is shown in FIG. 4A to FIG. 4D . . The smartphone is placed at the center of the screen perpendicular to the image, 0 degrees is the direction facing the rear camera, and 180 degrees is the facing direction of the front camera. For easy understanding, please refer to FIG. 5A to FIG. 5D , the black refers to the range where sound is picked up, and the gray refers to the range where no sound is picked up.
预设规则包括:所述第一场景、所述第二场景、所述第三场景和所述第四场景分别对应的收音指向性依次减弱。The preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens in turn.
本申请的实施例采用的收音装置能够根据拍摄的实时图像对应的场景对麦克风的收音前置指向性进行调节,通过确定收音场景进行指向性的及时调节有利于提高收音效果,提升用户体验。The sound pickup device used in the embodiments of the present application can adjust the forward directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is conducive to improving the sound pickup effect and user experience.
在一些可能的实现方式中,如图7A所示,收音装置700还可以包括切换单元701,用于将电子设备的拍摄模式切换为前置模式或者后置模式。In some possible implementations, as shown in FIG. 7A , the sound pickup apparatus 700 may further include a switching unit 701 for switching the shooting mode of the electronic device to the front mode or the rear mode.
在一些可能的实现方式中,第一处理单元702还可以用于,在对实时拍摄的图像进行识别之前,在实时拍摄的图像中确定指定对象。调节单元703具体用于,在电子设备的拍摄模式为前置模式时,对麦克风704的收音前置指向性进行调节。在电子设备的拍摄模式为后置模式时,对麦克风704的收音后置指向性进行调节。In some possible implementations, the first processing unit 702 may also be configured to determine a specified object in the real-time captured image before identifying the real-time captured image. The adjusting unit 703 is specifically configured to adjust the front directivity of the microphone 704 when the shooting mode of the electronic device is the front mode. When the photographing mode of the electronic device is the rear mode, the rear directivity of the microphone 704 for sound collection is adjusted.
在一些可能的实现方式中,在实时拍摄的图像中确定指定对象方面,第一处理单元702具体用于:获取用户对所述图像的点击操作,将所述点击操作的对象确定为指定对象。举例来说,若实时拍摄的图片中包括指定对象,用户可以通过点击操作来确认指定对象,然后根据指定对象在图片中的大小、清晰度、位置等信息确定当前场景,根据场景对麦克风的收音指向性进行调节,场景与麦克风的收音指向性可以根据需要预先设定,以满足更好地对指定对象发出的声音进行收音。In some possible implementations, in terms of determining a designated object in a real-time captured image, the first processing unit 702 is specifically configured to: acquire a user's click operation on the image, and determine the object of the click operation as the designated object. For example, if a picture captured in real time includes a specified object, the user can click to confirm the specified object, and then determine the current scene according to the size, definition, position and other information of the specified object in the picture, and then select the sound from the microphone according to the scene. The directivity can be adjusted, and the radio directivity of the scene and the microphone can be preset as needed to better receive the sound from the specified object.
在一些可能的实现方式中,如图7B所示,收音装置700还可以包括第二处理单元705,用于在第一处理单元在实时拍摄的图像中确定指定对象之前,拍照获取第一图片,从第一图片中确定指定对象。In some possible implementations, as shown in FIG. 7B , the radio apparatus 700 may further include a second processing unit 705, configured to take a picture to obtain a first picture before the first processing unit determines the specified object in the real-time captured image, The specified object is determined from the first picture.
在一些可能的实现方式中,第二处理单元705还可以用于从其他途径获取图片,比如从图库中获取图片、也可以从其他电子设备获取图片、或者电子设备通过截屏等操作获取图片,在获取的图片中选取指定对象。In some possible implementations, the second processing unit 705 may also be configured to acquire pictures from other channels, such as acquiring pictures from a gallery, or acquiring pictures from other electronic devices, or acquiring pictures by the electronic device through operations such as screenshots. Select the specified object in the acquired image.
在一些可能的实现方式中,第二处理单元705还可以用于:根据对指定对象的描述确定指定对象。举例来说,可以通过本文记录的信息或者语音等方式对指定对象进行描述,比如戴帽子的人、戴项链的人、穿礼服的人、拿话筒的人、正方形的物体、小鸟等,可以理解的指定对象不限于是人,也可以是可以发出声音的其他事物,比如机器人、小动物(小猫、小狗、小鸟等)等,通过对指定对象的描述对实时拍摄的图像中是否包括指定对象进行识别。In some possible implementations, the second processing unit 705 may also be configured to: determine the specified object according to the description of the specified object. For example, a specified object can be described by means of information or voice recorded in this article, such as a person wearing a hat, a person wearing a necklace, a person wearing a dress, a person holding a microphone, a square object, a bird, etc. The specified object understood is not limited to people, but can also be other things that can make sounds, such as robots, small animals (kittens, puppies, birds, etc.) Include specified objects for identification.
本申请的实施例采用的收音方法能够根据拍摄的实时图像对应的场景对麦克风的收音指向性进行调节,通过确定收音场景进行指向性的及时调节有利于提高收音效果,提升用户体验。The sound pickup method adopted in the embodiment of the present application can adjust the sound pickup directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is beneficial to improve the sound pickup effect and user experience.
实施例四Embodiment 4
请参阅图7A,图7A为本申请的一个实施例提供的一种收音装置700的结构示意图,收音装置700包括:切换单元701、第一处理单元702、调节单元703和麦克风704。Please refer to FIG. 7A . FIG. 7A is a schematic structural diagram of a sound pickup device 700 according to an embodiment of the present application. The sound pickup device 700 includes a switching unit 701 , a first processing unit 702 , an adjustment unit 703 and a microphone 704 .
切换单元701,用于在第一处理单元702对实时拍摄的图像进行识别之前,将电子设备的拍摄模式切换为前置模式。The switching unit 701 is configured to switch the photographing mode of the electronic device to the front mode before the first processing unit 702 recognizes the real-time photographed image.
在一些可能的实施方式中,可以通过硬件开关进行拍摄模式的切换,也可以通过操作显示界面中的虚拟按键将电子设备的拍摄模式切换为前置模式。In some possible implementations, the shooting mode can be switched through a hardware switch, and the shooting mode of the electronic device can also be switched to the front mode by operating a virtual key in the display interface.
第一处理单元702,在拍摄模式为前置模式时,对实时拍摄的图像进行识别,确定所述图像对应的场景;所述场景为预设场景中的一种。The first processing unit 702, when the shooting mode is the front mode, identifies the image captured in real time, and determines a scene corresponding to the image; the scene is one of preset scenes.
在一些可能的实现方式中,对实时拍摄的图像进行识别方面,第一处理单元702具体用于:识别所述图像中的如下信息中的一个或者多个:指定对象的个数、指定对象的清晰度、指定对象在图像中的占比、以及指定对象在图像中的位置等。在一些可能的实施方式中,指定对象可以是人脸图像,举例来说,若拍摄的图像中包括一个人脸的图像,则在该实施例中,可以识别出一个人脸图像。可以另一些实施例中,若拍摄的图像中包括两个人脸的图像,则在该实施例中,可以识别出两个人脸图像。若在图像中没有人脸图像,则在该实施例中识别出人脸图像的个数为0。In some possible implementations, in terms of recognizing images captured in real time, the first processing unit 702 is specifically configured to: recognize one or more of the following information in the images: the number of specified objects, the number of specified objects Sharpness, the proportion of the specified object in the image, and the position of the specified object in the image, etc. In some possible implementations, the specified object may be a human face image. For example, if the captured image includes an image of a human face, in this embodiment, a human face image can be identified. In other embodiments, if the captured images include images of two human faces, in this embodiment, two human face images can be identified. If there is no face image in the image, the number of the identified face image is 0 in this embodiment.
在一些实施例中,还可以对识别出的人脸图像的清晰度进行评估,比如可以确认指定对象清晰、不清晰等,也可以设定多个清晰度等级,根据实际图像确定其中包括的指定对象的清晰度。In some embodiments, the sharpness of the recognized face image can also be evaluated, for example, it can be confirmed that the designated object is clear, unclear, etc., and multiple sharpness levels can also be set, and the designated objects included in it can be determined according to the actual image. Object clarity.
在一些实施例中,可以识别对指定对象在图像中的占比,比如,根据指定对象在图像中所占区域大小,可以识别出人脸图像占比为30%、10%、0或者50%等。In some embodiments, the proportion of the specified object in the image can be identified, for example, according to the size of the area occupied by the specified object in the image, it can be identified that the proportion of the face image is 30%, 10%, 0 or 50% Wait.
在一些实施例中,还可以识别指定对象在图像的中心区域的占比,可以将图像中心点附近一定的区域确定为中心区域,比如可以将以图像的中心点为圆心、图像宽度的一半为半径的圆形区域确定为中心区域,进而在确定指定对象在中心区域的比例,可以理解的,也可以按照其他的方式定义中心区域。In some embodiments, the proportion of the designated object in the center area of the image can also be identified, and a certain area near the center point of the image can be determined as the center area. The circular area of the radius is determined as the central area, and then the proportion of the specified object in the central area is determined. It is understood that the central area may also be defined in other ways.
在一些实施例中,还可以识别指定对照在图像的位置,指定对象在图像中的位置可以是:位于图像的中心、图像的上方、图像的下方、图像的左侧、图像的右侧、图像的左上方、图像的左下方、图像的右上方、图像的右下方等。In some embodiments, the position of the designated control in the image may also be identified, and the position of the designated object in the image may be: at the center of the image, above the image, below the image, to the left of the image, to the right of the image, to the image top left of the image, bottom left of the image, top right of the image, bottom right of the image, etc.
调节单元703,用于对所述麦克风的收音前置指向性进行调节,按照预设规则使麦克风704的收音前置指向性与所述场景匹配。The adjustment unit 703 is configured to adjust the front directivity of the microphone for sound collection, and match the front direction of the sound collection of the microphone 704 with the scene according to a preset rule.
在一些可能的实现方式中,预设场景可以包括如下场景中的一个或者多个场景:第一场景、第二场景、第三场景和第四场景。In some possible implementations, the preset scene may include one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene.
第一场景包括:图像中包括一个或者多个指定对象,指定对象在图像中心区域内,并且在图像中心区域内的占比超过第一阈值。举例来说,指定对象可以是人脸图像,如图3A所示,通过识别,确定图3A所示的图像中包括一个指定对象,指定对象位于图像的中间区域,第一阈值可以是30%,中心区域可以是图3A中虚线对应的正方形区域内,比如可以是以正方形的中心为圆点,以正方形边长的四分之一为半径的圆形区域,可以理解的,第一阈值以及中心区域可以是固定的,也可以根据需要进行设定和调整。The first scene includes: the image includes one or more designated objects, the designated objects are in a central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold. For example, the specified object may be a face image, as shown in FIG. 3A, through identification, it is determined that the image shown in FIG. 3A includes a specified object, the specified object is located in the middle area of the image, and the first threshold may be 30%, The central area may be within the square area corresponding to the dotted line in FIG. 3A , for example, it may be a circular area with the center of the square as a dot and a quarter of the side length of the square as a radius. It can be understood that the first threshold and the center Zones can be fixed or set and adjusted as needed.
第二场景包括:图像中包括一个或者多个指定对象,指定对象未在图像中心区域内,并且在图像中的占比超过第二阈值。举例来说,指定对象可以是人脸图像,如图3B所示,通过识别,确定图3B所示的图像中包括两个指定对象,指定对象不位于图像的中心区域,第二阈值可以是28%,图3B中两个指定对象在图像中的占比超过28%。The second scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold. For example, the specified object may be a face image, as shown in FIG. 3B , through identification, it is determined that the image shown in FIG. 3B includes two specified objects, the specified objects are not located in the central area of the image, and the second threshold may be 28 %, the two specified objects in Fig. 3B occupy more than 28% of the image.
第三场景包括:图像中包括一个或者多个指定对象,指定对象未在所述图像中心区域内,并且在图像中的占比未超过第三阈值;举例来说,指定对象可以是人脸图像,如图3C所示, 通过识别,确定图3C所示的图像中包括两个指定对象,指定对象不位于图像的中心区域,第三阈值可以是15%,图3C中两个指定对象在图像中的占比低于15%。The third scenario includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold; for example, the designated objects may be face images , as shown in FIG. 3C, through identification, it is determined that the image shown in FIG. 3C includes two specified objects, the specified objects are not located in the central area of the image, and the third threshold can be 15%, and the two specified objects in FIG. 3C are in the image. The proportion is less than 15%.
第四场景包括:图像中未包括指定对象,如图3D所示。The fourth scene includes: the specified object is not included in the image, as shown in FIG. 3D.
在一些可能的实施方式中,预设规则可以包括:第一场景、第二场景、第三场景和第四场景分别对应的收音前置指向性依次减弱,收音效果如图4A至图4D所示。智能手机处在画面中心垂直于图像的位置放置,0度是后置摄像头正对方向,180度是前置摄像头正对方向,图4A至图4D所示的收音前置指向性依次减弱。为了便于理解可以参考图5A至图5D所示,黑色指收音的范围,灰色指没有收音的范围。In some possible implementations, the preset rules may include: the radio front directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially, and the radio effect is shown in FIG. 4A to FIG. 4D . . The smartphone is placed at the center of the screen perpendicular to the image, 0 degrees is the direction facing the rear camera, and 180 degrees is the facing direction of the front camera. For easy understanding, please refer to FIG. 5A to FIG. 5D , the black refers to the range where sound is picked up, and the gray refers to the range where no sound is picked up.
本申请的实施例采用的收音装置能够根据拍摄的实时图像对应的场景对麦克风的收音前置指向性进行调节,通过确定收音场景进行指向性的及时调节有利于提高收音效果,提升用户体验。The sound pickup device used in the embodiments of the present application can adjust the forward directivity of the microphone according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is conducive to improving the sound pickup effect and user experience.
实施例五Embodiment 5
请参阅图8,图8为本申请的一个实施例提供的一种电子设备的结构示意图,电子设备800包括:射频单元810、存储器820、输入单元830、摄像头840、音频电路850、处理器860、外部接口870和电源880。其中,输入单元830包括触摸屏831和其他输入设备832,音频电路850包括扬声器851、麦克风852和耳机插孔853。麦克风852可以有多个,比如可以包括3个麦克风,分别设置在电子设备顶部、底部和电池盖上。触摸屏831可以是具有触摸功能的显示屏。本实施例中,用户可以通过点击触摸屏831显示的切换按键将电子设备800的拍摄模式切换为前置模式或者后置模式,以切换为前置模式为例,切换到前置模式后,电子设备可以处于拍视频、视频通话、网络直播等状态。当处理器860在拍摄模式为前置模式时,对实时拍摄的图像进行识别,确定图像对应的场景;所述场景为预设场景中的一种,根据拍摄视频的改变,处理器860可以实时识别出不同的场景。处理器860对麦克风852的收音前置指向性进行调节,按照预设规则使麦克风852的收音前置指向性与场景匹配。Please refer to FIG. 8 , which is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device 800 includes: a radio frequency unit 810 , a memory 820 , an input unit 830 , a camera 840 , an audio circuit 850 , and a processor 860 , an external interface 870 and a power supply 880 . The input unit 830 includes a touch screen 831 and other input devices 832 , and the audio circuit 850 includes a speaker 851 , a microphone 852 and an earphone jack 853 . There may be multiple microphones 852, for example, three microphones may be included, which are respectively disposed on the top, the bottom and the battery cover of the electronic device. The touch screen 831 may be a display screen with a touch function. In this embodiment, the user can switch the shooting mode of the electronic device 800 to the front mode or the rear mode by clicking the switch button displayed on the touch screen 831. Taking switching to the front mode as an example, after switching to the front mode, the electronic device It can be in the state of shooting video, video calling, webcasting, etc. When the shooting mode is the front mode, the processor 860 recognizes the image captured in real time, and determines the scene corresponding to the image; the scene is one of the preset scenes, and according to the change of the shooting video, the processor 860 can real-time Identify different scenarios. The processor 860 adjusts the sound collection front directivity of the microphone 852, and matches the sound collection forward directivity of the microphone 852 with the scene according to a preset rule.
采用本申请实施例提供的技术方案,能够根据拍摄的实时图像对应的场景对麦克风的收音指向性进行调节,通过确定收音场景进行指向性的及时调节有利于提高收音效果,提升用户体验。Using the technical solutions provided by the embodiments of the present application, the sound pickup directivity of the microphone can be adjusted according to the scene corresponding to the captured real-time image, and the timely adjustment of the directivity by determining the sound pickup scene is conducive to improving the sound pickup effect and user experience.
本申请实施例还提供了一种电子设备,包括:麦克风、摄像头、存储器和处理器,其中,存储器,用于保存预设规则和计算机程序代码,计算机程序代码包括指令;所述指令被处理器运行时,使得电子设备执行前述任一方法实施例所述收音方法的部分或全部步骤的指令。Embodiments of the present application further provide an electronic device, including: a microphone, a camera, a memory, and a processor, wherein the memory is used to store preset rules and computer program code, and the computer program code includes instructions; the instructions are processed by the processor When running, the electronic device executes the instructions of part or all of the steps of the audio pickup method described in any of the foregoing method embodiments.
本申请实施例还提供了一种计算机程序产品,当计算机程序产品在计算机上运行时,使得所述计算机执行前述任一方法实施例所述收音方法的部分或全部步骤的指令。The embodiments of the present application further provide a computer program product, when the computer program product runs on a computer, the computer executes the instructions of part or all of the steps of the audio pickup method described in any of the foregoing method embodiments.
本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质存储用于电子设备执行的程序代码,所述程序代码被执行时,所述电子设备执行前述任一方法实施例所述的方法。Embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium stores program codes for execution by an electronic device. When the program codes are executed, the electronic device executes any of the foregoing method embodiments. method described.
上述具体的方法实施例以及实施例中技术特征的解释、表述、以及多种实现形式的扩展也适用于装置中的方法执行,装置实施例中不予以赘述。The above-mentioned specific method embodiments and explanations, expressions, and extensions of various implementation forms of the technical features in the embodiments are also applicable to the method execution in the apparatus, and are not repeated in the apparatus embodiments.
应理解以上装置中的各个模块的划分仅仅是一种逻辑功能的划分,实际实现时可以全部或部分集成到一个物理实体上,也可以物理上分开。例如,以上各个模块可以为单独设立的处理元件,也可以集成在终端的某一个芯片中实现,此外,也可以以程序代码的形式存储于控制器的存储元件中,由处理器的某一个处理元件调用并执行以上各个模块的功能。此外各 个模块可以集成在一起,也可以独立实现。这里所述的处理元件可以是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤或以上各个模块可以通过处理器元件中的硬件的集成逻辑电路或者软件形式的指令完成。该处理元件可以是通用处理器,例如处理器(central processing unit,CPU),还可以是被配置成实施以上方法的一个或多个集成电路,例如:一个或多个特定集成电路(application-specific integrated circuit,ASIC),或,一个或多个微处理器(digital signal processor,DSP),或,一个或者多个现场可编程门阵列(field-programmable gate array,FPGA)等。It should be understood that the division of each module in the above apparatus is only a division of logical functions, and in actual implementation, it may be fully or partially integrated into a physical entity, or may be physically separated. For example, each of the above modules can be a separately established processing element, or can be integrated into a certain chip of the terminal to be implemented, in addition, it can also be stored in the storage element of the controller in the form of program code, and processed by a certain one of the processor. The component calls and executes the functions of the above modules. In addition, each module can be integrated together or can be implemented independently. The processing element described here may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method or each of the above-mentioned modules can be completed by an integrated logic circuit of hardware in the processor element or an instruction in the form of software. The processing element may be a general-purpose processor, such as a central processing unit (CPU), or may be one or more integrated circuits configured to implement the above method, such as one or more application-specific integrated circuits (application-specific integrated circuits) integrated circuit, ASIC), or, one or more microprocessors (digital signal processor, DSP), or, or, one or more field-programmable gate arrays (field-programmable gate array, FPGA), etc.
应理解本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的实施例能够以除了在这里图示或描述的内容以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或模块的过程、方法、***、产品或设备不必限于清楚地列出的那些步骤或模块,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或模块。It should be understood that the terms "first", "second" and the like in the description and claims of the present invention and the above-mentioned drawings are used to distinguish similar objects and are not necessarily used to describe a specific order or sequence. It is to be understood that data so used may be interchanged under appropriate circumstances so that the embodiments described herein can be practiced in sequences other than those illustrated or described herein. Furthermore, the terms "comprising" and "having", and any variations thereof, are intended to cover non-exclusive inclusion, for example, a process, method, system, product or device comprising a series of steps or modules is not necessarily limited to those expressly listed Rather, those steps or modules may include other steps or modules not expressly listed or inherent to the process, method, product or apparatus.
以上所揭露的仅为本发明多种较佳实施例而已,当然不能以此来限定本发明之权利范围,本领域普通技术人员可以理解实现上述实施例的全部或部分流程,并依本发明权利要求所作的等同变化,仍属于发明所涵盖的范围。The above disclosures are only various preferred embodiments of the present invention, which of course cannot limit the scope of the rights of the present invention. Those of ordinary skill in the art can understand that all or part of the procedures for implementing the above-mentioned embodiments can be realized according to the rights of the present invention. The equivalent changes required to be made still belong to the scope covered by the invention.

Claims (20)

  1. 一种收音方法,其特征在于,应用于包括麦克风的电子设备,所述方法包括:A sound collection method, characterized in that, applied to an electronic device including a microphone, the method comprising:
    对实时拍摄的图像进行识别,确定所述图像对应的场景;所述场景为预设场景中的一种;Identifying the image captured in real time, and determining the scene corresponding to the image; the scene is one of the preset scenes;
    对所述麦克风的收音指向性进行调节,按照预设规则使所述麦克风的收音指向性与所述场景匹配。The sound pickup directivity of the microphone is adjusted, and the sound pickup directivity of the microphone is matched with the scene according to a preset rule.
  2. 根据权利要求1所述的方法,其特征在于,The method of claim 1, wherein:
    所述对实时拍摄的图像进行识别,包括:识别所述图像中的如下信息中的一个或者多个:指定对象的个数、所述指定对象的清晰度、所述指定对象在所述图像中的占比、以及所述指定对象在所述图像中的位置。The recognizing the image captured in real time includes: recognizing one or more of the following information in the image: the number of designated objects, the clarity of the designated object, the designated object in the image and the position of the specified object in the image.
  3. 根据权利要求2所述的方法,其特征在于,The method of claim 2, wherein:
    用于确定所述预设场景的信息包括如下信息中的一个或者多个确定:所述图像中指定对象的个数、所述指定对象的清晰度、所述指定对象在所述图像中的占比、或者所述指定对象在所述图像中的位置。The information used to determine the preset scene includes one or more of the following information: the number of designated objects in the image, the clarity of the designated objects, and the proportion of the designated objects in the image. ratio, or the position of the specified object in the image.
  4. 根据权利要求2所述的方法,其特征在于,The method of claim 2, wherein:
    所述指定对象包括人脸图像;The designated object includes a face image;
    所述对实时拍摄的图像进行识别,包括:对所述图像进行人脸识别。The recognizing the image captured in real time includes: performing face recognition on the image.
  5. 根据权利要求1至4任一项所述的方法,其特征在于,所述预设场景包括如下场景中的一个或者多个场景:第一场景、第二场景、第三场景和第四场景;The method according to any one of claims 1 to 4, wherein the preset scene comprises one or more of the following scenes: a first scene, a second scene, a third scene and a fourth scene;
    所述第一场景包括:所述图像中包括一个或者多个指定对象,所述指定对象在所述图像中心区域内,并且在所述图像中心区域内的占比超过第一阈值;The first scene includes: the image includes one or more designated objects, the designated objects are in the central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold;
    所述第二场景包括:所述图像中包括一个或者多个指定对象,所述指定对象未在所述图像中心区域内,并且在所述图像中的占比超过第二阈值;The second scene includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold;
    所述第三场景包括:所述图像中包括一个或者多个指定对象,所述指定对象未在所述图像中心区域内,并且在所述图像中的占比未超过第三阈值;The third scene includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold;
    所述第四场景包括:所述图像中未包括指定对象;The fourth scene includes: the specified object is not included in the image;
    所述预设规则包括:所述第一场景、所述第二场景、所述第三场景和所述第四场景分别对应的收音指向性依次减弱。The preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially.
  6. 根据权利要求1至5任一项所述的方法,其特征在于,在所述对实时拍摄的图像进行识别之前,所述方法还包括:The method according to any one of claims 1 to 5, wherein before the recognizing the real-time captured image, the method further comprises:
    将所述电子设备的拍摄模式切换为前置模式或者后置模式。Switch the shooting mode of the electronic device to the front mode or the rear mode.
  7. 根据权利要求6所述的方法,其特征在于,The method of claim 6, wherein:
    在所述对实时拍摄的图像进行识别之前,所述方法还包括:在所述实时拍摄的图像中确定指定对象;Before identifying the real-time captured image, the method further includes: determining a designated object in the real-time captured image;
    所述对所述麦克风的收音指向性进行调节,包括:在所述电子设备的拍摄模式为前置模 式时,对所述麦克风的收音前置指向性进行调节;或者,在所述电子设备的拍摄模式为后置模式时,对所述麦克风的收音后置指向性进行调节。The adjusting the sound pickup directivity of the microphone includes: when the shooting mode of the electronic device is the front mode, adjusting the sound pickup front directivity of the microphone; When the shooting mode is the rear mode, the rear directivity of the microphone for sound collection is adjusted.
  8. 根据权利要求7所述的方法,其特征在于,所述在所述实时拍摄的图像中确定指定对象,包括:The method according to claim 7, wherein the determining the specified object in the real-time captured image comprises:
    获取用户对所述图像的点击操作,将所述点击操作的对象确定为所述指定对象。The click operation of the user on the image is acquired, and the object of the click operation is determined as the designated object.
  9. 根据权利要求7所述的方法,其特征在于,所述在所述实时拍摄的图像中确定指定对象之前,所述方法包括:The method according to claim 7, characterized in that, before the determination of the designated object in the real-time captured image, the method comprises:
    拍照获取第一图片,从所述第一图片中确定指定对象;或者,Taking a picture to obtain a first picture, and determining a designated object from the first picture; or,
    从图片库中获取第二图片,从所述第二图片确定指定对象;或者,Obtain a second picture from the picture library, and determine the specified object from the second picture; or,
    根据预先获取的对指定对象的描述确定指定对象。The specified object is determined according to the pre-acquired description of the specified object.
  10. 一种收音装置,其特征在于,包括:麦克风,还包括:A sound-receiving device is characterized in that, comprising: a microphone, and also comprising:
    第一处理单元,对实时拍摄的图像进行识别,确定所述图像对应的场景;所述场景为预设场景中的一种;a first processing unit, for identifying an image captured in real time, and determining a scene corresponding to the image; the scene is one of preset scenes;
    调节单元,用于对所述麦克风的收音指向性进行调节,按照预设规则使所述麦克风的收音前置指向性与所述场景匹配。The adjustment unit is configured to adjust the sound pickup directivity of the microphone, and make the sound pickup front directivity of the microphone match the scene according to a preset rule.
  11. 根据权利要求10所述的装置,其特征在于,The device of claim 10, wherein:
    在所述对实时拍摄的图像进行识别方面,所述第一处理单元具体用于,识别所述图像中的如下信息中的一个或者多个:指定对象的个数、所述指定对象的清晰度、所述指定对象在所述图像中的占比、以及所述指定对象在所述图像中的位置。In the aspect of recognizing the image captured in real time, the first processing unit is specifically configured to recognize one or more of the following information in the image: the number of specified objects, the definition of the specified objects , the proportion of the designated object in the image, and the position of the designated object in the image.
  12. 根据权利要求11所述的装置,其特征在于,The apparatus of claim 11, wherein:
    用于确定所述预设场景的信息包括如下信息中的一个或者多个确定:所述图像中指定对象的个数、所述指定对象的清晰度、所述指定对象在所述图像中的占比、或者所述指定对象在所述图像中的位置。The information used to determine the preset scene includes one or more of the following information: the number of designated objects in the image, the clarity of the designated objects, and the proportion of the designated objects in the image. ratio, or the position of the specified object in the image.
  13. 根据权利要求11所述的装置,其特征在于,The apparatus of claim 11, wherein:
    所述指定对象包括人脸图像;The designated object includes a face image;
    所述对实时拍摄的图像进行识别,包括:对所述图像进行人脸识别。The recognizing the image captured in real time includes: performing face recognition on the image.
  14. 根据权利要求10至13任一项所述的装置,其特征在于,所述预设场景包括如下场景中的一个或者多个场景:第一场景、第二场景、第三场景和第四场景;The apparatus according to any one of claims 10 to 13, wherein the preset scene includes one or more of the following scenes: a first scene, a second scene, a third scene, and a fourth scene;
    所述第一场景包括:所述图像中包括一个或者多个指定对象,所述指定对象在所述图像中心区域内,并且在所述图像中心区域内的占比超过第一阈值;The first scene includes: the image includes one or more designated objects, the designated objects are in the central area of the image, and the proportion of the designated objects in the central area of the image exceeds a first threshold;
    所述第二场景包括:所述图像中包括一个或者多个指定对象,所述指定对象未在所述图像中心区域内,并且在所述图像中的占比超过第二阈值;The second scene includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image exceeds a second threshold;
    所述第三场景包括:所述图像中包括一个或者多个指定对象,所述指定对象未在所述图像中心区域内,并且在所述图像中的占比未超过第三阈值;The third scene includes: the image includes one or more designated objects, the designated objects are not in the central area of the image, and the proportion of the designated objects in the image does not exceed a third threshold;
    所述第四场景包括:所述图像中未包括指定对象;The fourth scene includes: the specified object is not included in the image;
    所述预设规则包括:所述第一场景、所述第二场景、所述第三场景和所述第四场景分别对应的收音指向性依次减弱。The preset rule includes: the radio directivity corresponding to the first scene, the second scene, the third scene and the fourth scene respectively weakens sequentially.
  15. 根据权利要求10至14任一项所述的装置,其特征在于,还包括:The device according to any one of claims 10 to 14, further comprising:
    切换单元,用于在所述第一处理单元对实时拍摄的图像进行识别之前,将所述电子设备的拍摄模式切换为前置模式或者后置模式。A switching unit, configured to switch the shooting mode of the electronic device to the front mode or the rear mode before the first processing unit recognizes the real-time captured image.
  16. 根据权利要求15所述的装置,其特征在于,The apparatus of claim 15, wherein:
    所述第一处理单元还用于,在对实时拍摄的图像进行识别之前,在所述实时拍摄的图像中确定指定对象;The first processing unit is further configured to, before identifying the real-time captured image, determine a designated object in the real-time captured image;
    所述调节单元具体用于,在所述电子设备的拍摄模式为前置模式时,对所述麦克风的收音前置指向性进行调节;或者,在所述电子设备的拍摄模式为后置模式时,对所述麦克风的收音后置指向性进行调节。The adjustment unit is specifically configured to, when the shooting mode of the electronic device is the front mode, adjust the front direction of the microphone for sound collection; or, when the shooting mode of the electronic device is the rear mode , to adjust the directivity of the microphone's sound-receiving rear.
  17. 根据权利要求16所述的装置,其特征在于,The apparatus of claim 16, wherein:
    在所述实时拍摄的图像中确定指定对象方面,所述第一处理单元具体用于,获取用户对所述图像的点击操作,将所述点击操作的对象确定为所述指定对象。In terms of determining the designated object in the real-time captured image, the first processing unit is specifically configured to acquire a user's click operation on the image, and determine the object of the click operation as the designated object.
  18. 根据权利要求16所述的装置,其特征在于,还包括:第二处理单元,The apparatus of claim 16, further comprising: a second processing unit,
    用于在所述第一处理单元在所述实时拍摄的图像中确定指定对象之前,拍照获取第一图片,从所述第一图片中确定指定对象;或者,before the first processing unit determines the specified object in the real-time captured image, taking a picture to obtain a first picture, and determining the specified object from the first picture; or,
    用于在所述第一处理单元在所述实时拍摄的图像中确定指定对象之前,从图片库中获取第二图片,从所述第二图片确定指定对象;或者,for obtaining a second picture from a picture library before the first processing unit determines the specified object in the real-time captured image, and determining the specified object from the second picture; or,
    用于根据预先获取的指定对象的描述确定指定对象。It is used to determine the specified object according to the description of the specified object obtained in advance.
  19. 一种电子设备,其特征在于,包括:麦克风、存储器和处理器,其中,所述存储器,用于保存预设场景、预设规则和计算机程序代码,所述计算机程序代码包括指令;所述指令被所述处理器运行时,使得所述电子设备执行如权利要求1-9任一项所述的方法。An electronic device, characterized in that it comprises: a microphone, a memory and a processor, wherein the memory is used to save a preset scene, a preset rule and a computer program code, the computer program code includes instructions; the instructions When executed by the processor, the electronic device is caused to perform the method of any one of claims 1-9.
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储用于电子设备执行的程序代码,所述程序代码被执行时,所述电子设备执行如权利要求1-9中任一项所的方法。A computer-readable storage medium, wherein the computer-readable storage medium stores a program code for execution by an electronic device, and when the program code is executed, the electronic device executes any one of claims 1-9. a method.
PCT/CN2022/085899 2021-04-29 2022-04-08 Method for audio reception, apparatus, and related electronic device WO2022228089A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110478055.3 2021-04-29
CN202110478055.3A CN115272839A (en) 2021-04-29 2021-04-29 Radio reception method and device and related electronic equipment

Publications (1)

Publication Number Publication Date
WO2022228089A1 true WO2022228089A1 (en) 2022-11-03

Family

ID=83745862

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/085899 WO2022228089A1 (en) 2021-04-29 2022-04-08 Method for audio reception, apparatus, and related electronic device

Country Status (2)

Country Link
CN (1) CN115272839A (en)
WO (1) WO2022228089A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120050570A1 (en) * 2010-08-26 2012-03-01 Jasinski David W Audio processing based on scene type
JP2013003392A (en) * 2011-06-17 2013-01-07 Sanyo Electric Co Ltd Sound recording apparatus
US20150350769A1 (en) * 2014-06-03 2015-12-03 Cisco Technology, Inc. Determination, Display, and Adjustment of Best Sound Source Placement Region Relative to Microphone
JP2016119620A (en) * 2014-12-22 2016-06-30 パナソニックIpマネジメント株式会社 Directivity control system and directivity control method
CN107004426A (en) * 2014-11-28 2017-08-01 华为技术有限公司 The method and mobile terminal of the sound of admission video recording object
CN112165590A (en) * 2020-09-30 2021-01-01 联想(北京)有限公司 Video recording implementation method and device and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120050570A1 (en) * 2010-08-26 2012-03-01 Jasinski David W Audio processing based on scene type
JP2013003392A (en) * 2011-06-17 2013-01-07 Sanyo Electric Co Ltd Sound recording apparatus
US20150350769A1 (en) * 2014-06-03 2015-12-03 Cisco Technology, Inc. Determination, Display, and Adjustment of Best Sound Source Placement Region Relative to Microphone
CN107004426A (en) * 2014-11-28 2017-08-01 华为技术有限公司 The method and mobile terminal of the sound of admission video recording object
JP2016119620A (en) * 2014-12-22 2016-06-30 パナソニックIpマネジメント株式会社 Directivity control system and directivity control method
CN112165590A (en) * 2020-09-30 2021-01-01 联想(北京)有限公司 Video recording implementation method and device and electronic equipment

Also Published As

Publication number Publication date
CN115272839A (en) 2022-11-01

Similar Documents

Publication Publication Date Title
CN110072070B (en) Multi-channel video recording method, equipment and medium
CN111050269B (en) Audio processing method and electronic equipment
CN113112505B (en) Image processing method, device and equipment
JP7266672B2 (en) Image processing method, image processing apparatus, and device
CN108566516B (en) Image processing method, device, storage medium and mobile terminal
EP4192004A1 (en) Audio processing method and electronic device
TWI656509B (en) Image processing method and related products
US20220076006A1 (en) Method and device for image processing, electronic device and storage medium
WO2021051995A1 (en) Photographing method and terminal
JP2016531362A (en) Skin color adjustment method, skin color adjustment device, program, and recording medium
WO2018058899A1 (en) Sound volume adjusting method and apparatus of intelligent terminal
CN113365012A (en) Audio processing method and device
CN109714582B (en) White balance adjusting method, device, storage medium and terminal
CN112584251B (en) Display method and electronic equipment
TW202105239A (en) Image processing methods, electronic devices and storage medium
CN113132863B (en) Stereo pickup method, apparatus, terminal device, and computer-readable storage medium
WO2023231787A9 (en) Audio processing method and apparatus
WO2022228089A1 (en) Method for audio reception, apparatus, and related electronic device
WO2023016053A1 (en) Sound signal processing method and electronic device
US11902754B2 (en) Audio processing method, apparatus, electronic device and storage medium
CN115529431A (en) Video recording method and electronic equipment
WO2021159943A1 (en) Photographing control method and apparatus, and terminal device
WO2022161146A1 (en) Video recording method and electronic device
WO2023005450A1 (en) Image processing method and apparatus, and terminal and storage medium
WO2024078238A1 (en) Video-recording control method, electronic device and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22794560

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22794560

Country of ref document: EP

Kind code of ref document: A1