CN118118775A - Scene perception method, equipment and storage medium - Google Patents

Scene perception method, equipment and storage medium Download PDF

Info

Publication number
CN118118775A
CN118118775A CN202211521455.9A CN202211521455A CN118118775A CN 118118775 A CN118118775 A CN 118118775A CN 202211521455 A CN202211521455 A CN 202211521455A CN 118118775 A CN118118775 A CN 118118775A
Authority
CN
China
Prior art keywords
camera
electronic device
scene
image
mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211521455.9A
Other languages
Chinese (zh)
Inventor
李经纬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Honor Device Co Ltd
Original Assignee
Honor Device Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Honor Device Co Ltd filed Critical Honor Device Co Ltd
Priority to CN202211521455.9A priority Critical patent/CN118118775A/en
Priority to PCT/CN2023/125982 priority patent/WO2024114170A1/en
Publication of CN118118775A publication Critical patent/CN118118775A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/61Control of cameras or camera modules based on recognised objects
    • H04N23/611Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules
    • H04N23/667Camera operation mode switching, e.g. between still and video, sport and normal or high- and low-resolution modes

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Signal Processing (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Ophthalmology & Optometry (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Studio Devices (AREA)

Abstract

The application provides a scene perception method, equipment and a storage medium, which relate to the technical field of terminals and artificial intelligence, and also relate to the technical fields of intelligent perception, intelligent control, intelligent recommendation and the like. The method comprises the following steps: after the intelligent sensing function is started, the camera operates in a first mode. And when the electronic equipment detects that the face of the person exists in the range of the camera, controlling the camera to operate in a second mode. The electronic equipment acquires detection data, the detection data at least comprises image data acquired by the camera in a second mode, the scene type of the person using the electronic equipment is determined based on the image data in the detection data, and preset operation corresponding to the scene type is executed, so that intelligent sensing scene and scene control are realized, and the user experience is improved.

Description

Scene perception method, equipment and storage medium
Technical Field
The present application relates to the field of terminal technologies, and in particular, to a scene sensing method, apparatus, and storage medium.
Background
With the popularization of intelligent terminals, users can use the intelligent terminals anytime and anywhere. Users often have different use requirements for different use scenes, for example, the user usually needs to set mute or turn down the volume in a meeting room, a classroom or the like, and the user usually needs to turn up the volume or shake intensity in a station, a subway station or the like.
In the related art, under different usage scenarios, a user needs to manually perform system setting, for example, open a system setting interface, and select and adjust related parameters at the interface, so that user experience needs to be improved.
Disclosure of Invention
The embodiment of the application provides a scene sensing method, equipment and a storage medium, which realize intelligent scene sensing and scene control and improve the user experience.
In a first aspect, an embodiment of the present application provides a scene sensing method, which is applied to an electronic device, where a camera of the electronic device operates in a first mode; the electronic equipment acquires a first image acquired by the camera in a first mode, and detects whether the face of a person exists in the first image; if the electronic equipment detects the face of the person in the first image, the camera is controlled to be switched from a first mode to a second mode; the electronic equipment acquires detection data, wherein the detection data comprises a second image acquired by the camera in a second mode; the electronic device recognizes the second image, determines a scene category of the person using the electronic device based on the second image, and controls to execute a preset operation corresponding to the scene category.
In the scheme, the camera acquires the first image in the first mode, if the face of the person in the image is detected, the camera is switched to the second mode, the second image is acquired in the second mode, the current scene category of the electronic equipment is determined by identifying the second image, and then the operation corresponding to the scene category is executed, so that intelligent scene perception and scene control are realized, and the user experience is improved. In addition, through obtaining the first image that the camera gathered under the first mode, detect the personage's face and have or not, through the switching of camera mode again, obtain the second image that the camera gathered under the second mode, discern scene category, can reduce equipment consumption to a certain extent.
As an example, the resolution of the first image is less than the resolution of the second image and/or the frame rate of the first image is less than the frame rate of the second image.
In an alternative embodiment of the first aspect, before the camera of the electronic device operates in the first mode, the method further comprises: the electronic device is responsive to a first operation to turn on the scene-aware function.
In the scheme, the condition that the device starts the camera is limited, so that the device is triggered to execute the scheme of intelligent perception scene category.
In an alternative embodiment of the first aspect, before the camera of the electronic device operates in the first mode, the method further comprises: detecting that the state of the electronic equipment meets a first condition; the first condition includes at least one of: the screen state of the electronic equipment is a bright screen state; the electronic device is unlocked; the time difference between the light signal emitted by the electronic device close to the light sensor and the reflected signal of the light signal is greater than a first threshold value, and/or the signal intensity of the reflected signal is less than a second threshold value, and/or the reflected signal is not received by the receiving light sensor; the detection data of the ambient light sensor of the electronic device is greater than a third threshold; the screen of the electronic equipment faces to a preset direction; the electronic device is in a mobile state.
In the scheme, the condition that the equipment starts the camera is further limited, besides the scene sensing function is started manually by a user, the camera is prevented from continuously collecting images when not necessary by adding the first condition, and the power consumption of the equipment is reduced. It is understood that the camera cannot acquire any scene of the face of the person if not necessary.
In an optional embodiment of the first aspect, the electronic device is a foldable device, the foldable device includes an inner screen and an outer screen, the inner screen is correspondingly provided with the first camera, and the outer screen is correspondingly provided with the second camera; the camera of the electronic device operates in a first mode comprising: detecting that an external screen of the electronic equipment is in a bright screen state, and the electronic equipment is in a folded state, and controlling the second camera to operate in a first mode; or detecting that the internal screen of the electronic equipment is in a bright screen state, and the electronic equipment is in an unfolding state, and controlling the first camera to operate in a first mode.
The scheme can be applied to foldable equipment, if the equipment is in a folded state or an unfolded state, a corresponding screen (an inner screen or an outer screen) is in a bright screen state, and a camera on the screen in the bright screen state can be controlled to be started so as to detect whether a person face exists in the range of the camera, and then the scene of the electronic equipment can be triggered and identified.
In an alternative embodiment of the first aspect, before controlling the first camera to operate in the first mode, or controlling the second camera to operate in the first mode, the method further comprises: detecting that the state of the electronic device meets a second condition; the second condition includes at least one of: the electronic device is unlocked; the time difference between the light signal emitted by the electronic device close to the light sensor and the reflected signal of the light signal is greater than a first threshold value, and/or the signal intensity of the reflected signal is less than a second threshold value, and/or the reflected signal is not received by the receiving light sensor; the detection data of the ambient light sensor of the electronic device is greater than a third threshold; the inner screen or the outer screen of the electronic equipment faces to a preset direction; the electronic device is in a mobile state.
In the scheme, the condition that the foldable equipment starts the camera is further limited, except that a user starts a scene sensing function and a corresponding screen is lightened when the screen of the electronic equipment is in a folded state or an unfolded state, the camera is prevented from continuously acquiring a first image when not necessary by adding a second condition, and the power consumption of the equipment is reduced.
In an alternative embodiment of the first aspect, the method further comprises: when the second camera operates in the first mode, if the electronic equipment is detected to be in a folded state to an unfolded state, the electronic equipment controls the first camera to operate in the first mode and closes the second camera; or the electronic equipment controls the first camera to operate in a first mode; when the first camera operates in the first mode, if the electronic equipment is detected to be in a folded state from an unfolded state, the electronic equipment controls the first camera to be closed, and controls the second camera to operate in the first mode.
In the scheme, when other conditions are unchanged, if a user changes the physical state of the equipment screen, for example, from a folded state to an unfolded state, or from the unfolded state to the folded state, the camera can be switched to continuously acquire the first image, so that the equipment screen can realize the function of intelligently sensing the scene under the new physical state.
In an optional embodiment of the first aspect, the detection data further comprises time data, the method further comprising: if the time data is determined to be in the preset time period, taking a preset scene category corresponding to the preset time period as the scene category of the electronic equipment.
In the scheme, based on clock information of the electronic equipment, the possible scene categories of the equipment in the current period can be acquired so as to assist the equipment to perceive the scene.
In an alternative embodiment of the first aspect, the detection data further comprises position data, the method further comprising: if the position data is determined to be in the preset position range, taking the preset scene category corresponding to the preset position range as the scene category of the electronic equipment.
In the scheme, based on the position information of the electronic equipment, the possible scene categories of the equipment at the current position can be acquired so as to assist the equipment to perceive the scene.
In an optional embodiment of the first aspect, the detection data further comprises voice data, the method further comprising: if the voice data is identified to contain one sound source or less than N sound sources, determining the scene category of the electronic equipment as a first scene, wherein N is a positive integer greater than or equal to 2; if the voice data are recognized to contain more than M sound sources, determining that the scene category of the electronic equipment is a second scene, wherein M is a positive integer greater than N. In this embodiment, the first scene may be a quieter scene, such as a conference room or classroom scene. The second scene may be a noisy scene, such as a subway station, station scene.
In the scheme, based on the voice information of the environment where the electronic equipment is located, the possible scene category of the equipment in the current environment can be known, so that the equipment is assisted to perceive the scene.
In an optional embodiment of the first aspect, the detection data further comprises data of a first sensor of the electronic device, the first sensor comprising a gyroscope sensor and an acceleration sensor; the method further comprises the steps of: the electronic device determines a scene category of the electronic device based on the second image in the detection data and the data of the first sensor.
In the scheme, based on the second image acquired by the camera, the current possible scene category of the equipment can be known by combining the sensor data in the electronic equipment, and the accuracy of the equipment for sensing the scene can be improved.
In an optional embodiment of the first aspect, the electronic device determining a scene category of the electronic device based on the second image in the detection data and the data of the first sensor comprises: if the user is determined to be in a motion state based on the data of the first sensor, and the user is determined to continuously watch the screen of the electronic equipment based on the second image, determining the scene category of the electronic equipment as a third scene; the exercise state includes a walking or riding state. In this embodiment, the third scene may be a scene of walking or riding watching the screen, which belongs to a scene of unsafe use of the electronic device.
In the scheme, based on the sensor data and the image data of the electronic equipment, the motion state of the user and the eye state (such as the state of continuously watching the screen) of the user are respectively detected, whether the user is in a scene of unsafe use of the electronic equipment or not can be determined, and the perceptibility of the scene is realized.
In an alternative embodiment of the first aspect, the detection data further comprises data of a second sensor of the electronic device, the second sensor comprising an ambient light sensor; the method further comprises the steps of: and if the data of the second sensor is smaller than the fourth threshold value, determining the scene category of the electronic equipment as a fourth scene. In this embodiment, the fourth scene may be a scene of a dark environment, such as a bedroom or a sleeping scene.
In the scheme, whether the user is in a dark environment or not can be determined by detecting the ambient light data of the environment where the electronic equipment is located so as to assist the equipment to perceive the scene. The scheme can be combined with clock information, position information and the like of the electronic equipment so as to improve the accuracy of the equipment for sensing the scene.
In an alternative embodiment of the first aspect, the preset operation comprises at least one of: adjusting the volume; adjusting the brightness of a screen; adjusting blue light of a screen; adjusting the vibration intensity; sending first information, wherein the first information is used for reminding a user to stop using the electronic equipment; sending second information, wherein the second information is used for recommending content corresponding to the scene category; and opening the rear camera for detecting the obstacle.
In the scheme, different scene categories can correspond to different operations so as to realize intelligent control of the equipment after the scene categories are perceived.
In an optional embodiment of the first aspect, if the preset operation is to turn on the rear camera, the method further includes: the electronic equipment acquires a third image acquired by the rear camera in a second mode; and if the third image is identified to have the obstacle, sending third information, wherein the third information is used for reminding a user to avoid the obstacle.
This scheme is mainly to foretell third scene, through opening the rear-mounted camera to whether there is the barrier in detection equipment surrounding environment, so that in time remind the user to dodge the barrier, promote user's machine experience.
In an alternative embodiment of the first aspect, the camera of the electronic device operates in a first mode comprising: the method comprises the steps that a sensing module of the electronic equipment sends a first instruction to a second processing module of the electronic equipment, wherein the first instruction is used for instructing the second processing module to detect whether a person face exists in the range of a camera; the second processing module sends a first shooting instruction to the camera; the camera is operated in a first mode in response to a first shooting instruction.
In an optional embodiment of the first aspect, the electronic device acquires a first image acquired by the camera in the first mode, and detects whether there is a face of the person in the first image, including: the second processing module of the electronic equipment acquires a first image acquired by the camera in a first mode, and detects whether the face of the person exists in the first image.
In an optional embodiment of the first aspect, if the electronic device detects a face of a person in the first image, controlling the camera to switch from the first mode to the second mode includes: if the second processing module of the electronic equipment detects that the face of the person exists in the first image, the second processing module sends a first message to the sensing module of the electronic equipment, and the first message is used for informing the sensing module that the face of the person exists in the camera range; the sensing module sends a second instruction to the first processing module of the electronic device, and the second instruction is used for instructing the first processing module to identify the category of the scene in the range of the camera; the first processing module responds to the second instruction and sends a second shooting instruction to the camera, and the second shooting instruction is used for indicating the camera to operate in a second mode.
In an optional embodiment of the first aspect, the electronic device identifies a second image, determines a scene category of the person using the electronic device based on the second image, and controls to perform a preset operation corresponding to the scene category, including: the first processing module of the electronic device identifies a second image, determines a scene category of the electronic device based on the second image, and sends a second message to the sensing module of the electronic device, wherein the second message is used for indicating the scene category of the electronic device; the perception module sends a third indication to a target application of the electronic equipment, wherein the third indication is used for indicating scene categories of the electronic equipment; and the target application controls to execute preset operation corresponding to the scene category.
In an alternative embodiment of the first aspect, the second processing module of the electronic device detects a state of the electronic device.
The several alternative embodiments described above illustrate the interaction process between the underlying modules of an electronic device to achieve intelligent device-aware scene and scene control.
In a second aspect, an embodiment of the present application provides an electronic device, including: a camera for capturing images of different frame rates and/or resolutions, a memory and a processor for invoking a computer program in the memory to perform the method of any of the first aspects.
In an alternative embodiment of the second aspect, the processor comprises a first processing module and a second processing module, the power consumption of the first processing module being higher than the power consumption of the second processing module; the second processing module is used for detecting whether a person face exists in a first image acquired by the camera in a first mode; the first processing module is used for identifying a second image acquired by the camera in a second mode and determining scene categories of the electronic equipment used by the user.
In the scheme, whether the face of the person exists in the camera range is detected through the second processing module with lower power consumption, the current scene category of the electronic equipment is identified through the first processing module with higher power consumption, and the processing performance of the electronic equipment is optimized.
In a third aspect, an embodiment of the application provides an electronic device comprising means, modules or circuits for performing the method according to any of the first aspects.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing computer instructions that, when run on an electronic device, cause the electronic device to perform the method of any one of the first aspects.
In a fifth aspect, an embodiment of the present application provides a chip comprising a processor for invoking a computer program in memory to perform a method according to any of the first aspects.
In a sixth aspect, a computer program product comprising a computer program which, when run, causes a computer to perform the method according to any of the first aspects.
It should be understood that, the second aspect to the sixth aspect of the present application correspond to the technical solutions of the first aspect of the present application, and the advantages obtained by each aspect and the corresponding possible embodiments are similar, and are not repeated.
Drawings
FIG. 1 is a schematic view of a scene provided in an embodiment of the present application;
FIG. 2 is a schematic diagram of an interface according to an embodiment of the present application;
FIG. 3 is a schematic view of a scenario provided in an embodiment of the present application;
fig. 4 is a schematic view of a scenario provided in an embodiment of the present application;
fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of an SoC according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
fig. 9 is a schematic flow chart of a scene perception method according to an embodiment of the present application;
fig. 10 is a schematic flow chart of a scene perception method according to an embodiment of the present application;
fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
Fig. 12 is a schematic structural diagram of a chip according to an embodiment of the present application;
fig. 13 is a schematic structural diagram of a folding screen mobile phone according to an embodiment of the present application;
Fig. 14 is a flow chart of a scene perception method according to an embodiment of the present application.
Detailed Description
In order to clearly describe the technical solution of the embodiments of the present application, in the embodiments of the present application, the words "first", "second", etc. are used to distinguish the same item or similar items having substantially the same function and effect. For example, the first image and the second image are merely for distinguishing between different frame rates and/or resolution images, and are not limited in their order of precedence. As another example, the first indication and the second indication are merely for distinguishing between the different indications. It will be appreciated by those of skill in the art that the words "first," "second," and the like do not limit the amount and order of execution, and that the words "first," "second," and the like do not necessarily differ.
It should be noted that, in the embodiments of the present application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
In the embodiments of the present application, "at least one" means one or more, and "a plurality" means two or more. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a alone, a and B together, and B alone, wherein a, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship. "at least one item (seed/number)" or the like means any combination of these items, including any combination of single item (seed/number) or plural items (seed/number). For example, at least one (seed/seed) of a, b or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, wherein a, b, c may be single or plural.
The following is a description of some of the terms involved in the present application to facilitate understanding by those skilled in the art.
Frame rate (FRAME RATE), which refers to the number of images acquired or transmitted by a camera in one second, is typically expressed in fps (i.e., frames per second). In the embodiment of the application, the camera acquires/transmits images at a first frame rate in a first mode, acquires/transmits images at a second frame rate in a second mode, and the first frame rate is smaller than the second frame rate. In some embodiments, the resolution of the image acquired in the first mode is less than the resolution of the image acquired in the second mode.
Resolution, i.e., image resolution, refers to the amount of information stored in an image, and is how many pixels are in an image per inch, the unit of resolution is: dpi (dots per inch), ppi (pixels per inch), and the like.
Geofencing (geo-fencing), an application of location-based services (location based services, LBS), is to enclose a virtual geographic boundary with a virtual fence. When the electronic device enters, leaves or is active in a specific geographic area, the electronic device can receive information prompts such as automatic notices and warnings, and the electronic device can also automatically set system related parameters such as volume, vibration intensity and the like. Geofences may have different names based on different scenarios, e.g., a geofence near a subway station may be referred to as a subway fence, a geofence near an office building may be referred to as an office fence, and a geofence near a teaching building may be referred to as a classroom fence. In embodiments of the application, the geofence may be a common fence for a public area, and the detection by the electronic device of whether entering the geofence performs operations related to the geofence requires authorization from a user of the electronic device.
The lightweight neural network is a lighter model, and has performance which is not worse than that of a heavier model, so that the hardware-friendly neural network is realized. The weight herein generally refers to the scale or parameter of the model. Common lightweight neural networks are: distillation, pruning, quantization, weight sharing, low rank decomposition, attention module lightweight, dynamic network architecture/training mode, lighter network architecture design, etc., without limitation to embodiments of the present application.
At present, the scenes of using the intelligent terminal by the user are various, the use requirements of the user under different use scenes have certain differences, and the user usually needs to manually adjust the related parameters of the intelligent terminal so as to adapt to the current use scene. For example, in quieter scenes, such as meeting rooms, classrooms, hospitals, etc., it is often necessary to mute or turn down the volume of the handset. In a noisy scene, such as a station, a subway station, etc., the volume or vibration intensity often needs to be increased. Based on this, how to improve the scene perception capability of the intelligent terminal is a problem to be solved.
In view of the above problems, an embodiment of the present application provides a scene sensing method, an electronic device, and a storage medium, where when a condition of starting a camera is satisfied, the electronic device provided by the embodiment of the present application instructs the camera to operate in a first mode, acquires a first image acquired by the camera in the first mode, and when detecting that the first image includes a face of a person, instructs the camera to switch from the first mode to a second mode, and acquires a second image acquired by the camera in the second mode, where resolution and/or frame rate of the second image is greater than that of the first image. The second image is subjected to image analysis, the scene type of the electronic equipment used by the user is determined, and the preset operation corresponding to the scene type is executed, so that the automatic detection and identification of the use scene of the electronic equipment are realized, and the preset operation corresponding to the use scene, such as volume adjustment, vibration intensity, information pushing and the like, is executed, so that the use experience of the user is improved.
The following describes the technical scheme of the present application and how the technical scheme of the present application solves the above technical problems in detail with specific embodiments. The following embodiments may be implemented independently or combined with each other, and the same or similar concepts or processes may not be described in detail in some embodiments.
The following description will take an electronic device as an example of a mobile phone, and this example does not limit the embodiments of the present application.
Fig. 1 is a schematic view of a scenario provided in an embodiment of the present application. As shown in fig. 1, if the condition of starting the camera is satisfied, triggering the front camera of the mobile phone to collect a first image in a first mode, and triggering the front camera and/or the rear camera of the mobile phone to collect a second image in a second mode when the face of the person is detected to be contained in the first image, and determining the scene type of the user currently using the mobile phone by performing image analysis on the second image. Based on the scene category, a preset operation corresponding to the scene category is performed, such as adjusting volume, brightness, vibration intensity, transmitting a prompt message, and the like.
In this embodiment, the power consumption of the camera in the first mode is lower than that in the second mode. The resolution of the first image is less than the resolution of the second image, and the frame rate of the first image is less than the frame rate of the second image. The first image collected by the front camera may be one or more, and the second image collected by the front camera and/or the rear camera may be one or more, which is not limited in this embodiment.
In some embodiments, the conditions for turning on the camera include: the scene perception function has been turned on. In response to a first operation of turning on the scene perception function, the camera of the mobile phone is operated in a first mode (background operation, for example, the mobile phone currently turns on a third party application, and the camera is operated in the background), and the first image is collected. The first operation may be a clicking operation of the user on the system setting interface, and the first operation may also be a voice operation, which is not limited to the embodiment of the present application.
Exemplary, fig. 2 is a schematic diagram of an interface provided by an embodiment of the present application, as shown in fig. 2, a user may select to turn on or off a scene sensing function in a setting interface of a system application, and trigger a front camera of a mobile phone to collect the first image when detecting that the mobile phone has turned on the scene sensing function.
In some embodiments, the user may further select to turn on or off the scene sensing function in the setting interface of the third party application, and trigger the front camera of the mobile phone to collect the first image when detecting that the mobile phone has turned on the third party application and turned on the scene sensing function in the third party application.
It should be noted that, the third party application being opened includes that the user currently clicks on the third party application, or that the user enters a background running state after opening the third party application.
In some embodiments, the conditions for turning on the camera further comprise a first condition comprising at least one of:
The screen state of the mobile phone is a bright screen state; the mobile phone is unlocked; the time difference between the light signal emitted by the mobile phone close to the light sensor and the reflected signal of the light signal is larger than a first threshold value, and/or the signal intensity of the reflected signal is smaller than a second threshold value, and/or the received light sensor does not receive the reflected signal; the detection data of the ambient light sensor of the mobile phone is larger than a third threshold value; the screen of the mobile phone faces to a preset direction; the mobile phone is in a mobile state.
In some embodiments, in response to a first operation of turning on a scene perception function and satisfying a first condition, a camera of the mobile phone operates in a first mode to capture the first image.
In this embodiment, if the mobile phone has turned on the scene sensing function and the first condition is satisfied, the camera of the mobile phone is triggered to continuously collect the image. By adding the first condition, the mobile phone camera is prevented from continuously collecting images when not necessary, and the power consumption of the equipment is further reduced.
Based on the foregoing embodiments, various possible implementations for triggering the front-facing camera to capture the first image are described below.
In one possible implementation, if it is detected that the user has turned on the scene sensing function, the front camera of the mobile phone is triggered to collect the first image.
In one possible implementation manner, if it is detected that the user has turned on the scene sensing function, the screen state of the mobile phone is detected, and if the screen state of the mobile phone is a bright screen state, the front camera of the mobile phone is triggered to collect the first image. The interface displayed by the mobile phone on-screen state comprises, for example, a screen locking interface, a main interface and a third party application interface.
In one possible implementation manner, if it is detected that the user has turned on the scene sensing function, it is detected whether the mobile phone is unlocked, and if the mobile phone is unlocked, the front camera of the mobile phone is triggered to collect the first image.
In one possible implementation manner, if it is detected that the user has turned on the scene sensing function, it is detected whether an optical signal emitted by the proximity light sensor of the mobile phone is blocked, and if it is determined that the optical signal emitted by the proximity light sensor is not blocked, a front camera of the mobile phone is triggered to collect the first image.
As an example, if a time difference between an optical signal emitted by the proximity light sensor of the mobile phone and a reflected signal of the optical signal is greater than a first threshold value, and/or a signal strength of the reflected signal is less than a second threshold value, and/or the proximity light sensor does not receive the reflected signal, it may be determined that the optical signal emitted by the proximity light sensor is not blocked.
It can be understood that if a user dials or answers a call through the earphone, or the mobile phone is located in a handbag or a pocket, the light signal emitted by the mobile phone close to the light sensor is blocked, the face of the person cannot be detected by the image collected by the front camera, and the continuous image collection by the camera can be stopped at this time, so that the power consumption of the device is reduced.
In one possible implementation manner, if it is detected that the user has turned on the scene sensing function, it is detected whether the detection data of the ambient light sensor of the mobile phone is greater than a third threshold value, and if it is determined that the detection data of the ambient light sensor is greater than the third threshold value, the front camera of the mobile phone is triggered to collect the first image. The detection data mainly refer to the ambient light brightness. It should be appreciated that if the detection data of the ambient light sensor of the mobile phone is greater than the third threshold, it indicates that the electronic device is not in a dark environment, such as the mobile phone is in a pocket, or is currently in a night period.
In one possible implementation manner, if it is detected that the user has turned on the scene sensing function, the screen orientation of the mobile phone is detected, and if the screen of the mobile phone is oriented in a preset direction, the front camera of the mobile phone is triggered to collect the first image. In this embodiment, the preset direction may be understood as a direction in which the user uses the mobile phone, typically, a direction in which the screen of the mobile phone faces the user, and the direction may be determined by detecting gesture data of the mobile phone, where the gesture data includes a pitch angle, a yaw angle, and a roll angle.
In one possible implementation manner, if it is detected that the user has turned on the scene sensing function, it is detected whether the mobile phone is in a moving state, and if it is determined that the mobile phone is in the moving state, the front camera of the mobile phone is triggered to collect the first image. In the embodiment of the application, the mobile phone is in a moving state, for example, the user carries (including holding) the mobile phone to walk or ride, and the user carries (including holding) the mobile phone to ride the vehicle, and the like.
In one possible implementation, if it is detected that the user has turned on the scene perception function, it is determined that at least two of the following are satisfied: the screen state of the mobile phone is a bright screen state, the mobile phone is unlocked, an optical signal emitted by an approaching optical sensor of the mobile phone is not blocked, the screen of the mobile phone faces to a preset direction, and the mobile phone is in a moving state, so that a front camera of the mobile phone is triggered to collect a first image.
The above embodiment shows a scene sensing method, and if a condition for starting a camera is met, face detection is triggered, where the condition for starting the camera at least includes that a scene sensing function is started. And carrying out face detection by acquiring a first image with lower resolution, acquiring a second image with higher resolution if the face of the person is detected in the first image, and determining the scene type of the mobile phone currently used by the user by analyzing the second image, thereby executing the preset operation corresponding to the scene type. The method realizes the intelligent scene sensing function of the mobile phone, and a user does not need to manually set system parameters such as volume, vibration intensity and the like according to scene changes, so that the use experience of the user is improved.
In some embodiments, if it is detected that the user has turned on the scene sensing function, it is detected whether the mobile phone enters a preset geofence, and if the mobile phone enters the preset geofence, the mobile phone can learn a scene corresponding to the geofence, and then execute a preset operation corresponding to the scene. Among other things, the preset geofences may include, for example, subway fences, office fences, classroom fences, and the like. For example, when it is detected that the cell phone enters the subway fence, the cell phone can learn that the user is about to enter the subway car, and can turn up the volume of the cell phone or increase the vibration intensity. When the mobile phone is detected to enter the office fence, the mobile phone can know that the user is about to enter the office, and the mobile phone can be set to be mute or vibration intensity is improved. In this embodiment, the face detection in the above embodiment may not be performed, and it may be determined whether to enter the preset geofence only according to the current location of the mobile phone, thereby setting the corresponding preset operation.
Fig. 3 is a schematic view of a scenario provided in an embodiment of the present application. As shown in fig. 3, if the condition of starting the camera described in the above embodiment is satisfied, the front camera of the mobile phone is triggered to collect the first image, and when the first image is detected to include the face of the person, detection data is obtained. In this embodiment, the detection data includes at least one of image data, time data, position data, and voice data. Based on the detection data to determine the scene category of the mobile phone currently used by the user, for example, the current scene is identified as a conference or class scene in fig. 3, a preset operation corresponding to the current scene may be performed, for example, the volume is reduced, or the mobile phone is set to be mute, or the vibration intensity is increased, etc.
In a possible implementation manner, the detection data comprises a second image acquired by the front camera and/or the rear camera, and the scene type of the mobile phone currently used by the user is determined based on the second image in the detection data, namely, the scene type of the mobile phone currently used by the user is determined by performing image analysis on the second image.
As an example, the second image is input into the scene detection model, and a first detection result output by the scene detection model is acquired, where the first detection result is used to indicate a scene category of a user using the mobile phone. In this example, the scene detection model may be trained using a lightweight neural network model.
As one example, the training process of the scene detection model includes:
And a training set and a testing set of the scene detection model are constructed, wherein the training set or the testing set comprises sample images and scene categories (sample labels) corresponding to the sample images, and the sample images in the training set and the testing set are different.
And b, training the scene detection model based on the initial scene detection model and the training set. Specifically, a sample image of the training set is used as an input of an initial scene detection model, a scene category corresponding to the sample image of the training set is used as an output of the initial scene detection model, and the scene detection model is trained.
And c, verifying the prediction result of the scene detection model based on the scene detection model trained in the step b and the test set, and stopping training the scene detection model when the model loss function converges.
In one possible implementation, the detection data includes time data, and the scene category of the mobile phone currently used by the user is determined based on the time data in the detection data. As an example, if it is determined that the time data is within the preset time period, the preset scene category corresponding to the preset time period is used as the scene category of the mobile phone used by the user.
By way of example, by acquiring the conference notification, the mobile phone may learn the conference time period and the conference location, and if the current time is within the conference time period, may determine that the current scene is the conference scene. By way of example, by acquiring the electronic class chart, the mobile phone can acquire the class time period and the class place, and if the current time is within the class time period, the current scene can be determined to be a class scene.
In one possible implementation, the detection data includes location data, and the scene category of the mobile phone currently used by the user is determined based on the location data in the detection data. As an example, if the location data is determined to be within the preset location range, the preset scene category corresponding to the preset location range is used as the scene category of the mobile phone used by the user.
In one possible implementation, the detection data includes voice data, and the scene category of the mobile phone currently used by the user is determined based on the voice data in the detection data. As an example, if it is recognized that the voice data includes one sound source or less than N sound sources, it is determined that the scene category of the user using the mobile phone is a first scene, for example, a scene such as a meeting or a classroom in fig. 3. As another example, if more than M sound sources are identified to be included in the voice data, it is determined that the scene category of the user using the mobile phone is a second scene, for example, a scene such as a station or a subway station. In this example, N is a positive integer greater than or equal to 2, and M is a positive integer greater than N.
In one possible implementation, the detection data includes image data (i.e., the second image), time data, location data, and voice data, and the detection data is comprehensively analyzed to determine a scene category of the mobile phone currently used by the user. The accuracy of the scene category determined by the present embodiment is higher than the above-described various embodiments.
The above embodiment shows a scene sensing method, and if the condition of starting the camera is satisfied, face detection is triggered. The condition for turning on the camera at least comprises that the scene sensing function is turned on. If the face of the person is detected, various detection data including image data, time data, position data, voice data and the like are obtained, and various detection data are synthesized to determine the scene type of the mobile phone currently used by the user, and then preset operation corresponding to the scene type is executed. The method realizes the intelligent scene sensing function of the mobile phone, and the user does not need to manually set parameters such as system volume, vibration intensity and the like according to scene changes, so that the use experience of the user is improved.
Fig. 4 is a schematic view of a scenario provided in an embodiment of the present application. As shown in fig. 4, if the condition of starting the camera described in the above embodiment is satisfied, the front camera of the mobile phone is triggered to collect the first image, and when the first image is detected to include the face of the person, detection data is obtained. In this embodiment, the detection data includes image data and data of a first sensor including a gyro sensor and an acceleration sensor. Based on the image data in the detection data and the data of the first sensor, determining a scene category of the user using the mobile phone, for example, a scene in which the current scene is identified as a walking gaze screen in fig. 4, a preset operation corresponding to the current scene may be performed, for example, sending first information for reminding or suggesting that the user does not use the mobile phone. In some embodiments, the first information is sent by means of a pop-up window or voice.
In one possible implementation, the image data in the detection data includes a plurality of continuous second images acquired by the front-facing camera, and if it is determined that the user is in a walking state based on the data of the first sensor, and it is determined that the user continuously looks at the mobile phone screen based on the plurality of continuous second images, it is determined that the scene category of the user using the mobile phone is a third scene. In the present embodiment, the third scene is a scene of walking gaze screen.
In one possible implementation, the image data in the detection data includes a plurality of continuous second images acquired by the front-end camera, if it is determined that the user is in a riding state based on the data of the first sensor, and it is determined that the user continuously looks at the mobile phone screen based on the plurality of continuous second images, it is determined that the scene category of the user using the mobile phone is a third scene. In the present embodiment, the third scene is a scene of the riding gaze screen.
As one example, determining the motion state of the user based on the data of the first sensor includes at least one of:
Acquiring mobile phone posture data detected by a gyroscope sensor, determining the offset of the gravity center of a user based on the mobile phone posture data, and determining the motion state of the user based on the offset of the gravity center of the user, wherein the mobile phone posture data comprises a pitch angle, a yaw angle and a roll angle of a mobile phone; and acquiring the mobile phone acceleration detected by the acceleration sensor, and determining the motion state of the user based on the mobile phone acceleration.
In some embodiments, the user's motion state may also be determined in conjunction with a cell phone compass assist.
As one example, determining that the user is continuously gazing at the cell phone screen based on the plurality of consecutive second images includes: and sequentially inputting a plurality of continuous second images into the gaze detection model, and acquiring a second detection result output by the gaze detection model, wherein the second detection result is used for indicating whether a user continuously gazes at a screen of the electronic equipment. In this example, the gaze detection model may be based on a deep learning approach, trained using a lightweight neural network model.
In some embodiments, if it is determined that the scene where the user currently uses the mobile phone is a scene of walking gaze screen, the following operations may be further performed: and opening the rear camera for detecting the obstacle. Specifically, the mobile phone starts the rear camera, the camera works in the second mode, a third image acquired by the rear camera in the second mode is acquired, and if an obstacle, such as a step, a telegraph pole, a motor vehicle, a pothole and the like, exists in the third image, third information is sent, and the third information is used for reminding a user of avoiding the obstacle. Wherein the resolution of the third image is greater than the resolution of the first image and/or the frame rate of the third image is greater than the frame rate of the first image. In this embodiment, a target detection model may be used to determine whether an obstacle exists in the third image, where the target detection model may be obtained by training a lightweight neural network model based on a deep learning method.
As one example, the training process of the object detection model includes: and a training set and a testing set of the target detection model are built, wherein the training set or the testing set comprises sample images and labeling information of the sample images, the labeling information is used for indicating whether barriers exist in the sample images, and the training set is different from the sample images in the testing set. And b, training the target detection model based on the initial target detection model and the training set. Specifically, the sample image of the training set is used as the input of the initial target detection model, the labeling information of the sample image of the training set is used as the output of the initial target detection model, and the target detection model is trained. And c, verifying a prediction result of the target detection model based on the target detection model trained in the step b and the test set, and stopping training the target detection model when the model loss function converges.
The above embodiment shows a scene sensing method, and if a condition for starting a camera is met, face detection is triggered, where the condition for starting the camera at least includes that a scene sensing function is started. If the face of the person is detected, various detection data including image data, gesture data, speed acceleration data and the like are obtained so as to sense whether the user uses the mobile phone in an unsafe scene, such as walking or riding and watching a screen, and if the user uses the mobile phone in the unsafe scene, the user can be reminded of not using the mobile phone or paying attention to road safety, and the use experience of the user is improved.
In some embodiments, the detection data includes data of a second sensor, the second sensor including an ambient light sensor. If the data of the second sensor is smaller than the fourth threshold value, determining that the scene category of the user using the mobile phone is a fourth scene. In this embodiment, the data of the second sensor is used to indicate the ambient light data, such as illuminance, of the mobile phone in the current scene. If the ambient light sensor data is smaller than the fourth threshold, it indicates that the mobile phone is currently in a dark environment, i.e. a fourth scene (e.g. bedroom/sleeping scene), at this time, the mobile phone can automatically adjust the brightness of the mobile phone screen or start a low blue light mode.
In some embodiments, in identifying bedroom/sleep scenarios, in addition to analyzing the data of the second sensor described above, a comprehensive decision may be made in conjunction with a clock, a habit, a geofence, etc. Operations after entering a bedroom/sleeping scene include, for example, lowering screen brightness, lowering blue light, recommending sleeping content (i.e., content corresponding to the scene category bedroom), and the like.
Based on the above embodiments, the mobile phone can identify a scene category of a user using the mobile phone by combining at least one of image data, clock (time) data, position data, voice data and various sensor data, and further execute a preset operation corresponding to the scene category. Wherein the preset operation includes at least one of: adjusting the volume; adjusting the brightness of a screen; adjusting blue light of a screen; adjusting the vibration intensity; sending first information, wherein the first information is used for reminding a user to stop using the electronic equipment; sending second information, wherein the second information is used for recommending the content corresponding to the scene category; and opening the rear camera for detecting the obstacle.
The scene sensing scheme provided by the embodiment of the application realizes the intelligent detection of the scene of the electronic equipment and the function of automatically setting the system parameters based on the scene detection, such as setting silence or improving vibration intensity in meeting rooms, classrooms and the like, reducing the use volume of bedrooms, preventing peeping in public places, preventing collision or stepping on pits and the like when walking to use the mobile phone, and improving the user experience.
The scene perception method provided by the embodiment of the application can be applied to a folding screen mobile phone besides a straight mobile phone.
In some embodiments, if the mobile phone is a folding screen mobile phone, the folding screen mobile phone includes an inner screen and an outer screen, the inner screen is correspondingly provided with a first camera, and the outer screen is correspondingly provided with a second camera, for example, the first camera is the camera 3 in fig. 13, and the second camera is the camera 1 in fig. 13.
As an example, if the external screen of the mobile phone is detected to be in a bright screen state and the mobile phone is in a folded state, the second camera is controlled to operate in the first mode. As an example, if the external screen of the mobile phone is detected to be in a bright screen state and the mobile phone is in an unfolded state, the second camera is controlled to operate in the first mode. Based on both examples, in some embodiments, before controlling the second camera to operate in the first mode, further comprising: detecting that the state of the mobile phone meets at least one of the following: the mobile phone is unlocked; the time difference between the light signal emitted by the mobile phone close to the light sensor (on the outer screen) and the reflected signal of the light signal is larger than a first threshold value, and/or the signal intensity of the reflected signal is smaller than a second threshold value, and/or the reflected signal is not received by the receiving light sensor; the detection data of the ambient light sensor of the mobile phone is larger than a third threshold value; the outer screen of the mobile phone faces to a preset direction; the mobile phone is in a mobile state.
As an example, if the internal screen of the mobile phone is detected to be in a bright screen state and the mobile phone is in an unfolded state, the first camera is controlled to operate in a first mode. Before controlling the first camera to operate in the first mode, further comprising: detecting that the state of the mobile phone meets at least one of the following: the mobile phone is unlocked; the time difference between the light signal emitted by the mobile phone close to the light sensor (on the inner screen) and the reflected signal of the light signal is larger than a first threshold value, and/or the signal intensity of the reflected signal is smaller than a second threshold value, and/or the reflected signal is not received by the receiving light sensor; the detection data of the ambient light sensor of the mobile phone is larger than a third threshold value; the inner screen of the mobile phone faces to a preset direction; the mobile phone is in a mobile state.
Based on the above examples, before controlling the first camera to operate in the first mode or controlling the second camera to operate in the first mode, further comprises: detecting that the state of the electronic device meets a second condition; the second condition includes at least one of: the electronic device is unlocked; the time difference between the light signal emitted by the electronic device close to the light sensor and the reflected signal of the light signal is greater than a first threshold value, and/or the signal intensity of the reflected signal is less than a second threshold value, and/or the reflected signal is not received by the receiving light sensor; the detection data of the ambient light sensor of the electronic device is greater than a third threshold; the inner screen or the outer screen of the electronic equipment faces to a preset direction; the electronic device is in a mobile state.
As an example, when the second camera operates in the first mode, if the electronic device is detected to be in the unfolded state from the folded state, the electronic device controls the first camera to operate in the first mode and turns off the second camera.
As an example, when the second camera operates in the first mode, if the electronic device is detected to be in the unfolded state from the folded state, the electronic device controls the first camera to operate in the first mode, the second camera keeps operating in the first mode, and the cameras on the inner screen and the outer screen are simultaneously opened so as to enlarge the range of the scene of the detection device.
As an example, when the first camera operates in the first mode, if the electronic device is detected to be in the folded state from the unfolded state, the electronic device controls the first camera to be turned off, and controls the second camera to operate in the first mode.
As an example, when the first camera operates in the first mode, the electronic device controls the first camera to be turned off if the electronic device is detected to be turned from the unfolded state to the folded state.
The method for performing scene sensing on the folding mobile phone is described in detail below with reference to fig. 13.
Fig. 13 is a schematic structural diagram of a folding screen mobile phone according to an embodiment of the present application. As shown in fig. 13, the screen of the folding screen mobile phone includes a first screen, a second screen and a third screen, the first screen is an outer screen of the folding screen mobile phone, the second screen and the third screen are inner screens of the folding screen mobile phone, the folding screen includes the second screen and the third screen, and the folding screen is folded according to the folding edge shown in (4) in fig. 13 to form the second screen and the third screen. The virtual axis of the folding screen is a common axis. Wherein, the inner screen refers to the screen that is located inside when folding screen is in the folding state, and the outer screen refers to the screen that is located outside when folding screen is in the closed state. The included angle beta between the second screen and the third screen is the hinge angle of the mobile phone with the folding screen, and the physical state of the folding screen can be determined by determining the hinge angle. The physical state includes a folded state as shown in fig. 13 (3), an unfolded state as shown in fig. 13 (4), or a stent state as shown in fig. 13 (2). The folding screen cell phone shown in fig. 13 includes 3 groups of cameras, which are respectively denoted as camera 1, camera 2, and camera 3. As shown in fig. 13 (1), the camera 1 is disposed at an intermediate position of the upper portion of the first screen, the camera 2 is disposed on the back plate, and as shown in fig. 13 (4), the camera 3 is disposed at an intermediate position of the upper portion of the third screen. In the folded state, the camera 1 can be regarded as a front camera, and the camera 2 can be regarded as a rear camera, as shown in fig. 13 (3). In the unfolded state of the folding screen mobile phone, the camera 3 can be regarded as a front camera, and the cameras 1 and 2 can be regarded as rear cameras.
Taking the folding mobile phone shown in fig. 13 as an example, the situation that the folding mobile phone starts the cameras and which cameras are started are illustrated.
In one possible implementation, if the folding-screen mobile phone has turned on the scene sensing function, when it is detected that the mobile phone is in a folded state and the external screen (such as the first screen in fig. 13) is in a bright screen state, the camera 1 of the folding-screen mobile phone is triggered to continuously collect the first image, so as to detect whether the face of the person exists in the range of the camera.
In one possible implementation manner, if the folding screen mobile phone has turned on the scene sensing function, when it is detected that the mobile phone is in a folded state and the external screen is in a bright screen state, and at least one of the above second conditions is met, the camera 1 of the folding screen mobile phone is triggered to continuously collect the first image, so as to detect whether the face of the person exists in the range of the camera.
In one possible implementation, if the folding-screen mobile phone has turned on the scene sensing function, when it is detected that the mobile phone is in an unfolded state and the internal screen (such as the second screen and the third screen in fig. 13) is in a bright screen state, the camera 3 of the folding-screen mobile phone is triggered to continuously collect the first image, so as to detect whether the face of the person exists in the range of the camera.
In one possible implementation manner, if the folding screen mobile phone has turned on the scene sensing function, when it is detected that the mobile phone is in an unfolded state and the inner screen is in a bright state, and at least one of the above second conditions is met, the camera 3 of the folding screen mobile phone is triggered to continuously collect the first image, so as to detect whether the range of the camera has the face of the person.
In one possible implementation, if the folding screen phone has met the condition of opening the camera, the current phone is in a folded state and the camera 1 has been opened. When detecting that the mobile phone is in a folded state to an unfolded state and other conditions for opening the camera are unchanged, the camera 1 can be closed, and the camera 3 can be opened to detect whether the face of the person exists in the range of the camera 3.
In one possible implementation, if the folding screen phone has met the condition of opening the camera, the current phone is in the unfolded state and the camera 3 has been opened. When detecting that the mobile phone is switched from the unfolded state to the folded state and other conditions for opening the camera are unchanged, the camera 3 can be closed, and the camera 1 can be opened to detect whether the face of the person exists in the range of the camera 1.
It should be noted that, the method for implementing scene perception by using the folding screen mobile phone in other forms can refer to the folding screen mobile phone shown in fig. 13, and its implementation principle and technical effect are similar, and the structural style of the folding screen mobile phone in the embodiment of the present application is not limited.
In order to better understand the embodiments of the present application, the structure of the electronic device according to the embodiments of the present application is described below. Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 5, the electronic device 100 may include: processor 110, external memory interface 120, internal memory 121, universal serial bus (universal serial bus, USB) interface 130, charge management module 140, power management module 141, battery 142, antenna 1, antenna 2, mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, headset interface 170D, sensor 180, keys 190, motor 191, indicator 192, camera 193, display 194, and subscriber identity module (subscriber identification module, SIM) card interface 195, etc. It is to be understood that the structure illustrated in the present embodiment does not constitute a specific limitation on the electronic apparatus 100. In other embodiments of the application, electronic device 100 may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
The processor 110 may include one or more processing units, such as: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (IMAGE SIGNAL processor, ISP), a controller, a video codec, a digital signal processor (DIGITAL SIGNAL processor, DSP), a baseband processor, a display processing unit (display process unit, DPU), and/or a neural-network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.
In some embodiments, the electronic device 100 may also include one or more processors 110.
In some embodiments, the processor 110 may include one or more interfaces.
It should be understood that the interfacing relationship between the modules illustrated in the embodiments of the present application is only illustrative, and is not meant to limit the structure of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also employ different interfacing manners in the above embodiments, or a combination of multiple interfacing manners.
The charge management module 140 is configured to receive a charge input from a charger. The power management module 141 is used for connecting the battery 142, and the charge management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140 to power the processor 110, the internal memory 121, the display 194, the camera 193, the wireless communication module 160, and the like. The power management module 141 may also be configured to monitor battery capacity, battery cycle times, battery health, and other parameters. In other embodiments, the power management module 141 may also be provided in the processor 110. In other embodiments, the power management module 141 and the charge management module 140 may be disposed in the same device.
The wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like. The mobile communication module 150 may provide a solution for wireless communication including 2G/3G/4G/5G, etc., applied to the electronic device 100.
The wireless communication module 160 may provide solutions for wireless communication including wireless local area network (wireless local area networks, WLAN), bluetooth, global navigation satellite system (global navigation SATELLITE SYSTEM, GNSS), frequency modulation (frequency modulation, FM), NFC, infrared (IR), etc. applied to the electronic device 100.
The electronic device 100 may implement display functions through a GPU, a display screen 194, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute instructions to generate or change display information.
The display screen 194 is used to display images, videos, and the like. The display 194 includes a display panel. The display panel may employ a Liquid Crystal Display (LCD) CRYSTAL DISPLAY, an organic light-emitting diode (OLED), an active-matrix organic LIGHT EMITTING diode (AMOLED), a flexible light-emitting diode (FLED), miniled, microLed, micro-oLed, a quantum dot LIGHT EMITTING diode (QLED), or the like. In some embodiments, the electronic device 100 may include 1 or N display screens 194, N being a positive integer greater than 1.
Electronic device 100 may implement shooting functionality through an ISP, one or more cameras 193, video codecs, a GPU, one or more display screens 194, an application processor, and the like.
The NPU is a neural-network (NN) computing processor, and can rapidly process input information by referencing a biological neural network structure, for example, referencing a transmission mode between human brain neurons, and can also continuously perform self-learning. Applications such as intelligent awareness of the electronic device 100 may be implemented through the NPU, for example: image recognition, face recognition, speech recognition, text understanding, etc.
The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to enable expansion of the memory capabilities of the electronic device 100. The external memory card communicates with the processor 110 through an external memory interface 120 to implement data storage functions. For example, data files such as music, photos, videos, etc. are stored in an external memory card.
The internal memory 121 may be used to store one or more computer programs, including instructions. The processor 110 may cause the electronic device 100 to execute various functional applications, data processing, and the like by executing the above-described instructions stored in the internal memory 121.
The sensors 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
The gyro sensor 180B may be used to determine a motion gesture of the electronic device 100. In some embodiments, the angular velocity of electronic device 100 about three axes (i.e., x, y, and z axes) may be determined by gyro sensor 180B. The gyro sensor 180B may be used for photographing anti-shake. The gyro sensor 180B can also be used for navigation, somatosensory game scenes, and the like.
The magnetic sensor 180D is configured to detect the magnetic field intensity of the magnet, obtain magnetic force data, and detect the physical state of the folding screen of the electronic device 100 through the magnetic force data. The magnet is used for generating a magnetic field. In this embodiment of the present application, the magnetic sensor 180D may be disposed in a body corresponding to the back plate shown in (1) in fig. 13, the magnet may be disposed in a body corresponding to the first screen shown in (1) in fig. 13, the magnet may enable the magnetic sensor 180D to detect magnetic force data, and as the folding state of the folding screen changes, the distance between the magnetic sensor 180D and the magnet correspondingly changes, the magnetic field strength of the magnet detected by the magnetic sensor 180D also changes, so that the intelligent sensor hub may determine the physical state of the folding screen according to the magnetic force data obtained by the magnetic sensor 180D under the action of the magnetic field of the magnet, where the physical state includes, for example, an unfolded state, a stand state, or a folded state (closed state). In some embodiments, the sensor 180 may also include a hall sensor that may also be used to detect the magnetic field strength of the magnet, outputting a high/low level from which to determine the physical state of the folding screen of the electronic device 100.
The acceleration sensor 180E may detect the magnitude of acceleration of the electronic device 100 in various directions (typically three axes). The magnitude and direction of gravity may be detected when the electronic device 100 is stationary. The electronic equipment gesture recognition method can also be used for recognizing the gesture of the electronic equipment, and is applied to horizontal and vertical screen switching, pedometers and other applications.
A distance sensor 180F for measuring a distance. The electronic device 100 may measure the distance by infrared or laser. In some embodiments, the electronic device 100 may range using the distance sensor 180F to achieve quick focus.
The proximity light sensor 180G may include, for example, a Light Emitting Diode (LED) and a light detector, such as a photodiode. The light emitting diode may be an infrared light emitting diode. The electronic device 100 emits infrared light outward through the light emitting diode. The electronic device 100 detects infrared reflected light from nearby objects using a photodiode. When sufficient reflected light is detected, it may be determined that there is an object in the vicinity of the electronic device 100. When insufficient reflected light is detected, the electronic device 100 may determine that there is no object in the vicinity of the electronic device 100. The electronic device 100 can detect that the user holds the electronic device 100 close to the ear by using the proximity light sensor 180G, so as to automatically extinguish the screen for the purpose of saving power. The proximity light sensor 180G may also be used in holster mode, pocket mode to automatically unlock and lock the screen.
The ambient light sensor 180L is used to sense ambient light level. The electronic device 100 may adaptively adjust the brightness of the display 194 based on the perceived ambient light level. The ambient light sensor 180L may also be used to automatically adjust white balance when taking a photograph. Ambient light sensor 180L may also cooperate with proximity light sensor 180G to detect whether electronic device 100 is in a pocket to prevent false touches.
The keys 190 include a power-on key, a volume key, etc. The keys 190 may be mechanical keys or touch keys. The electronic device 100 may receive key inputs, generating key signal inputs related to user settings and function controls of the electronic device 100.
The electronic device may also be referred to as a terminal device (terminal), a User Equipment (UE), a Mobile Station (MS), a Mobile Terminal (MT), or the like. The electronic device may be a mobile phone (mobile phone) with a touch screen, a wearable device, a tablet (Pad), a computer with wireless transceiving function, a Virtual Reality (VR) terminal device, an augmented reality (augmented reality, AR) terminal device, a wireless terminal in industrial control (industrial control), a wireless terminal in unmanned driving (self-driving), a wireless terminal in teleoperation (remote medical surgery), a wireless terminal in smart grid (SMART GRID), a wireless terminal in transportation security (transportation safety), a wireless terminal in smart city (SMART CITY), a wireless terminal in smart home (smart home), or the like. The embodiment of the application does not limit the specific technology and the specific equipment form adopted by the electronic equipment.
In the embodiment of the application, in order to realize the intelligent scene sensing function of the electronic equipment, the camera and the processor of the electronic equipment are required to be improved in hardware and software. The hardware improvement of the electronic device will be first described below.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 6, the electronic device may include an improved camera 601 and an improved processor 602.
The improved camera 601 is to add a control circuit and a working circuit corresponding to a newly added shooting mode in the existing camera module, so as to realize low-power consumption configuration. For example, the shooting mode of the existing camera module is mode 1, and the camera module includes a working circuit corresponding to the mode 1. If the shooting mode is increased by the mode 2, the mode 1 and the mode 2 are correspondingly switched, and the improved camera module comprises a working circuit corresponding to the mode 1, a working circuit corresponding to the newly increased mode 2 and a control circuit corresponding to the two modes. It should be understood that, according to practical application requirements, more than two shooting modes may be set, and the embodiment of the present application is not limited in any way.
As an example, the improved camera 601 includes two modes of operation: a first mode, which may be referred to as a low power consumption photographing mode, and a second mode, which may be referred to as a normal photographing mode. The resolution of the image captured by the camera 601 in the first mode is smaller than the resolution of the image captured in the second mode, and the frame rate of the image captured by the camera 601 in the first mode is smaller than the frame rate of the image captured in the second mode. The camera 601 can switch between these two modes.
For example, if the condition of starting the camera is satisfied, the camera 601 is operated in the first mode and can perform resident scanning, and the camera 601 continuously collects the first image with the first resolution at the first frame rate, so as to detect whether a person exists in the range of the camera 601, for example, whether the first image contains a face of a person; if a person is detected to be present, for example, the first image includes a face of the person, the camera 601 switches from the first mode to the second mode, and continuously collects the second image with the second resolution at the second frame rate, so as to detect the category of the current scene, or detect whether the person looks at the screen, or not. Wherein the first frame rate is less than the second frame rate and the first resolution is less than the second resolution.
Based on the above examples, the camera 601 may dynamically adjust the frame rate and resolution of the captured image to accommodate different needs, e.g., the camera 601 captures images at a lower frame rate and resolution to detect whether there is a person's face within the camera 601, and the camera 601 captures images at a higher frame rate and resolution to identify the scene category of the person using the electronic device. Illustratively, the lowest frame rate of camera 601 may be 1fps and the highest frame rate may be 30fps. The lowest resolution of the camera 601 may be 120×180 and the highest resolution may be 480×640. In some examples, the highest frame rate may also be 240fps and the highest resolution may also be 2736×3648.
It should be noted that, the frame rate range and the resolution range of the image collected by the camera are not limited in the embodiment of the application, that is, the lowest frame rate and the highest frame rate of the image collected by the camera are not limited, and the lowest resolution and the highest resolution of the image collected by the camera are not limited. In practical application, the frame rate range and the resolution range can be reasonably set according to requirements.
It should be noted that, the camera 601 may be a front camera of the electronic device or a rear camera of the electronic device, which is not limited in this embodiment.
The modified processor 602 may be a System on Chip (SoC). In the embodiment of the present application, when the electronic device triggers the resident scanning of the camera 601 to acquire an image under the current scene, the camera 601 may send the image to the SoC to perform image analysis, so as to detect whether there is a person face in the image, whether the person looks at the screen, the category of the current scene, and the like.
For the purpose of low power consumption, the SoC may support an AON ISP (Always On ISP), and referring to fig. 7, the camera 601 transmits an image to the AON ISP, which does not perform any image effect processing other than format conversion of the image, and then stores the format-converted image in an On-chip static Random Access Memory (On-CHIP STATIC Random-Access Memory). The SoC may also support very low power cores, with computing, algorithm running, and image storage all operating in low power modes. Furthermore, the SoC may also support a low power embedded neural network processor eNPU (emdedded NPU).
The SoC according to the embodiment of the present application will be described in detail.
Fig. 7 is a schematic structural diagram of an SoC according to an embodiment of the present application. As shown in fig. 7, the SoC includes a first processing unit and a second processing unit. The first processing unit comprises an image signal processing ISP, a neural network processor NPU and a central processing unit CPU, and the second processing unit comprises an I2C bus interface, an AON ISP, an on-chip SRAM, a digital signal processor DSP and eNPU. In Soc, the power consumption of the second processing unit is lower than the power consumption of the first processing unit, in particular eNPU in the second processing unit is lower than the power consumption of the NPU in the first processing unit, and the power consumption of the AON ISP in the second processing unit is lower than the power consumption of the ISP in the first processing unit.
As an example, the first processing unit may be configured to process the second image acquired by the camera 601 with the second resolution as described above. Illustratively, in the second mode, the camera 601 captures a second image at a second resolution, and after the second image is processed by the IPS, the NPU detects the processed second image at the second resolution, for example, to detect a category of a current scene, or to detect whether a person looks at the screen. The first processing unit may perform security processing (e.g., encryption processing) before transmitting data (e.g., image data) to the memory, and store the data after the security processing in a security buffer (buffer) of the memory. The security process is used to protect the user's private data.
As an example, the second processing unit may be configured to process the first image acquired by the camera 601 with the first resolution as described above. Illustratively, in the first mode, the camera 601 acquires a first image with a first resolution, the AON ISP acquires the first image with the first resolution through the I2C bus interface, and after the AON ISP processes the first image with the first resolution, the first image after processing is detected by eNPU, for example, to detect whether the first image contains a face of a person. The on-chip SRAM in the second processing unit may be configured to store the processed first image with the first resolution, and the DSP may be configured to notify eNPU to perform image detection, receive a detection result reported by eNPU, and report the detection result to an upper layer application. The second processing unit adopts a low power consumption configuration to reduce the power consumption of the electronic device.
It should be noted that, in the embodiment of the present application, the format of the image data transmitted between the first processing unit or the second processing unit and the camera is not limited. By way of example, the image data may be camera serial interface (CAMERA SERIAL INTERFACE, CSI) mobile industry processor interface (Mobile Industry Processor Interface, MIPI) data.
The software system of the electronic device may employ a layered architecture, an event driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. In the embodiment of the application, a software system with a layered architecture is taken as an Android system as an example, and the software structure of the electronic equipment is illustrated by an example. Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the present application. The layered architecture divides the software system of the electronic device into several layers, each of which has a distinct role and division of labor. The layers communicate with each other through a software interface.
Referring to fig. 8, an electronic device of an embodiment of the present application includes an application layer (Applications), an application framework layer (Application Framework), a hardware abstraction layer (Hardware Abstraction Layer, HAL), a Kernel layer (Kernel), a sensor control center (Sensorhub), and a hardware layer.
The application layer may include a series of applications that are run by calling an application program interface (application programming interface, API) provided by the application framework layer.
In the embodiment of the application, the application program layer may include a scene sensing application and a sensing module, the scene sensing application is connected with the sensing module, the scene sensing application is registered in the sensing module, the sensing module performs state management and transmits data, for example, when the sensing module acquires that the face of the person exists in the range of the camera from the second processing module in Sensorhub, the sensing module notifies the first processing module in the HAL, so that the first processing module identifies the scene category of the person using the electronic device based on the image acquired by the camera, and finally the sensing module reports the identification result to the scene sensing application.
In some embodiments, the application layer also includes other applications (not shown in fig. 8), such as a gaze-invariant screen application, a gaze-normally-bright display (always on display, AOD) application. In a possible scenario, where multiple applications correspond to the same algorithm, e.g., a gaze-invariant screen application and a gaze-AOD application correspond to a gaze detection algorithm, the awareness module may be used to uniformly schedule and manage the gaze detection algorithm. In another possible scenario, the different applications correspond to different algorithms, for example, the scene-aware application corresponds to a scene recognition algorithm (the scene-aware application also corresponds to a face (or not) detection algorithm), the gaze-free screen application corresponds to a gaze detection algorithm, both of which involve acquiring image data from the underlying camera, and the awareness module may be used to schedule and manage priorities among the multiple algorithms. The scene recognition algorithm and the gazing detection algorithm have the same priority, and the sensing module can inform the camera at the bottom layer to report the image data to the scene sensing application and the gazing non-screen application at the same time. In this embodiment, the scene recognition algorithm may be deployed in the first processing module and the face detection algorithm may be deployed in the second processing module. The algorithm shown in this embodiment is only an example.
In some embodiments, if the electronic device includes a folding screen, the application layer further includes a third processing module, where the third processing module is configured to obtain the physical state of the folding screen reported by the second processing module, and the state (on or off) of the inner and outer screen cameras. In addition, the third processing module is also used for informing the state of the inner screen camera and the outer screen camera to the first processing module.
In some embodiments, the application program may further include a camera, gallery, calendar, call, map, navigation, WLAN, bluetooth, music, video, short message, etc., which may be a system application or a third party application, which is not limited in this embodiment of the present application.
The application framework layer provides APIs and programming frameworks for application programs of the application layer. The application framework layer includes a number of predefined functions. As shown in fig. 8, the application framework layer may include a camera service (CAMERASERVICE) for priority scheduling and management of all applications that need to use the camera.
In some embodiments, the application framework layer may also include, for example, a window manager, a content provider, a resource manager, a notification manager, a view system, etc., as embodiments of the application are not limited in this respect.
The hardware abstraction layer may include an AO (always on) service and a first processing module. The AO service may be used to control the turning on or off of the scene recognition algorithm in the first processing module and the turning on or off of the face detection algorithm in the second processing module, as well as the upper and lower layer data transfer. The first processing module may be configured to process higher resolution and/or higher frame rate images, such as the second images described above, to detect the second images and identify the scene category of the user using the device. The first processing module is further configured to switch modes of the camera, for example, the first processing module receives a second instruction from the sensing module, and controls the camera to switch from the first mode to the second mode.
The kernel layer is a layer between hardware and software. The kernel layer is used for driving the hardware so that the hardware works. In the embodiment of the application, the kernel layer can comprise a camera driver, and the camera driver is used for driving a camera in the electronic equipment to work in a first mode or a second mode so as to acquire images with different frame rates and/or resolutions.
In addition, the kernel layer may further include a display driving audio driver, a sensor driver, a motor driver, and the like, which is not limited in the embodiment of the present application. The sensor driver can drive, for example, a proximity light sensor to emit light signals to detect whether a user is currently holding the electronic device to be communicated close to an ear, and the like, and can drive, for example, a gyroscope sensor to detect gesture data of the electronic device; the sensor driver may also drive, for example, an ambient light sensor to detect ambient light levels to detect whether the electronic device is in a dark environment, including, for example, a cell phone in a pocket, etc.
Sensorhub are used to implement centralized control of the sensors to reduce the load on the CPU. Sensorhub corresponds to a micro-program controller (Microprogrammed Control Unit, MCU) on which a program for driving a plurality of sensors to operate can be run, that is, sensorhub, the capability of mounting a plurality of sensors can be supported, which can be used as a separate chip, placed between a CPU and various sensors, or can be integrated in an application processor (application processor, AP) in the CPU.
In an embodiment of the present application, sensorhub may include a second processing module that may be configured to process images of lower resolution and/or lower frame rate, such as the first image described above, to detect whether there is a person's face in the first image. The second processing module is a low power processing module compared to the first processing module, and the second processing module is resident or operates in a low power mode. As an example, when the scene perception function is turned on, the second processing module is further configured to obtain data reported by various sensors, determine various states of the electronic device, such as a screen state, an unlock state, a use state, and the like, based on the various sensor data, and if the device state meets a first condition, the second processing module may send a first shooting instruction to the camera to instruct the camera to scan the resident of the camera in a low-power shooting mode (such as the first mode) so as to detect whether a person has a face in the range of the camera. As an example, for a foldable device, the second processing module may determine whether to send the first photographing instruction to the camera (the camera on the inner screen or the outer screen) by detecting whether the folded screen physical state, the screen state, and the device state satisfy the second condition. As an example, the second processing module may control the camera of the folding-screen mobile phone (such as the camera of the external screen of the mobile phone and/or the camera of the internal screen of the mobile phone) to be turned on or turned off when detecting that the physical state of the screen of the folding-screen mobile phone changes, such as from the folded state to the unfolded state or from the unfolded state to the folded state.
The hardware layer may include, for example, cameras, various types of sensors, and AON ISPs, etc.
It will be appreciated that the layers in the hierarchy shown in fig. 8, as well as the modules or components contained in the layers, do not constitute a specific limitation of the electronic device. In other embodiments, the electronic device may include more or fewer layers than shown, and more or fewer components may be included in each layer, as the application is not limited. The modules included in the respective tiers shown in fig. 8 are modules involved in the embodiment of the present application, and do not constitute a limitation on the structure of the electronic device and the hierarchy (illustration) of the module arrangement. In some embodiments, the modules shown in fig. 8 may be deployed alone, or several modules may be deployed together, with the division of modules in fig. 8 being an example. In some embodiments, the names of the modules shown in fig. 8 are exemplary illustrations.
On the basis of the structure of the electronic device, the scene perception method provided by the embodiment of the application is described below with reference to specific embodiments. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
Fig. 9 is a schematic flow chart of a scene perception method according to an embodiment of the present application. As shown in fig. 9, the scene sensing method provided in this embodiment includes:
step 901, registering a scene perception function in a perception module by a target application.
In this embodiment, the target application is the scene aware application of the application layer. In some embodiments, the target application obtains the information of the scene-aware function from the server (or cloud), and after obtaining the information of the scene-aware function, the target application may register the scene-aware function with the awareness module so that the awareness module is resident and performs matters related to scene awareness. The information of the scene perception function comprises preset operations and the like corresponding to different scene categories.
Step 902, the sensing module determines whether to activate a scene sensing function.
If the sensing module determines to activate the scene sensing function, step 903 is performed.
For example, referring to fig. 2, if the user selects to turn on the scene sensing function at the setting interface of the system application or the third party application, the system application or the third party application sends a notification to the sensing module to inform the sensing module that the user has turned on the scene sensing function.
In step 903, the sensing module sends a first instruction to the second processing module, where the first instruction is used to instruct the second processing module to detect whether a person's face exists in a camera range of the electronic device. The sensing module sends a first instruction, and can control the second processing module to be started, including module power-on, work scene issuing, resource preparation and the like.
Step 904, the second processing module responds to the first instruction and sends a first shooting instruction to the camera, wherein the first shooting instruction is used for indicating the camera to work in a first mode. The second processing module sends a first shooting instruction to control starting of the camera, wherein the first shooting instruction comprises camera power-on, mode switching, picture resolution, frame rate setting and the like. The camera continues to capture images, i.e. the first image, in the first mode at a lower frame rate and/or resolution.
In step 905, the camera sends the first image to the second processing module.
The camera responds to a first shooting instruction and acquires a first image with a first resolution at a first frame rate in a first mode. Illustratively, the first frame rate may be set to 5fps and the first resolution may be set to 120×180 or 640×480.
Step 906, the second processing module identifies whether the first image contains a person's face.
If the second processing module identifies a person's face in the first image, then step 907 is performed. Otherwise, the second processing module continues to step 906 unless the camera is controlled to be off.
In some embodiments, the second processing module is preset with a face recognition model, and the face recognition model may be trained by using a lightweight neural network model and is used for recognizing whether the face of the person is contained in the image.
In some embodiments, the face recognition model may be deployed on eNPU of the second processing module with good real-time.
In step 907, the second processing module sends a first message to the sensing module, where the first message is used to notify the sensing module that the face of the person is recognized within the range of the camera of the electronic device.
Step 908, the sensing module sends a second instruction to the first processing module, where the second instruction is used to instruct the first processing module to identify a category of a scene within the range of the camera. The second processing module sends a second instruction, and the first processing module can be controlled to be started, including module power-on, work scene issuing, resource preparation and the like.
In step 909, the first processing module responds to the second instruction and sends a second shooting instruction to the camera, where the second shooting instruction is used to instruct the camera to work in the second mode.
The first processing module sends a second shooting instruction to control the camera to switch modes, namely, the first mode is switched to the second mode. The camera continues to capture images, i.e. second images, in the second mode at a higher frame rate and/or resolution.
Step 910, the camera sends the second image to the first processing module.
The camera responds to a second shooting instruction, acquires a second image with a second resolution at a second frame rate in a second mode, and sends the second image to the first processing module. Illustratively, the second frame rate may be set to 30fps and the second resolution may be set to 1920×1080.
In step 911, the first processing module identifies a scene category of the user using the electronic device based on the second image.
In some embodiments, the first processing module is preset with a scene detection model, where the scene detection model may be obtained by training with a lightweight neural network model, and is used to identify a scene category corresponding to the second image.
In some embodiments, the scene detection model may be deployed on the NPU of the first processing module, with good real-time.
Step 912, the first processing module sends a second message to the sensing module.
In this embodiment, the second message is used to indicate a scene category of the electronic device used by the user. In some embodiments, the second message includes an identification of a scene category.
Step 913, the perception module sends a third indication to the target application, where the third indication is used to indicate a scene category of the electronic device used by the user. In some embodiments, the third indication comprises an identification of a scene category.
In step 914, in response to the third indication, the target application controls to execute a preset operation corresponding to the scene category.
In this embodiment, the target application pre-stores information of the scene perception function, including preset operations corresponding to different scene categories. The preset operation includes at least one of the following: adjusting the volume; adjusting the brightness of a screen; adjusting blue light of a screen; adjusting the vibration intensity; sending first information, wherein the first information is used for reminding a user to stop using the electronic equipment; sending second information, wherein the second information is used for recommending content corresponding to the scene category; and opening the rear camera for detecting the obstacle.
In some embodiments, the target application obtains a third indication from the sensing module, determines a preset operation corresponding to the scene category based on the identification of the scene category in the third indication and the pre-stored information of the scene sensing function, and further controls to execute the preset operation corresponding to the scene category.
It should be noted that, in the present embodiment, the first processing module may correspond to the first processing unit shown in fig. 7, and the second processing module may correspond to the second processing unit shown in fig. 7.
In the above embodiment, if the electronic device has registered the scene sensing function and the user has turned on the scene sensing function, the sensing module may send the first shooting instruction to the camera through the second processing module (low power consumption processing module), so that the camera captures the first image at a lower frame rate and/or resolution. If the second processing module recognizes that the first image contains the face of the person, the first processing module can be notified, so that the first processing module sends a second shooting instruction to the camera, and the camera can acquire the second image at a higher frame rate and/or resolution. The first processing module identifies a scene category corresponding to the second image and informs the application of the scene category so that the application executes a preset operation corresponding to the scene category.
According to the scheme, after the user starts the scene sensing function, the electronic equipment detects the face of the person with lower power consumption, scene detection is performed when the user is determined to be using the electronic equipment, and system parameters or push notifications are automatically set based on the scene detection result, so that the intelligent scene sensing function of the electronic equipment is realized. Because the camera and the second processing module are both configured with low power consumption, the power consumption for executing the scheme is extremely low.
Fig. 10 is a flow chart of a scene perception method according to an embodiment of the present application. On the basis of the embodiment shown in fig. 9, as shown in fig. 10, the scene sensing method provided in this embodiment includes:
in step 1010, the target application registers a scene perception function in the perception module.
In step 1011, the sensing module determines whether to activate the scene sensing function.
If the perception module determines to activate the scene perception function, then step 1012 is performed.
Step 1012, the sensing module sends a first instruction to the second processing module, where the first instruction is used to instruct the second processing module to detect whether a person's face exists within a camera range of the electronic device.
Step 1013, the second processing module determines whether the first condition is satisfied.
If the second processing module determines that the first condition is satisfied, then step 1014 is performed. Otherwise, the second processing module continues the low power detection to determine whether the first condition is satisfied.
In this embodiment, the first condition includes at least one of:
The screen state of the electronic equipment is a bright screen state; the electronic device is unlocked; the time difference between the light signal emitted by the electronic device close to the light sensor and the reflected signal of the light signal is greater than a first threshold value, and/or the signal intensity of the reflected signal is less than a second threshold value, and/or the reflected signal is not received by the receiving light sensor; the detection data of the ambient light sensor of the electronic device is greater than a third threshold; the screen of the electronic equipment faces to a preset direction; the electronic device is in a mobile state.
In this embodiment, if the electronic device has turned on the scene sensing function and the first condition is satisfied, the camera of the electronic device is triggered to continuously collect the first image. By adding the first condition, the electronic equipment is prevented from continuously collecting the first image when not necessary, and the power consumption of the equipment is further reduced.
Step 1014, the second processing module sends a first shooting instruction to the camera, where the first shooting instruction is used to instruct the camera to work in the first mode.
In some embodiments, if the electronic device is a foldable device, the foldable device includes an inner screen and an outer screen, the inner screen is correspondingly provided with a first camera, and the outer screen is correspondingly provided with a second camera.
In this case, step 1013 may be replaced with: the second processing module detects that the internal screen of the electronic equipment is in a bright screen state, and the electronic equipment is in an unfolding state. Accordingly, step 1014 may be: the second processing module sends a first shooting instruction to the first camera. In some embodiments, before the second processing module sends the first shooting instruction to the first camera, the method further includes: detecting that the state of the electronic device satisfies a second condition, the second condition comprising at least one of: the electronic device is unlocked; the time difference between the light signal emitted by the electronic device near the light sensor (on the inner screen) and the reflected signal of the light signal is greater than a first threshold value, and/or the signal strength of the reflected signal is less than a second threshold value, and/or the reflected signal is not received by the receiving light sensor; the detection data of the ambient light sensor of the electronic device is greater than a third threshold; the inner screen of the electronic equipment faces to a preset direction; the electronic device is in a mobile state.
In this case, step 1013 may be replaced with: the second processing module detects that the external screen of the electronic equipment is in a bright screen state, and the electronic equipment is in a folding state. Accordingly, step 1014 may be: the second processing module sends a first shooting instruction to the second camera. In some embodiments, before the second processing module sends the first shooting instruction to the second camera, the method further includes: detecting that the state of the electronic device satisfies a second condition, the second condition comprising at least one of: the electronic device is unlocked; the time difference between the light signal emitted by the electronic device near the light sensor (on the external screen) and the reflected signal of the light signal is greater than a first threshold value, and/or the signal intensity of the reflected signal is less than a second threshold value, and/or the reflected signal is not received by the receiving light sensor; the detection data of the ambient light sensor of the electronic device is greater than a third threshold; the external screen of the electronic equipment faces to a preset direction; the electronic device is in a mobile state.
Step 1015, the camera sends the first image to the second processing module.
Step 1016, the second processing module identifies whether the first image contains a person's face.
If the second processing module identifies a person's face in the first image, then step 1017 is performed;
If the second processing module does not recognize the person's face in the first image, it jumps back to step 1013.
Step 1017, the second processing module sends a first message to the sensing module, where the first message is used to inform the sensing module that the face of the person is recognized within the range of the camera of the electronic device.
Step 1018, the sensing module sends a second instruction to the first processing module, where the second instruction is used to instruct the first processing module to identify a category of a scene within the range of the camera.
In step 1019, the first processing module responds to the second instruction and sends a second shooting instruction to the camera, where the second shooting instruction is used to instruct the camera to work in the second mode.
Step 1020, the camera sends the second image to the first processing module.
In step 1021, the first processing module identifies a scene category of the user using the electronic device based on the second image.
Step 1022, the first processing module sends a second message to the sensing module, where the second message is used to instruct the user to use the scene category of the electronic device.
Step 1023, the perception module sends a third indication to the target application, where the third indication is used to indicate the scene category of the electronic device used by the user.
Step 1024, the target application controls to execute the preset operation corresponding to the scene category.
In the above embodiment, if the user has turned on the scene sensing function and the first condition is met, the sensing module may send the first shooting instruction to the camera through the second processing module, so that the camera captures the first image at a lower frame rate and/or resolution. If the second processing module recognizes that the first image contains the face of the person, the first processing module can be notified, and the first processing module sends a second shooting instruction to the camera, so that the camera collects the second image at a higher frame rate and/or resolution. The first processing module identifies the second image, determines a scene category corresponding to the second image, and informs the application of the scene category so that the application can execute preset operations corresponding to the scene category, such as automatic setting of system parameters or push notification, and the intelligent scene sensing function of the device is realized. On the one hand, the camera and the second processing module are both configured with low power consumption, so that the power consumption for executing the scheme is extremely low. On the other hand, the first condition is additionally arranged, so that unnecessary person face detection by the electronic equipment can be avoided, and the power consumption of the equipment is further reduced.
The embodiment of the application also provides a scene perception method which is applied to the electronic equipment with the flexible screen, and the scheme is explained by taking a folding screen mobile phone as an example, wherein a group of cameras are respectively arranged on the inner screen and the outer screen of the folding screen mobile phone and are used for collecting image data. The scene perception method of the present embodiment relates to processing logic of the device bottom module when the status of the folding screen changes, and is described below with reference to fig. 14.
Fig. 14 is a schematic flow chart of a scene perception method according to an embodiment of the present application. As shown in fig. 14, the scene perception method of the present embodiment may include the following steps:
In step 1401, the second processing module obtains sensor data to determine a change in physical state of the folding screen.
In this embodiment, the sensor data includes, for example, a magnetic sensor, a hall sensor, and the like, and whether or not the physical state of the folding screen has changed is determined by acquiring the sensor data. The change in physical state of the folding screen includes a change from a folded state to an unfolded state, or a change from an unfolded state to a folded state.
Step 1402a, the second processing module controls to turn on or off the camera on the inner screen based on the physical state change of the folding screen.
Step 1402b, the second processing module controls to turn on or off the camera on the external screen based on the physical state change of the folding screen.
In one possible implementation, if the physical state of the folding screen changes from the folded state to the unfolded state, the second processing module may control to turn on the camera on the inner screen and/or the camera on the outer screen.
In one possible implementation, the camera on the outer screen is turned on, and if the physical state of the folded screen changes from the folded state to the unfolded state, the second processing module may control to turn off the camera on the outer screen and simultaneously control to turn on the camera on the inner screen.
In one possible implementation, the camera on the inner screen is turned on, and if the physical state of the folding screen changes from the unfolded state to the folded state, the second processing module may control to turn off the camera on the inner screen and turn on the camera on the outer screen.
Illustratively, the camera on the outer screen may be the camera 1 shown in fig. 13, and the camera on the inner screen may be the camera 3 shown in fig. 13.
Step 1402c, the second processing module reports the physical state of the folding screen and the state of the cameras of the inner and outer screens to the third processing module.
It should be noted that the execution sequence of the steps 1402a to 1402c is not limited in this embodiment.
Step 1403, the third processing module sends a notification to the first processing module, notifying the state of the inner and outer screen cameras.
When the physical state of the screen of the folding screen mobile phone changes, the interaction process between the modules inside the mobile phone is shown, and the accurate control of the camera low-power consumption continuous image scanning function is realized through the interaction, so that a user can intelligently identify the current scene of the mobile phone when using the folding screen mobile phone, such as a classroom, a conference room, a subway station and the like, further, mobile phone system parameters (such as volume, vibration intensity and the like) are automatically set, and the user experience is improved.
Fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present application, as shown in fig. 11, where the electronic device includes a camera 1106, a processor 1101, a communication line 1104, and at least one communication interface (illustrated in fig. 11 by taking the communication interface 1103 as an example).
The camera 1106 may be used to capture images at different frame rates and/or resolutions, and the processor 1101 may be used to detect if there are person faces in the images and to identify scenes.
The processor 1101 may be a general purpose central processing unit (central processing unit, CPU), microprocessor, application Specific Integrated Circuit (ASIC), or one or more integrated circuits for controlling the execution of the programs of the present application. In some embodiments, the processor 1101 includes a first processing module and a second processing module, the power consumption of the first processing module being higher than the power consumption of the second processing module; the second processing module can be used for detecting whether the face of the person exists in the first image acquired by the camera in the first mode, and the first processing module is used for identifying the second image acquired by the camera in the second mode and determining the scene type of the user using the electronic equipment.
Communication line 1104 may include circuitry for communicating information between the components described above.
Communication interface 1103 uses any transceiver-like device for communicating with other devices or communication networks, such as ethernet, wireless local area network (wireless local area networks, WLAN), etc.
In some embodiments, the electronic device may also include memory 1102.
The memory 1102 may be, but is not limited to, a read-only memory (ROM) or other type of static storage device that can store static information and instructions, a random access memory (random access memory, RAM) or other type of dynamic storage device that can store information and instructions, an electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, optical disk storage (including compact disc, laser disc, optical disc, digital versatile disc, blu-ray disc, etc.), magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer. The memory may be separate and coupled to the processor via communication line 1104. The memory may also be integrated with the processor.
The memory 1102 is used for storing computer-executable instructions for implementing the aspects of the present application, and is controlled by the processor 1101 for execution. The processor 1101 is configured to execute computer-executable instructions stored in the memory 1102, thereby implementing the scene awareness method provided by the embodiment of the present application.
In some embodiments, the electronic device further includes a display screen 1207, and the display screen 1207 may be a folding screen.
Computer-executable instructions in embodiments of the application may also be referred to as application code, and embodiments of the application are not limited in this regard.
As an example, the processor 1101 may include one or more CPUs, such as CPU0 and CPU1 in fig. 11.
As one example, an electronic device may include multiple processors, such as processor 1101 and processor 1105 in fig. 11. Each of these processors may be a single-core (single-CPU) processor or may be a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (e.g., computer program instructions).
Fig. 12 is a schematic structural diagram of a chip according to an embodiment of the present application. As shown in fig. 12, the chip 120 includes one or more (including two) processors 1220 and a communication interface 1230.
In some implementations, memory 1240 stores the following elements: an executable module or data structure, or a subset of an executable module or data structure, or an extended set of executable modules or data structures.
In an embodiment of the application, memory 1240 may include read-only memory and random access memory and provide instructions and data to processor 1220. A portion of the memory 1240 may also include non-volatile random access memory (non-volatile random access memory, NVRAM).
In an embodiment of the application, memory 1240, communication interface 1230, and memory 1240 are coupled together by bus system 1210. The bus system 1210 may include a power bus, a control bus, a status signal bus, and the like, in addition to a data bus. For ease of description, the various buses are labeled as bus system 1210 in FIG. 12.
The methods described above for embodiments of the present application may be applied to the processor 1220 or implemented by the processor 1220. Processor 1220 may be an integrated circuit chip with signal processing capabilities. In implementation, the steps of the methods described above may be performed by integrated logic circuitry in hardware in processor 1220 or by instructions in software. The processor 1220 may be a general purpose processor (e.g., a microprocessor or a conventional processor), a digital signal processor (DIGITALSIGNAL PROCESSING, DSP), an Application SPECIFIC INTEGRATED Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gates, transistor logic, or discrete hardware components, and the processor 1220 may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the application.
In the above embodiments, the instructions stored by the memory for execution by the processor may be implemented in the form of a computer program product. The computer program product may be written in advance in the memory or may be downloaded and installed in the memory in the form of software.
Embodiments of the present application also provide a computer program product comprising one or more computer instructions. When the computer program instructions are loaded and executed on the electronic device, the electronic device executes the technical scheme in the above embodiment, and the implementation principle and technical effects are similar to those of the above related embodiment, which are not repeated herein.
The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, server, or data center to another website, computer, server, or data center by a wired (e.g., coaxial cable, fiber optic, digital subscriber line (digital subscriber line, DSL), or wireless (e.g., infrared, wireless, microwave, etc.) means.
The embodiment of the application also provides a computer readable storage medium, which stores computer instructions that, when executed on an electronic device, cause the electronic device to execute the technical scheme in the above embodiment, and the implementation principle and technical effects are similar to those of the above related embodiment, and are not repeated here.
Computer readable storage media may include computer storage media and communication media and may include any medium that can transfer a computer program from one place to another. The computer readable storage medium may include: compact disc read-only memory CD-ROM, RAM, ROM, EEPROM or other optical disc storage; the computer readable storage medium may include disk storage or other disk storage devices. Moreover, any connection is properly termed a computer-readable storage medium. For example, if the software is transmitted from a website, server, or other remote source using a coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave, then the coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. Disk and disc, as used herein, includes Compact Disc (CD), laser disc, optical disc, digital versatile disc (DIGITAL VERSATILE DISC, DVD), floppy disk and blu-ray disc where disks usually reproduce data magnetically, while discs reproduce data optically with lasers.
In addition, it should be noted that, the user information (including, but not limited to, user equipment information, user personal information, user face information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.
The foregoing is merely illustrative embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily think about variations or substitutions within the technical scope of the present invention, and the invention should be covered. Therefore, the protection scope of the invention is subject to the protection scope of the claims.

Claims (23)

1. A scene perception method, characterized by being applied to an electronic device, the method comprising:
the camera of the electronic device operates in a first mode;
the electronic equipment acquires a first image acquired by the camera in the first mode, and detects whether a person face exists in the first image;
if the electronic equipment detects the face of the person in the first image, controlling the camera to switch from the first mode to a second mode;
The electronic equipment acquires detection data, wherein the detection data comprises a second image acquired by the camera in the second mode;
The electronic equipment identifies the second image, determines the scene category of the person using the electronic equipment based on the second image, and controls to execute preset operation corresponding to the scene category.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
Before the camera of the electronic device operates in the first mode, the method further includes:
the electronic device is responsive to a first operation to turn on the scene-aware function.
3. A method according to claim 1 or 2, characterized in that,
Before the camera of the electronic device operates in the first mode, the method further includes:
Detecting that the state of the electronic equipment meets a first condition; the first condition includes at least one of:
The screen state of the electronic equipment is a bright screen state;
The electronic device is unlocked;
A time difference between an optical signal emitted by the electronic device near the optical sensor and a reflected signal of the optical signal is greater than a first threshold, and/or a signal strength of the reflected signal is less than a second threshold, and/or the reflected signal is not received by the receiving optical sensor;
The detection data of the ambient light sensor of the electronic device is greater than a third threshold;
The screen of the electronic equipment faces to a preset direction;
the electronic device is in a mobile state.
4. The method according to claim 1 or 2, wherein the electronic device is a foldable device, the foldable device comprising an inner screen and an outer screen, the inner screen being provided with a first camera and the outer screen being provided with a second camera; the camera of the electronic device operates in a first mode comprising:
detecting that the external screen of the electronic equipment is in a bright screen state, and the electronic equipment is in a folded state, and controlling the second camera to operate in a first mode; or alternatively
Detecting that the internal screen of the electronic equipment is in a bright screen state, and the electronic equipment is in an unfolding state, and controlling the first camera to operate in a first mode.
5. The method of claim 4, wherein prior to controlling the first camera to operate in the first mode or controlling the second camera to operate in the first mode, the method further comprises:
detecting that the state of the electronic equipment meets a second condition; the second condition includes at least one of:
The electronic device is unlocked;
A time difference between an optical signal emitted by the electronic device near the optical sensor and a reflected signal of the optical signal is greater than a first threshold, and/or a signal strength of the reflected signal is less than a second threshold, and/or the reflected signal is not received by the receiving optical sensor;
The detection data of the ambient light sensor of the electronic device is greater than a third threshold;
the inner screen or the outer screen of the electronic equipment faces to a preset direction;
the electronic device is in a mobile state.
6. The method according to claim 4 or 5, characterized in that the method further comprises:
when the second camera operates in the first mode, if the electronic equipment is detected to be in a folded state to an unfolded state, the electronic equipment controls the first camera to operate in the first mode and closes the second camera; or the electronic equipment controls the first camera to operate in the first mode;
when the first camera operates in the first mode, if the electronic equipment is detected to be in a folded state from an unfolded state, the electronic equipment controls the first camera to be closed, and controls the second camera to operate in the first mode.
7. The method of any one of claims 1 to 6, wherein the detection data further comprises time data, the method further comprising: and if the time data is determined to be in the preset time period, taking a preset scene category corresponding to the preset time period as the scene category of the electronic equipment.
8. The method of any one of claims 1 to 7, wherein the detection data further comprises location data, the method further comprising: and if the position data is determined to be in the preset position range, taking a preset scene category corresponding to the preset position range as the scene category of the electronic equipment.
9. The method of any one of claims 1 to 8, wherein the detection data further comprises voice data, the method further comprising: if the voice data is identified to contain one sound source or less than N sound sources, determining that the scene category of the electronic equipment is a first scene, wherein N is a positive integer greater than or equal to 2; if the voice data are recognized to contain more than M sound sources, determining that the scene category of the electronic equipment is a second scene, wherein M is a positive integer greater than N.
10. The method according to any one of claims 1 to 9, wherein the detection data further comprises data of a first sensor of the electronic device, the first sensor comprising a gyro sensor and an acceleration sensor; the method further comprises the steps of:
The electronic device determines a scene category of the electronic device based on the second image in the detection data and the data of the first sensor.
11. The method of claim 10, wherein the electronic device determining a scene category of the electronic device based on the second image and the data of the first sensor in the detection data comprises:
If the user is determined to be in a motion state based on the data of the first sensor, and the user is determined to continuously watch the screen of the electronic equipment based on the second image, determining the scene category of the electronic equipment as a third scene;
The exercise state includes a walking or riding state.
12. The method of any one of claims 1 to 11, wherein the detection data further comprises data of a second sensor of the electronic device, the second sensor comprising an ambient light sensor; the method further comprises the steps of:
And if the data of the second sensor is smaller than a fourth threshold value, determining that the scene category of the electronic equipment is a fourth scene.
13. The method according to any one of claim 1 to 12, wherein,
The preset operation includes at least one of:
Adjusting the volume;
Adjusting the brightness of a screen;
Adjusting blue light of a screen;
Adjusting the vibration intensity;
sending first information, wherein the first information is used for reminding a user to stop using the electronic equipment;
sending second information, wherein the second information is used for recommending the content corresponding to the scene category;
And opening the rear camera for detecting the obstacle.
14. The method of claim 13, wherein if the preset operation is to turn on the rear camera, the method further comprises:
The electronic equipment acquires a third image acquired by the rear camera in the second mode;
and if the third image is identified to have the obstacle, sending third information, wherein the third information is used for reminding a user to avoid the obstacle.
15. The method according to any one of claims 1 to 14, wherein,
The camera of the electronic device operates in a first mode comprising:
The sensing module of the electronic equipment sends a first instruction to the second processing module of the electronic equipment, wherein the first instruction is used for instructing the second processing module to detect whether the face of the person exists in the range of the camera;
the second processing module sends a first shooting instruction to the camera;
The camera is operated in the first mode in response to the first shooting instruction.
16. The method of any one of claims 1 to 15, wherein the electronic device acquiring a first image acquired by the camera in the first mode, detecting whether there is a person's face in the first image, comprises:
and a second processing module of the electronic equipment acquires the first image acquired by the camera in the first mode and detects whether a person face exists in the first image.
17. The method of any of claims 1-16, wherein controlling the camera to switch from the first mode to a second mode if the electronic device detects a person's face in the first image comprises:
If the second processing module of the electronic device detects that the face of the person exists in the first image, the second processing module sends a first message to the sensing module of the electronic device, wherein the first message is used for notifying the sensing module that the face of the person exists in the camera;
The sensing module sends a second instruction to a first processing module of the electronic device, wherein the second instruction is used for instructing the first processing module to identify the category of the scene in the range of the camera;
The first processing module responds to the second instruction and sends a second shooting instruction to the camera, wherein the second shooting instruction is used for indicating the camera to operate in the second mode.
18. The method of any of claims 1-17, wherein the electronic device identifies the second image, determines a scene category of the person using the electronic device based on the second image, and controls performing a preset operation corresponding to the scene category, comprising:
the first processing module of the electronic device identifies the second image, determines the scene category of the electronic device based on the second image, and sends a second message to the sensing module of the electronic device, wherein the second message is used for indicating the scene category of the electronic device;
The perception module sends a third indication to a target application of the electronic equipment, wherein the third indication is used for indicating the scene category of the electronic equipment;
And the target application controls to execute preset operation corresponding to the scene category.
19. The method of any of claims 3 to 6, wherein the second processing module of the electronic device detects a status of the electronic device.
20. An electronic device, the electronic device comprising: the device comprises a camera, a memory and a processor;
the camera is used for collecting images with different frame rates and/or resolutions; the processor is configured to invoke a computer program in the memory to perform the scene-aware method of any of claims 1-19.
21. The method of claim 20, wherein the processor comprises a first processing module and a second processing module, the first processing module having a higher power consumption than the second processing module;
The second processing module is used for detecting whether a person face exists in a first image acquired by the camera in a first mode; the first processing module is used for identifying a second image acquired by the camera in a second mode and determining scene categories of the electronic equipment used by the user.
22. A computer readable storage medium storing computer instructions which, when run on an electronic device, cause the electronic device to perform the scene-aware method of any of claims 1 to 19.
23. A chip comprising a processor for invoking a computer program in memory to perform the scene-aware method of any of claims 1-19.
CN202211521455.9A 2022-11-30 2022-11-30 Scene perception method, equipment and storage medium Pending CN118118775A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202211521455.9A CN118118775A (en) 2022-11-30 2022-11-30 Scene perception method, equipment and storage medium
PCT/CN2023/125982 WO2024114170A1 (en) 2022-11-30 2023-10-23 Scene perception method, and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211521455.9A CN118118775A (en) 2022-11-30 2022-11-30 Scene perception method, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN118118775A true CN118118775A (en) 2024-05-31

Family

ID=91211109

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211521455.9A Pending CN118118775A (en) 2022-11-30 2022-11-30 Scene perception method, equipment and storage medium

Country Status (2)

Country Link
CN (1) CN118118775A (en)
WO (1) WO2024114170A1 (en)

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104076898B (en) * 2013-03-27 2018-10-26 腾讯科技(深圳)有限公司 A kind of method and apparatus of control mobile terminal screen brightness
CN106303088A (en) * 2016-09-30 2017-01-04 努比亚技术有限公司 Prompting control method and mobile terminal
KR102421487B1 (en) * 2017-04-24 2022-07-15 엘지전자 주식회사 Artificial intelligent device
WO2019051777A1 (en) * 2017-09-15 2019-03-21 深圳传音通讯有限公司 Reminding method and reminding system based on intelligent terminal
CN110310668A (en) * 2019-05-21 2019-10-08 深圳壹账通智能科技有限公司 Mute detection method, system, equipment and computer readable storage medium
CN114257670B (en) * 2022-02-28 2022-07-05 荣耀终端有限公司 Display method of electronic equipment with folding screen

Also Published As

Publication number Publication date
WO2024114170A1 (en) 2024-06-06

Similar Documents

Publication Publication Date Title
EP3964954A1 (en) Processing method and apparatus for waiting scenario in application
WO2021213164A1 (en) Application interface interaction method, electronic device, and computer readable storage medium
CN111371938B (en) Fault detection method and electronic equipment
WO2021115007A1 (en) Network switching method and electronic device
WO2023000772A1 (en) Mode switching method and apparatus, electronic device and chip system
US11868463B2 (en) Method for managing application permission and electronic device
US20230127696A1 (en) Drive control method and related device
WO2023103951A1 (en) Display method for foldable screen and related apparatus
WO2022037398A1 (en) Audio control method, device, and system
CN116301363B (en) Space gesture recognition method, electronic equipment and storage medium
CN115079886B (en) Two-dimensional code recognition method, electronic device, and storage medium
WO2022179604A1 (en) Method and apparatus for determining confidence of segmented image
CN110401768A (en) The method and apparatus for adjusting the working condition of electronic equipment
CN116048436B (en) Application interface display method, electronic device and storage medium
CN110234023A (en) A kind of method and apparatus handling video traffic
CN115914460A (en) Display screen control method and electronic equipment
WO2023130931A1 (en) Service anomaly warning method, electronic device and storage medium
CN118118775A (en) Scene perception method, equipment and storage medium
CN118118776A (en) Fatigue sensing method, apparatus and storage medium
CN118114695A (en) Code scanning method, device and storage medium
WO2024114137A1 (en) Crowd identification method, device and storage medium
CN118118778A (en) Gesture sensing method, device and storage medium
CN115827207B (en) Application program switching method and electronic equipment
CN116301362B (en) Image processing method, electronic device and storage medium
CN115079804B (en) Control processing method of electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination