WO2024125379A1 - 一种图像处理方法、头戴式显示设备及介质 - Google Patents

一种图像处理方法、头戴式显示设备及介质 Download PDF

Info

Publication number
WO2024125379A1
WO2024125379A1 PCT/CN2023/137011 CN2023137011W WO2024125379A1 WO 2024125379 A1 WO2024125379 A1 WO 2024125379A1 CN 2023137011 W CN2023137011 W CN 2023137011W WO 2024125379 A1 WO2024125379 A1 WO 2024125379A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
eye
display device
zoom
mounted display
Prior art date
Application number
PCT/CN2023/137011
Other languages
English (en)
French (fr)
Inventor
袁乐
王鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2024125379A1 publication Critical patent/WO2024125379A1/zh

Links

Definitions

  • the present application relates to the field of image processing technology, and in particular to an image processing method, a head-mounted display device and a medium.
  • the human eye has a certain limit to its ability to recognize the scale of tiny objects in space. For example, when the human eye wants to see distant objects or landscapes clearly, it often needs to use telescopes and other equipment to magnify the scale of the object of interest for easy recognition by the human eye. For another example, people with physiological deterioration such as presbyopia due to aging need to use optical instruments (magnifying glasses, reading glasses) to magnify tiny objects so that the human eye can recognize them.
  • optical instruments magnifying glasses, reading glasses
  • Telescopes and other methods need to be held with both hands, and the magnification is too high to view nearby objects.
  • Magnifiers and reading glasses have fixed magnifications and cannot meet the needs of various application environments.
  • the embodiments of the present application provide an image processing method, a head-mounted display device, and a medium, which are used to meet the usage requirements of various application environments.
  • an embodiment of the present application provides a head-mounted display device, comprising one or two variable-focus cameras and a display screen.
  • the head-mounted display device includes two zoom cameras, namely a first zoom camera and a second zoom camera.
  • the first zoom camera is used to capture a first image viewed by a user's left eye in a target scene.
  • the second zoom camera is used to capture a second image viewed by a user's right eye in the target scene.
  • a display screen is used to display a left-eye target image on a left-eye display unit of the display screen, and to display a right-eye target image on a right-eye display unit of the display screen.
  • the left-eye target image is obtained after a region of interest (ROI) of the user included in the first image is enlarged, and the right-eye target image is obtained after a second image including the ROI is enlarged.
  • the enlargement process may be a super-resolution process.
  • the head-mounted display device includes a zoom camera.
  • the zoom camera is used to capture images viewed by a user in a target scene.
  • a display screen is used to display a left-eye target image on a left-eye display unit of the display screen, and to display a right-eye target image on a right-eye display unit of the display screen.
  • the left-eye target image and the right-eye target image are obtained by performing image processing on the image captured by the zoom camera.
  • the image processing includes performing binocular parallax adjustment on the image and magnifying the region of interest.
  • At least one variable-focus camera is added to the head-mounted display device, and the user can adjust the magnification, i.e., the zoom ratio, according to the needs.
  • the magnification i.e., the zoom ratio
  • the user cannot see the distant object clearly, by adjusting the magnification, the distant object can be seen without the help of external devices.
  • the user does not need to hold it with both hands, which can also improve portability.
  • the head-mounted display device can perform image processing on the scene image, so that the processed image fits the size of the display screen and has a high resolution, and can display the image clearly.
  • the magnification of the optical zoom camera is limited, the magnification can be increased through image processing, so that the head-mounted display device can achieve the magnification required by the user.
  • the zoom ratio used by the first zoom camera to capture the first image is the same as or different from the zoom ratio used by the second zoom camera to capture the second image; the zoom ratio used by the first zoom camera and the zoom ratio used by the second zoom camera are independently controlled.
  • the user can independently adjust the magnification of the image viewed by the left eye or the right eye.
  • the head-mounted display device also includes a processor; the processor is used to perform image processing on the first image and the second image respectively to obtain the left eye target image and the right eye target image; the image processing includes magnification processing performed on the area of interest in the first image and the second image.
  • the processor is further used to obtain the zoom ratio used by the first zoom camera and the second zoom camera, and determine the area of interest.
  • the zoom ratios of the first zoom camera and the second zoom camera are the same.
  • the processor Specifically, it is used to determine a central picture area corresponding to the zoom ratio from the shooting range of the first zoom camera and/or the second zoom camera according to the zoom ratio, and the central picture area is used as the region of interest.
  • the zoom magnifications of the first zoom camera and the second zoom camera are different.
  • the processor is specifically configured to determine a first central picture area corresponding to the zoom magnification of the first zoom camera from a shooting range of the first zoom camera, determine a second central picture area corresponding to the zoom magnification of the second zoom camera from a shooting range of the second zoom camera, and determine the ROI according to the first central picture area and the second central picture area.
  • the processor is specifically configured to determine the ROI from the shooting range of the first zoom camera and/or the second zoom camera based on an eye tracking algorithm.
  • the processor is specifically used to perform binocular parallax adjustment on the first image and the second image respectively according to the distance between the left eye pupil and the right eye pupil of the user and the positions of the first zoom camera and the second zoom camera on the head-mounted display device to obtain a left eye display view and a right eye display view; perform a magnification process on the ROI in the left eye display view to obtain the left eye target image, and perform a magnification process on the ROI in the right eye display view to obtain the right eye target image.
  • a head-mounted display device includes a zoom camera; and a processor, specifically used to: perform binocular parallax adjustment on an image captured by the zoom camera according to a distance between a left eye pupil and a right eye pupil of a user and a position of the zoom camera on the head-mounted display device to obtain a left eye display view and a right eye display view; perform a magnification process on an image of an area of interest in the left eye display view to obtain a left eye target image, and perform a magnification process on an image of an area of interest in the right eye display view to obtain a right eye target image.
  • the image processing further includes: image enhancement processing for a left-eye display view and image enhancement processing for a right-eye display view;
  • the image enhancement processing includes at least one of the following:
  • the head-mounted display device further includes an inertial measurement unit IMU;
  • IMU Inertial measurement unit IMU, used to output IMU measurement data
  • the processor is also used to de-jitter the left-eye display view and the right-eye display view respectively according to the IMU measurement data when the user's head is deflected.
  • the processor is further used to: determine whether the auxiliary vision function is turned on before obtaining the zoom ratio.
  • the head-mounted display device is a mixed reality MR helmet.
  • an embodiment of the present application provides an image processing method, which is applied to a head-mounted display device, and the head-mounted display device includes a display screen and two zoom cameras or one zoom camera. Take two zoom cameras as an example, which are a first zoom camera and a second zoom camera.
  • the method includes: obtaining a zoom ratio; determining a region of interest ROI in a target scene, and collecting a first image viewed by a user's left eye in the target scene through the first zoom camera, and collecting a second image viewed by the user's right eye in the target scene through the second zoom camera; performing image processing on the first image and the second image to obtain a left-eye target image and a right-eye target image respectively; the image processing includes a magnification process performed on the ROI in the first image and the second image; displaying the left-eye target image on the left-eye display unit of the display screen, and displaying the right-eye target image on the right-eye display unit of the display screen.
  • the image processing includes binocular parallax adjustment. In some possible embodiments, the image processing includes enlargement processing performed on images of the region of interest in the first image and the second image, such as super-resolution processing.
  • At least one variable-focus camera is added to the head-mounted display device, and the user can adjust the magnification, i.e., the zoom ratio, according to the needs.
  • the magnification i.e., the zoom ratio
  • the user cannot see the distant object clearly, by adjusting the magnification, the distant object can be seen without the help of external devices.
  • the user does not need to hold it with both hands, which can also improve portability.
  • the head-mounted display device can perform image processing on the scene image, so that the processed image fits the size of the display screen and has a high resolution, and can display the image clearly.
  • the magnification of the optical zoom camera is limited, the magnification can be increased through image processing, so that the head-mounted display device can achieve the magnification required by the user.
  • determining an area of interest within a target scene includes: obtaining a zoom ratio, determining a central picture area corresponding to the zoom ratio from a shooting range of the first zoom camera and/or the second zoom camera, and using the central picture area as the area of interest.
  • the association relationship between the zoom ratio and the central picture area in the shooting range can be preset, so that when a certain zoom ratio is determined, the area boundary of the central picture area in the shooting range can be determined according to the association relationship.
  • a region of interest within a target scene is identified, including:
  • an area of interest within a target scene viewed by a user is determined from a shooting range of the first zoom camera and/or the second zoom camera.
  • the area where the user focuses is determined by an eye tracking algorithm, and the area is the user's area of interest.
  • the variable-focus camera captures a scene image, and then performs a zoom operation on the user's area of interest.
  • performing image processing on the first image and the second image to obtain a left-eye target image and a right-eye target image respectively includes:
  • binocular parallax adjustment is performed on the first image and the second image to obtain a left eye display view and a right eye display view;
  • a magnification process is performed on the image of the region of interest in the left eye display view to obtain a left eye target image
  • a magnification process is performed on the image of the region of interest in the right eye display view to obtain a right eye target image.
  • the head-mounted display device includes a binocular variable-focus camera, and then performs binocular parallax adjustment on the scene image captured by the binocular variable-focus camera, so that the three-dimensional sense of objects in the image viewed by the user on the display screen is enhanced, the objects are more realistic, and the user experience is improved.
  • a head-mounted display device includes a zoom camera, and image processing is performed on images captured by the zoom camera to obtain a left-eye target image and a right-eye target image, including: performing binocular parallax adjustment on the images captured by the zoom camera according to the distance between the left eye pupil and the right eye pupil of the user and the position of the zoom camera on the head-mounted display device to obtain a left-eye display view and a right-eye display view; performing a magnification process on an image of an area of interest in the left-eye display view to obtain a left-eye target image, and performing a magnification process on an image of an area of interest in the right eye display view to obtain a right-eye target image.
  • the head-mounted display device includes a monocular zoom camera.
  • the head-mounted display device of the present application has the function of adjusting binocular parallax for the scene image captured by the binocular zoom camera, so that the three-dimensional sense of objects in the image viewed by the user on the display screen is enhanced, the objects are more realistic, and the user experience is improved.
  • the image processing further includes: image enhancement processing of a left-eye display view and image enhancement processing for a right-eye display view; the image enhancement processing includes at least one of the following:
  • the embodiment of the present application performs additional image enhancement processing on the image that needs to be magnified, which can reduce the problem of blurred images caused by air scattering when users view distant objects, thereby improving the clarity of the image.
  • Deraining and defogging processing is performed on scene images collected in bad weather such as rain and fog, which can improve the clarity of the displayed image and enhance the user's viewing experience.
  • the head mounted display device further includes an inertial measurement unit (IMU). IMU measurement data output by the inertial measurement unit (IMU) is obtained; when the user's head is deflected, de-jitter processing is performed on the left eye display view and the right eye display view respectively according to the IMU measurement data.
  • IMU inertial measurement unit
  • the IMU in the head-mounted display device is used.
  • de-jitter processing can be implemented through IMU measurement data, thereby further improving the imaging quality of the displayed image viewed by the user.
  • the method before obtaining the zoom ratio, the method also includes: determining whether the auxiliary visual function is turned on.
  • the auxiliary visual function may be in a low-power standby state.
  • the auxiliary visual function may be awakened, such as by a voice command or a button or knob set on the head-mounted display device.
  • the head-mounted display device is a mixed reality MR helmet.
  • MR helmets do not have the function of image magnification.
  • a variable-focus camera is added to the MR helmet.
  • users with presbyopia or who need to realize the telescopic function can clearly see the distant scenery they need to see without wearing reading glasses or using a telescope.
  • an embodiment of the present application provides an image processing device, which is included in a head-mounted display device, and the head-mounted display device also includes a first variable-focus camera, a second variable-focus camera, and a display screen;
  • a processing module configured to determine a region of interest ROI in a target scene, and to collect a first image viewed by a left eye of a user in the target scene through the first variable-focus camera, and to collect a second image viewed by a right eye of the user in the target scene through the second variable-focus camera; and to perform image processing on the first image and the second image to obtain a left-eye target image and a right-eye target image respectively; and the image processing includes a magnification process performed on the ROI in the first image and the second image;
  • the display module is used to display a left-eye target image on a left-eye display unit of the display screen, and to display a right-eye target image on a right-eye display unit of the display screen.
  • the device further includes an acquisition module for acquiring the zoom ratio.
  • the processing module is specifically configured to determine a central image area corresponding to the zoom ratio from a shooting range of the first zoom camera and/or the second zoom camera, and the central image area is used as the ROI.
  • the processing module is specifically used to determine the area of interest from the shooting range of the first zoom camera and/or the second zoom camera based on an eye tracking algorithm.
  • the processing module is specifically used for:
  • binocular parallax adjustment is performed on the first image and the second image to obtain a left eye display view and a right eye display view;
  • a magnification process is performed on the image of the region of interest in the left eye display view to obtain a left eye target image
  • a magnification process is performed on the image of the region of interest in the right eye display view to obtain a right eye target image.
  • the image processing further includes: image enhancement processing for a left-eye display view and image enhancement processing for a right-eye display view;
  • the image enhancement processing includes at least one of the following:
  • the head mounted display device further includes an inertial measurement unit IMU.
  • the processing module is further configured to:
  • the left-eye display view and the right-eye display view are de-jittered respectively according to the IMU measurement data.
  • the processing module is also used to determine whether the auxiliary vision function is turned on before obtaining the zoom ratio.
  • the head-mounted display device is a mixed reality MR helmet.
  • an embodiment of the present application provides a computer-readable storage medium, in which a computer program or instruction is stored.
  • the computer program or instruction is executed by a head-mounted display device, the head-mounted display device executes the method described in the first aspect or any design of the first aspect.
  • an embodiment of the present application provides a computer program product, which includes a computer program or instructions.
  • the computer program or instructions are executed by a head-mounted display device, the method described in the first aspect or any design of the first aspect is implemented.
  • FIG1A is a schematic diagram of the structure of a head mounted display device provided in an embodiment of the present application.
  • FIG1B is a schematic diagram of the structure of another head mounted display device provided in an embodiment of the present application.
  • FIG1C is a schematic diagram of the configuration position of a variable-focus camera of a head-mounted display device provided in an embodiment of the present application;
  • FIG2 is a schematic diagram of the structure of another head mounted display device provided in an embodiment of the present application.
  • FIG3 is a schematic diagram of the structure of another head mounted display device provided in an embodiment of the present application.
  • FIG4A is a schematic diagram of a flow chart of an image processing method provided in an embodiment of the present application.
  • FIG4B is a schematic diagram of another image processing method provided in an embodiment of the present application.
  • FIG5 is a schematic diagram of a possible binocular parallax adjustment provided in an embodiment of the present application.
  • FIG6 is a schematic diagram of the projection relationship between left and right eye images provided in an embodiment of the present application.
  • FIG7 is a schematic diagram of viewpoint transformation of a left-eye image provided by an embodiment of the present application.
  • FIG8 is a schematic diagram of image processing of a scene provided in an embodiment of the present application.
  • FIG. 9 is a schematic diagram of the structure of an image processing device provided in an embodiment of the present application.
  • multiple means two or more than two.
  • “/” indicates that the objects associated before and after are in an "or” relationship, for example, A/B can represent A or B; “and/or” in this application is only a kind of association relationship describing the associated objects, indicating that there can be three relationships, for example, A and/or B can represent: A exists alone, A and B exist at the same time, and B exists alone, where A and B can be singular or plural.
  • the words "first" and "second” are used to distinguish the same items or similar items with basically the same functions and effects.
  • the embodiments of the present application can be applied to mixed reality (MR) scenes or virtual reality (VR) scenes.
  • mixed reality technology is to present virtual scene information in real scenes, build an interactive feedback information loop between the real world, the virtual world and the user, so as to enhance the realism of the user's experience.
  • the head-mounted display device in the embodiments of the present application can be MR glasses or MR helmets.
  • the head-mounted display device in the embodiments of the present application has the functions of magnification, telephoto, image processing and display.
  • FIGS. 1A and 1B are schematic diagrams of a system architecture of a head mounted display device provided in an embodiment of the present application.
  • the head mounted display device includes at least one variable focus camera 120 , a processor 140 , and a display screen 160 .
  • the structure illustrated in the embodiments of the present application does not constitute a specific limitation on the head-mounted display device.
  • the head-mounted display device may include more or fewer components than those shown in FIG. 1A and FIG. 1B, or combine certain components, or split certain components, or arrange the components differently.
  • the components shown in FIG. 1A and FIG. 1B may be implemented in hardware, software, or a combination of software and hardware.
  • the processor 140 may include one or more processing units, for example, the processor 140 may include a central processing unit (CPU), an application processor (AP), a modem processor, a graphics processor (GPU), an image signal processor (ISP), a controller, a video codec, a digital signal processor (DSP), and/or a neural-network processing unit (NPU), etc.
  • CPU central processing unit
  • AP application processor
  • GPU graphics processor
  • ISP image signal processor
  • DSP digital signal processor
  • NPU neural-network processing unit
  • Different processing units may be independent devices or integrated into one or more processors.
  • the processor 140 includes a CPU and an image processing module.
  • the image processing module may include one or more of an ISP, an NPU, a DSP, a GPU, and the like.
  • the processor 140 may also be provided with a memory for storing instructions and data.
  • the memory in the processor 140 is a cache memory.
  • the memory may store instructions or data that the processor 140 has just used or cyclically used. If the processor 140 needs to use the instruction or data again, it may be directly called from the memory. This avoids repeated access, reduces the waiting time of the processor 140, and thus improves the efficiency of the system.
  • a head-mounted display device includes a zoom camera 120 as an example.
  • a head-mounted display device includes two zoom cameras 120 (also referred to as binocular zoom cameras), which are respectively a first zoom camera 120-1 and a second zoom camera 120-2.
  • the zoom camera 120 (or the first zoom camera 120-1 and the second zoom camera 120-2) is connected to the processor 140 via a data interface and a control interface.
  • the data interface is used for the zoom camera 120 to transmit image data to the processor 140.
  • the control interface is used for the processor 140 to send a control signal to the zoom camera, such as a zoom control signal.
  • the data interface may be a mobile industry processor interface (Mobile Industry Processor Interface, MIPI), or other interfaces that can be used to transmit image data.
  • the control interface may be a serial peripheral interface (Serial Peripheral Interface, SPI) or a two-wire serial bus (Inter-Integrated Circuit, I2C) interface, or other interfaces that can be used to transmit control signals.
  • SPI Serial Peripheral Interface
  • I2C Inter-Integrated Circuit
  • the head mounted display device may include multiple zoom cameras, and different cameras have different zoom ranges.
  • the zoom ratio of a telephoto camera is greater than 5, such as 10X.
  • the zoom ratio of a medium-telephoto camera is less than that of a telephoto camera and greater than 1X, such as 2X, 3X, and so on.
  • the head-mounted display device may include multiple pairs of zoom cameras.
  • Each pair of zoom cameras has the same zoom range, such as a pair of telephoto cameras and a pair of medium-telephoto cameras.
  • the telephoto camera can achieve a telephoto function.
  • the medium-telephoto camera can be used in presbyopia scenarios.
  • the I2C interface is a bidirectional synchronous serial bus that includes a serial data line (SDA) and a serial clock line (SCL).
  • the processor 140 may include multiple I2C buses.
  • the processor 140 may Different zoom cameras are coupled through different I2C bus interfaces.
  • MIPI interfaces include camera serial interface (CSI), display serial interface (DSI), etc.
  • the processor 140 and the zoom camera 120 communicate via the CSI interface to realize the acquisition function of the head-mounted display device.
  • the processor 140 and the display screen 160 can communicate via the DSI interface to realize the display function of the head-mounted display device.
  • a control signal may be sent to the first zoom camera 120-1 and the second zoom camera 120-2 through the processor 140, so that the first zoom camera 120-1 and the second zoom camera 120-2 maintain synchronous adjustment of the focal length.
  • the focus and exposure control of the binocular zoom camera may also be kept consistent.
  • the zoom magnifications of the first zoom camera 120-1 and the second zoom camera 120-2 may be controlled independently. For example, only the zoom magnification of the first zoom camera 120-1 may be adjusted. For another example, only the zoom magnification of the second zoom camera 120-2 may be adjusted. For another example, the zoom magnification of the first zoom camera 120-1 and the zoom magnification of the second zoom camera 120-2 may be different.
  • the display screen 160 is used to display images, videos, etc.
  • the display screen 160 may include a left-eye display unit and a right-eye display image.
  • the left-eye display unit is used to display images, videos, etc. viewed by the left eye.
  • the right-eye display unit is used to display images, videos, etc. viewed by the right eye.
  • the display screen 160 includes a display panel.
  • the display panel may be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (AMOLED), a flexible light-emitting diode (FLED), Miniled, MicroLed, Micro-oLed, a quantum dot light-emitting diode (QLED), etc.
  • LCD liquid crystal display
  • OLED organic light-emitting diode
  • AMOLED active-matrix organic light-emitting diode
  • FLED flexible light-emitting diode
  • Miniled MicroLed
  • Micro-oLed Micro-oLed
  • QLED quantum dot light-emitting diode
  • the head mounted display device can realize image and video shooting functions through an ISP, at least one variable focus camera 120, a GPU, a display screen 160, and an application processor.
  • the ISP is used to process the data fed back by the zoom camera 120. For example, when taking a photo, the shutter is opened, and the light is transmitted to the camera photosensitive element through the lens. The light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing and converts it into an image visible to the naked eye.
  • the ISP can also perform algorithm optimization on the noise, brightness, and skin color of the image. The ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP can be set in the zoom camera 120.
  • the zoom camera 120 is used to capture static images or videos.
  • the object generates an optical image through the lens and projects it onto the photosensitive element.
  • the photosensitive element can be a charge coupled device (CCD) or a complementary metal oxide semiconductor (CMOS) phototransistor.
  • CMOS complementary metal oxide semiconductor
  • the photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP for conversion into a digital image signal.
  • the ISP converts the digital image signal into an image signal in a standard RGB, YUV or other format, and outputs it to the image processing module.
  • the processor 140 can trigger the zoom camera 120 according to the program or instruction in the memory, so that the zoom camera 120 captures at least one image, and performs corresponding processing on the at least one image according to the program or instruction, such as digital magnification processing, binocular parallax adjustment, image sharpening, image defogging, image deraining, image deblurring, image demosaicing, image contrast enhancement, image color enhancement, image detail enhancement or image brightness enhancement.
  • digital magnification processing such as digital magnification processing, binocular parallax adjustment, image sharpening, image defogging, image deraining, image deblurring, image demosaicing, image contrast enhancement, image color enhancement, image detail enhancement or image brightness enhancement.
  • At least one zoom camera 120 may be deployed on an external panel of the head-mounted display device, facing the viewing direction, as shown in FIG1C .
  • at least one zoom camera 120 may be deployed on the head-mounted display device in a folded optical path manner.
  • the deployment method of at least one zoom camera 120 is not specifically limited in the present application. It should be understood that when the head-mounted display device includes a binocular zoom camera, one of the zoom cameras is located close to the left eye position in the head-mounted display device, and the other zoom camera is located close to the right eye position in the head-mounted display device.
  • the head-mounted display device may further include an inertial measurement unit (IMU) 180, as shown in FIG2 .
  • IMU 180 is used to output IMU measurement data.
  • IMU measurement data may include the three-axis attitude angle (or angular rate) and acceleration of the object.
  • the IMU 180 in the head-mounted display device can be used for posture positioning of the head-mounted display device.
  • the IMU measurement data in a zoomed-in or telescopic scene, the IMU measurement data can be used for anti-shake processing.
  • the head mounted display device may further include a charging management module 131, a power management module 132, a battery 133, an audio module 134, a speaker 135, a microphone 136, an earphone interface 137, a sensor module 138, a button 139, a receiver 151, etc.
  • the sensor module 138 may include a pressure sensor, a gyroscope sensor, an air pressure sensor, a magnetic sensor, an acceleration sensor, a temperature sensor, etc.
  • the charging management module 131 is used to receive charging input from a charger.
  • the charger can be a wireless charger or a wired charger.
  • the charging management module 131 can receive charging input from a wired charger through a USB interface.
  • the charging management module 131 can receive charging input from a wireless charger through a wireless charging coil of the head mounted display device. While the charging management module 131 is charging the battery 133 , it can also supply power to the head mounted display device through the power management module 132 .
  • the power management module 132 is used to connect the battery 133, the charging management module 131 and the processor 140.
  • the power management module 132 receives input from the battery 133 and/or the charging management module 131, and supplies power to the processor 140, the memory, the display screen 160, the zoom camera 120, etc.
  • the power management module 132 can also be used to monitor parameters such as battery capacity, battery cycle number, battery health status (leakage, impedance), etc.
  • the power management module 132 can also be set in the processor 140.
  • the power management module 132 and the charging management module 131 can also be set in the same device.
  • the audio module 134 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signals.
  • the audio module 134 can also be used to encode and decode audio signals.
  • the audio module 134 can be arranged in the processor 140, or some functional modules of the audio module 134 can be arranged in the processor 140.
  • the speaker 135 also called a “speaker”, is used to convert an audio electrical signal into a sound signal.
  • the head mounted display device can listen to music or listen to hands-free calls through the speaker 135 .
  • the receiver 151 also called a "earpiece" is used to convert audio electrical signals into sound signals.
  • the head mounted display device receives a call or voice message, the voice can be received by placing the receiver 151 close to the human ear.
  • Microphone 136 also called “microphone” or “microphone” is used to convert sound signals into electrical signals. When making a call or sending a voice message, the user can make a sound by approaching the microphone 136 with his mouth, and the sound signal is input into the microphone 136.
  • the head-mounted display device can be provided with at least one microphone 136. In other embodiments, the head-mounted display device can be provided with two microphones 136, which can not only collect sound signals but also realize noise reduction function.
  • the earphone jack 137 is used to connect a wired earphone.
  • the earphone jack 137 may be a USB interface, or a 3.5 mm open mobile terminal platform (OMTP) standard interface, or a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • the USB interface is an interface that complies with the USB standard specification, and can be a Mini USB interface, a Micro USB interface, a USB Type C interface, etc.
  • the head-mounted display device may also include one or more USB interfaces.
  • the USB interface can be used to connect a charger to charge the head-mounted display device, and can also be used to transfer data between the head-mounted display device and peripheral devices. It can also be used to connect headphones to play audio through the headphones.
  • the interface can also be used to connect other electronic devices, such as terminal devices, etc.
  • the interface connection relationship between the modules illustrated in the embodiments of the present application is only a schematic illustration and does not constitute a structural limitation on the head-mounted display device.
  • the head-mounted display device may also adopt different interface connection methods in the above embodiments, or a combination of multiple interface connection methods.
  • an optical lens device with a fixed focal length is installed on the outside of the AR device, and a photosensitive device is used to complete image acquisition.
  • this method can only realize the telephoto magnification function of a fixed magnification.
  • Another way is to achieve it by modifying the telescope.
  • an electronic collection path is added. The two paths are imaged separately and superimposed at the position of the eyepiece. This method is only a modification of the telescope, which can only realize the function of the telescope and has poor portability.
  • the head-mounted display device includes at least one variable-focus camera and a display screen, and the focal length of at least one variable-focus camera can be adjusted according to user needs to achieve digital magnification of the user's area of interest.
  • FIG. 4A a flow chart of a possible image processing method provided in an embodiment of the present application is shown.
  • the image processing method can be applied to the head-mounted display device shown in FIG. 1A to FIG. 3 above.
  • the head-mounted display device includes at least one variable-focus camera and a display screen.
  • the method steps in FIG. 4A can be executed by the head-mounted display device, for example, by a processor or processing module in the head-mounted display device.
  • ROI region of interest
  • At least one scene image is captured for a target scene within a shooting range through at least one variable-focus camera.
  • the head mounted display device includes a zoom camera. At least one scene image can be acquired by the zoom camera in one shot, that is, acquired in one exposure. Acquiring multiple scene images for subsequent image processing can improve the processing effect.
  • the head mounted display device includes a pair of zoom cameras, and each of the pair of zoom cameras can capture N scene images.
  • N is a positive integer.
  • the pair of zoom cameras capture scene images at the same zoom ratio.
  • the pair of zoom cameras can also capture scene images at different zoom ratios.
  • the head mounted display device includes multiple zoom cameras or multiple pairs of zoom cameras
  • different zoom cameras or different pairs of zoom cameras
  • Different (or different pairs of) zoom cameras can be activated according to the zoom ratio. Zoomable camera.
  • which zoom camera to start can be determined based on the comparison result of the acquired zoom magnification and the magnification threshold. For example, if the acquired zoom magnification is less than or equal to the first magnification threshold, the medium-telephoto camera can be started. If the zoom magnification is greater than the first magnification threshold, the telephoto camera can be started.
  • the value range of the first magnification threshold may be [A1, A2). For example, the value range of the first magnification threshold may be [5, 10). For example, the first magnification threshold may be 5, 6, 7 or 8, etc.
  • the magnification of the telephoto camera is greater than or equal to the first magnification threshold, for example, the magnification of the telephoto camera is 10X. In a possible implementation, when the zoom magnification is greater than 9.9, N scene images can be captured by the telephoto camera.
  • the image of the area of interest when performing a magnification process on an image of an area of interest in at least one scene image, can be cropped out from at least one scene image.
  • the image of the cropped portion is then magnified.
  • the portion corresponding to the area of interest in at least one scene image is enlarged to the size of a left-eye display unit and a right-eye display unit.
  • the image of the area of interest of the user is magnified and displayed on the display screen, so that the user can obtain a better visual experience through the wearable display device. For example, a wearer with presbyopia can use a wearable device to magnify the area that cannot be seen clearly.
  • step 4011 may be further performed to obtain a zoom factor.
  • the zoom ratio may also be a default zoom ratio of the wearable display device in certain shooting modes, such as presbyopia mode or telephoto mode, etc.
  • the zoom ratio may also be a zoom ratio selected by the user on the wearable display device.
  • the head-mounted display device is configured with a user interface for obtaining the user's input on the magnification.
  • the zoom ratio can also be understood as the magnification.
  • the user interface can be a button or knob type, or can also include a voice recognition module, for example, to determine the magnification by recognizing the user's voice command.
  • the user interface can also be an interface for an external control device such as an external handle. The user can operate the external control device to send a control signal to the head-mounted display device to indicate the magnification.
  • the head-mounted display device may have a menu bar for the user to select the magnification.
  • the magnification factor may be an integer, such as 1, 2, 3, 10, 50 or more, or a non-integer, such as 1.5, etc. All of these are within the protection scope of the present application.
  • the user interface can be implemented in any known and common form, such as being set in a controller connected to the product body by wire or wirelessly, or being directly set in the product body, etc., and the present application does not impose any specific restrictions on this.
  • the user may set different zoom ratios for the first zoom camera and the second zoom camera.
  • any of the following possible methods can be used to implement the method.
  • the following are only three possible implementation methods, and other methods capable of determining the ROI area are applicable to the present application.
  • a central picture area corresponding to the zoom ratio is determined from a shooting range of at least one zoom camera, and the central picture area is used as the region of interest.
  • the association relationship between the zoom ratio and the central picture area in the shooting range can be preset, so that when a certain zoom ratio is determined, the area boundary of the central picture area in the shooting range can be determined according to the association relationship.
  • the center screen area can be understood as a center area of the scene image captured by the zoom camera within the shooting range.
  • the zoom camera can capture a certain shooting range at the focal length corresponding to the zoom ratio. When the shooting range is fixed, for the same scene image, the size of the displayed part (center screen area) is different at different zoom ratios.
  • the region of interest in the target scene viewed by the user is determined from the shooting range of at least one variable-focus camera. It can be understood that the region size and position of the left and right eyes of the user are generally fixed. The length and width of the image can also be determined, so the region of interest of the user's eyes can be determined according to the region size and position of the left and right eyes.
  • the area focused by the user is determined by an eye tracking algorithm, and the area is the user's area of interest.
  • at least one variable-focus camera is used to capture a scene image, and then a zoom operation is performed on the user's area of interest.
  • a preview frame is displayed on the display screen, and a preview image is displayed in the preview frame; the preview image is obtained by capturing a target scene with at least one zoom camera; the preview image can be obtained by capturing the target scene with the zoom camera at 1 times zoom.
  • the first area is determined to be an area of interest in the target scene viewed by the user.
  • the image processing in the embodiments of the present application may include binocular parallax adjustment and/or anti-shake processing in addition to the magnification processing.
  • a wearable display device includes a first zoom camera and a second zoom camera.
  • the first zoom camera is used to capture a first image viewed by the left eye at a zoom ratio.
  • the first image may also be referred to as a left-eye scene image in the embodiment of the present application
  • the second zoom camera is used to capture a second image viewed by the right eye at a zoom ratio.
  • the second image may also be referred to as a right-eye scene image in the embodiment of the present application.
  • binocular parallax adjustment may be performed on the left-eye scene image and the right-eye scene image first, and then the region of interest after binocular parallax adjustment may be magnified.
  • the region of interest may be magnified to the size of the left-eye display unit and the right-eye display unit.
  • the magnification processing in the embodiment of the present application includes super-resolution processing for the image with magnified size.
  • the first zoom camera may also be referred to as a left-eye zoom camera
  • the second zoom camera may also be referred to as a right-eye zoom camera.
  • the first zoom camera is referred to as a left-eye zoom camera
  • the second zoom camera is referred to as a right-eye zoom camera.
  • step 403 when performing the above step 403 and performing image processing on at least one scene image to obtain a left-eye target image and a right-eye target image, it can be implemented in the following manner:
  • binocular parallax adjustment is performed on the left eye scene image and the right eye scene image to obtain the left eye display view and the right eye display view. Then, the image of the area of interest in the left eye display view is magnified to obtain the left eye target image, and the image of the area of interest in the right eye display view is magnified to obtain the right eye target image.
  • the distance between the user's left eye pupil and right eye pupil and the left and right eye parallax setting value can be referred to, and the left and right eye scene images collected by the left and right eye zoom cameras at the set focal length can be transformed and reprojected to the left eye viewing position and the right eye viewing position on the display screen.
  • image processing can be performed according to the user's needs and settings to generate left and right eye display views generated by the left and right eye scene images.
  • the binocular parallax adjustment process can be shown in Figure 5. Specifically, it can include the steps of left and right eye zoom camera parameter correction, left and right eye projection parameter correction, stereo matching, triangulation processing, viewpoint transformation projection, texture mapping, etc.
  • the left and right eye zoom camera parameter calibration includes:
  • the pose matrix (also called external parameter matrix) and internal parameter matrix of the left/right eye variable zoom camera can be generated according to the set focal length of the left/right eye variable zoom camera and its camera parameter calibration.
  • the pose matrix of the left/right eye variable zoom camera can be expressed as: Represents a 3*3 camera rotation matrix, L represents the left eye, and R represents the right eye. is the 3*1 camera translation matrix, is a 4*4 pose matrix.
  • the memory matrix of the left/right eye zoom camera can be expressed as Where f is the focal length of the zoom camera.
  • dx and dy represent the length units of a pixel in the x direction and y direction, and the size of the actual physical value represented by a pixel.
  • (u 0 ,v 0 ) represents the center point of the pixel plane.
  • Left/right eye projection parameter correction includes:
  • the left/right eye projection matrix can be generated according to the user's left/right eye pupil distance and parallax measured or set by the wearable display device.
  • the left/right eye projection matrix can be expressed as: It is a 3*3 binocular rotation matrix, which indicates the current gaze direction of both eyes. It is a 3*1 binocular translation matrix, where x, y, and z can describe the pupil distance of the user's eyes and set the parallax.
  • the scene images of the left/right eye are processed by stereo matching and triangulation to generate the depth information of the captured scene.
  • the stereo matching operation can be implemented by, but not limited to, block matching, optical flow, deep learning matching and other matching algorithms.
  • the depth estimation of the captured scene can be achieved by combining the matching points in the left/right scene images, the left and right camera pose matrices and the intrinsic parameter matrix.
  • p 1 and p 2 are projection points of the spatial point p w on the left eye scene image and the right scene image, respectively, and the two projection points can be marked by matching.
  • the relationship between p 1 and p 2 and the spatial point p w satisfies the conditions shown in formula (1) and formula (2).
  • the coordinates of p 1 and p 2 can be obtained through the stereo matching algorithm.
  • the intrinsic and extrinsic matrix of the camera is known.
  • z 1 and z 2 are the depths of p w in the two camera coordinate systems of the left and right zoom cameras.
  • the coordinates of p w and the values of z 1 and z 2 can be obtained by triangulation through triangulation.
  • the scene depth map is obtained by calculating the coordinates of p w and the values of z 1 and z 2 corresponding to each point in the left and right scene images.
  • Triangulation uses the triangle formed by two zoom cameras and the observed target, the distance between the zoom cameras (generally called baseline distance) and the focal length of the lens, and combines the principle of similar triangles to calculate the size and distance of the observed object.
  • Viewpoint transformation projection includes:
  • the scene depth map is reprojected according to the left/right eye projection matrix to generate scene geometry information corresponding to the left/right view.
  • the scene geometry information can describe the depth, perspective relationship and other information of the scene, and match the left/right eye.
  • the scene geometry information can be represented by the projection matrix of the viewpoint transformation projection. As shown in Figure 7, the projection matrix of the viewpoint transformation projection can be expressed as formula (3):
  • the projection matrix representing the viewpoint transformation projection for the left/right eye is the display projection matrix of the 3*4 left and right display units, s/dx and s/dy represent the pixel projection scaling factors, and u' 0 and v' 0 represent the projection offsets.
  • the conditions shown in formula (4) are satisfied.
  • Texture mapping includes:
  • the left/right eye scene images can be combined to perform texture shading, view warping, depth-image-based rendering (DIBR) and other operations to generate left/right eye display views that match the left/right eye.
  • DIBR depth-image-based rendering
  • a wearable display device includes a zoom camera.
  • a left-eye display view and a right-eye display view are generated based on the scene image captured by the zoom camera.
  • the regions of interest of the left-eye display view and the right-eye display view are enlarged.
  • the region of interest can be enlarged to the size of the left-eye display unit and the right-eye display unit.
  • the enlargement process in the embodiment of the present application includes super-resolution processing for the enlarged image.
  • step 403 when performing the above step 403 and performing image processing on at least one scene image to obtain a left-eye target image and a right-eye target image, it can be implemented in the following manner:
  • binocular parallax adjustment is performed on the image captured by the zoom camera to obtain a left eye display view and a right eye display view;
  • a magnification process is performed on the image of the region of interest in the left eye display view to obtain a left eye target image
  • a magnification process is performed on the image of the region of interest in the right eye display view to obtain a right eye target image.
  • the image processing in the embodiments of the present application also includes anti-shake processing.
  • the head-mounted display device may also include an inertial measurement unit (IMU).
  • IMU inertial measurement unit
  • the IMU is used to output IMU measurement data.
  • the left eye display view and the right eye display view may be anti-shake processed respectively according to the IMU measurement data. Then the amplification process is performed.
  • IMU is a sensor used to measure acceleration and rotational motion. IMU can usually measure acceleration and angular velocity along three axes (X, Y, and Z in the IMU coordinate system). IMU coordinate system: The IMU coordinate system uses the center of the IMU as the origin, with the X axis pointing to the left and right direction of the IMU, the Y axis pointing to the front and back direction of the IMU, and the Z axis pointing to the up and down direction of the IMU.
  • the image processing mentioned in the embodiments of the present application may also include image enhancement processing.
  • the image enhancement processing may include at least one of the following:
  • the image enhancement processing may be performed separately for the left-eye display view and the right-eye display view.
  • Image sharpening is used to compensate for the contours of the left-eye display view and the right-eye display view, enhance the edges of the image and the grayscale jump part, and make the image clear. It can include two types of processing: spatial domain processing and frequency domain processing.
  • the defogging in the embodiments of the present application may adopt a defogging algorithm based on image enhancement, a defogging algorithm based on image restoration, or a defogging algorithm based on deep learning.
  • the starting point of the dehazing algorithm based on image enhancement is to remove image noise as much as possible and improve image contrast, so as to restore a haze-free and clear image.
  • the dehazing algorithm based on image enhancement may include: one or more of: histogram equalization (HLE), adaptive histogram equalization (AHE), contrast limited adaptive histgram equalization (CLAHE), Retinex algorithm, wavelet transform, homomorphic filtering, etc.
  • the dehazing algorithm based on image restoration is mainly based on the atmospheric degradation model to perform corresponding dehazing processing.
  • the dehazing algorithm based on image restoration can include one or more of He Kai's bright and dark channel dehazing algorithm, Fattal's Single image dehazing, Tan's Visibility in bad weather from a single image, etc.
  • the dehazing effect based on the atmospheric degradation model is generally better than the dehazing algorithm based on image enhancement.
  • defogging algorithms There are two main types of defogging algorithms based on deep learning.
  • the first type of defogging algorithm based on deep learning can use the atmospheric degradation model and use a neural network to estimate the parameters in the atmospheric degradation model.
  • the second type of defogging algorithm based on deep learning can use the input foggy image and the defogged image for end-to-end training to obtain a neural network model, and then use the foggy image as input for inference to obtain the defogged image.
  • the rain removal algorithm can adopt a rain removal method based on filtering or a rain removal method based on deep learning.
  • the filtering-based method will treat raindrops as image noise for filtering.
  • the rain removal method based on deep learning mainly realizes image rain removal by constructing a neural network model, combining rain and rain removal data sets for supervised training or semi-supervised training, or unsupervised training.
  • the rain removal method based on deep learning can adopt a self-aligned video denoising method with transmission-Depth Consistency (Self-Aligned Video Deraining with Transmission-Depth Consistent) or a semi-supervised rain removal method based on a dynamic generator (Semi-Supervised Video Deraining with Dynamical Rain Generator), etc.
  • the deblurring process can be implemented by using a neural network model.
  • the neural network model is obtained by end-to-end training using the input blurred image and the deblurred image, and then the blurred image is used as input for inference to obtain the deblurred image.
  • the de-mosaicing process can also be implemented by using a neural network model.
  • the neural network model is obtained by end-to-end training using the input mosaic image and the de-mosaiced image, and then the mosaic image is used as input for inference to obtain the de-mosaiced image.
  • one or more neural network models can be constructed to achieve image dehazing, image deraining, image deblurring, image demosaicing and other processing.
  • the amplified image processing method provided in the embodiment of the present application can be used to assist visually impaired users to clearly see distant objects. For example, before executing step 4011, it can be determined that the auxiliary visual function is in the on state. If the auxiliary visual function is in the off state, 4011-404 are no longer executed.
  • 4011-404 are re-executed. If the user re-selects the ROI, 401-404 are executed. If the user does not adjust the magnification or re-select the ROI, such as continuing to watch, 4011-401 may not be re-executed, and only 402-404 are continuously executed.
  • a user with presbyopia uses the head mounted display device in the present application to view an object.
  • the user finds that he cannot see the object in the target scene clearly, as shown in Figure 8.
  • the user can activate the auxiliary vision function, and send input about the magnification to the processor through the user interface by operating the knob or button or performing other operations.
  • the zoom magnification is 4X.
  • the processor will determine the central picture area corresponding to the zoom magnification in the shooting range of the first zoom camera according to the zoom magnification of 4X, and determine the central picture area corresponding to the zoom magnification in the shooting range of the second zoom camera.
  • the left eye scene image and the right eye scene image are respectively subjected to parallax adjustment processing, image enhancement processing, etc. to obtain the processed left eye display view and right eye display view. Then, the area of interest in the left eye display view and the right eye display view is enlarged, such as super-resolution processing, to obtain the left eye target image and the right eye target image.
  • the left eye target image and the right eye target image are displayed correspondingly on the left eye display unit and the right eye display unit of the display screen.
  • the embodiment of the present application also provides an image processing device.
  • the image processing device is included in a head-mounted display device.
  • the head-mounted display device includes at least one variable-focus camera and a display screen.
  • the image processing device includes an acquisition module 901, a processing module 902, and a display module 903.
  • the functions of the acquisition module 901, the processing module 902, and the display module 903 can all be implemented by a processor.
  • the function of the acquisition module 901 can be implemented by a user interface.
  • the function of the processing module 902 is implemented by a processor.
  • the function of the display module 903 can be implemented by a display driver.
  • An acquisition module 901 is used to acquire a zoom ratio
  • the processing module 902 is used to determine an area of interest in a target scene and to use at least one variable-focus camera to focus on the target scene. Acquire at least one scene image; perform image processing on the at least one scene image to obtain a left-eye target image and a right-eye target image; the image processing includes performing a magnification process on an image of an area of interest in the at least one image;
  • the display module 903 is used to display the left-eye target image on the left-eye display unit of the display screen, and to display the right-eye target image on the right-eye display unit of the display screen.
  • the processing module 902 is specifically configured to determine, according to the zoom ratio, a central picture area corresponding to the zoom ratio from a shooting range of at least one zoom camera, and the central picture area is used as the region of interest.
  • the processing module 902 is specifically configured to determine a region of interest within a target scene from a shooting range of at least one variable-focus camera based on an eye tracking algorithm.
  • the at least one zoom camera includes a first zoom camera and a second zoom camera; the first zoom camera is used to capture a first image viewed by a left eye within a shooting range, and the second zoom camera is used to capture a second image viewed by a right eye within the shooting range;
  • the processing module 902 is specifically used for:
  • binocular parallax adjustment is performed on the first image and the second image to obtain a left eye display view and a right eye display view;
  • a magnification process is performed on the image of the region of interest in the left eye display view to obtain a left eye target image
  • a magnification process is performed on the image of the region of interest in the right eye display view to obtain a right eye target image.
  • the head mounted display device includes a variable focus camera
  • the processing module 902 is specifically configured to:
  • binocular parallax adjustment is performed on the image captured by the zoom camera to obtain a left eye display view and a right eye display view;
  • a magnification process is performed on the image of the region of interest in the left eye display view to obtain a left eye target image
  • a magnification process is performed on the image of the region of interest in the right eye display view to obtain a right eye target image.
  • the image processing further includes: image enhancement processing for a left-eye display view and image enhancement processing for a right-eye display view;
  • the image enhancement processing includes at least one of the following:
  • the head mounted display device further includes an inertial measurement unit IMU.
  • the processing module 902 is further configured to:
  • the left-eye display view and the right-eye display view are de-jittered respectively according to the IMU measurement data.
  • the processing module 902 is further configured to determine whether the auxiliary vision function is enabled before obtaining the zoom ratio.
  • the head-mounted display device is a mixed reality MR helmet.
  • an embodiment of the present application also provides a computer-readable storage medium, in which a computer-readable program is stored.
  • the computer-readable program When the computer-readable program is run on a computer, the computer executes the image processing method applied to a head-mounted display device provided in the above embodiments.
  • an embodiment of the present application further provides a computer program product.
  • the computer program product runs on a computer
  • the computer executes the image processing method applied to a head-mounted display device provided in the above embodiments.
  • an embodiment of the present application further provides a chip, which is used to read a computer program stored in a memory and execute the image processing method applied to a head-mounted display device provided in the above embodiments.
  • an embodiment of the present application further provides a chip system, the chip system including a processor, for supporting a display device to implement the image processing method provided in the above embodiments for a head-mounted display device.
  • the chip system also includes a memory, which is used to store programs and data necessary for the computer device.
  • a chip system consists of a chip, or includes a chip and other discrete devices.
  • the method provided in the embodiment of the present application can be implemented in whole or in part by software, hardware, firmware or any combination thereof.
  • software When implemented using software, it can be implemented in whole or in part in the form of a computer program product.
  • the computer program product includes one or more computer instructions.
  • the process or function according to the embodiment of the present invention is generated in whole or in part.
  • the computer can be a general-purpose computer, a special-purpose computer, a computer network, a network device, a user device or other programmable device.
  • Computer instructions can be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium.
  • Computer instructions can be transmitted from one website, computer, server or data center to another website, computer, server or data center via wired (e.g., coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.) means.
  • Computer-readable storage media can be any available medium that can be accessed by a computer or a data storage device such as a server or data center that includes one or more available media integrated. Available media can be magnetic media (e.g., floppy disks, hard disks, tapes), optical media (e.g., digital video discs (DVDs), or semiconductor media (e.g., SSDs), etc.
  • These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing device to work in a specific manner, so that the instructions stored in the computer-readable memory produce a manufactured product including an instruction device that implements the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.
  • These computer program instructions may also be loaded onto a computer or other programmable data processing device so that a series of operational steps are executed on the computer or other programmable device to produce a computer-implemented process, whereby the instructions executed on the computer or other programmable device provide steps for implementing the functions specified in one or more processes in the flowchart and/or one or more boxes in the block diagram.

Abstract

一种图像处理方法、头戴式显示设备及介质,涉及图像处理技术领域。头戴式显示设备中包括一个或者两个可变焦摄像头(120)。用户可以根据需求来调整放大倍数,即变焦倍率。在用户无法看清远处物体的情况下,通过可变焦摄像头(120)调节放大倍数,然后再针对放大的部分进行图像处理,比如超分辨率处理、图像增强处理,防抖动处理等等,无需借助外部设备来看到远处物体。此外,用户无需双手持握,进而还可以提高便携性。另外,利用头戴式显示设备中的IMU(180),在图像放大的情况下,来进行防抖动处理,无需额外增加器件。

Description

一种图像处理方法、头戴式显示设备及介质
相关申请的交叉引用
本申请要求在2022年12月14日提交中华人民共和国知识产权局、申请号为202211606269.5、发明名称为“一种图像处理方法、头戴式显示设备及介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及图像处理技术领域,特别涉及一种图像处理方法、头戴式显示设备及介质。
背景技术
人眼对于细微空间物体的尺度识别能力有一定的极限。比如,当人眼要看清远处的物体或景观时,往往要借助望远镜等设备才能放大关注物体的呈现尺度方便人眼辨别。再比如,由于年龄增长带来的老花等生理机能退化的人群,需要借助光学仪器(放大镜、老花镜)放大细微物体人眼才可识别。
望远镜等方式,需要双手持握,并且放大倍率较大无法观看近处的物体。而放大镜和老花镜等放大倍数固定,不能满足多种应用环境的使用需求。
发明内容
本申请实施例提供一种图像处理方法、头戴式显示设备及介质,用于满足多种应用环境的使用需求。
第一方面,本申请实施例提供一种头戴式显示设备,包括一个或者两个可变焦摄像头和显示屏。
一种可能的设计中,头戴式显示设备包括两个可变焦摄像头,分别为第一可变焦摄像头和第二可变焦摄像头。第一可变焦摄像头,用于采集目标场景内用户左眼观看的第一图像。第二可变焦摄像头,用于采集目标场景内用户右眼观看的第二图像。显示屏,用于在显示屏的左眼显示单元上显示左眼目标图像,并在显示屏的右眼显示单元上显示右眼目标图像。其中,左眼目标图像是在第一图像包括的用户的感兴趣区域(region of interest,ROI)进行放大处理后获得的,右眼目标图像是在第二图像包括ROI进行放大处理后获得的。示例性地,放大处理可以是超分辨率处理。
另一种可能的设计中,头戴式显示设备包括一个可变焦摄像头。可变焦摄像头,用于采集目标场景内用户观看的图像。显示屏,用于在显示屏的左眼显示单元上显示左眼目标图像,并在显示屏的右眼显示单元上显示右眼目标图像。其中,左眼目标图像和右眼目标图像是针对所述可变焦摄像头采集的图像进行图像处理得到的。图像处理包括对所述图像进行双目视差调节以及对感兴趣区域进行放大处理。
本申请实施例中,在头戴式显示设备中增加至少一个可变焦摄像头,用户可以根据需求来调整放大倍数,即变焦倍率。在用户无法看清远处物体的情况下,通过调节放大倍数,无需借助外部设备来看到远处物体。此外,用户无需双手持握,进而还可以提高便携性。
一些场景中,头戴式显示设备可以针对场景图像执行图像处理,使得图像处理后的图像适配显示屏的尺寸,并且分辨率较高,可以清晰的显示图像。此外,在光学可变焦摄像头倍率受限的情况下,可以通过图像处理,可以提高放大倍率,使得头戴式显示设备达到用户需求的放大倍数。
在一种可能的设计中,第一可变焦摄像头采集第一图像所采用的变焦倍率与第二可变焦摄像头采集第二图像所采用的变焦倍率相同或者不同;第一可变焦摄像头所采用的变焦倍率和第二可变焦摄像头所采用的变焦倍率分别被独立控制。上述设计中,用户可以针对左眼或者右眼观看的图像的放大进行独立调节。
在一种可能的设计中,所述头戴式显示设备还包括处理器;所述处理器,用于分别对所述第一图像和所述第二图像进行图像处理得到所述左眼目标图像和所述右眼目标图像;所述图像处理包括针对所述第一图像和所述第二图像中的感兴趣区域执行的放大处理。
在一种可能的设计中,处理器,还用于获取所述第一可变焦摄像头和第二可变焦摄像头所采用的变焦倍率,并确定所述感兴趣区域。
在一种可能的设计中,第一可变焦摄像头和第二可变焦摄像头的变焦倍率相同为例。所述处理器, 具体用于根据所述变焦倍率,从所述第一可变焦摄像头和/或第二可变焦摄像头的拍摄范围中确定与所述变焦倍率对应的中心画面区域,所述中心画面区域作为所述感兴趣区域。
在一种可能的设计中,第一可变焦摄像头和第二可变焦摄像头的变焦倍率不同为例。所述处理器,具体用于从所述第一可变焦摄像头的拍摄范围中确定与所述第一可变焦摄像头的变焦倍率对应的第一中心画面区域,从所述第二可变焦摄像头的拍摄范围中确定与所述第二可变焦摄像头的变焦倍率对应的第二中心画面区域,根据第一中心画面区域和第二中心画面区域确定ROI。
在一种可能的设计中,处理器,具体用于基于眼球跟踪算法从第一可变焦摄像头和/或第二可变焦摄像头的拍摄范围中确定ROI。
在一种可能的设计中,处理器,具体用于根据所述用户的左眼瞳孔与右眼瞳孔的距离、第一可变焦摄像头和第二可变焦摄像头在所述头戴式显示设备的位置分别对所述第一图像和所述第二图像进行双目视差调节得到左眼显示视图和右眼显示视图;针对左眼显示视图中的所述ROI执行放大处理得到所述左眼目标图像,并针对所述右眼显示视图中的所述ROI执行放大处理得到所述右眼目标图像。
在一种可能的设计中,头戴式显示设备包括一个可变焦摄像头;处理器,具体用于:根据用户的左眼瞳孔与右眼瞳孔的距离、可变焦摄像头在头戴式显示设备上的位置对可变焦摄像头采集的图像进行双目视差调节得到左眼显示视图和右眼显示视图;针对左眼显示视图中的感兴趣区域的图像执行放大处理得到左眼目标图像,并针对右眼显示视图中的感兴趣区域的图像执行放大处理得到右眼目标图像。
在一种可能的设计中,图像处理还包括:针对左眼显示视图的图像增强处理以及针对右眼显示视图的图像增强处理;
图像增强处理包括如下至少一种:
图像锐化、图像去雾、图像去雨、图像去模糊、图像去马赛克、图像对比度增强、图像颜色增强、图像细节增强或者图像亮度增强。
在一种可能的设计中,头戴式显示设备还包括惯性测量单元IMU;
惯性测量单元IMU,用于输出IMU测量数据;
处理器,还用于在用户头部发生偏转的情况下,根据IMU测量数据分别对左眼显示视图和右眼显示视图进行去抖动处理。
在一种可能的设计中,处理器,还用于:获取变焦倍率之前,确定辅助视觉功能处于开启状态。
在一种可能的设计中,头戴式显示设备为混合现实MR头盔。
第二方面,本申请实施例提供一种图像处理方法,应用于头戴式显示设备,头戴式显示设备包括显示屏和两个可变焦摄像头或者一个可变焦摄像头。以两个可变焦摄像头为例,分别为第一可变焦摄像头和第二可变焦摄像头。方法包括:获取变焦倍率;确定目标场景内的感兴趣区域ROI,并通过所述第一可变焦摄像头采集所述目标场景内用户左眼观看的第一图像,以及通过所述第二可变焦摄像头采集所述目标场景内所述用户右眼观看的第二图像;分别对所述第一图像和所述第二图像进行图像处理得到左眼目标图像和右眼目标图像;所述图像处理包括针对所述第一图像和所述第二图像中的所述ROI执行的放大处理;在所述显示屏的左眼显示单元上显示所述左眼目标图像,并在所述显示屏的右眼显示单元上显示所述右眼目标图像。
一些可能的实施例中,图像处理包括双目视差调节。一些可能的实施例中,图像处理包括针对第一图像和第二图像中的感兴趣区域的图像执行的放大处理,比如超分辨率处理。
本申请实施例中,在头戴式显示设备中增加至少一个可变焦摄像头,用户可以根据需求来调整放大倍数,即变焦倍率。在用户无法看清远处物体的情况下,通过调节放大倍数,无需借助外部设备来看到远处物体。此外,用户无需双手持握,进而还可以提高便携性。
一些场景中,头戴式显示设备可以针对场景图像执行图像处理,使得图像处理后的图像适配显示屏的尺寸,并且分辨率较高,可以清晰的显示图像。此外,在光学可变焦摄像头倍率受限的情况下,可以通过图像处理,可以提高放大倍率,使得头戴式显示设备达到用户需求的放大倍数。
在一种可能的设计中,确定目标场景内的感兴趣区域,包括:获取变焦倍率,从所述第一可变焦摄像头和/或第二可变焦摄像头的拍摄范围中确定与变焦倍率对应的中心画面区域,中心画面区域作为感兴趣区域。
示例性地,可以预置变焦倍率与拍摄范围中的中心画面区域的关联关系。从而在确定某一变焦倍率的情况下,可以根据该关联关系确定中心画面区域在拍摄范围内的区域边界。
在一种可能的设计中,确定目标场景内的感兴趣区域,包括:
基于眼球跟踪算法从所述第一可变焦摄像头和/或第二可变焦摄像头的拍摄范围中确定用户观看的目标场景内的感兴趣区域。
示例性地,在用户的视角范围内,通过眼球跟踪算法确定用户聚焦的区域,该区域即为用户的感兴趣区域。响应于用户的放大操作,通过可变焦摄像头采集场景图像,然后针对用户的感兴趣区域执行放大操作。
在一种可能的设计中,分别对所述第一图像和所述第二图像进行图像处理得到左眼目标图像和右眼目标图像,包括:
根据用户的左眼瞳孔与右眼瞳孔的距离、第一可变焦摄像头和第二可变焦摄像头在头戴式显示设备的位置分别对第一图像和第二图像进行双目视差调节得到左眼显示视图和右眼显示视图;
针对左眼显示视图中的感兴趣区域的图像执行放大处理得到左眼目标图像,并针对右眼显示视图中的感兴趣区域的图像执行放大处理得到右眼目标图像。
上述设计中,头戴式显示设备包括双目的可变焦摄像头,进而针对双目的可变焦摄像头采集的场景图像进行双目视差调节,使得用户在显示屏观看到的图像中的物体立体感增强,物体更加真实,提高用户体验。
在一种可能的设计中,头戴式显示设备包括一个可变焦摄像头,针对可变焦摄像头采集的图像进行图像处理得到左眼目标图像和右眼目标图像,包括:根据用户的左眼瞳孔与右眼瞳孔的距离、可变焦摄像头在头戴式显示设备上的位置对可变焦摄像头采集的图像进行双目视差调节得到左眼显示视图和右眼显示视图;针对左眼显示视图中的感兴趣区域的图像执行放大处理得到左眼目标图像,并针对右眼显示视图中的感兴趣区域的图像执行放大处理得到右眼目标图像。
上述设计中,头戴式显示设备包括单目可变焦摄像头,本申请的头戴式显示设备具备针对双目的可变焦摄像头采集的场景图像进行双目视差调节的功能,使得用户在显示屏观看到的图像中的物体立体感增强,物体更加真实,提高用户体验。
在一种可能的设计中,图像处理还包括:左眼显示视图的图像增强处理以及针对右眼显示视图的图像增强处理;图像增强处理包括如下至少一种:
图像锐化、图像去雾、图像去雨、图像去模糊、图像去马赛克、图像对比度增强、图像颜色增强、图像细节增强或者图像亮度增强。
本申请实施例针对需要放大的图像额外在执行图像增强处理,可以让用户观看到远处的物体时,减少空气散射带来的画面模糊不清楚等问题,提高画面清晰度。针对雨、雾等恶劣天气下的采集的场景图像执行去雨、去雾处理,可以提高显示图像的清晰度,提升用户观看体验。
在一种可能的设计中,头戴式显示设备还包括惯性测量单元IMU。获取惯性测量单元IMU输出的IMU测量数据;在用户头部发生偏转的情况下,根据IMU测量数据分别对左眼显示视图和右眼显示视图进行去抖动处理。
上述设计中,利用头戴式显示设备中的IMU,在图像放大出现抖动的情况,可以通过IMU测量数据来实现去抖动处理,进一步提高用户观看的显示图像的成像质量。
在一种可能的设计中,获取变焦倍率之前,方法还包括:确定辅助视觉功能处于开启状态。
一些实施例中,辅助视觉功能可以处于低功耗的待机状态。响应于用户的唤醒指令,可以唤醒辅助视觉功能。比如通过语音命令或者头戴式显示设备设置的按钮或者旋钮,来唤醒辅助视觉功能。
在一种可能的设计中,头戴式显示设备为混合现实MR头盔。
目前MR头盔不具备图像放大的功能,本申请实施例中在MR头盔中增加可变焦摄像头,在实现混合现实功能的基础上,使得具有老花或者需要实现望远功能的用户,无需额外佩戴老花镜或者使用望远镜,也能够清晰的观看到所需观看的远处景色。
第三方面,本申请实施例提供一种图像处理装置,包含于头戴式显示设备,头戴式显示设备还包括第一可变焦摄像头、第二可变焦摄像头和显示屏;
处理模块,用于确定目标场景内的感兴趣区域ROI,并通过所述第一可变焦摄像头采集所述目标场景内用户左眼观看的第一图像,以及通过所述第二可变焦摄像头采集所述目标场景内所述用户右眼观看的第二图像;分别对所述第一图像和所述第二图像进行图像处理得到左眼目标图像和右眼目标图像;所述图像处理包括针对所述第一图像和所述第二图像中的所述ROI执行的放大处理;
显示模块,用于在显示屏的左眼显示单元上显示左眼目标图像,并在显示屏的右眼显示单元上显示右眼目标图像。
在一种可能的设计中,所述装置还包括获取模块,用于获取所述变焦倍率。处理模块具体用于从所述第一可变焦摄像头和/或第二可变焦摄像头的拍摄范围中确定与所述变焦倍率对应的中心画面区域,所述中心画面区域作为所述ROI。
在一种可能的设计中,处理模块具体用于基于眼球跟踪算法从所述第一可变焦摄像头和/或第二可变焦摄像头的拍摄范围中确定感兴趣区域。
在一种可能的设计中,处理模块具体用于:
根据用户的左眼瞳孔与右眼瞳孔的距离、第一可变焦摄像头和第二可变焦摄像头在头戴式显示设备的位置分别对第一图像和第二图像进行双目视差调节得到左眼显示视图和右眼显示视图;
针对左眼显示视图中的感兴趣区域的图像执行放大处理得到左眼目标图像,并针对右眼显示视图中的感兴趣区域的图像执行放大处理得到右眼目标图像。
在一种可能的设计中,图像处理还包括:左眼显示视图的图像增强处理以及针对右眼显示视图的图像增强处理;
图像增强处理包括如下至少一种:
图像锐化、图像去雾、图像去雨、图像去模糊、图像去马赛克、图像对比度增强、图像颜色增强、图像细节增强或者图像亮度增强。
在一种可能的设计中,头戴式显示设备还包括惯性测量单元IMU。处理模块,还用于:
获取惯性测量单元IMU输出的IMU测量数据;
在用户头部发生偏转的情况下,根据IMU测量数据分别对左眼显示视图和右眼显示视图进行去抖动处理。
在一种可能的设计中,处理模块,还用于在获取变焦倍率之前,确定辅助视觉功能处于开启状态。
在一种可能的设计中,头戴式显示设备为混合现实MR头盔。
第四方面,本申请实施例提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机程序或指令,当计算机程序或指令被头戴式显示设备执行时,使得该头戴式显示设备执行如第一方面或者第一方面的任一设计所述中的方法。
第五方面,本申请实施例提供一种计算机程序产品,该计算机程序产品包括计算机程序或指令,当该计算机程序或指令被头戴式显示设备执行时,实现如第一方面或者第一方面的任一设计所述中的方法。
本申请在上述各方面提供的实现的基础上,还可以进行进一步组合以提供更多实现。
附图说明
图1A为本申请实施例提供的一种头戴式显示设备结构示意图;
图1B为本申请实施例提供的另一种头戴式显示设备结构示意图;
图1C为本申请实施例提供的头戴式显示设备的可变焦摄像头配置位置示意图;
图2为本申请实施例提供的又一种头戴式显示设备结构示意图;
图3为本申请实施例提供的再一种头戴式显示设备结构示意图;
图4A为本申请实施例提供的一种图像处理方法流程示意图;
图4B为本申请实施例提供的另一种图像处理方法流程示意图;
图5为本申请实施例提供的一种可能的双目视差调节示意图;
图6为本申请实施例提供的左右眼图像投影关系示意图;
图7为本申请实施例提供的左眼图像视点变换示意图;
图8为本申请实施例提供的一种场景的图像处理示意图;
图9为本申请实施例提供的一种图像处理装置结构示意图。
具体实施方式
下面将结合附图,对本申请实施例进行详细描述。本申请的实施方式部分使用的术语仅用于对本申请的具体实施例进行解释,而非旨在限定本申请。显然,所描述的实施例仅仅是本申请一部分实施例, 并不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
其中,在本申请的描述中,除非另有说明,“多个”是指两个或多于两个。另外,“/”表示前后关联的对象是一种“或”的关系,例如,A/B可以表示A或B;本申请中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况,其中A,B可以是单数或者复数。并且,为了便于清楚描述本申请实施例的技术方案,在本申请的实施例中,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本领域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。还需要说明的是,除非特殊说明,一个实施例中针对一些技术特征的具体描述也可以应用于解释其他实施例提及对应的技术特征。
本申请实施例可以应用于混合现实(mixed reality,MR)场景或者虚拟现实(virtual reality,VR)场景。在混合现实领域,混合现实技术是通过在现实场景呈现虚拟场景信息,在现实世界、虚拟世界和用户之间搭起一个交互反馈的信息回路,以增强用户的使用体验的真实感。本申请实施例中的头戴式显示设备可以MR眼镜或者MR头盔等。本申请实施例中的头戴式显示设备具有放大、望远、图像处理及展示功能。
参见图1A和图1B所示,为本申请实施例提供的一种头戴式显示设备的***架构示意图。头戴式显示设备包括至少一个可变焦摄像头120、处理器140以及显示屏160。
可以理解的是,本申请实施例示意的结构并不构成对头戴式显示设备的具体限定。在本申请另一些实施例中,头戴式显示设备可以包括比图1A和图1B所示的更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图1A和图1B所示的部件可以以硬件,软件或软件和硬件的组合实现。
处理器140可以包括一个或多个处理单元,例如:处理器140可以包括中央处理单元(central processing unit,CPU)、应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processing unit,GPU),图像信号处理器(image signal processor,ISP),控制器,视频编解码器,数字信号处理器(digital signal processor,DSP),和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。
作为一种举例,处理器140中包括CPU和图像处理模块。示例性地,图像处理模块可以包括ISP、NPU、DSP、GPU等中的一个或者多个。
处理器140中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器140中的存储器为高速缓冲存储器。该存储器可以保存处理器140刚用过或循环使用的指令或数据。如果处理器140需要再次使用该指令或数据,可从存储器中直接调用。避免了重复存取,减少了处理器140的等待时间,因而提高了***的效率。
图1A中,以头戴式显示设备包括一个可变焦摄像头120为例。图1B中,以头戴式显示设备包括两个可变焦摄像头120(也可以称为双目可变焦摄像头)为例,分别为第一可变焦摄像头120-1和第二可变焦摄像头120-2为例。可变焦摄像头120(或者,第一可变焦摄像头120-1和第二可变焦摄像头120-2)与处理器140之间通过数据接口和控制接口连接。数据接口用于可变焦摄像头120向处理器140传输图像数据。控制接口用于处理器140向可变焦摄像头发送控制信号,比如变焦控制信号。例如,数据接口可以是移动产业处理器接口(Mobile Industry Processor Interface,MIPI),或者其它可以用于传输图像数据的接口。比如,控制接口可以是串行外设接口(Serial Peripheral Interface,SPI)或者两线式串行总线(Inter-Integrated Circuit,I2C)接口,或者其它可以用于传输控制信号的接口。
一些场景中,头戴式显示设备中可以包括多个可变焦摄像头,不同的摄像头的变焦范围不同,比如长焦摄像头,变焦倍率大于5,比如倍率为10X。中长焦摄像头,变焦倍率小于长焦摄像头且大于1X,比如2X、3X等等。
在另一些场景中,头戴式显示设备可以包括多对可变焦摄像头。每对可变焦摄像头的变焦范围相同,比如一对长焦摄像头,一对中长焦摄像头。长焦摄像头可以实现望远功能。中长焦摄像头可以应用于老花场景中。
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器140可以包含多组I2C总线。处理器140可以通 过不同的I2C总线接口分别耦合不同的可变焦摄像头。
MIPI接口包括摄像头串行接口(camera serial interface,CSI),显示屏串行接口(display serial interface,DSI)等。在一些实施例中,处理器140和可变焦摄像头120之间通过CSI接口通信,实现头戴式显示设备的采集功能。处理器140和显示屏160之间可以通过DSI接口通信,实现头戴式显示设备的显示功能。
一些实施例中,当使用第一可变焦摄像头120-1和第二可变焦摄像头120-2采集图像时,可以通过处理器140向第一可变焦摄像头120-1和第二可变焦摄像头120-2发送控制信号,使得第一可变焦摄像头120-1和第二可变焦摄像头120-2保持焦距的同步调节。一些实施例中,双目可变焦摄像头的对焦和曝光控制也可以保持一致。另一些实施例中,第一可变焦摄像头120-1和第二可变焦摄像头120-2的变焦倍率可以分别独立控制。比如仅调整第一可变焦摄像头120-1的变焦倍率。再比如,仅调整第二可变焦摄像头120-2的变焦倍率。又比如,第一可变焦摄像头120-1的变焦倍率和第二可变焦摄像头120-2的变焦倍率可以不同。
显示屏160用于显示图像,视频等。显示屏160可以包括左眼显示单元和右眼显示图像。左眼显示单元用于显示左眼观看的图像、视频等。右眼显示单元用于显示右眼观看的图像、视频等。显示屏160包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emitting diode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrix organic light emitting diode的,AMOLED),柔性发光二极管(flex light-emitting diode,FLED),Miniled,MicroLed,Micro-oLed,量子点发光二极管(quantum dot light emitting diodes,QLED)等。
头戴式显示设备可以通过ISP,至少一个可变焦摄像头120,GPU,显示屏160以及应用处理器等实现拍摄图像和视频功能。
ISP用于处理可变焦摄像头120反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在可变焦摄像头120中。
可变焦摄像头120用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号转换成标准的RGB,YUV等格式的图像信号,并输出给图像处理模块。在一些实施例中,处理器140可以根据存储器中的程序或者指令触发启动可变焦摄像头120,从而可变焦摄像头120采集至少一个图像,并根据程序或者指令对至少一个图像进行相应的处理,比如数字放大处理、双目视差调节、图像锐化、图像去雾、图像去雨、图像去模糊、图像去马赛克、图像对比度增强、图像颜色增强、图像细节增强或者图像亮度增强等图像处理。
在一些可能的实施例中,至少一个可变焦摄像头120可以部署于头戴式显示设备的外部面板上,朝向视角方向,参见图1C所示。为了减少头戴式显示设备的体积,可以采用折叠光路的方式将至少一个可变焦摄像头120部署在头戴式显示设备上。本申请中对至少一个可变焦摄像头120的部署方式不作具体限定。应理解的是,头戴式显示设备包括双目可变焦摄像头时,其中一个可变焦摄像头在头戴式显示设备中靠近左眼位置,另一个可变焦摄像头在头戴式显示设备中考验右眼位置。
在一些实施例中,头戴式显示设备还可以包括惯性测量单元(inertial measurement unit,IMU)180,参见图2所示。IMU180用于输出IMU测量数据。IMU测量数据,可以包括物体三轴姿态角(或角速率)以及加速度。头戴式显示设备中的IMU180可以用于头戴式显示设备的位姿定位。本申请实施例中,在放大或者望远场景中,IMU测量数据可以用于防抖处理。
在一些可能的实施例中,参见图3所示,头戴式显示设备还可以包括充电管理模块131,电源管理模块132,电池133,音频模块134,扬声器135,麦克风136,耳机接口137,传感器模块138,按键139、受话器151等。其中传感器模块138可以包括压力传感器,陀螺仪传感器,气压传感器,磁传感器,加速度传感器,温度传感器等。
充电管理模块131用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块131可以通过USB接口接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块131可以通过头戴式显示设备的无线充电线圈接收无线充电 输入。充电管理模块131为电池133充电的同时,还可以通过电源管理模块132为头戴式显示设备供电。
电源管理模块132用于连接电池133,充电管理模块131与处理器140。电源管理模块132接收电池133和/或充电管理模块131的输入,为处理器140,存储器,显示屏160,可变焦摄像头120等供电。电源管理模块132还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块132也可以设置于处理器140中。在另一些实施例中,电源管理模块132和充电管理模块131也可以设置于同一个器件中。
音频模块134用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块134还可以用于对音频信号编码和解码。在一些实施例中,音频模块134可以设置于处理器140中,或将音频模块134的部分功能模块设置于处理器140中。
扬声器135,也称“喇叭”,用于将音频电信号转换为声音信号。头戴式显示设备可以通过扬声器135收听音乐,或收听免提通话。
受话器151,也称“听筒”,用于将音频电信号转换成声音信号。当头戴式显示设备接听电话或语音信息时,可以通过将受话器151靠近人耳接听语音。
麦克风136,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风136发声,将声音信号输入到麦克风136。头戴式显示设备可以设置至少一个麦克风136。在另一些实施例中,头戴式显示设备可以设置两个麦克风136,除了采集声音信号,还可以实现降噪功能。
耳机接口137用于连接有线耳机。耳机接口137可以是USB接口,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。
USB接口是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。头戴式显示设备上还可以包括一个或者多个USB接口。USB接口可以用于连接充电器为头戴式显示设备充电,也可以用于头戴式显示设备与***设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如终端设备等。
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对头戴式显示设备的结构限定。在本申请另一些实施例中,头戴式显示设备也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。
目前相关技术中,为了实现望远功能,通过AR设备外部安装一个固定焦距的光学镜头设备,配合感光器件来完成图像采集。但是该方式,仅能实现固定倍率的望远放大功能。另一种方式,通过改造望远镜来实现,在原有望远镜的光学采集通路之外,增加一路电子采集通路。两个通路分别进行成像,并在目镜的位置处进行叠加。该方式仅是对望远镜进行改造,仅能实现望远镜的功能,便携性较差。
本申请实施例中,头戴式显示设备中包括至少一个可变焦摄像头和显示屏,可以根据用户需求调整至少一个可变焦摄像头的焦距,来实现用户的感兴趣区域的数字放大。
参见图4A所示,为本申请实施例提供的一种可能的图像处理方法流程示意图。该图像处理方法可以应用上述图1A-图3所示的头戴式显示设备中。头戴式显示设备包括至少一个可变焦摄像头和显示屏。图4A中的方法步骤可以由头戴式显示设备执行,比如可以由头戴式显示设备中的处理器或者处理模块执行。
401,确定用户通过头戴式显示设备所观看的目标场景内的感兴趣区域(Region of Interest,ROI)。
402,通过至少一个可变焦摄像头,针对目标场景采集至少一个场景图像。
示例性地,通过至少一个可变焦摄像头,在拍摄范围内针对目标场景采集至少一个场景图像。
例如,头戴式显示设备中包括1个可变焦摄像头。至少一个场景图像可以是通过该一个可变焦摄像头在一次拍摄下采集得到,即在一次曝光下获得。采集多个场景图像用于后续在执行图像处理时,可以提高处理效果。
再比如,头戴式显示设备包括一对可变焦摄像头,则该一对可变焦摄像头中每个可变焦摄像头可以采集N个场景图像。N为正整数。该一对可变焦摄像头在相同的变焦倍率下采集场景图像。该一对可变焦摄像头也可以在不相同的变焦倍率下采集场景图像。
又比如,头戴式显示设备包括多个可变焦摄像头或者多对可变焦摄像头的情况下,不同的可变焦摄像头(或者不同对的可变焦摄像头)的变焦范围不同。可以根据变焦倍率启动不同的(或者不同对的) 可变焦摄像头。
在该情况下,可以根据获取的变焦倍率与倍率阈值的比较结果来确定启动哪个可变焦摄像头。比如获取的变焦倍率小于或者等于第一倍率阈值,可以启动中长焦摄像头。变焦倍率大于第一倍率阈值,可以启动长焦摄像头。第一倍率阈值的取值范围可为[A1,A2)。例如,第一倍率阈值的取值范围可以是[5,10)。例如,第一倍率阈值可为5、6、7或8等。长焦摄像头的倍率大于或者等于第一倍率阈值,比如长焦摄像头的倍率为10X。在一种可能的实施方式中,当变焦倍率大于9.9时,可以通过长焦摄像头采集N个场景图像。
403,针对至少一个场景图像进行图像处理得到左眼目标图像和右眼目标图像;图像处理包括针对至少一个场景图像中的感兴趣区域的图像执行的放大处理。
在一些可能的实施例中,在针对至少一个场景图像中的感兴趣区域的图像执行放大处理时,可以将感兴趣区域的图像从至少一个场景图像中裁剪出来。然后将裁剪出来的部分的图像进行放大处理。比如,将至少一个场景图像中与感兴趣区域相对应的部分放大到左眼显示单元和右眼显示单元的尺寸。本申请实施例中,通过针对用户感兴趣区域的图像放大显示在显示屏中,从而用户可以通过可穿戴显示设备获取更好的视觉体验。比如老花的佩戴者,可以通过可穿戴设备针对无法看清楚的区域进行放大处理。
404,在显示屏的左眼显示单元上显示左眼目标图像,并在显示屏的右眼显示单元上显示右眼目标图像。
在一种可能的实施方式中,参见图4B所示,在步骤401之前,还可以执行步骤4011,获取变焦倍率。
变焦倍率也可以是可穿戴显示设备在某些拍摄模式下的默认的变焦倍率,比如老花模式或者望远模式等。变焦倍率也可以用户在可穿戴显示设备上选择的变焦倍率。
一种可能的示例中,头戴式显示设备上比如配置有用户接口,用于获取用户的关于倍率的输入。变焦倍率也可以理解为放大倍数。当头戴式显示设备的佩戴者由于视力原因无法看清显示屏上的某个区域的图像时,可以通过该用户接口输入放大倍数。该用户接口可以是按钮、旋钮的类型,或者也可以包括语音识别模块,例如通过识别用户的语音指令,来确定放大倍数。用户接口还可以是用于外接手柄等外部控制设备的接口。用户可以操作外部控制设备来向头戴式显示设备发送控制信号,来指示放大倍数。示例性地,头戴式显示设备中可以具有菜单栏,用于用户来选择放大倍数。
所述放大倍数可以为整数,例如1倍、2倍、3倍、10倍、50倍或更大,也可以为非整数,例如1.5倍等。这些都在本申请的保护范围内。
本领域技术人员应当理解,所述用户接口可以任何已知常见的形式实现,例如将其设置在与产品主体以有线或无线连接的控制器中,或直接设置在产品主体中等,本申请对此不作具体限制。
一些可能的场景中,用户可以针对第一可变焦摄像头和第二可变焦摄像头设置不同的变焦倍率。
在一些可能的实施方式中,步骤401在确定ROI时,可以通过如下任一种可能的方式来实现。如下仅是例举三种可能的实现方式,其它的能够确定ROI区域的方式均适用于本申请。
第一种可能的实现方式中:
根据变焦倍率,从至少一个可变焦摄像头的拍摄范围中确定与变焦倍率对应的中心画面区域,中心画面区域作为感兴趣区域。
示例性地,可以预置变焦倍率与拍摄范围中的中心画面区域的关联关系。从而在确定某一变焦倍率的情况下,可以根据该关联关系确定中心画面区域在拍摄范围内的区域边界。
中心画面区域可以理解为可变焦摄像头拍摄的场景图像在拍摄范围中的一个中心区域。可变焦摄像头在变焦倍率对应的焦距下,能够采集的拍摄范围是一定的。在拍摄范围固定的情况下,对于同一场景图像,在不同的变焦倍率下被显示部分(中心画面区域)的尺寸是不同的。
第二种可能的实现方式中:
基于眼球跟踪算法从至少一个可变焦摄像头的拍摄范围中确定用户观看的目标场景内的感兴趣区域。可以理解的,用户左右眼的区域大小和位置一般是固定的。而图像的长宽也是能够确定的,从而根据左右眼部的区域大小和位置可以确定用户人眼的感兴趣区域。
示例性地,在用户的视角范围内,通过眼球跟踪算法确定用户聚焦的区域,该区域即为用户的感兴趣区域。响应于用户的放大操作,通过至少一个可变焦摄像头采集场景图像,然后针对用户的感兴趣区域执行放大操作。
第三种可能的实现方式中:
在显示屏上显示有预览框,预览框中显示预览图像;预览图像是至少一个可变焦摄像头中采集目标场景得到的;预览图像可以是可变焦摄像头在1倍变焦的情况下采集目标场景得到。响应于针对预览图像中第一区域的选择操作,以确定第一区域为用户观看的目标场景内的感兴趣区域。
在一些可能的实施方式中,本申请实施例中的图像处理除了放大处理以外,还可以包括双目视差调节和/或防抖处理。
一种可能的示例中,以可穿戴显示设备包括第一可变焦摄像头和第二可变焦摄像头为例。第一可变焦摄像头用于在变焦倍率下采集左眼观看的第一图像。第一图像在本申请实施例中也可以称为左眼场景图像,第二可变焦摄像头用于在变焦倍率下采集右眼观看的第二图像。第二图像在本申请实施例中也可以称为右眼场景图像。进而在执行图像处理时,可以先对左眼场景图像和右眼场景图像进行双目视差调节,然后执行双目视差调节后的感兴趣区域进行放大处理。可以将感兴趣区域放大到左眼显示单元和右眼显示单元的大小。本申请实施例中的放大处理,包括针对尺寸放大的图像进行超分辨率处理。在本申请实施例中,第一可变焦摄像头也可以称为左眼可变焦摄像头,第二可变焦摄像头也可以称为右眼可变焦摄像头。后续在描述时,以将第一可变焦摄像头称为左眼可变焦摄像头,第二可变焦摄像头称为右眼可变焦摄像头为例。
例如,在执行上述步骤403,针对至少一个场景图像进行图像处理得到左眼目标图像和右眼目标图像时,可以通过如下方式实现:
根据用户的左眼瞳孔与右眼瞳孔的距离、第一可变焦摄像头和第二可变焦摄像头在头戴式显示设备的位置分别对左眼场景图像和右眼场景图像进行双目视差调节得到左眼显示视图和右眼显示视图。然后,针对左眼显示视图中的感兴趣区域的图像执行放大处理得到左眼目标图像,并针对右眼显示视图中的感兴趣区域的图像执行放大处理得到右眼目标图像。
在执行双目视差调节时,可以参考用户的左眼瞳孔与右眼瞳孔的距离、左右眼视差设定值,将设定焦距下左右眼可变焦摄像头采集的左右眼场景图像进行视点变换重投影到显示屏上左眼观看位置和右眼观看位置。在执行双目视差调节时可以按照用户所需、设定来执行图像处理生成由左右眼场景图像生成的左右眼显示视图。
作为一种举例,双目视差调节流程可以参见图5所示。具体可以包括左右眼可变焦摄像头参数校正、左右眼投影参数校正、立体匹配、三角化处理、视点变换投影、纹理映射等步骤。
左右眼可变焦摄像头参数校正包括:
可根据设定的左/右眼可变焦摄像头的焦距及其摄像头参数标定生成左/右眼可变焦摄像头的位姿矩阵(也可以称为外参矩阵)和内参矩阵。左/右眼可变焦摄像头的位姿矩阵可以表示为: 表示3*3相机旋转矩阵,L表示左眼,R表示右眼。为3*1相机平移矩阵,为4*4位姿矩阵。左/右眼可变焦摄像头的内存矩阵可以表示为其中f为设定的可变焦摄像头的焦距。其中,
dx和dy分别表示x方向和y方向的一个像素占多少长度单位,及一个像素代表的实际物理值的大小。(u0,v0)表示像素平面的中心点。
左/右眼投影参数校正包括:
可根据可穿戴显示设备的测量或设定的用户左/右眼瞳距及视差生成左/右眼投影矩阵。左/右眼投影矩阵可以表示为: 为3*3双眼旋转矩阵,表示当前双眼的注视朝向,为3*1双眼平移矩阵,其中x、y、z可描述用户双眼的瞳距及设定视差。
左/右眼的场景图像经过立体匹配操作、三角化处理来生成采集场景的深度信息。立体匹配操作可以采用但不局限于块匹配、光流、深度学习匹配等匹配算法来实现。三角化处理的过程中,可结合左/右的场景图像中匹配点、左右摄像头位姿矩阵及内参矩阵实现对采集场景的深度估计。
参见图6所示,p1和p2分别为空间点pw在左眼场景图像和右场景图像上的投影点,两个投影点可通过匹配来标记出。p1和p2与空间点pw的关系满足公式(1)和公式(2)所示的条件。

其中,p1和p2的坐标可通过立体匹配算法获得,已知相机的内参矩阵和外参矩阵,z1和z2为pw在左右可变焦摄像头的两个相机坐标系中的深度。可通过三角法来执行三角化处理求得pw的坐标、z1和z2的值。通过计算左右场景图像中每个点对应的pw的坐标以及z1和z2的值来获得场景深度图。
三角法就是利用两个可变焦摄像头与观测目标形成的三角形,利用可变焦摄像头之间的间距(一般叫基线距离)和镜头焦距并结合相似三角形原理来计算观测物体的大小与距离。
视点变换投影包括:
根据左/右眼投影矩阵针对场景深度图进行重投影,生成对应左/右视图的场景几何信息,场景几何信息可以描述场景的深度、透视关系等信息,与左/右眼匹配。场景几何信息可以通过视点变换投影的投影矩阵来表示。参见图7所示,视点变换投影的投影矩阵可以表示为公式(3):
其中表示左/右眼的视点变换投影的投影矩阵,为3*4左右显示单元的显示投影矩阵,s/dx、s/dy表示像素投影缩放系数,u’0和v’0为投影偏移。其中,满足公式(4)所示的条件。
纹理映射包括:
可根据估计的左/右视图场景几何信息,结合左/右眼场景图像进行纹理着色、视图卷绕、基于深度图的图像绘制(depth-image-based rendering,DIBR)等操作,生成与左/右眼相匹配的左右眼显示视图。
另一种可能的示例中,以可穿戴显示设备包括一个可变焦摄像头为例。进而在执行图像处理时,基于该可变焦摄像头采集的场景图像生成左眼显示视图和右眼显示视图。然后针对左眼显示视图和右眼显示视图感兴趣区域进行放大处理。可以将感兴趣区域放大到左眼显示单元和右眼显示单元的大小。本申请实施例中的放大处理,包括针对尺寸放大的图像进行超分辨率处理。
例如,在执行上述步骤403,针对至少一个场景图像进行图像处理得到左眼目标图像和右眼目标图像时,可以通过如下方式实现:
根据用户的左眼瞳孔与右眼瞳孔的距离、可变焦摄像头在头戴式显示设备上的位置对可变焦摄像头采集的图像进行双目视差调节得到左眼显示视图和右眼显示视图;
针对左眼显示视图中的感兴趣区域的图像执行放大处理得到左眼目标图像,并针对右眼显示视图中的感兴趣区域的图像执行放大处理得到右眼目标图像。
一些可能的实施例中,本申请实施例中的图像处理还包括防抖处理。头戴式显示设备中还可以包括惯性测量单元(IMU)。IMU用于输出IMU测量数据。在用户的头部发生偏转的情况下,可以根据IMU测量数据分别对左眼显示视图和右眼显示视图进行防抖处理。然后再执行放大处理。
IMU是用于测量加速度与旋转运动的传感器。IMU通常可沿着三轴(IMU坐标系中的X轴、Y轴、Z轴)测量加速度和角速度。IMU坐标系:IMU坐标系以IMU的中心作为原点,X轴指向IMU的左右方向,Y轴指向IMU的前后方向,Z轴指向IMU的上下方向。
在一些可能的实施方式中,本申请实施例提及的图像处理还可以包括图像增强处理。图像增强处理可以包括如下至少一种:
图像锐化、图像去雾、图像去雨、图像去模糊、图像去马赛克、图像对比度增强、图像颜色增强、图像细节增强或者图像亮度增强等。
示例性地,可以针对左眼显示视图和右眼显示视图分别执行图像增强处理。
图像锐化(image sharpening)用于补偿左眼显示视图和右眼显示视图的轮廓,增强图像的边缘及灰度跳变的部分,使图像变得清晰。可以包括空间域处理和频域处理两类。
本申请实施例中的去雾,可以采用基于图像增强的去雾算法、基于图像复原的去雾算法或者基于深度学习的去雾算法等。
基于图像增强的去雾算法出发点是尽量去除图像噪声,提高图像对比度,从而恢复出无雾清晰图像。 基于图像增强的去雾算法可以包括:直方图均衡化(histogram equalization,HLE)、自适应直方图均衡化(adaptive histogram equlization,AHE)、限制对比度自适应直方图均衡化(contrast limited adaptive histgram equalization,CLAHE)、Retinex算法、小波变换、同态滤波等中的一种或多种。
基于图像复原的去雾算法主要是基于大气退化模型,进行响应的去雾处理。基于图像复原的去雾算法可以包括:何凯明暗通道去雾算法、Fattal的Single image dehazing、Tan的Visibility in bad weather from a single image等中的一种或多种。一般情况下,基于大气退化模型的去雾效果普遍优于基于图像增强的去雾算法。
基于深度学习的去雾算法主要可以分为两种。第一种,基于深度学习的去雾算法可以采用大气退化模型,利用神经网络对大气退化模型中的参数进行估计。第二种,基于深度学习的去雾算法可以利用输入的有雾图像及去雾后的图像进行端到端训练得到神经网络模型,而后以有雾的图像作为输入进行推理,得到去雾后的图像。
本申请实施例中,去雨算法可以采用基于滤波的去雨方法或者基于深度学习的去雨方法。比如,基于滤波的方法会把雨滴当做图像噪声进行滤波处理。基于深度学习的去雨方法主要通过构建神经网络模型,结合有雨和去雨的数据集进行监督训练或半监督训练,或者无监督训练等实现图像去雨。例如基于深度学习的去雨方法可以采用具有传输深度一致性的自对准视频去噪方法(Self-Aligned Video Deraining with Transmission-Depth Consistenc)或者基于动态生成器的半监督去雨方法(Semi-Supervised Video Deraining with Dynamical Rain Generator)等。
本申请实施例中,去模糊处理可以采用神经网络模型来实现。利用输入的模糊图像及去模糊后的图像进行端到端训练得到神经网络模型,而后以模糊的图像作为输入进行推理,得到去模糊后的图像。
本申请实施例中,去马赛克处理也可以采用神经网络模型来实现。利用输入的存在马赛克的图像及去马赛克后的图像进行端到端训练得到神经网络模型,而后以存在马赛克的图像作为输入进行推理,得到去马赛克后的图像。
一些场景中,可以构造一个或者多个神经网络模型来实现图像去雾、图像去雨、图像去模糊、图像去马赛克等处理。
一些实施例中,在确定辅助视觉功能处于开启状态的情况下,可以采用本申请实施例提供的放大的图像处理的方式来辅助视觉有障碍的用户来清楚的看到远处的物体。例如,在执行步骤4011之前,可以先执行确定辅助视觉功能处于开启状态。如果辅助视觉功能处于关闭状态,则不再执行4011-404。
一些可能的场景中,用户调整了放大倍数,则重新执行4011-404。如果用户重新选择ROI,则执行401-404。如果用户未调整放大倍数或者重新选择ROI,比如持续观看,则可以不再重新执行4011-401,仅持续执行402-404。
下面结合场景对本申请实施例的方案进行说明。比如一位老花的用户使用本申请中头戴式显示设备来观看某个物体。
此时,该用户发现其无法看清楚的目标场景中的物体,参见图8所示。为此,该用户可以启动辅助视觉功能,通过操作旋钮或者按钮或者执行其它操作,通过用户接口向处理器发送关于倍率的输入。比如变焦倍率为4X。处理器将根据4X的变焦倍率,确定第一可变焦摄像头的拍摄范围中与变焦倍率对应的中心画面区域,以及确定第二可变焦摄像头的拍摄范围中与变焦倍率对应的中心画面区域。通过第一可变焦摄像头采集左眼场景图像以及通过第二可变焦摄像头采集右眼场景图像后,然后针对眼场景图像和右眼场景图像分别进行视差调节处理、图像增强处理等后得到处理后的左眼显示视图和右眼显示视图。然后再针对左眼显示视图和右眼显示视图中的感兴趣区域的进行放大处理,比如超分辨率处理,获得左眼目标图像和右眼目标图像。将左眼目标图像和右眼目标图像对应在显示屏的左眼显示单元和右眼显示单元进行显示。
本申请实施例还提供一种图像处理装置。图像处理装置包含于头戴式显示设备中。头戴式显示设备包括至少一个可变焦摄像头和显示屏。参见图9所示,图像处理装置包括获取模块901、处理模块902以及显示模块903。一些可能的实施例中,获取模块901、处理模块902以及显示模块903的功能均可以由处理器来实现。另一些可能的实施例中,获取模块901的功能可以用户接口来实现。处理模块902的功能由处理器来实现。显示模块903的功能可以由显示驱动来实现。
获取模块901,用于获取变焦倍率;
处理模块902,用于确定目标场景内的感兴趣区域,并通过至少一个可变焦摄像头,针对目标场景 采集至少一个场景图像;针对至少一个场景图像进行图像处理得到左眼目标图像和右眼目标图像;图像处理包括针对至少一个图像中的感兴趣区域的图像执行的放大处理;
显示模块903,用于在显示屏的左眼显示单元上显示左眼目标图像,并在显示屏的右眼显示单元上显示右眼目标图像。
在一种可能的实施方式中,处理模块902具体用于根据变焦倍率,从至少一个可变焦摄像头的拍摄范围中确定与变焦倍率对应的中心画面区域,中心画面区域作为感兴趣区域。
在一种可能的实施方式中,处理模块902具体用于基于眼球跟踪算法从至少一个可变焦摄像头的拍摄范围中确定目标场景内的感兴趣区域。
在一种可能的实施方式中,至少一个可变焦摄像头包括第一可变焦摄像头和第二可变焦摄像头;第一可变焦摄像头用于在拍摄范围内采集左眼观看的第一图像,第二可变焦摄像头用于在拍摄范围内采集右眼观看的第二图像;
处理模块902具体用于:
根据用户的左眼瞳孔与右眼瞳孔的距离、第一可变焦摄像头和第二可变焦摄像头在头戴式显示设备的位置分别对第一图像和第二图像进行双目视差调节得到左眼显示视图和右眼显示视图;
针对左眼显示视图中的感兴趣区域的图像执行放大处理得到左眼目标图像,并针对右眼显示视图中的感兴趣区域的图像执行放大处理得到右眼目标图像。
在一种可能的实施方式中,头戴式显示设备包括一个可变焦摄像头,处理模块902具体用于:
根据用户的左眼瞳孔与右眼瞳孔的距离、可变焦摄像头在头戴式显示设备上的位置对可变焦摄像头采集的图像进行双目视差调节得到左眼显示视图和右眼显示视图;
针对左眼显示视图中的感兴趣区域的图像执行放大处理得到左眼目标图像,并针对右眼显示视图中的感兴趣区域的图像执行放大处理得到右眼目标图像。
在一种可能的实施方式中,图像处理还包括:左眼显示视图的图像增强处理以及针对右眼显示视图的图像增强处理;
图像增强处理包括如下至少一种:
图像锐化、图像去雾、图像去雨、图像去模糊、图像去马赛克、图像对比度增强、图像颜色增强、图像细节增强或者图像亮度增强。
在一种可能的实施方式中,头戴式显示设备还包括惯性测量单元IMU。处理模块902,还用于:
获取惯性测量单元IMU输出的IMU测量数据;
在用户头部发生偏转的情况下,根据IMU测量数据分别对左眼显示视图和右眼显示视图进行去抖动处理。
在一种可能的实施方式中,处理模块902,还用于在获取变焦倍率之前,确定辅助视觉功能处于开启状态。
在一种可能的实施方式中,头戴式显示设备为混合现实MR头盔。
基于以上实施例及相同构思,本申请实施例还提供一种计算机可读存储介质,计算机可读存储介质中存储有计算机可读程序,当计算机可读程序在计算机上运行时,使得计算机执行以上实施例提供的应用于头戴式显示设备的图像处理方法。
基于以上实施例及相同构思,本申请实施例还提供一种计算机程序产品,当计算机程序产品在计算机上运行时,使得计算机执行以上实施例提供的应用于头戴式显示设备的图像处理方法。
基于以上实施例及相同构思,本申请实施例还提供一种芯片,芯片用于读取存储器中存储的计算机程序,执行以上实施例提供的应用于头戴式显示设备的图像处理方法。
基于以上实施例及相同构思,本申请实施例还提供一种芯片***,芯片***包括处理器,用于支持显示装置实现以上实施例提供的应用于头戴式显示设备的图像处理方法。
在一种可能的设计中,芯片***还包括存储器,存储器用于存储计算机装置必要的程序和数据。
在一种可能的设计中,芯片***由芯片构成,或者包含芯片和其他分立器件。
本申请实施例提供的方法中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行计算机程序指令时,全部或部分地产生按照本发明实施例的流程或功能。计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其他可编程装置。 计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,简称DSL)或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。计算机可读存储介质可以是计算机可以存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如,数字视频光盘(digital video disc,简称DVD)、或者半导体介质(例如,SSD)等。
本申请是参照根据本申请的方法、设备(***)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理设备上,使得在计算机或其他可编程设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
显然,本领域的技术人员可以对本申请进行各种改动和变型而不脱离本申请的范围。这样,倘若本申请的这些修改和变型属于本申请权利要求及其等同技术的范围之内,则本申请也意图包含这些改动和变型在内。

Claims (19)

  1. 一种头戴式显示设备,其特征在于,包括第一可变焦摄像头、第二可变焦摄像头和显示屏;其中,
    所述第一可变焦摄像头,用于采集目标场景内用户左眼观看的第一图像;
    所述第二可变焦摄像头,用于采集所述目标场景内所述用户右眼观看的第二图像;
    所述显示屏,用于在所述显示屏的左眼显示单元上显示左眼目标图像,并在所述显示屏的右眼显示单元上显示右眼目标图像;
    其中,所述左眼目标图像是在所述第一图像包括的所述用户的感兴趣区域ROI进行放大处理后获得的,所述右眼目标图像是在所述第二图像包括所述ROI进行放大处理后获得的。
  2. 如权利要求1所述的头戴式显示设备,其特征在于,所述第一可变焦摄像头采集所述第一图像所采用的变焦倍率与所述第二可变焦摄像头采集所述第二图像所采用的变焦倍率相同或者不同;
    所述第一可变焦摄像头所采用的变焦倍率和所述第二可变焦摄像头所采用的变焦倍率分别被独立控制。
  3. 如权利要求1或2所述的头戴式显示设备,其特征在于,所述头戴式显示设备还包括处理器;
    所述处理器,用于分别对所述第一图像和所述第二图像进行图像处理得到所述左眼目标图像和所述右眼目标图像;
    所述图像处理包括针对所述第一图像和所述第二图像中的感兴趣区域执行的放大处理。
  4. 如权利要求3所述的头戴式显示设备,其特征在于,所述处理器,还用于:
    获取所述第一可变焦摄像头和所述第二可变焦摄像头所采用的变焦倍率,并确定所述感兴趣区域。
  5. 如权利要求4所述的头戴式显示设备,其特征在于,所述处理器,具体用于:
    根据所述变焦倍率,从所述第一可变焦摄像头和/或第二可变焦摄像头的拍摄范围中确定与所述变焦倍率对应的中心画面区域,所述中心画面区域作为所述感兴趣区域。
  6. 如权利要求4所述的头戴式显示设备,其特征在于,所述处理器,具体用于:
    基于眼球跟踪算法从所述第一可变焦摄像头和/或所述第二可变焦摄像头的拍摄范围中确定所述感兴趣区域。
  7. 如权利要求3-6任一项所述的头戴式显示设备,其特征在于,所述处理器,具体用于:
    根据所述用户的左眼瞳孔与右眼瞳孔的距离、第一可变焦摄像头和第二可变焦摄像头在所述头戴式显示设备的位置分别对所述第一图像和所述第二图像进行双目视差调节得到左眼显示视图和右眼显示视图;针对左眼显示视图中的所述ROI执行放大处理得到所述左眼目标图像,并针对所述右眼显示视图中的所述ROI执行放大处理得到所述右眼目标图像。
  8. 如权利要求7所述的头戴式显示设备,其特征在于,所述图像处理还包括:针对所述左眼显示视图的图像增强处理以及针对所述右眼显示视图的所述图像增强处理;
    所述图像增强处理包括如下至少一种:
    图像锐化、图像去雾、图像去雨、图像去模糊、图像去马赛克、图像对比度增强、图像颜色增强、图像细节增强或者图像亮度增强。
  9. 如权利要求7或8所述的头戴式显示设备,其特征在于,头戴式显示设备还包括惯性测量单元IMU;
    所述惯性测量单元IMU,用于输出IMU测量数据;
    所述处理器,还用于在所述用户头部发生偏转的情况下,根据所述IMU测量数据分别对所述左眼显示视图和右眼显示视图进行去抖动处理。
  10. 如权利要求1-9任一项所述的头戴式显示设备,其特征在于,所述头戴式显示设备为混合现实MR头盔。
  11. 一种图像处理方法,其特征在于,应用于头戴式显示设备,所述头戴式显示设备包括第一可变焦摄像头、第二可变焦摄像头和显示屏;
    确定目标场景内的感兴趣区域ROI,并通过所述第一可变焦摄像头采集所述目标场景内用户左眼观看的第一图像,以及通过所述第二可变焦摄像头采集所述目标场景内所述用户右眼观看的第二图像;
    分别对所述第一图像和所述第二图像进行图像处理得到左眼目标图像和右眼目标图像;所述图像处理包括针对所述第一图像和所述第二图像中的所述ROI执行的放大处理;
    在所述显示屏的左眼显示单元上显示所述左眼目标图像,并在所述显示屏的右眼显示单元上显示所 述右眼目标图像。
  12. 如权利要求11所述的方法,其特征在于,所述确定目标场景内的ROI,包括:
    获取所述变焦倍率,并从所述第一可变焦摄像头和/或第二可变焦摄像头的拍摄范围中确定与所述变焦倍率对应的中心画面区域,所述中心画面区域作为所述ROI。
  13. 如权利要求11所述的方法,其特征在于,所述确定目标场景内的ROI,包括:
    基于眼球跟踪算法从所述第一可变焦摄像头和/或第二可变焦摄像头的拍摄范围中确定所述ROI。
  14. 如权利要求11-13任一项所述的方法,其特征在于,分别对所述第一图像和所述第二图像进行图像处理得到左眼目标图像和右眼目标图像,包括:
    根据所述用户的左眼瞳孔与右眼瞳孔的距离、第一可变焦摄像头和第二可变焦摄像头在所述头戴式显示设备的位置分别对所述第一图像和所述第二图像进行双目视差调节得到左眼显示视图和右眼显示视图;
    针对左眼显示视图中的所述ROI执行放大处理得到所述左眼目标图像,并针对所述右眼显示视图中的所述ROI执行放大处理得到所述右眼目标图像。
  15. 如权利要求14所述的方法,其特征在于,所述图像处理还包括:针对所述左眼显示视图的图像增强处理以及针对所述右眼显示视图的所述图像增强处理;
    所述图像增强处理包括如下至少一种:
    图像锐化、图像去雾、图像去雨、图像去模糊、图像去马赛克、图像对比度增强、图像颜色增强、图像细节增强或者图像亮度增强。
  16. 如权利要求14或15所述的方法,其特征在于,头戴式显示设备还包括惯性测量单元IMU,所述方法还包括:
    获取所述惯性测量单元IMU输出的IMU测量数据;
    在所述用户头部发生偏转的情况下,根据所述IMU测量数据分别对所述左眼显示视图和右眼显示视图进行去抖动处理。
  17. 如权利要求11-16任一项所述的方法,其特征在于,所述头戴式显示设备为混合现实MR头盔。
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有计算机程序或指令,当计算机程序或指令被头戴式显示设备执行时,使得所述头戴式显示设备执行如权利要求11-17任一项所述中的方法。
  19. 一种计算机程序产品,其特征在于,该计算机程序产品包括计算机程序或指令,当该计算机程序或指令被头戴式显示设备执行时,实现如权利要求11-17任一项所述中的方法。
PCT/CN2023/137011 2022-12-14 2023-12-07 一种图像处理方法、头戴式显示设备及介质 WO2024125379A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211606269.5A CN118192083A (zh) 2022-12-14 2022-12-14 一种图像处理方法、头戴式显示设备及介质
CN202211606269.5 2022-12-14

Publications (1)

Publication Number Publication Date
WO2024125379A1 true WO2024125379A1 (zh) 2024-06-20

Family

ID=91401295

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/137011 WO2024125379A1 (zh) 2022-12-14 2023-12-07 一种图像处理方法、头戴式显示设备及介质

Country Status (2)

Country Link
CN (1) CN118192083A (zh)
WO (1) WO2024125379A1 (zh)

Also Published As

Publication number Publication date
CN118192083A (zh) 2024-06-14

Similar Documents

Publication Publication Date Title
JP7408678B2 (ja) 画像処理方法およびヘッドマウントディスプレイデバイス
US9927948B2 (en) Image display apparatus and image display method
US10666856B1 (en) Gaze-directed photography via augmented reality feedback
JP5567235B2 (ja) 画像処理装置、撮影装置、プログラム及び画像処理方法
CN113747050B (zh) 一种拍摄方法及设备
JP5835384B2 (ja) 情報処理方法、情報処理装置、およびプログラム
KR20190012465A (ko) 복수의 카메라를 이용하여 영상을 획득하기 위한 전자 장치 및 이를 이용한 영상 처리 방법
CN107948505B (zh) 一种全景拍摄方法及移动终端
JP2015176559A (ja) 情報処理方法、情報処理装置、およびプログラム
JP5720068B1 (ja) 自己撮影機能付スマートグラス
JP2017060078A (ja) 画像録画システム、ユーザ装着装置、撮像装置、画像処理装置、画像録画方法、及びプログラム
CN113542600B (zh) 一种图像生成方法、装置、芯片、终端和存储介质
JP2017084422A (ja) 装置、方法、およびプログラム
CN113850709A (zh) 图像变换方法和装置
WO2020044916A1 (ja) 情報処理装置、情報処理方法及びプログラム
WO2024125379A1 (zh) 一种图像处理方法、头戴式显示设备及介质
US20230216999A1 (en) Systems and methods for image reprojection
WO2019150668A1 (ja) 画像処理装置、画像処理方法、及びプログラム
KR20170044319A (ko) 헤드 마운트 디스플레이의 시야 확장 방법
CN114339101B (zh) 一种录像方法及设备
JP2018163683A (ja) 情報処理方法、情報処理プログラムおよび情報処理装置
KR20180111991A (ko) 화상 처리 장치, 화상 처리 방법 및 화상 처리 시스템
CN114757866A (zh) 清晰度检测方法、装置及计算机存储介质
RU2782312C1 (ru) Способ обработки изображения и устройство отображения, устанавливаемое на голове
JP2020167657A (ja) 画像処理装置、ヘッドマウントディスプレイ、および画像表示方法