CN115686181A

CN115686181A - Display method and electronic equipment

Info

Publication number: CN115686181A
Application number: CN202110824187.7A
Authority: CN
Inventors: 王松; 沈钢
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2021-07-21
Filing date: 2021-07-21
Publication date: 2023-02-03
Also published as: WO2023001113A1

Abstract

A display method and electronic equipment are used for relieving human brain fatigue of a user when the user wears VR glasses to watch VR scenes. The method comprises the following steps: displaying the N frames of images to a user through a display device; the definition of a first object at a first depth of field on the ith frame image in the N frame images is a first definition; on the jth frame image in the N frame images, the definition of the first object at the first depth of field is a second definition; the definition of the first object at the first depth of field on the k frame image in the N frame images is a third definition; wherein the first definition is less than the second definition, the second definition is greater than the third definition, i, j and k are positive integers less than N, and i is greater than j and less than k; the first depth of field is larger than the second depth of field, or the distance between the first depth of field and the depth of field where the user gazing point is located is larger than the first distance.

Description

Display method and electronic equipment

Technical Field

The present application relates to the field of electronic technologies, and in particular, to a display method and an electronic device.

Background

Virtual Reality (VR) technology is a man-machine interaction means created by computer and sensor technologies. The VR technology integrates a plurality of scientific technologies such as computer graphics technology, computer simulation technology, sensor technology, display technology, and the like, and can create a virtual environment in which a user is immersed by wearing VR glasses. The virtual environment is presented by continuously refreshing a plurality of three-dimensional images, and the three-dimensional images comprise objects in different depths of field, so that a stereoscopic impression is brought to a user.

In real life, when a person watches an object, convergence adjustment (both the left eye line and the right eye line of sight point at the object when the person observes the object) and focus adjustment (adjusting crystalline lens to focus light on the retina) are performed simultaneously, as shown in fig. 1. Generally, convergence adjustment and focus adjustment are consistent, which is a normal physiological mechanism for humans.

However, in the virtual environment presented by the VR glasses, the scene seen by the user is displayed by the display screen of the VR glasses. The light emitted by the screen has no depth difference, so that the focal point of the eyes is positioned on the screen through the focal length adjustment. However, the depth of the object in the virtual environment actually seen by the user is not the same as the distance of the display screen from the user, so the Vergence adjustment is inconsistent with the focus adjustment, which is called Vergence Adjustment Conflict (VAC). VAC brings fatigue and dizziness to the user because the human brain cannot accurately judge the true depth of the object, and if it is so for a long time, it will also have serious influence on the user's eyesight.

Disclosure of Invention

The application aims to provide a display method and electronic equipment for improving VR experience.

In a first aspect, a display method is provided. The method comprises the following steps: displaying the N frames of images to a user through a display device; n is a positive integer; the definition of a first object at a first depth of field on the ith frame image in the N frame images is a first definition; on the jth frame image in the N frame images, the definition of the first object at the first depth of field is a second definition; the definition of the first object at the first depth of field on the k frame image in the N frame images is a third definition; the first definition is smaller than the second definition, the second definition is larger than the third definition, i, j and k are positive integers smaller than N, and i is larger than j and smaller than k; the first depth of field is greater than the second depth of field, or the distance between the first depth of field and the depth of field where the user's gaze point is located is greater than a first distance.

It should be noted that, taking the VR display device as an example, if the definitions of all objects in each frame of image displayed by the VR display device are the same, the brain may not accurately determine the depths of different objects, and the brain may feel fatigue. In an embodiment of the application, a first object at a first depth of field in an image stream (i.e., N frames of images) presented to a user by a VR display device (e.g., VR glasses) has a high or low sharpness. For example, the first object with a large depth of field or the first object farther from the user's gaze point has high or low sharpness. When the VR display device displays the ith frame or the kth frame, the fatigue of the human brain can be relieved due to the fact that the definition of the first object is low. When the VR display device displays the jth frame image, the brain can take the details of the first object due to the fact that the definition of the first object on the jth frame image is high, and the details of the first object are not lost when the comparison is conducted. Therefore, by the mode, the brain fatigue can be relieved, the brain can be guaranteed to take enough details of the object, and the user experience is better.

In one possible design, the user's gaze point is unchanged during the displaying of the i frame image, the j frame image, and the k frame image. That is, when the user wears VR glasses to view the virtual environment, if the user's gaze point does not change, the sharpness of the first object far from the user's gaze point is high or low (or raised or lowered). For example, when VR glasses display an image of an i-th frame or a k-th frame, the fatigue of the brain of a person can be relieved because the definition of the first object is low. When the jth frame image is displayed, the first object on the jth frame image is clear, so that the loss of the details of the first object can be avoided, and the sufficient details of the first object can be taken in the brain. By the method, the brain fatigue can be relieved, the brain can be guaranteed to take enough details of the object, and the user experience is good.

It is understood that when the user's focus changes, the first depth of field changes accordingly. For example, when the user gazing point is at the object a, it is assumed that the depth of field of the object a is 0.5m, and the distance between the first depth of field and the depth of field of the object a is greater than a first distance (e.g., 0.5 m), so that the first depth of field is greater than or equal to 1m; when the user's gaze point changes to an object B, assuming that the depth of field where the object B is located is 0.8m, the distance between the first depth of field and the depth of field where the object B is located is greater than a first distance (e.g., 0.5 m), so that the first depth of field becomes greater than or equal to 1.3m; therefore, the first depth of field changes as the user's focus changes.

In one possible design, the second depth of field includes: the user specifies at least one of the depth of field, the depth of field where the user gazing point is located, the default depth of field of the system, the depth of field where the virtual image surface is located, the depth of field corresponding to the virtual scene, and the depth of field where the main body object is located on the ith frame of image. That is, the display device displays the image stream with a first object having a depth of field greater than the second depth of field. The second depth of field may be determined in a plurality of manners, for example, in a first manner, the second depth of field may be determined according to a VR scene, and the preset depth of field differs according to the VR scene. Wherein the VR scene includes, but is not limited to, at least one of VR games, VR movies, VR instruction, etc. In a second manner, the second depth of view may be user set. It is to be understood that different VR applications may set different second depths of view. Third, the second depth of field may also be a default depth of field. In a fourth manner, the second depth of field may also be the depth of field of the virtual image plane. In a fifth mode, the second depth of field may also be the depth of field where the main object is located in the picture currently being displayed by the VR display device. And in the sixth mode, the second depth of field is the depth of field where the user gazing point is located. It should be noted that, several determination manners of the second depth of field are listed above, but the embodiment of the present application is not limited to the above manners, and other manners of determining the second depth of field are also possible.

In one possible design, prior to displaying the N frames of images, the method further includes: at least one of the user's eyes blink/squint times greater than the first times within a period of time when the user triggers the operation for starting the eye protection mode and the period of time when the user watches the eye is greater than the first time is detected. For example, the display device has just started to display with the same definition, i.e., the definition of all objects in the displayed image stream is the same. When detecting that the operation that is used for starting eyeshield mode, user watch time are greater than first duration, and the user's eyes blink/squint degree in the second duration is greater than at least one in the first number of times, the definition of the first object of first depth of field has the lift in the image stream that shows. Thus, the fatigue of the human brain can be relieved, and meanwhile, the human brain can be guaranteed to take sufficient details of the object. Moreover, when the user needs (for example, the number of times of blinking/squinting of the user's eyes is increased), the technical scheme of the application is started (the definition of the first object at the first depth of field in the image stream is increased or decreased), and power consumption caused by image processing (for example, blurring of the first object on the ith frame image and the kth frame image) is saved.

In one possible design, the sharpness of a second object at a third depth of field in the N images is the same. That is, the definition of a first object at a first depth of field in the image stream is high or low, and the definition of a second object at a third depth of field is the same.

For example, the third depth of field is smaller than the first depth of field. That is, the first object with a larger depth of field has a higher or lower definition, and the second object with a smaller depth of field has a constant definition. Generally, the human eye sees closer objects more and more clearly, and sees distant objects less and less blurred. Therefore, when the far object is blurred and the near object is sharp on the image displayed by the display device, the virtual environment sensed by the human is more suitable for the real situation. Moreover, the definition of the distant object in the image stream displayed by the display device is high or low, and is not always in a fuzzy state, so that the human brain can be ensured to acquire enough details of the distant object.

For another example, the distance between the third depth of field and the depth of field where the user gaze point is located is smaller than the distance between the first depth of field and the depth of field where the user gaze point is located. That is, the definition of the first object far from the user's gaze point is high or low, and the definition of the second object near the user's gaze point is unchanged. It can be understood that in a real environment, when a human eye gazes at an object, the object is seen more clearly by the eye, and other objects farther away from the object are more blurred. Therefore, objects far away from the user's gaze point on the image displayed by the display device are blurred, and objects near the user's gaze point are sharp, so that the virtual environment seen by human eyes is more in line with the real situation. In addition, the definition of the object far away from the user's gaze point in the image stream displayed by the display device is high or low, and is not always in a fuzzy state, so that the human brain can be ensured to acquire enough details of the object far away from the user's gaze point.

One embodiment is that the time interval between the display time of the j frame image and the display time of the i frame image is less than or equal to the visual dwell time of the user; and/or the time interval between the display time of the k frame image and the display time of the j frame image is less than or equal to the time stay duration. It should be noted that, the definition of the first object in the ith frame image is low, and the definition of the first object in the jth frame image is high, so as to ensure that the human brain obtains the details of the first object, the ith frame image and the jth frame image may be displayed within the visual dwell time of the user, and in this way, the human brain fuses the ith frame image and the jth frame image, thereby ensuring that the human brain obtains enough details of the first object.

Another possible embodiment is that j = i + n, where n is greater than or equal to 1, or n varies with the variation of the user's dwell time duration and the image refresh frame rate of the display device; and/or k = j + m, wherein m is greater than or equal to 1, or m varies with the variation of the user's time dwell duration and the image refresh frame rate of the display device.

Illustratively, j = i +1, k = j +1, i.e., the first object in the last frame in the image stream is blurred, the first object in the next frame is sharp, and the first object in the next frame is blurred. Or n and m can be determined according to the visual retention time of a user and the image refreshing frame rate. Suppose the user visual dwell time is T and the image refresh frame rate is P. Then, during the T time, a T/P frame image can be displayed, then n is less than or equal to T/P and m is less than or equal to T/P. Therefore, the ith frame image and the jth frame image are displayed within the visual dwell time of the user, and therefore the human brain can fuse the ith frame image and the jth frame image to guarantee that the human brain can take enough details of the first object.

In one possible design, the display device includes a first display screen for presenting images to a left eye of a user and a second display screen for presenting images to a right eye of the user; the images displayed on the first display screen and the second display screen are synchronized. The synchronization of the images displayed on the first display screen and the second display screen can be understood as that the first display screen displays the ith frame image, and then the second display screen also displays the ith frame image, so that the sequence of the images displayed by the two display screens is ensured to be consistent.

In one mode, a first display screen and a second display screen respectively display the N frames of images; it is to be understood that the first display screen and the second display screen display the same set of image streams in which the sharpness of the first object is high or low. Since the image streams displayed by the two display screens are the same and the displayed images are synchronized, for example, the ith frame is displayed at the same time, the jth frame is displayed at the same time, and so on. Therefore, when the first object on the first display screen is blurred, the first object on the second display screen is also blurred. That is, the image stream displayed by the first display screen and the image stream displayed by the second display screen have the same trend of change of the definition of the first object, such as the trend of change of "blur-clear-blur-clear".

In a second mode, the first display screen displays the N frames of images; the second display screen displays another N frames of images; the image content of the other N frames of images is the same as that of the N frames of images; the definition of the first object at the first depth of view on the ith frame image in the other N frames of images is a fourth definition; the definition of the first object at the first depth of field on the jth image in the other N images is a fifth definition; the definition of the first object at the first depth of field on the k frame image in the other N frame images is a sixth definition; wherein the fourth definition is greater than the fifth definition and the fourth definition is less than the sixth definition. Due to the synchronization of the images displayed by the two display screens, for example, the ith frame is displayed at the same time, the jth frame is displayed at the same time, and so on. Therefore, when the first object on the first display screen is blurred, the first object on the second display screen is sharp. That is, the sharpness change trends of the first object in the image stream displayed on the first display screen and the image stream displayed on the second display screen may be opposite. For example, a first display screen displays an image stream in which a first object has alternating sharpness "blurred-clear-blurred-clear", and a second display screen displays an image stream in which a distant object has alternating sharpness-blurred-clear-blurred ".

Illustratively, the fourth definition is greater than the first definition; and/or the fifth definition is less than the second definition; and/or the sixth definition is greater than the third definition.

For example, when the display screens corresponding to the left and right eyes synchronously display images (for example, ith frame images), the first object on the left-eye image is blurred, and the first object on the right-eye image is clear. For another example, when the display screens corresponding to the left and right eyes synchronously display images (for example, the jth frame image), the first object on the left-eye image is clear, and the first object on the right-eye image is blurred. In this way, the fatigue of the human brain can be relieved to a certain extent, and the first object on the image obtained by superposing the left eye image and the right eye image in the human brain is not too blurred, so that the loss of too many details of the object is avoided.

The display method provided by the embodiment of the application can be suitable for various application scenes. Such as a gaming application (e.g., VR gaming application), simulated driving (e.g., VR driving), simulated teaching (e.g., VR teaching), and so forth. In the following, VR games and VR driving are taken as examples.

A first application scenario, wherein the N frames of images are images related to a game; the game may be a VR game. For example, the N-frame image is an image generated by a VR game application. The second depth of field comprises: in a game scene, the depth of field of a game role corresponding to the user is located, or the depth of field of the upper body part (such as an arm) of the game role corresponding to the user is located, or the depth of field of a game device (such as a gun) currently held by the game role corresponding to the user is located; and/or the depth of field of the user's gaze point includes: in a game scene, the depth of field of a game role corresponding to a game counterpart is located, or the depth of field of a building is located, or the depth of field of the body part of the game role corresponding to the user is located, or the depth of field of a game device currently held by the game role corresponding to the user is located. Therefore, in the game scene, the definition of the first object with the depth of field larger than the second depth of field or the first object far away from the user's gaze point can be high or low, so that not only can the fatigue feeling be relieved, but also enough details of the first object can be taken in the human brain, and the game (such as VR game) experience can be ensured.

A second application scene, wherein the N frames of images are images related to vehicle driving; for example, the N-frame images are images generated by a VR driving application. The second depth of field comprises: in a vehicle driving scene, the depth of field of a currently driven vehicle of the user, or the depth of field of a steering wheel on the currently driven vehicle of the user, or the depth of field of a windshield on the currently driven vehicle of the user; and/or the depth of field where the user's point of regard is located comprises: in a vehicle driving scenario, other users on a driving road drive vehicles (e.g., vehicles driving in front of the user's currently driving vehicle), or roadside setting objects of the driving road (e.g., trees, signs, etc. on the side of the road). Therefore, in a vehicle driving scene, the definition of the first object with the depth of field larger than the second depth of field or the first object far away from the user's gaze point can be high or low, so that not only can fatigue be relieved, but also enough details of the first object can be taken in the human brain, and the vehicle driving (such as VR driving) experience can be guaranteed.

In a possible design, the image of the ith frame is an image obtained by blurring the first object on the original image of the ith frame; the j frame image is the j frame original image or an image obtained by performing sharpening processing on the first object on the j frame original image; the k frame image is an image obtained by blurring the first object on the k frame original image; the definition of all objects on the original image of the ith frame, the original image of the jth frame and the original image of the kth frame is the same; the image of the jth frame is an image obtained by performing sharpening processing on the first object on the original image of the jth frame, and includes: the j frame image is an image obtained by fusing the i frame image and the j frame original image; or the jth frame image is an image obtained by fusing lost image information during the blurring processing of the ith frame image with the jth frame original image. Therefore, the definition of the first object on the jth frame image is high, the brain can be guaranteed to take enough details of the first object, and user experience is not affected.

In one possible design, the image of the jth frame is an image obtained by fusing the image of the ith frame and the original image of the jth frame, and includes: the image block in the area of the first object on the jth frame of image is obtained by fusing a first image block and a second image block; the first image block is an image block in an area where the first object is located on the ith frame of image, and the second image block is an image block in an area where the first object is located on the jth frame of original image. By the method, when the ith frame image and the jth frame original image are fused, only the image block in the area where the first object is located on the ith frame image and the image block in the area where the first object is located on the jth frame original image can be fused, fusion of the ith frame image and the whole jth frame original image is not needed, and the efficiency is improved.

In a second aspect, an electronic device is further provided, including:

a processor, a memory, and one or more programs;

wherein the one or more programs are stored in the memory, the one or more programs comprising instructions which, when executed by the processor, cause the electronic device to perform the method steps as provided above in the first aspect.

In a third aspect, there is provided a computer readable storage medium for storing a computer program which, when run on a computer, causes the computer to perform the method as provided in the first aspect above.

In a fourth aspect, there is provided a computer program product comprising a computer program which, when run on a computer, causes the computer to perform the method as provided in the first aspect above.

A fifth aspect provides a graphical user interface on an electronic device, the electronic device having a display screen, a memory, and a processor configured to execute one or more computer programs stored in the memory, the graphical user interface comprising a graphical user interface displayed when the electronic device performs the method provided by the first aspect.

In a sixth aspect, an embodiment of the present application further provides a chip system, where the chip system is coupled to a memory in an electronic device, and is used to call a computer program stored in the memory and execute the technical solution of the first aspect of the embodiment of the present application, and "coupling" in the embodiment of the present application means that two components are directly or indirectly combined with each other.

The beneficial effects of the second aspect to the sixth aspect are referred to in the beneficial effects of the first aspect, and are not repeated.

Drawings

FIG. 1 is a schematic view of a vergence accommodation conflict according to an embodiment of the present application;

fig. 2 is a schematic diagram of a VR system provided by an embodiment of the present application;

FIG. 3A is a schematic diagram illustrating the convergence of human eyes according to an embodiment of the present application;

FIG. 3B is a schematic diagram of a human eye structure provided by an embodiment of the present application;

FIG. 4 is a schematic representation of ciliary muscle accommodation of a human eye provided by an embodiment of the present application;

fig. 5 is a schematic diagram of VR glasses according to an embodiment of the present application;

fig. 6A to 6C are schematic diagrams of virtual image planes corresponding to images displayed by VR glasses according to an embodiment of the present application;

fig. 7A to 7B are schematic diagrams of a first application scenario provided in an embodiment of the present application;

fig. 8A to 8B are schematic diagrams of a second application scenario provided in an embodiment of the present application;

fig. 9 is a schematic structural diagram of a VR wearable device provided in an embodiment of the application;

fig. 10 is a schematic diagram illustrating a user's point of regard being clear and a non-user's point of regard being fuzzy in a VR virtual environment according to an embodiment of the present application;

FIG. 11 is a schematic flow chart illustrating the principles of image generation provided by an embodiment of the present application;

FIGS. 12-13 are schematic diagrams of an image stream provided in an embodiment of the present application;

FIG. 14 is a schematic diagram illustrating a user's brain acquiring an image according to an embodiment of the present application;

FIG. 15 is a schematic diagram of an image stream generation process provided in an embodiment of the present application;

fig. 16 is a schematic diagram of a VR virtual environment with blurred distant objects and sharp close objects according to an embodiment of the present disclosure;

FIG. 17 is another schematic flow chart diagram illustrating the image generation principle provided by an embodiment of the present application;

18-19 are alternative schematic diagrams of image streams provided by an embodiment of the present application;

fig. 20 to 23 are schematic diagrams illustrating a left-eye display screen and a right-eye display screen on a display device according to an embodiment of the present application to display image streams;

fig. 24 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

Hereinafter, some terms in the embodiments of the present application are explained so as to be easily understood by those skilled in the art.

(1) The embodiments of the present application relate to at least one, including one or more; wherein a plurality means greater than or equal to two. In addition, it should be understood that the terms "first," "second," and the like in the description of the present application are used for descriptive purposes only and are not intended to indicate or imply relative importance, nor order to be construed as indicating or implying any order. For example, the first object and the second object do not represent the importance of the two, or represent the order of the two, in order to distinguish the objects.

In the embodiment of the present application, "and/or" is an association relationship describing an association object, and indicates that three relationships may exist, for example, a and/or B may indicate: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter associated objects are in an "or" relationship.

(2) Virtual Reality (VR) technology is a man-machine interaction means created by computer and sensor technologies. VR technology combines a variety of scientific technologies such as computer graphics, computer simulation, sensor, and display, and can create virtual environments. The virtual environment comprises a three-dimensional vivid image which is generated by a computer and dynamically played in real time, so that visual perception is brought to a user; in addition to visual perception generated by computer graphics technology, there are also perceptions such as auditory sensation, tactile sensation, force sensation, and movement, and even olfactory sensation and taste sensation, which are also called multi-perception; in addition, head rotation, eyes, gestures or other human body behavior actions of the user can be detected, data adaptive to the actions of the user are processed by the computer, the real-time response is carried out on the actions of the user, and the data are respectively fed back to the five sense organs of the user, so that a virtual environment is formed. Exemplarily, a user wears the VR wearable device to see the VR game interface, and can interact with the VR game interface through operations such as gestures and a handle as if in a game.

(3) Augmented Reality (AR) technology refers to superimposing a computer-generated virtual object on a scene of the real world, thereby implementing an enhancement to the real world. That is, in the AR technology, a real-world scene needs to be collected, and then a virtual environment is added to the real world.

Therefore, VR technology differs from AR technology in that VR technology creates a complete virtual environment, and all users see is a virtual object; in the AR technology, a virtual object is superimposed on the real world, that is, both an object in the real world and the virtual object are included. For example, a user wears transparent glasses through which the real environment around the user can be seen, and virtual objects can be displayed on the glasses, so that the user can see both the real objects and the virtual objects.

(4) Mixed Reality (MR) technology is a bridge for building interactive feedback information among a virtual environment, a real world and a user by introducing real scene information (or called real scene information) into the virtual environment, thereby enhancing the sense of Reality of user experience. In particular, a real object is virtualized (e.g., a camera is used to scan the real object for three-dimensional reconstruction to generate a virtual object), and the virtualized real object is introduced into a virtual environment such that the user can view the real object in the virtual environment.

It should be noted that the technical solution provided in the embodiment of the present application may be applied to a VR scene, an AR scene, or an MR scene. Of course, other scenarios besides VR, AR and MR are also applicable. For example, naked eye 3D scenes (naked eye 3D display screens, naked eye 3D projection, etc.), cinemas (such as 3D movies), VR software in electronic devices, etc. in short, the method can be applied to any scene requiring three-dimensional image display.

For convenience of description, the following description mainly takes VR scenes as an example.

For example, please refer to fig. 2, which is a schematic diagram of a VR system according to an embodiment of the present application. The VR system includes a VR wearable device 100 and an image generation device 200. The image generation apparatus 200 includes, among other things, a host (e.g., VR host) or a server (e.g., VR server). VR wearable device 100 is connected (wired or wireless) to a VR host or VR server. The VR host or VR server may be a device with greater computing power. For example, the VR host may be a device such as a mobile phone, a tablet computer, and a notebook computer, and the VR server may be a cloud server. The VR host or the VR server is responsible for generating images and the like, then the images are sent to the VR wearable device 100 to be displayed, and the user wears the VR wearable device 100 to see the images. For example, the VR-worn device 100 may be a Head Mounted Device (HMD), such as glasses, a helmet, and the like. Optionally, the VR system in fig. 2 may not include the image generating apparatus 200. For example, the VR-worn device 100 has image generation capability locally, and does not need to acquire an image from the image generation device 200 (VR host or VR server) for display. In summary, the VR-worn device 100 may display a three-dimensional image through which a virtual environment may be presented to a user due to the different depths of field (see description below) of different objects on the three-dimensional image.

(5) Depth of Field (Depth of Field, DOF for short)

The three-dimensional image includes objects of different image depths. For example, the VR wearable device displays a three-dimensional image, and what the user wears the VR wearable device sees is a three-dimensional scene (i.e., a virtual environment), and the distances from different objects in the three-dimensional scene to the eyes of the user are different, so that a stereoscopic effect is presented. Thus, the image depth may be understood as the distance between an object on a three-dimensional image and the user's eyes, the greater the image depth, the farther visually from the user's eyes, the more distant the image appears, and the smaller the image depth, the closer visually to the user's eyes, the closer the image appears. The image depth may also be referred to as "depth of field".

To clearly illustrate the virtual environment presented to the user by the VR-worn device, a brief description of the human-eye vision generation mechanism is provided below.

It can be understood that, in an actual scene, when a user views an object, human eyes can realize visual perception by acquiring a light signal in the actual scene and processing the light signal in a brain. The light signals in the actual scene may include reflected light from different objects and/or light signals directly emitted by the light source. Because the optical signal of the actual scene may carry the relevant information (such as size, position, color, etc.) of each object in the actual scene, the brain can acquire the information of the object in the actual scene by processing the optical signal, that is, acquire the visual sensation.

It should be noted that, when the left eye and the right eye view the same object, the viewing angles are slightly different. The left and right eyes see a different scene in nature. For example, the left eye may acquire an optical signal of a two-dimensional image (hereinafter, simply referred to as a left-eye image) of a plane on which a focus of the human eye is located, the plane being perpendicular to the left-eye line of sight direction. Similarly, the right eye can acquire an optical signal of a two-dimensional image (hereinafter, referred to as a right-eye image) of a plane where the focus of the human eye is located, the two-dimensional image being perpendicular to the direction of the right-eye line of sight. The left-eye image is slightly different from the right-eye image. The brain can process the light signals of the left-eye image and the right-eye image so as to acquire the related information of different objects in the current scene.

In addition, the user can also obtain stereoscopic vision feeling by obtaining the depth of different objects in the actual scene. The stereoscopic perception may also be referred to as binocular stereo vision.

Illustratively, a user may undergo two processes, convergence (vergence) and zoom (accommodation), while viewing objects in an actual scene. Herein, convergence may also be referred to as convergence, and the name is not limited in the present application.

1. Convergence

Convergence is understood to mean the adjustment of the line of sight of the human eye to point at an object. For example, as shown in fig. 3A, when observing an object in an actual scene (in fig. 3A, the object is a triangle, for example), the eyes of both eyes can be controlled to rotate towards the object (point to the object) by controlling muscles near the eyes of the human.

Vergence angle (Vergence angle) and Vergence depth (Vergence distance)

Continuing with FIG. 3A, the angle formed by the left and right eye lines when viewing an object is referred to as the vergence angle θ. The brain can judge the depth of the object, i.e., the convergence depth, by acquiring the convergence angle θ of both eyes. It can be understood that the closer the object to be observed is to the human eyes, the larger the convergence angle θ is, and the smaller the convergence depth is. Correspondingly, the farther the observed object is from the eyes, the smaller the convergence angle θ and the greater the convergence depth.

2. Zoom lens

Zooming is understood to mean the adjustment of the human eye to the correct focal length when viewing an object. Typically, the brain controls the lens to adjust to the correct focal distance through the ciliary muscle. The description is given with reference to fig. 3B and fig. 4. Fig. 3B is a schematic diagram of the composition of the human eye. As shown in fig. 3B, the human eye may include a lens and ciliary muscles, and a retina located at the fundus. The lens can play the role of a zoom lens and can perform convergence processing on light rays emitted into human eyes. So as to converge the incident light rays on the retina of the eye fundus of the human eye, and enable the scenery in the actual scene to form a clear image on the retina. The ciliary muscle can be used for adjusting the shape of the crystalline lens, for example, the ciliary muscle can adjust the diopter of the crystalline lens by contracting or relaxing, so that the effect of adjusting the focal distance of the crystalline lens is achieved. So that objects at different distances in the actual scene can be clearly imaged on the retina through the crystalline lens. As an example, referring to FIG. 4, the accommodation of the crystalline lens by the ciliary muscle is illustrated when the human eye is viewing objects at different distances. As shown in fig. 4 (a), when the human eye observes a distant object, the object is taken as a non-light source as an example. The reflected light from the surface of the object may be near parallel light. At this time, the ciliary muscle may control the state of the crystalline lens to the state shown in fig. 4 (a), such as the relaxation of the ciliary muscle, the control of the crystalline lens to be flat, and the diopter to be small, so that the parallel incident light may pass through the crystalline lens and then be converged on the retina of the fundus. When the human eye observes a relatively close object, the object is taken as a non-light source as an example, in conjunction with (b) in fig. 4. The reflected light from the surface of the object can be incident on the human eye along the optical path as shown in (b) of fig. 4. At this time, the ciliary muscle may bring the state of the crystalline lens to the state shown in fig. 4 (b), and as the ciliary muscle contracts, the crystalline lens is convex and the diopter becomes large, so that the incident light as shown in fig. 4 (b) may pass through the crystalline lens and then be converged on the retina of the fundus. That is, the state of contraction or relaxation of the ciliary muscle is different when the human eye views objects at different distances.

Zoom depth (Accommodation distance)

As mentioned above, when the human eye observes objects at different distances, the contraction or relaxation state of the ciliary muscle is different, so the brain can judge the depth of the object by the current contraction or relaxation state of the ciliary muscle when the human eye observes the object. This depth may be referred to as zoom depth.

Therefore, the convergence depth can be determined according to the convergence angle theta in the human brain in the convergence process, the zoom depth can be determined according to the contraction or relaxation state of ciliary muscles in the human brain in the zooming process, and both the convergence depth and the zoom depth represent the distance between the object and the glasses of the user. In general, in a real scene, the convergence depth and the zoom depth are coordinated or uniform when the user observes the object.

If the depth of the object indicated by the convergence depth and the zoom depth is inconsistent, the brain cannot accurately judge the depth of the object, fatigue is caused, and user experience is affected. Generally, the disparity in the depth of the object indicated by the Vergence depth and the zoom depth may also be referred to as a Vergence Accommodation Conflict (VAC).

Currently, most VR products (e.g., those that do not incorporate zoom plane or multi-focal plane technology) present a virtual environment to a user, most VAC's occur.

Illustratively, the electronic device that presents the virtual environment to the user through VR technology is VR glasses. Fig. 5 is a schematic diagram of VR glasses. As shown in fig. 5, 2 display screens (for example, the display screen 501 and the display screen 502 may be separate display screens, or the display screen 501 and the display screen 502 may be different display areas on the same display screen) may be provided in the VR glasses, and each display screen has a display function. Each display screen may be adapted to display corresponding content to one of the eyes (e.g., left or right eye) of a user through a corresponding eyepiece. The display 501 corresponds to the eyepiece 502, and the display 502 corresponds to the eyepiece 504. For example, on the display 501, a left-eye image corresponding to the virtual environment may be displayed. The light of the left-eye image may be converged at the left eye through the eyepiece 503 so that the left eye sees the left-eye image. Similarly, on the display screen 502, a right-eye image corresponding to the virtual environment may be displayed. The light of the right-eye image can be converged at the right eye through the eyepiece 504, so that the right eye sees the right-eye image. Thus, the brain can make the user see the object in the virtual environment corresponding to the left-eye image and the right-eye image by fusing the left-eye image and the right-eye image.

It should be noted that, due to the convergence of the ocular lens, the image seen by the human eye is actually the image corresponding to the image displayed on the display screen on the virtual image plane 600 as shown in fig. 6A. For example, the left-eye image seen by the left eye may be a virtual image corresponding to the left-eye image on the virtual image plane 600. As another example, the right-eye image seen by the right eye may be a virtual image corresponding to the right-eye image on the virtual image plane 600.

Exemplarily, in connection with fig. 6B. Taking the example that the object on the image is a triangle, the image is displayed on the display screen. When the human eye observes the corresponding display screen, in order to see the image on the display screen clearly, the ciliary muscle adjusts the crystalline lenses of the two eyes, so that the image on the virtual image plane 600 can be converged on the retina through the crystalline lenses. Accordingly, the zoom distance may be a distance from the virtual image plane 600 to the human eye (depth 1 as shown in fig. 6B). And the objects in the virtual environment that are presented to the user by the VR glasses often are not on the virtual image plane 600. For example, the triangle of the observed object in the virtual environment in fig. 6B (the triangle is represented by a dotted line because it is a virtual environment) is not on the virtual image plane 600. The user can converge the sight lines of the eyes on the triangle in the virtual environment by rotating the eyeballs. The convergence angle θ is shown as the sign of fig. 6B. Thus, the convergence depth should be the depth of the observed object (i.e., triangle) in the virtual environment. For example, the convergence depth may be a depth of 2 as shown in fig. 6B.

It can be seen that depth 1 and depth 2 are not identical at this point. Therefore, the brain cannot accurately judge the depth of the observed object, and therefore brain fatigue is caused, and user experience is affected.

In the above, one observed object is taken as an example, and generally, the virtual environment includes a plurality of observed objects. As shown in fig. 6C, the observed object includes two observed objects, wherein the observed object 1 is a triangle (dashed line) as an example, and the observed object 2 is a sphere (dashed line) as an example. For each observed object, there is a case where the convergence depth is different from the zoom depth.

In addition, in the current VR technology, the definition of all virtual objects (i.e., observed objects) on one image is the same. Continuing with fig. 6C as an example, the observed object is illustrated as a triangle and a sphere. The triangle and the sphere are displayed on the same image, and the definition is the same, and the virtual image surfaces 600 corresponding to the triangle and the sphere are at the same depth (i.e. depth 1), so the human brain considers that the zoom depths of the two observed objects should be the same based on the same virtual image surface. However, the convergence depths of the two observed objects are different, the convergence depth of the triangle is depth 2, the convergence depth of the sphere is depth 3, the human brain considers that the zoom depths of the two observed objects should be different based on different convergence depths, and the human brain considers that the zoom depths of the two observed objects are the same based on the same virtual image plane and conflicts with the human brain, so that the human brain cannot accurately judge the object depths, and fatigue feeling occurs. Therefore, when a plurality of objects exist in the virtual environment, and different objects have the same definition, the fatigue of the user is aggravated, and the user experience is influenced.

In order to solve VAC in the VR technology, there is a solution to adjust the position of the virtual image plane 600 so that the zoom depth and the convergence depth are the same, and to solve a brain fatigue feeling caused by the inconsistency between the zoom depth and the convergence depth. Taking fig. 6B as an example, this technique can adjust the virtual image plane 600 to the depth of the observed object (e.g., a triangle) in the virtual environment, so that the zoom depth and the convergence depth are the same. Taking fig. 6C as an example, this technique can adjust the virtual image surface corresponding to the triangle to the depth of the triangle, and adjust the virtual image surface corresponding to the sphere to the depth of the circle, so that the zoom depth and convergence depth corresponding to the triangle are the same, and the zoom depth and convergence depth corresponding to the sphere are also the same, thereby overcoming VAC. However, the technology for adjusting the virtual image plane position needs a certain support of optical hardware, such as a stepping motor, on one hand, the cost is increased by additional optical hardware, and on the other hand, the volume of VR glasses is increased by adding optical hardware, so that the technology is difficult to apply to light and small VR wearable equipment.

In order to solve the above technical problem, an embodiment of the present application provides a display method. For example, VR display devices (e.g., VR glasses) present images to a user with different objects having different degrees of sharpness, some objects being sharp, and some objects being blurred. For example, taking fig. 6C as an example, in the display method provided in the embodiment of the present application, the definition of the round ball and the triangle may be set to be different. For example, the sphere is fuzzy and the triangle is clear. The human brain considers that the zoom depths of the triangle and the sphere are the same, namely the depth 1, based on the same virtual image plane. Since the triangle is clear, the human brain would consider this depth 1 to be accurate for the triangle; the sphere is blurred and the human brain may perceive the depth 1 as inaccurate or unadjusted for the sphere and may attempt to adjust the ciliary muscle to see the sphere clearly. Therefore, the human brain can judge that the zoom depths of the triangle and the ball are not the same depth any more, and the two observed objects are not conflicted any more with the fact that the human brain considers that the zoom depths of the two observed objects are different based on the different convergence depths of the triangle and the ball, so that the fatigue feeling of the human brain is relieved.

Further, in general, when a human eye views a near object, more and more details are seen, and when a distant object is viewed, less and more blurred details are seen (this principle is also referred to as a pyramid principle of human eye imaging). Continuing with fig. 6C as an example, the VR glasses show a virtual environment in which the far round ball is blurred and the near triangle is clear, so that the virtual environment felt by the human is more consistent with the real situation (the near view is clear and the far view is blurred).

Therefore, the display method provided by the embodiment of the application can relieve fatigue of a user when the user watches a virtual environment through VR glasses, improves user experience, does not need to rely on the support of special optical hardware such as a stepping motor, is low in cost, and is beneficial to light and small equipment.

Fig. 7A and 7B are schematic diagrams of a first application scenario provided in an embodiment of the present application. This application scene uses the user to wear VR glasses and carry out the VR recreation as an example.

As shown in fig. 7A, VR glasses display an image 701. Image 701 may be an image generated by a VR game application including objects such as guns, containers, trees, etc. Assume that the gun is at depth 1, the container is at depth 2, and the tree is at depth 3. Depth of field 3> depth of field 2> depth of field 1. Assuming that the user's gaze point is on the gun (e.g., sighting telescope) in image 701, the tree is less sharp than the container in image 701 because the tree is at depth 3 which is farther from depth 1 and the container is at depth 2 which is closer to depth 1. Therefore, the tree in the VR game scene seen by the user wearing the VR eyes is blurred and the container is clear. That is, in the application scenario, the sharpness of different objects on the image is different. Specifically, objects (e.g., trees) that are far from the user's gaze point are blurred, and objects (e.g., containers) that are closer to the user's gaze point are sharper.

The human brain considers that the convergence depths of the tree and the container are different because the depths of field of the tree and the container are different; in addition, the definition of the tree and the container is different, the human brain considers that the zoom depths of the tree and the container are different, and the zoom depths are different from the convergence depths of the tree and the container considered by the human brain, so that brain fatigue can be relieved. Further, in a real environment, when a human eye gazes at an object, the object seen by the eye is relatively clear, and other objects far away from the object are relatively blurred. In the application scene, the tree far away from the user fixation point is fuzzy, and the container near the user fixation point is clear, so that the virtual environment seen by human eyes is more in accordance with the real situation.

In fig. 7A, for example, VR glasses display one frame of image (i.e., image 701), and it is noted that general VR glasses display an image stream (e.g., an image stream generated by a VR game application). The image stream includes a plurality of frames of images. When multi-frame images are displayed, the fixation point of a user may change, the VR glasses can detect the fixation point of the user in real time through the eye movement tracking module, and when the fixation point changes, the definition of an object is determined based on a new fixation point. For example, objects far from the new point of regard are blurred, and objects near the new point of regard are sharp.

For convenience of description, the process of displaying image streams by VR glasses is described below by taking the example that the user's gaze point is kept at the gun in the application scene.

One way this can be done is that the user's gaze point is kept in the gun period, the VR glasses display the image stream, blurring the tree (objects far from the user's gaze point) on every frame of image in the image stream. That is, the user's gaze point is always in a blurred state during the gun's period, with objects far from the user's gaze point. This can alleviate the fatigue, but can lose object (e.g., tree) details, so that the user may miss some details resulting in game failure.

In order to relieve fatigue and ensure that the details of an object (such as a tree) are taken in the human brain, the user's gaze point is kept in the gun period, VR glasses display an image stream, and the definition of the object (such as a tree) far away from the user's gaze point in the image stream can be high or low and does not need to be in a fuzzy state all the time.

Illustratively, as shown in fig. 7b, vr glasses display an image stream including an i-th frame, a j-th frame, and a k-th frame. Wherein the tree (object far away from the user's gaze point) in the ith frame image is blurred, the tree in the jth frame image is sharp, and the tree in the kth frame image is blurred. Therefore, when VR glasses display the ith frame image, the tree seen by the user is fuzzy, so that fatigue can be relieved, when VR eyes display the jth frame image, the tree seen by the user is clear, and due to the visual retention effect, the tree on the ith frame image and the tree on the jth frame image are fused in the human brain, so that the tree in the ith frame image is blurred, but details of the tree in the human brain cannot be lost. That is to say, in fig. 7B, when the VR glasses display the image stream, for an object (such as a tree) far away from the user's gaze point, the definition of the object is high or low, and the object does not need to be in a fuzzy state all the time.

Alternatively, in fig. 7B, the definition of the object (e.g., container) near the user's gaze point may be unchanged. Illustratively, the tree definition varies from high to low, but not beyond the definition of the container. For example, in fig. 7B, the tree is clear and the container is clear in the jth image, but the tree definition is lower than or equal to the container definition. Alternatively, the definition of the object (such as a container) close to the user's gaze point may be higher or lower, as long as the definition of the object close to the user's gaze point on the same image is higher than or equal to that of the object far from the user's gaze point. For example, the container may be obscured by the tree in the ith image frame of fig. 7B, but the container is less obscured than the tree, so that the container is more sharp than the tree. In short, the farther an object is from the user's gaze point, the higher the degree of blurring and the lower the definition, and the closer an object is from the user's gaze point, the lower the degree of blurring and the higher the definition.

It should be noted that, in fig. 7A and fig. 7B, the user's gazing point is taken as an example of a gun, in other words, the user's gazing point may be on a game device (such as a gun) currently held by a game character corresponding to the user, and it is understood that the user's gazing point may also be at a game character corresponding to a game counterpart, or at a building, or at an upper body part of the game character corresponding to the user, and so on, which are not necessarily examples in the present application.

Fig. 8A and 8B are exemplary diagrams illustrating a second application scenario provided in an embodiment of the present application. The user wears VR glasses to carry out VR driving in this application scene for example.

As shown in fig. 8A, VR glasses display an image 801. Illustratively, the image 801 may be an image generated by a VR driving application. The image 801 includes vehicle information such as a steering wheel, a display, and the like, a road, a tree and a preceding vehicle on the road, and the like. Wherein the depth of field 2 of the tree is greater than the depth of field 1 of the vehicle in front. Therefore, the sharpness of the tree in the image 801 is lower than that of the preceding vehicle. That is, objects with a large depth of field are blurred, and objects with a small depth of field are sharp. This application scenario is different from the application scenarios shown in fig. 7A and 7B. In the application scenes shown in fig. 7A and 7B, the user's gaze point is taken as the reference, the object far from the user's gaze point is blurred, the object near the user's gaze point is sharp, and in the application scenes of fig. 8A and 8B, the far object is blurred, the near object is sharp, and is not related to the user's gaze point.

With continued reference to FIG. 8A, the human brain may consider the tree to be at a different convergence depth than the lead vehicle due to the different depths of field for the tree and the lead vehicle; in addition, the definition of the tree is different from that of the front vehicle, the human brain considers that the zoom depth of the tree is different from that of the front vehicle, and the zoom depth is matched with the convergence depth of the tree and the front vehicle considered by the human brain, so that brain fatigue can be relieved. Further, in a real-world environment, the human eye sees more and more detail more clearly when looking at near objects and less and more blurred when looking at distant objects. Therefore, in the virtual environment of fig. 8A, the near objects are sharp and the far objects are blurred, so that the virtual environment is more realistic.

It should be noted that, in fig. 8A, for example, VR glasses display one frame of image (i.e., image 801), it is understood that VR glasses may display an image stream (e.g., an image stream generated by a VR driving application). The image stream includes a plurality of frames of images. One possible implementation is that distant objects on each frame of image in the image stream are blurred. This approach, while alleviating the fatigue, can lose the details of distant objects.

In order to relieve fatigue and ensure that the details of a distant object are taken in the brain of a person, when the VR glasses display an image stream, the definition of the distant object (such as a tree) in the image stream can be high or low, and the image stream does not need to be in a fuzzy state all the time.

Illustratively, as shown in fig. 8b, vr glasses display an image stream including an i-th frame, a j-th frame, and a k-th frame. Wherein, the tree is fuzzy on the ith frame image, the tree is clear on the jth frame image, and the tree is fuzzy on the kth frame image. When VR eyes display the ith frame image, the tree that the user sees is fuzzy, can alleviate tired sense, and when VR glasses display the jth frame image, the tree that the user sees is clear, because the effect of visual stay, can fuse the tree on ith frame image and the jth frame image in the human brain, so although the tree in the ith frame image is blurred, still can guarantee not to lose the detail of tree in the human brain. That is, in fig. 8B, the object with a large depth of field in the image stream displayed by the VR glasses has a high or low sharpness, and does not need to be in a blurred state all the time. By the mode, not only can the fatigue be relieved, but also the details of the remote object can be taken in the brain of a person, and the user experience is better.

Alternatively, in fig. 8B, the sharpness may be unchanged for a nearby object (e.g., a vehicle ahead). Illustratively, the definition of the far object varies slightly, but not more than that of the near object. For example, in fig. 8B, the tree is clear in the jth image, and the vehicle in front is also clear, but the tree definition is lower than or equal to that of the vehicle in front. Alternatively, the definition of the near object (e.g. the front vehicle) may be higher or lower, as long as the definition of the near object is higher than or equal to that of the far object in the same image. For example, the image of the ith frame in fig. 8B is blurred, and the vehicle ahead may be blurred, but the degree of blurring of the vehicle ahead is lower than the degree of blurring of the tree, so that the sharpness of the tree is lower than that of the vehicle ahead. In short, for an object with a larger depth of field, the higher the blurring degree is, the lower the definition is, and for an object with a smaller depth of field, the lower the blurring degree is, the higher the definition is.

It should be noted that, in the above embodiment, the tree is blurred, and the vehicle in front is clear (the depth of field 2 where the tree is located is greater than the depth of field 1 where the vehicle in front is located), in other words, the object at the depth of field 2 where the depth of field 1 is greater than the depth of field 1 is blurred. In other embodiments, as shown in fig. 8A, in the driving scene of the vehicle, the object at the depth 1 and the object at the depth 2, which are larger than the depth 3, may be blurred, based on the depth 3. It is to be understood that, in fig. 8A, the depth of field 3 is taken as an example of the depth of field where the user currently drives the vehicle, and it is to be understood that the depth of field 3 may also be the depth of field where a steering wheel is located on the vehicle currently driven by the user, or the depth of field where a windshield is located on the vehicle currently driven by the user, and so on.

In the first application scenario (fig. 7A and 7B), the VR game application is taken as an example, and based on the user's gaze point, the objects farther from the user's gaze point are blurred, the objects closer to the user's gaze point are sharp, and when the image stream is displayed, the sharpness of the objects farther from the user's gaze point in the image stream is increased or decreased. In the second application scenario (fig. 8A and 8B), taking VR driving application as an example, the far object is blurred, the near object is sharp, and when the image stream is displayed, there is a rise or fall in the sharpness of the far object in the image stream. For convenience of description, a mode in which an object far away from the user's gaze point is blurred and an object near the user's gaze point is clear in a first application scene is referred to as a first eye protection mode, and a mode in which an object far away from the user's gaze point is blurred and an object near the user's gaze point is clear in a second application scene is referred to as a second eye protection mode.

The above are two application scenarios listed in the present application, mainly taking VR game and VR driving as examples, it should be noted that the two application scenarios may be combined, for example, a first application scenario (e.g., VR game application) may also apply a scheme of a second application scenario to implement far object blurring and near object blurring, or a second application scenario (e.g., VR driving application) may also apply a scheme of the first application scenario to implement that an object farther from a user's gaze point is blurred and an object closer to the user's gaze point is sharp with respect to the user's gaze point. In addition, the technical scheme provided by the embodiment of the application can be suitable for other application scenes, such as any scene needing to show a virtual environment like a user, for example, VR car watching, VR house watching, VR chatting, VR teaching, VR cinema and the like.

The apparatus associated with the present application is described below.

For example, please refer to fig. 9, which shows a schematic structural diagram of a VR wearable device (such as VR glasses) provided in an embodiment of the present application. As shown in fig. 9, the VR-worn device 100 may include a processor 110, a memory 120, a sensor module 130 (which may be used to obtain the gesture of the user), a microphone 140, a key 150, an input/output interface 160, a communication module 170, a camera 180, a battery 190, an optical display module 1100, an eye tracking module 1200, and the like.

It is to be understood that the illustrated structure of the embodiment of the present application does not constitute a specific limitation to the VR-worn device 100. In other embodiments of the present application, the VR-worn device 100 can include more or fewer components than shown, or combine certain components, or split certain components, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.

The processor 110, which is generally used to control the overall operation of the VR-worn device 100, may include one or more processing units, such as: the processor 110 may include an Application Processor (AP), a modem processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a Video Processing Unit (VPU) controller, a memory, a video codec, a Digital Signal Processor (DSP), a baseband processor, and/or a neural-Network Processing Unit (NPU), etc. The different processing units may be separate devices or may be integrated into one or more processors.

A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that have just been used or recycled by the processor 110. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Avoiding repeated accesses reduces the latency of the processor 110, thereby increasing the efficiency of the system.

In some embodiments of the present application, the processor 110 may be used to control the optical power of the VR-worn device 100. For example, the processor 110 may be configured to control the optical power of the optical display module 1100, and implement the function of adjusting the optical power of the wearing apparatus 100. For example, the processor 110 may adjust the focal power of the optical display module 1100 by adjusting the relative position between each optical device (e.g., lens, etc.) in the optical display module 1100, so that the position of the corresponding virtual image plane may be adjusted when the optical display module 1100 images human eyes. Thereby achieving the effect of controlling the focal power of the wearable device 100.

In some embodiments, processor 110 may include one or more interfaces. The interface may include an integrated circuit (I2C) interface, a universal asynchronous receiver/transmitter (UART) interface, a Mobile Industry Processor Interface (MIPI), a general-purpose input/output (GPIO) interface, a Subscriber Identity Module (SIM) interface, and/or a Universal Serial Bus (USB) interface, a Serial Peripheral Interface (SPI) interface, and/or the like.

In some embodiments, the processor 110 may blur objects at different depths to provide different degrees of sharpness for objects at different depths.

The I2C interface is a bidirectional synchronous serial bus including a serial data line (SDA) and a Serial Clock Line (SCL). In some embodiments, the processor 110 may include multiple sets of I2C buses.

The UART interface is a universal serial data bus used for asynchronous communications. The bus may be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, a UART interface is generally used to connect the processor 110 and the communication module 170. For example: the processor 110 communicates with a bluetooth module in the communication module 170 through a UART interface to implement a bluetooth function.

The MIPI interface may be used to connect the processor 110 with peripheral devices such as a display screen and a camera 180 in the optical display module 1100.

The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal and may also be configured as a data signal. In some embodiments, a GPIO interface may be used to connect the processor 110 with the camera 180, the display screen in the optical display module 1100, the communication module 170, the sensor module 130, the microphone 140, and the like. The GPIO interface may also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, and the like. Optionally, the camera 180 may collect an image including a real object, and the processor 110 may fuse the image collected by the camera with a virtual object, and the image obtained by the real fusion is displayed by the optical display module 1100. Optionally, the camera 180 may also capture images including human eyes. The processor 110 performs eye tracking through the image.

The USB interface is an interface which accords with USB standard specifications, and specifically can be a Mini USB interface, a Micro USB interface, a USB Type C interface and the like. The USB interface can be used to connect the charger to charge the VR wearable device 100, and also can be used to transmit data between the VR wearable device 100 and the peripheral device. And the method can also be used for connecting a headset and playing audio through the headset. The interface can also be used for connecting other electronic equipment, such as mobile phones and the like. The USB interface may be USB3.0, and is configured to be compatible with Display Port (DP) signaling, and may transmit video and audio high-speed data.

It should be understood that the connection relationship between the modules illustrated in the embodiment of the present application is only an illustrative example, and does not constitute a structural limitation on the wearable device 100. In other embodiments of the present application, the wearable device 100 may also adopt different interface connection manners or a combination of multiple interface connection manners in the above embodiments.

Additionally, the VR-worn device 100 may include wireless communication functionality, for example, the VR-worn device 100 may receive images from other electronic devices (e.g., VR host) for display. The communication module 170 may include a wireless communication module and a mobile communication module. The wireless communication function may be implemented by an antenna (not shown), a mobile communication module (not shown), a modem processor (not shown), a baseband processor (not shown), and the like. The antenna is used for transmitting and receiving electromagnetic wave signals. Multiple antennas may be included in VR-worn device 100, each antenna operable to cover a single or multiple communication bands. Different antennas can also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed as a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.

The mobile communication module may provide a solution for wireless communication including a second generation (2th generation, 2g) network/a third generation (3th generation, 3g) network/a fourth generation (4th generation, 4g) network/a fifth generation (5th generation, 5g) network applied to the VR-worn device 100. The mobile communication module may include at least one filter, a switch, a power amplifier, a Low Noise Amplifier (LNA), and the like. The mobile communication module can receive electromagnetic waves by the antenna, filter and amplify the received electromagnetic waves, and transmit the electromagnetic waves to the modulation and demodulation processor for demodulation. The mobile communication module can also amplify the signal modulated by the modulation and demodulation processor and convert the signal into electromagnetic wave to radiate the electromagnetic wave through the antenna. In some embodiments, at least part of the functional modules of the mobile communication module may be provided in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module may be provided in the same device as at least some of the modules of the processor 110.

The modem processor may include a modulator and a demodulator. The modulator is used for modulating a low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then passes the demodulated low frequency baseband signal to a baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs sound signals through an audio device (not limited to a speaker, etc.) or displays images or videos through a display screen in the optical display module 1100. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be independent of the processor 110, and may be disposed in the same device as the mobile communication module or other functional modules.

The wireless communication module may provide a solution for wireless communication including Wireless Local Area Networks (WLANs) (e.g., wireless fidelity (Wi-Fi) networks), bluetooth (BT), global Navigation Satellite System (GNSS), frequency Modulation (FM), near Field Communication (NFC), infrared (IR), and the like, which is applied on the VR wearable device 100. The wireless communication module may be one or more devices integrating at least one communication processing module. The wireless communication module receives electromagnetic waves via the antenna, performs frequency modulation and filtering on electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module may also receive a signal to be transmitted from processor 110, frequency modulate it, amplify it, and convert it to electromagnetic radiation via an antenna.

In some embodiments, the antenna and the mobile communication module of the VR-worn device 100 are coupled such that the VR-worn device 100 can communicate with networks and other devices through wireless communication techniques. The wireless communication technology may include global system for mobile communications (GSM), general Packet Radio Service (GPRS), code division multiple access (code division multiple access, CDMA), wideband Code Division Multiple Access (WCDMA), time-division code division multiple access (time-division code division multiple access, TD-SCDMA), long Term Evolution (LTE), BT, GNSS, WLAN, NFC, FM, and/or IR technologies, etc. GNSS may include Global Positioning System (GPS), global navigation satellite system (GLONASS), beidou satellite navigation system (BDS), quasi-zenith satellite system (QZSS), and/or Satellite Based Augmentation System (SBAS).

The VR-equipped device 100 implements a display function through a GPU, an optical display module 1100, and an application processor. The GPU is a microprocessor for image processing, and is connected to the optical display module 1100 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. The processor 110 may include one or more GPUs that execute program instructions to generate or alter display information.

Memory 120 may be used to store computer-executable program code, which includes instructions. The processor 110 executes various functional applications of the VR-worn device 100 and data processing by executing instructions stored in the memory 120. The memory 120 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The storage data area may store data (such as audio data, a phone book, etc.) created during use of the wearable device 100, and the like. In addition, the memory 120 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, a Universal Flash Storage (UFS), and the like.

VR wearable device 100 may implement audio functionality via an audio module, speaker, microphone 140, headphone interface, and application processor, among other things. Such as music playing, recording, etc. The audio module is used for converting digital audio information into analog audio signals to be output and converting analog audio input into digital audio signals. The audio module may also be used to encode and decode audio signals. In some embodiments, the audio module may be disposed in the processor 110, or a portion of the functional modules of the audio module may be disposed in the processor 110. Loudspeakers, also known as "horns," are used to convert electrical audio signals into sound signals. The wearable device 100 can listen to music through a speaker or listen to a hands-free call.

The microphone 140, also known as a "microphone", is used to convert sound signals into electrical signals. The VR-worn device 100 can be provided with at least one microphone 140. In other embodiments, the VR-worn device 100 may be provided with two microphones 140 to achieve a noise reduction function in addition to collecting sound signals. In other embodiments, three, four, or more microphones 140 may be further disposed on the VR wearable device 100 to collect sound signals, reduce noise, identify sound sources, perform directional recording, and so on.

The earphone interface is used for connecting a wired earphone. The headset interface may be a USB interface, or may be a 3.5 millimeter (mm) open mobile platform for wearable equipment (OMTP) standard interface, or a CTIA (cellular telecommunications industry association) standard interface.

In some embodiments, the VR-worn device 100 may include one or more keys 150 that may control the VR-worn device to provide a user with access to functionality on the VR-worn device 100. Keys 150 may be in the form of buttons, switches, dials, and touch or near touch sensitive devices (e.g., touch sensors). Specifically, for example, the user may open the optical display module 1100 of the VR wearable device 100 by pressing a button. The keys 150 include a power-on key, a volume key, and the like. The keys 150 may be mechanical keys. Or may be touch keys. The wearable device 100 may receive key inputs, generating key signal inputs related to user settings and function control of the wearable device 100.

In some embodiments, the VR wearable device 100 may include an input-output interface 160, and the input-output interface 160 may connect other devices to the VR wearable device 100 through suitable components. The components may include, for example, audio/video jacks, data connectors, and the like.

The optical display module 1100 is used for presenting images to a user under the control of the processor 110. The optical display module 1100 may convert real pixel image display into virtual image display of near-to-eye projection through one or more optical devices in a reflector, a transmission mirror, or an optical waveguide, etc., so as to realize virtual interactive experience, or realize interactive experience combining virtual and reality. For example, the optical display module 1100 receives image data information sent by the processor 110 and presents a corresponding image to a user.

In some embodiments, the VR wearable device 100 may further include an eye tracking module 1200, where the eye tracking module 1200 is configured to track the movement of the human eye to determine the gaze point of the human eye. For example, the pupil position can be located by an image processing technique, and the coordinates of the pupil center are obtained, so as to calculate the human fixation point.

For convenience of understanding, the display method according to the embodiment of the present application will be described below by taking the VR wearable device shown in fig. 9 as VR glasses as an example.

Example one

VR glasses show the image to the user, and the definition of different objects is different on the image. For example, an object (referred to as a first object for convenience of description) far from the user's gaze point on the image is blurred, and an object (referred to as a second object for convenience of description) near the user's gaze point is sharp. This embodiment one can be applied to the application scenarios shown in fig. 7A and 7B above.

For example, please refer to fig. 10, it is assumed that the user points to a house, and a distance L1 between a depth of field of the mountain and a depth of field of the house is smaller than a distance L2 between a depth of field of the tree and a depth of field of the house, so that the definition of the mountain is higher than that of the tree. That is, when the user's gaze point is at a house, the mountains closer to the house are clear, and the trees farther from the house are blurred. The definition may include an image resolution, a scanning resolution, and the like, and in the following embodiments, the definition is described by taking the resolution of a displayed image as an example.

Exemplarily, fig. 11 is a schematic flowchart of an image generation method according to a first embodiment. As shown in fig. 11, the process includes:

s1101, determining a user fixation point.

The process for determining the user's gaze point has been described above, and is not repeated herein.

S1102, determining depths of different objects to be drawn.

The depth of each object can be automatically stored when the rendering pipeline runs, and the binocular vision can be used for calculation, and the embodiment of the application is not limited.

S1103, determining the fuzzy degree of the object according to the distance between the user watching point and the object.

Illustratively, when the distance between the depth of field of the object and the depth of field of the user's gaze point is smaller than a preset distance, the object may not be blurred; and when the distance between the depth of field of the object and the depth of field of the user fixation point is greater than the preset distance, carrying out fuzzy processing on the object. The specific value of the preset distance is not limited in the embodiments of the present application.

Or the blurring degree of different objects on the image can be increased from small to large according to the distance between the depth of field of the object and the depth of field of the user's gaze point. For example, assume that the user's gaze point is located in a depth of field of 1. The distance between the depth of field of the object 1 and the depth of field 1 is distance 1, the distance between the depth of field of the object 2 and the depth of field 1 is distance 2, and the distance between the depth of field of the object 3 and the depth of field 1 is distance 3. If distance 1< distance 2< distance 3, then the degree of blur of object 1< the degree of blur of object 2< the degree of blur of object 3, so the sharpness of object 1 > the sharpness of object 2> the sharpness of object 3. That is, in the virtual environment seen by the eyes of the user, objects farther from the user's gaze point are blurred, and objects closer to the user's gaze point are sharper.

And S1104, generating an image, and performing blurring processing on the object on the image.

In one possible implementation, the VR device may generate an image, where the definitions of all objects on the image are the same, and then perform blurring processing on different objects on the image to different degrees by using an image blurring algorithm. The image blurring algorithm includes at least one of gaussian blur, image down-sampling, out-of-focus blur (defocus blur) algorithm based on deep learning, level of detail (LOD) data structure, and the like, which is not described in detail herein.

It is understood that when the VR glasses display an image, the user's gaze point may change. When the user's fixation point changes, the definition of the object on the image is adjusted accordingly. For example, continuing with fig. 10 as an example, when the user's gaze point changes from a house to a tree, the definition of each object is re-determined based on the new gaze point (i.e., the tree), i.e., objects closer to the new gaze point are clear, and objects farther from the new gaze point are blurred.

Fig. 10 illustrates an example of VR glasses displaying one frame of image. It is understood that, in general, VR glasses display an image stream including a plurality of frames of images.

First, a VR image generation device (such as the image generation device 200 in fig. 2) defaults to an image stream generated using the flow shown in fig. 11 (i.e., blurring an object on each frame image that is far from the user's gaze point). For example, taking the VR image generating device as a mobile phone, when the mobile phone detects at least one of connection of VR glasses, turning on of VR glasses, and starting of a VR application (such as a VR game), the mobile phone starts to generate an image stream by default using the flow shown in fig. 11, and then displays the image stream through the VR glasses.

In the second mode, the VR image generation device generates an image by default using an existing mode (i.e., all objects on the image have the same sharpness), and generates an image using the flow shown in fig. 11 when an instruction for activating the first eye-protection mode is detected. Wherein, for the first eye protection mode, please refer to the above description. That is, the sharpness of all objects on the image that the VR glasses have just started to display is the same, and after detecting the instruction for activating the first eye-protection mode, objects far from the user's gaze point on the displayed image are blurred. Illustratively, referring to fig. 12, before the (i + 1) th frame, the sharpness of all objects on the image is the same. When the VR glasses detect the indication to initiate the first eye protection mode, objects (e.g., trees) on the i +1 th to nth frame images that are far from the user's gaze point are blurred. Wherein the indication for activating the first eye-shielding mode includes, but is not limited to: detect the user and trigger the operation (including the button that is used for starting first eyeshield mode in using if VR, the operation can be the click the operation of button), user watch time be greater than in the time of predetermineeing, predetermined time length in user's eyes blink/squint number of times be greater than at least one in the preset number of times. For example, if the user's eyes are blinked or squinted within a predetermined time period and the predetermined number of times is greater than the predetermined number of times, the user may blink or squinted without stopping to relieve the user's eyes, and therefore, when the user's eyes are detected to be blinked or squinted more, the first eye-protecting mode is activated to relieve the user's fatigue.

After the first eye protection mode is started, objects (such as trees) far away from the user fixation point on the image are blurred, so that the brain fatigue of a human is relieved, and the user experience is better. Optionally, before the first eye protection mode is started, prompt information may be further output, where the prompt information is used to prompt a user whether to switch to the first eye protection mode, and after an instruction that the user confirms to switch to the first eye protection mode is detected, the user switches to the first eye protection mode.

In the first and second aspects, the blurring process is performed on the object far from the user's gaze point in the image stream generated by the VR image generation device, which can alleviate the fatigue of the human brain, but the details of the object far from the user's gaze point are easily lost. For example, in the first mode, an object far from the user's gaze point is always blurred, and the user cannot obtain details of the object; in a second approach, after detecting the indication to initiate the first eye-protection mode, the subject far from the user's gaze point will always be obscured and the user will not be able to obtain details of the subject.

In order to not only relieve fatigue, but also obtain the details of an object far away from a user's point of regard. In the first or second mode, the image stream generated by the VR image generation device may have a high or low degree of sharpness for objects far from the user's gaze point (see fig. 15 for a specific generation process). For example, the image stream includes a plurality of periods, each period includes a plurality of frames of images, and the definition of an object far from the user's gaze point on the image increases first and then decreases in each period.

Illustratively, referring to fig. 13, in a period, the definition of a tree (an object far from the user's gaze point) in the ith frame image is lower than that in the jth frame image, and the definition of the tree in the jth frame image is higher than that in the kth frame image. That is, the definition of the object (i.e. tree) far from the user's gaze point in one period has a varying trend of "blur-definition-blur". In the next period, the definition of the tree on the kth frame image is lower than that on the p frame image, and the definition of the tree on the p frame image is higher than that on the q frame image, namely, in the next period, the definition of an object (namely, a tree) far away from the user fixation point also has a changing trend of 'fuzzy-clear-fuzzy'. The two periods in fig. 13 may be the same or different, and are not limited. The definition change trend can relieve the brain fatigue on one hand, and can prevent the user from losing the image details of the object far away from the user fixation point on the other hand.

Illustratively, continuing with FIG. 13, i, j, k, p satisfy: j = i + n, k = j + m, p = k + w, q = p + s, the jth frame is n frames after the ith frame, the kth frame is m frames after the jth frame, the pth frame is w frames after the kth frame, and the qth frame is s frames after the pth frame. Wherein n, m, q and s are integers greater than or equal to 1.

An example is that j = i +1, k = j +1, p = k +1, q = p +1, i.e., the j frame image is the next frame of the i frame image, the k frame image is the next frame of the j frame image, the p frame is the next frame of the k frame, and the q frame is the next frame of the p frame.

As another example, n, m, p, s may be determined according to the user visual dwell time and the image refresh frame rate. Suppose that the user visual dwell time is T and the image refresh frame rate is P. The visual retention time T may be any value in a range from 0.1s to 3s, or may be set by a user, which is not limited in the embodiment of the present application. Then, during T time, a T/P frame picture can be displayed, then n is less than or equal to T/P, m is less than or equal to T/P, q is less than or equal to T/P, and s is less than or equal to T/P. In other words, the time difference between the display time of the jth frame image and the display time of the ith frame image is less than the visual retention time of the user, so that the image information on the jth frame image and the ith frame image can be superimposed in the human brain, and because the object on the ith frame image far from the user's gaze point is blurred, the object on the jth frame image far from the user's gaze point is clear, and the superimposition of the two images can ensure that the details of the object in the human brain far from the user's gaze point are sufficient. Similarly, the time difference between the display time of the jth frame image and the display time of the kth frame image is less than the user visual retention time, and repeated description is omitted. For convenience of understanding, the following object far from the user's gaze point is taken as an example of a lighthouse, and one specific example is given. Referring to fig. 14, an object (i.e., a lighthouse) far from a gaze point of a user on the ith frame image is blurred, a lighthouse on the jth frame image is clear, and when the ith frame image and the jth frame image are watched by eyes of the user, because of a visual dwell effect, the definition of the lighthouse which can be formed in a human brain is higher than that of the lighthouse on the ith frame image, so that details are not lost due to the lighthouse blur on the ith frame image.

Optionally, continuing with the example of fig. 13, the definition of the object near the user's gaze point in the image stream may be unchanged. For example, in fig. 13, the sharpness of an object (i.e., a mountain) near the user's gaze point remains unchanged.

Illustratively, the generation process of the image stream shown in fig. 13 is described below.

The GPU outputs N frames of images (for convenience of description, the images output by the GPU are referred to as original images), the definition of all objects on the N frames of images output by the GPU is the same, and N frames of new images in which the definition of the first object (i.e., an object far from the user's gaze point, such as a tree in fig. 13) changes alternately are generated from the N frames of original images. The following describes the process of generating the i-th frame new image, the j-th frame new image, the k-th frame new image and the p-th frame new image by taking the i-th frame original image, the j-th frame original image, the k-th frame original image and the p-th frame original image (for example, j = i +1, k = j +1, p = k + 1).

In short, the new image of the ith frame is an image obtained by fuzzifying the first object on the original image of the ith frame; the j-th frame new image is the j-th frame original image or an image obtained by performing sharpening processing on a first object on the j-th frame original image; the new image of the kth frame is an image which is subjected to fuzzification processing on the first object on the original image of the kth frame; the new image of the p-th frame is the original image of the p-th frame or the image obtained by performing the sharpening processing on the first object on the original image of the p-th frame.

For example, as shown in fig. 15, a first object on the original image of the ith frame output by the GPU is blurred, so as to obtain a new image of the ith frame. To avoid loss of detail of the first object, more detail of the first object may be included on the new map of the jth frame. In one possible method, the jth frame new image is obtained by superimposing (or fusing) the ith frame new image and the jth frame original image output by the GPU, so that the definition of the first object on the jth frame new image is higher than that of the jth frame original image and the ith frame new image. Another possible way is to superimpose the image information lost when the ith frame of original image is blurred with the jth frame of original image output by the GPU to obtain a jth frame of new image. If j = i +1, the image information lost by the blurring process on the previous frame image is compensated to the next frame image. Optionally, in order to improve efficiency, when the ith frame new image and the jth frame original image are superimposed, only the image block in the area where the first object is located on the ith frame new image may be superimposed on the image block in the area where the first object is located on the jth frame original image.

Similarly, please continue to refer to fig. 15, blur the first object on the image of the kth frame output by the GPU to obtain a new image of the kth frame. In order to avoid the loss of details of the first object, the new image of the kth frame is superposed with the original image of the pth frame output by the GPU to obtain the new image of the pth frame, or image information lost when the kth original image is subjected to blurring processing is superposed with the original image of the pth frame to obtain the new image of the pth frame. Optionally, in order to improve efficiency, when the kth frame new image is overlapped with the pth frame original image, only the image block in the area where the first object is located on the kth frame new image may be overlapped with the image block in the area where the first object is located on the pth frame original image.

Continuing with the example of fig. 15, and taking the example of overlapping the blurred region (i.e., the region where the first object is located) on the new image of the ith frame with the region where the first object is located on the original image of the jth frame, it can be understood that there is a case where the positions of the blurred region on the new image of the ith frame and the region where the first object is located on the original image of the jth frame may change. In this case, the corresponding relationship between the pixels may be determined according to the optical flow (optical flow) between the two frames of images, the corresponding region on the original image of the jth frame may be determined according to the blurred region on the new image of the ith frame and the corresponding relationship, and the blurred region on the new image of the ith frame and the determined region on the original image of the jth frame may be superimposed to obtain the new image of the jth frame.

Example two

VR glasses show the image to the user, and the definition of different objects on the image is different. For example, an object having a large depth of field (referred to as a first object for convenience of description) is blurred on the image, and an object having a small depth of field (referred to as a second object for convenience of description) is sharp. The second embodiment can be applied to the application scenarios shown in fig. 8A and 8B.

For example, as shown in fig. 16, the second depth of field of the mountain is greater than the first depth of field of the tree, so that the mountain is blurred and the tree is clear, so that the user sees the virtual environment in which the mountain is blurred and the tree is clear.

For example, please refer to fig. 17, which is a schematic flow chart of an image generating method according to the second embodiment, and as shown in fig. 17, the flow of the method includes:

s1701, a preset depth of field is determined.

The preset depth of field may be a specific depth of field value, or may be a depth of field range, and the embodiment of the present invention is not limited thereto. The preset depth of field is used to determine which objects are distant objects and which objects are near objects. For example, an object having a depth of field greater than the preset depth of field is a distant object, and an object having a depth of field less than the preset depth of field is a close object. For example, the distance object may be blurred, and the near object may not be blurred. The preset depth of field may be determined in a variety of ways, including but not limited to at least one of the following ways.

In a first manner, the preset depth of field may be determined according to the VR scene, and the preset depth of field is different according to different VR scenes. Wherein the VR scene includes, but is not limited to, at least one of VR games, VR movies, VR instruction, etc.

Taking a VR game as an example, the VR game includes game characters, and the preset depth of field may be determined according to the game characters. For example, if the first-person game is adopted, the preset depth of field may be the depth of field where the game character corresponding to the user is located in the game scene, or the depth of field where the upper body part of the game character corresponding to the user is located, or the depth of field where the game equipment currently held by the game character corresponding to the user is located. Taking fig. 7A as an example, the arm of the game character holds the gun, and it can be determined that the depth of field of the arm or the gun is the preset depth of field. For another example, if the third person is a game (e.g., a fighting scene), the depth of field may be preset according to the depth of field of the game character controlling the game (e.g., the game character being handled). For example, the VR viewing includes a display screen, and the depth of field where the display screen is located can be determined as the preset depth of field. For example, the VR teaching includes teaching devices such as a blackboard, a display screen, and a projection, and it can be determined that the depth of field in which the teaching device is located is a preset depth of field.

In a second mode, the preset depth of field may be set by the user, for example, the user may set the preset depth of field on the VR glasses or an electronic device (e.g., a mobile phone) connected to the VR glasses. It should be understood that various VR applications are included on the electronic device, and different VR applications may set different preset depths of field. Optionally, the user may set the preset depth of view of the VR applications on the electronic device in batch, or may set each VR application individually.

In a third way, the preset depth of field may also be a default depth of field, where the default depth of field may be understood as a default setting of the VR glasses, or a default setting of an electronic device (e.g., a mobile phone) connected with the VR glasses, or a default setting of a VR application currently running on the electronic device (e.g., a mobile phone) connected with the VR glasses, and the like, and the embodiment of the present application is not limited thereto.

In a fourth mode, the preset depth of field may also be the depth of field of the virtual image plane. Taking fig. 6C as an example, the depth of field of the dashed line is depth 1, and the preset depth of field may be depth 1.

And fifthly, the preset depth of field can also be the depth of field of the main object in the picture currently displayed by the VR glasses. The subject object may include an object occupying the largest area in the screen, an object located in a central position in the screen, a virtual object (such as a UI interface) in the screen, and the like. For example, in 16, the image includes a tree, a house and a sun, and assuming that the house is centered, the house is determined as a subject object, and the preset depth of field may be the depth of field where the house is located. The depth of field of the mountain is larger than that of the house, so the mountain is fuzzy, and the depth of field of the tree is smaller than that of the house, so the tree is clear.

And in the sixth mode, the preset depth of field is the depth of field where the user fixation point is located. The VR glasses can comprise an eye-movement tracking module, the user fixation point can be determined through the eye-movement tracking module, and the depth of field where the user fixation point is located is determined to be the preset depth of field. Illustratively, continuing with 16 as an example, the image includes a tree, a house, and the sun, and assuming that the user's gaze point is at the house, the preset depth of field may be the depth of field at which the house is located. The depth of field of the mountain is larger than that of the house, so the mountain is fuzzy, and the depth of field of the tree is smaller than that of the house, so the tree is clear.

The above are some determination manners of the preset depth of field, and other manners are also possible, and the embodiment of the present application is not limited.

S1702, determining the depth of different objects to be drawn.

The depth of each object can be automatically stored when the rendering pipeline runs, and calculation can be performed by means of binocular vision.

And S1703, determining the fuzzy degree of the object according to the distance between the depth of the object and a preset depth of field.

For example, when the depth of the object is less than or equal to the preset depth, the object may not be fuzzified; when the depth of the object is greater than the preset depth, the object needs to be subjected to fuzzy processing.

Alternatively, the degree of blurring of different objects on the image may increase in order from small to large depth of field. For example, if the depth 1 of the object 1< the depth 2 of the object 2< the depth 3 of the object 3, the degree of blurring of the object 1< the degree of blurring of the object 2< the degree of blurring of the object 3. Thus, the definition of object 1 > the definition of object 2> the definition of object 3. That is to say. In the virtual environment seen by the eyes of the user, the definition of the near view object, the intermediate view object and the far view object is reduced in sequence.

And S1704, generating an image and blurring the object on the image.

In one possible implementation, the VR device may generate an image first, and then perform different degrees of blurring on different objects on the image by using an image blurring algorithm. The image blurring algorithm includes at least one of gaussian blur, image down-sampling, out-of-focus blur (defocus blur) algorithm based on deep learning, level of detail (LOD) data structure, and the like, which is not limited in the embodiment of the present application.

The LOD is described below as an example.

The LOD will be briefly introduced. The LOD is a multi-layer data structure, which may be understood as an image processing algorithm, i.e. a multi-layer data structure comprising a plurality of layers of image processing algorithms. Assume LOD0 through LOD3; wherein, each layer in LOD0 to LOD3 corresponds to an image processing algorithm. Moreover, algorithms corresponding to different layers in LOD0 to LOD3 are different, and specifically, the higher the hierarchy, the simpler the corresponding image processing algorithm. For example, the LOD3 level is the highest, and the corresponding image processing algorithm is the simplest; the LOD0 level is the lowest and the corresponding image processing algorithm is the most complex.

The LOD may be used to generate a three-dimensional image. Continue to take LOD0 to LOD3 as an example; each layer of the LOD0 to the LOD3 can be used for generating one layer of the three-dimensional image, and then different layers are used for synthesizing to obtain the three-dimensional image; wherein, the depth ranges corresponding to different image layers are different. For example, the image depth may be divided according to the LOD layer number, for example, four layers LOD0 to LOD3, and the image depth may be divided into four ranges. For example, LOD0 corresponds to depth range 1, that is, an image processing algorithm corresponding to LOD0 is used to process an image layer within depth range 1; LOD1 corresponds to depth range 2, that is, the image processing algorithm corresponding to LOD1 is used for processing the image layer in depth range 2; the LOD2 corresponds to the depth range 3, that is, an image processing algorithm corresponding to the LOD2 is used for processing the image layer in the depth range 3; LOD3 corresponds to depth range 4, i.e., an image processing algorithm corresponding to LOD3 is used to process layers within depth range 4. Since the present application considers objects of greater depth to be more blurred. Therefore, an image layer having a large depth range (a distant image layer) needs to be processed using an LOD layer having a simple algorithm because the image layer after LOD layer processing having a simpler algorithm is blurred. Therefore, an image layer with a larger depth range corresponds to an LOD layer with a higher level (the algorithm is simpler when the LOD level is higher, see the foregoing description), and similarly, an image layer with a smaller depth range corresponds to an LOD layer with a lower level, because the algorithm corresponding to an LOD layer with a lower level is more complex, and the processed image layer is clearer.

Illustratively, assume that the depth range 1 is 0-0.3m, corresponding to LOD0 (because LOD0 produces the highest layer definition). The depth range 2 is 0.3-0.5m, corresponding to LOD1 (the definition of the layer generated by LOD1 is lower than that of the layer of LOD 0). The depth range 3 is 0.5-0.8m, corresponding to LOD2 (the layer generated by LOD2 has a lower definition than the layer of LOD 1). The re-depth range 3 is 0.8-1m corresponding to LOD3 (the layer generated by LOD3 has a lower definition than the layer generated by LOD 2). That is, the definition of the layer becomes lower as the depth increases. And finally, synthesizing the image layers corresponding to different depth ranges, and displaying the image through VR display equipment.

Fig. 16 is described by taking the example that VR glasses display one frame of image. It is understood that, in general, VR glasses display an image stream including a plurality of frames of images.

First, a VR image generation device (such as the image generation device 200 in fig. 2) defaults to an image stream generated using the flow shown in fig. 17 (i.e., distant objects on each frame image are blurred). For example, taking the VR image generating device as a mobile phone, when the mobile phone detects at least one of connection of VR glasses, turning on of VR glasses, and starting of VR application (such as VR game), the mobile phone starts to generate an image stream by default using the flow shown in fig. 17, and then displays the image stream through the VR glasses.

In the second mode, the VR image generation apparatus generates an image by default using the existing mode (i.e., all objects on the image have the same sharpness), and generates an image using the flow shown in fig. 17 when detecting an instruction for starting the second eye-protection mode. For the second eye protection mode, please refer to the above description. That is, the VR glasses initially display an image in which all objects have the same sharpness, and after detecting the indication for activating the second eye-protection mode, the displayed image is blurred by distant objects (e.g., objects having a depth greater than a preset depth). Illustratively, referring to fig. 18, before the (i + 1) th frame, the sharpness of all objects on the image is the same. Distant objects (e.g., mountains and sun) are blurred on the i +1 th to nth frame images when an indication to initiate a second eye-protection mode is detected. Wherein the indication to initiate the second mode includes, but is not limited to: the user is detected to trigger an operation for starting the second eye protection mode (for example, a button for starting the second eye protection mode is included in the VR application, the operation may be clicking the operation of the button), the user watching time is longer than a preset time length, and at least one of the user eye blinking/squinting times and the preset times is larger than the preset times length. Optionally, before the second eye protection mode is started, prompt information may be further output, where the prompt information is used to prompt a user whether to switch to the second eye protection mode, and after an instruction that the user confirms to switch to the second eye protection mode is detected, the user switches to the second eye protection mode. After the second eye-protecting mode is started, a distant object (such as a mountain) on the image is blurred, so that the fatigue of the human brain is relieved, and the user experience is better.

In the first and second aspects above, blurring processing is performed on a distant object in an image stream generated by the VR image generation device, which can alleviate brain fatigue of a human, but details of the distant object are easily lost. For example, in the first approach, the distant object is always blurred, and the user cannot obtain details of the object; in the second way, after detecting the indication for activating the second eye protection mode, the distant object will always be obscured and the user will not be able to obtain details of the object.

In order to not only relieve fatigue, but also acquire the details of a distant object. In the first or second aspect, the image stream generated by the VR image generation device may have a high or low degree of sharpness of a distant object. For example, the image stream includes a plurality of periods, each period includes a plurality of frames of images, and the definition of the first object on the images in each period increases before decreases.

Illustratively, referring to fig. 19, in one period, the definition of a distant object (e.g., a mountain) in the ith image is lower than that in the jth image, and the definition of the mountain in the jth image is higher than that in the kth image. That is, the definition of the distant object in one period has a variation trend of "blur-clear-blur". In the next period, the definition of the mountain of the k frame image is lower than that of the mountain of the p frame image, and the definition of the mountain of the p frame image is higher than that of the mountain of the q frame image. That is, in the next period, the definition of the distant object also shows the variation trend of 'fuzzy-clear-fuzzy'. The two periods in fig. 19 may be the same or different, and are not limited. The definition change trend can relieve the brain fatigue on one hand and can prevent the user from losing the image details of the remote object on the other hand.

Illustratively, i, j, k, p satisfy: j = i + n, k = j + m, p = k + w, q = p + s, the j frame is the n frame after the i frame, the k frame is the m frame after the j frame, the p frame is the w frame after the k frame, and the q frame is the s frame after the p frame. Wherein n, m, q and s are integers greater than or equal to 1. For example, n, m, p, s are all 1, that is, the jth frame image is the next frame of the ith frame image, the kth frame image is the next frame of the jth frame image, the pth frame is the next frame of the kth frame, and the qth frame is the next frame of the pth frame. Or n, m, p, and s may be determined according to the user visual retention time and the image refresh frame rate, which are the same as the implementation principle of the first embodiment and are not repeated.

Optionally, continuing with the example of fig. 19, the definition of the close-up object in the image stream may be unchanged. For example, in fig. 19, the tree definition may be constant.

The first embodiment and the second embodiment can be implemented separately or in combination. For example, the VR image generation device may default to use the technical solution of the first embodiment (for example, the first manner or the second manner in the first embodiment), or default to use the technical solution of the second embodiment (for example, the first manner or the second manner in the second embodiment), or the VR image generation device includes a switch button, and the switch button may set whether the VR image generation device uses the technical solution of the first embodiment or the technical solution of the second embodiment; or, the VR application includes a button, and the user can set whether the VR application uses the technical solution of the first embodiment or the second embodiment.

In one embodiment, the VR glasses have two display screens, a first display screen and a second display screen. The first display screen is used for presenting images to the left eye of a user, and the second display screen is used for presenting images to the right eye of the user. For convenience of description, the display screen corresponding to the left eye is referred to as a left-eye display screen, and the display screen corresponding to the right eye is referred to as a right-eye display screen.

The left eye display screen and the right eye display screen are respectively used for displaying the image streams. The image stream may be an image stream generated using the method of the first embodiment (e.g., the image stream shown in fig. 13 or 12), or an image stream generated using the method of the second embodiment (e.g., the image stream shown in fig. 18 or 19).

The image stream displayed on the left-eye display screen and the image stream displayed on the right-eye display screen are the image streams shown in fig. 13 in the first embodiment.

In order to ensure that the image information shot by the left eye and the image information shot by the right eye are the same, the images displayed by the left eye display screen and the right eye display screen are synchronous. For example, referring to fig. 20, when the left-eye display screen displays the ith frame image, the right-eye display screen also displays the ith frame image. Since an object (e.g., tree) far from the user's gaze point on the ith frame image is blurred, the trees seen by both the left and right eyes are blurred at this time. With continued reference to fig. 20, while the left-eye display screen displays the jth frame image, the right-eye display screen also displays the jth frame image. Since an object (e.g., tree) far from the user's gaze point on the jth frame image is clear, the trees seen by both the left and right eyes are clear at this time. In the mode, when the ith frame of image is synchronously displayed by the left and right eye display screens, an object far away from the user fixation point on the image obtained by synthesizing the images of the left and right eye display screens in the human brain is fuzzy, so that the fatigue can be relieved. When the jth frame of image is synchronously displayed by the left-eye display screen and the right-eye display screen, an object far away from the user's gaze point on an image obtained by synthesizing the images of the left-eye display screen and the right-eye display screen in the human brain is clear, and details of the object far away from the user's gaze point can be acquired.

In fig. 20, the image stream displayed on the left-eye display screen and the image stream displayed on the right-eye display screen have the same tendency of change in sharpness of the object far from the user's gaze point, and both of them have the "blur-sharpness-blur-sharpness" tendency of change. In other embodiments, the sharpness change trends of objects far from the user's gaze point in the image stream displayed by the left-eye display screen and the image stream displayed by the right-eye display screen may be reversed. For example, the definition of the object far from the user's gaze point in the image stream displayed by the left-eye display screen alternates between "blur-clear-blur-clear", and the definition of the object far from the user's gaze point in the image stream displayed by the right-eye display screen alternates between "clear-blur-clear-blur". For example, referring to fig. 21, when the left-eye display screen displays the ith frame image, the right-eye display screen also displays the ith frame image. Objects (e.g., trees) on the ith frame image on the left eye display screen that are far from the user's gaze point are blurred and trees in the ith frame image on the right eye display screen are clear. Therefore, at this time, the left eye sees the tree blurry and the right eye sees the tree clearly. With continued reference to fig. 20, while the left-eye display screen displays the jth frame image, the right-eye display screen also displays the jth frame image. The tree is clear in the jth frame image on the left-eye display screen and is blurred in the jth frame image on the right-eye display screen. Therefore, at this time, the tree seen by the left eye is clear and the tree seen by the right eye is blurred. In this way, when the left and right eye display screens synchronously display images (i-th frame images or j-th frame images), the object far away from the fixation point on the left eye image is clear, and the object far away from the fixation point on the right eye image is fuzzy, so that the fatigue of the human brain can be relieved to a certain extent.

The image stream displayed on the left-eye display screen and the image stream displayed on the right-eye display screen are both the image streams shown in fig. 19 in the second embodiment.

In order to ensure that the image information shot by the left eye and the image information shot by the right eye are the same, the images displayed by the left eye display screen and the right eye display screen are synchronous. For example, referring to fig. 22, when the left-eye display screen displays the ith frame image, the right-eye display screen also displays the ith frame image. Since distant objects (e.g., mountain and sun) are blurred on the ith frame image, at this time, the distant objects seen by both the left and right eyes are blurred. With continued reference to fig. 22, while the left-eye display screen displays the jth frame image, the right-eye display screen also displays the jth frame image. Since distant objects (e.g., mountain and sun) are clear on the jth frame image, the distant objects seen by both the left and right eyes are clear at this time. In this way, when the i-th frame of image is synchronously displayed on the left and right eye display screens, a distant object on the image obtained by synthesizing the images of the left and right eye display screens in the human brain is blurred, and fatigue can be relieved. When the j frame image is synchronously displayed by the left-eye display screen and the right-eye display screen, a distant object on an image obtained by synthesizing the images of the left-eye display screen and the right-eye display screen in the human brain is clear, and the details of the distant object can be acquired.

In fig. 22, the image stream displayed on the left-eye display screen and the image stream displayed on the right-eye display screen have the same sharpness variation trend of the distant object, and both the two are the "blur-sharpness-blur-sharpness" variation trend. In other embodiments, the sharpness of distant objects in the image stream displayed by the left-eye display screen and the image stream displayed by the right-eye display screen may be reversed. For example, the left eye display screen displays the image stream with the distant objects alternating in sharpness with "blur-sharpness", and the right eye display screen displays the image stream with the distant objects alternating in sharpness-blur-sharpness-blur ". For example, referring to fig. 23, when the left-eye display screen displays the ith frame image, the right-eye display screen also displays the ith frame image. Distant objects (e.g., mountain and sun) are blurred in the ith frame of image on the left eye display screen and are clear in the ith frame of image on the right eye display screen. Therefore, at this time, the distant object seen by the left eye is blurred, and the distant object seen by the right eye is clear. With continued reference to fig. 23, while the left-eye display screen displays the jth frame image, the right-eye display screen also displays the jth frame image. Distant objects (e.g., mountain and sun) are clear in the jth frame image on the left eye display screen and distant objects (e.g., mountain and sun) are blurred in the jth frame image on the right eye display screen. Therefore, at this time, the distant object seen by the left eye is clear, and the distant object seen by the right eye is blurred. In this way, when the left and right eye display screens synchronously display images (i-th frame images or j-th frame images), the distant object on the left eye image is clear, and the distant object on the right eye image is fuzzy, so that the brain fatigue of a person can be relieved to a certain extent.

Based on the same concept, fig. 24 shows an electronic device 2400 provided in the present application. The electronic device 2400 can be a VR-worn device (e.g., VR glasses) as in the foregoing. As shown in fig. 24, the electronic device 2400 may include: one or more processors 2401; one or more memories 2402; a communications interface 2403, and one or more computer programs 2404, which may be connected by one or more communications buses 2405. Wherein the one or more computer programs 2404 are stored in the memory 2402 and configured to be executed by the one or more processors 2401, the one or more computer programs 2404 comprising instructions that may be used to perform steps associated with a cell phone as in the respective embodiments above. Communication interface 2403 is used to enable communications with other devices, such as a transceiver.

In the embodiments provided in the present application, the method provided in the embodiments of the present application is described from the perspective of an electronic device (e.g., a mobile phone) as an execution subject. In order to implement the functions in the method provided by the embodiments of the present application, the electronic device may include a hardware structure and/or a software module, and the functions are implemented in the form of a hardware structure, a software module, or a hardware structure and a software module. Whether any of the above functions is implemented as a hardware structure, a software module, or a combination of a hardware structure and a software module depends upon the particular application and design constraints imposed on the technical solution.

As used in the above embodiments, the terms "when 8230;" or "when 823030; after" may be interpreted to mean "if 8230;" or "after 8230;" or "in response to determining 8230;" or "in response to detecting 8230;", depending on the context. Similarly, the phrase "at the time of determination of \8230," or "if (a stated condition or event) is detected" may be interpreted to mean "if it is determined 8230;" or "in response to the determination of 8230;" or "upon detection of (a stated condition or event)" or "in response to the detection of (a stated condition or event)" depending on the context. In addition, in the above-described embodiments, relational terms such as first and second are used to distinguish one entity from another entity without limiting any actual relationship or order between the entities.

Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless otherwise specifically stated.

In the above embodiments, all or part of the implementation may be realized by software, hardware, firmware, or any combination thereof. When implemented in software, it may be implemented in whole or in part in the form of a computer program product. The computer program product includes one or more computer instructions. The procedures or functions described in accordance with the embodiments are all or partially performed when the computer program instructions are loaded and executed on a computer. The computer may be a general purpose computer, a special purpose computer, a network of computers, or other programmable device. The computer instructions may be stored in a computer readable storage medium or transmitted from one computer readable storage medium to another, for example, from one website site, computer, server, or data center to another website site, computer, server, or data center via wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer-readable storage medium can be any available medium that can be accessed by a computer or a data storage device, such as a server, a data center, etc., that includes one or more available media. The usable medium may be a magnetic medium (e.g., floppy Disk, hard Disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid State Disk (SSD)), among others. The aspects of the above embodiments may all be used in combination without conflict.

It is noted that a portion of this patent application contains material which is subject to copyright protection. The copyright owner reserves the copyright rights whatsoever, except for making copies of the patent files or recorded patent document contents of the patent office.

Claims

1. A display method, comprising:

displaying the N frames of images to a user through a display device; n is a positive integer;

on the ith frame image in the N frames of images, the definition of a first object at a first depth of field is first definition;

on the jth frame image in the N frame images, the definition of the first object at the first depth of field is a second definition;

the definition of the first object at the first depth of field on the k frame image in the N frame images is a third definition;

the first definition is smaller than the second definition, the second definition is larger than the third definition, i, j and k are positive integers smaller than N, and i is larger than j and smaller than k;

the first depth of field is larger than the second depth of field, or the distance between the first depth of field and the depth of field of the user fixation point is larger than the first distance.

2. The method of claim 1, wherein the user's gaze point is unchanged during the displaying of the i frame image, the j frame image, and the k frame image.

3. The method according to claim 1 or 2,

the second depth of field comprises: the user specifies at least one of the depth of field, the depth of field where the user gazing point is located, the default depth of field of the system, the depth of field where the virtual image surface is located, the depth of field corresponding to the virtual scene, and the depth of field where the main body object is located on the ith frame of image.

4. The method of any of claims 1-3, wherein prior to displaying the N frames of images, the method further comprises: detecting that a user triggers an operation for starting an eye protection mode, wherein the watching time of the user is more than a first time length, and the eye blinking/squinting times of the user are more than at least one of the first times within a second time length.

5. The method according to any one of claims 1-4, wherein the sharpness of the second object at the third depth of field in the N-frame images is the same.

6. The method of claim 5, wherein the third depth of field is less than the first depth of field.

7. The method of claim 5, wherein a distance between the third depth of field and the depth of field at which the user's point of regard is less than a distance between the first depth of field and the depth of field at which the user's point of regard is located.

8. The method according to any one of claims 1 to 7,

the time interval between the display time of the jth frame image and the display time of the ith frame image is less than or equal to the visual dwell time of the user; and/or the presence of a gas in the atmosphere,

and the time interval between the display time of the k frame image and the display time of the j frame image is less than or equal to the time stay duration.

9. The method according to any one of claims 1 to 8,

j = i + n, where n is greater than or equal to 1, or n varies with a variation in the user's dwell time duration and an image refresh frame rate of the display device; and/or the presence of a gas in the gas,

k = j + m, where m is greater than or equal to 1, or m varies with a variation in the user's dwell time duration and the image refresh frame rate of the display device.

10. The method of any one of claims 1 to 9,

the display device comprises a first display screen and a second display screen, wherein the first display screen is used for presenting images to the left eye of a user, and the second display screen is used for presenting images to the right eye of the user; the images displayed on the first display screen and the second display screen are synchronous;

the first display screen and the second display screen respectively display the N frames of images; alternatively, the first and second electrodes may be,

the first display screen displays the N frames of images; the second display screen displays another N frames of images; the image content of the other N frames of images is the same as that of the N frames of images;

the definition of the first object at the first depth of field on the ith frame image in the other N frames of images is a fourth definition;

the definition of the first object at the first depth of field on the jth image in the other N images is a fifth definition;

the definition of the first object at the first depth of field on the k frame image in the other N frame images is a sixth definition;

wherein the fourth definition is greater than the fifth definition and the fourth definition is less than the sixth definition.

11. The method of claim 10,

the fourth definition is greater than the first definition; and/or the presence of a gas in the gas,

the fifth definition is less than the second definition; and/or the presence of a gas in the atmosphere,

the sixth definition is greater than the third definition.

12. The method of any one of claims 1 to 11,

the N frames of images are game related images;

the second depth of field comprises: in a game scene, the depth of field of the game role corresponding to the user is located, or the depth of field of the body part of the game role corresponding to the user is located, or the depth of field of the game equipment currently held by the game role corresponding to the user is located; and/or the presence of a gas in the gas,

the depth of field of the user's gaze point includes: in a game scene, the depth of field of a game role corresponding to a game counterpart is located, or the depth of field of a building is located, or the depth of field of the body part of the game role corresponding to the user is located, or the depth of field of the game equipment currently held by the game role corresponding to the user is located.

13. The method according to any one of claims 1 to 11,

the N frames of images are images related to vehicle driving;

the second depth of field comprises: in a vehicle driving scene, the depth of field of a currently driven vehicle of the user, or the depth of field of a steering wheel on the currently driven vehicle of the user, or the depth of field of a windshield on the currently driven vehicle of the user; and/or the presence of a gas in the atmosphere,

the depth of field where the user gaze point is located includes: in a vehicle driving scene, other users on a driving road drive vehicles, or a roadside setting object of the driving road.

14. The method of any one of claims 1 to 13,

the ith frame of image is an image which is subjected to blurring processing on the first object on the ith frame of original image;

the j frame image is the j frame original image or an image obtained by performing sharpening processing on the first object on the j frame original image;

the k frame image is an image obtained by blurring the first object on the k frame original image;

the definition of all objects on the original image of the ith frame, the original image of the jth frame and the original image of the kth frame is the same;

the image of the jth frame is an image obtained by performing sharpening processing on the first object on the original image of the jth frame, and includes:

the j frame image is an image obtained by fusing the i frame image and the j frame original image; alternatively, the first and second electrodes may be,

and the j frame image is an image obtained by fusing lost image information and the j frame original image when the ith frame image is subjected to blurring processing.

15. The method of claim 14, wherein the j frame image is an image obtained by fusing the i frame image and the j frame original image, and the method comprises:

the image block in the area of the first object on the jth frame of image is obtained by fusing a first image block and a second image block; the first image block is an image block in an area where the first object is located on the ith frame of image, and the second image block is an image block in an area where the first object is located on the jth frame of original image.

16. An electronic device, comprising:

a processor, a memory, and one or more programs;

wherein the one or more programs are stored in the memory, the one or more programs comprising instructions which, when executed by the processor, cause the electronic device to perform the method steps of any of claims 1-15.

17. A computer-readable storage medium, for storing a computer program which, when run on a computer, causes the computer to perform the method of any one of claims 1 to 15.

18. A computer program product comprising a computer program which, when run on a computer, causes the computer to perform the method of any one of claims 1 to 15.