WO2024111475A1 - Appareil de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations - Google Patents

Appareil de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations Download PDF

Info

Publication number
WO2024111475A1
WO2024111475A1 PCT/JP2023/041042 JP2023041042W WO2024111475A1 WO 2024111475 A1 WO2024111475 A1 WO 2024111475A1 JP 2023041042 W JP2023041042 W JP 2023041042W WO 2024111475 A1 WO2024111475 A1 WO 2024111475A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
information processing
distance value
face
dimensional coordinate
Prior art date
Application number
PCT/JP2023/041042
Other languages
English (en)
Japanese (ja)
Inventor
勉 一ノ瀬
Original Assignee
ソニーグループ株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ソニーグループ株式会社 filed Critical ソニーグループ株式会社
Publication of WO2024111475A1 publication Critical patent/WO2024111475A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/86Combinations of sonar systems with lidar systems; Combinations of sonar systems with systems not using wave reflection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/302Image reproducers for viewing without the aid of special glasses, i.e. using autostereoscopic displays
    • H04N13/305Image reproducers for viewing without the aid of special glasses, i.e. using autostereoscopic displays using lenticular lenses, e.g. arrangements of cylindrical lenses
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N13/00Stereoscopic video systems; Multi-view video systems; Details thereof
    • H04N13/30Image reproducers
    • H04N13/366Image reproducers using viewer tracking

Definitions

  • This disclosure relates to an information processing device, an information processing method, and an information processing program.
  • a display related to naked eye stereoscopic display is a light field display, which is typified by the lenticular method.
  • the viewpoint positions of the user's right and left eyes are detected, and the optimal light beam is focused at the viewpoint positions to generate an image for the right eye and an image for the left eye.
  • a technique has been proposed for detecting the viewpoint position that detects feature points in an image including the user's face and tracks the viewpoint position based on these feature points.
  • a known method calculates a distance value based on the distance between the user's right and left eyes (interocular distance) as feature points.
  • this method is affected by individual differences in interocular distance, and errors may occur in the distance value.
  • methods have been proposed to eliminate errors in distance values due to individual differences.
  • the above conventional technology has room for improvement in terms of appropriately correcting errors in distance values.
  • errors in distance values may occur when camera images are affected by disturbances such as lighting, but the above conventional technology does not take such disturbances into account, and therefore cannot be said to be able to appropriately correct errors in distance values.
  • the above conventional technology does not necessarily enable appropriate tracking of the viewpoint position.
  • this disclosure proposes an information processing device, an information processing method, and an information processing program that can achieve appropriate tracking of the viewpoint position.
  • an information processing device includes an imaging unit that captures an image of a user to obtain the captured image, an ultrasonic device that is installed near the imaging unit to detect the user, and an information processing unit that acquires three-dimensional coordinate information of the user's face shown in the captured image, determines whether the user's face has been detected continuously for a predetermined time based on the three-dimensional coordinate information, and corrects a first distance value indicated by the three-dimensional coordinate information based on a second distance value based on a signal acquired from the ultrasonic device based on a determination that the user's face has been detected continuously for the predetermined time.
  • FIG. 1 is a diagram showing an example of the appearance of a stereoscopic display device according to an embodiment
  • 1 is a block diagram showing an example of a system configuration of a stereoscopic display device according to an embodiment.
  • 1 is a block diagram showing an example of the configuration of an information processing device according to an embodiment;
  • FIG. 1 is a diagram showing an outline of distance measurement by face detection.
  • FIG. 1 is a diagram showing the relationship between the directivity of a camera and the directivity of an ultrasonic device.
  • 13A to 13C are diagrams illustrating a specific example of correction determination processing based on a face frame.
  • FIG. 11 is an explanatory diagram illustrating a response delay.
  • 4 is a flowchart showing a processing procedure of the information processing device according to the embodiment.
  • FIG. 11 is a diagram showing the relationship between panel temperature and ambient temperature.
  • 13 is a flowchart showing a calculation process of a distance value Zu.
  • FIG. 2 is a block diagram showing an example of a hardware configuration of a computer corresponding to the information processing device according to the embodiment.
  • the information processing related to the proposed technology of this disclosure corrects errors in distance values due to individual differences in interocular distance and errors in distance values due to disturbances such as lighting by setting the difference between the distance measurement value obtained from a separate ultrasonic device that is less susceptible to disturbances such as lighting and the distance measurement value estimated from face detection as a scale factor when the face frame obtained when the user's face is detected from an image captured by a camera is in the central area of the image.
  • This information processing in detail.
  • Example of the appearance of a stereoscopic display device] 1 is a diagram showing an example of the appearance of a stereoscopic display device 1 according to an embodiment.
  • the stereoscopic display device 1 is, for example, about the same size as a notebook personal computer, but can also be made smaller or larger.
  • the stereoscopic display device 1 corresponds to a spatial reproduction display device that can provide a stereoscopic experience with the naked eye without using an attached tool such as a wearable device equipped with liquid crystal shutters.
  • the stereoscopic display device 1 has a base 2 and a display 3 standing upright from the base 2.
  • the stereoscopic display device 1 has a camera 4 above the display 3, and is configured so that the camera 4 can capture an image of a user positioned in front of the display 3.
  • the stereoscopic display device 1 also has an ultrasonic device 5 disposed thereon that uses ultrasonic waves to measure the distance to the user.
  • the stereoscopic display device 1 can display, for example, stereoscopic images using a lenticular method on the display 3.
  • the stereoscopic display device 1 detects the viewpoint position of a naked-eye user who is not using a dedicated wearable device for stereoscopic display, using images captured by a camera 4.
  • the stereoscopic display device 1 then generates images (parallax images) for the right and left eyes using light rays that are focused at the viewpoint positions of the right and left eyes, respectively, and displays the generated images on the display 3 equipped with a lenticular lens.
  • the user can view stereoscopic images without using a wearable device with liquid crystal shutters, a head-mounted display (HMD), or the like.
  • HMD head-mounted display
  • the camera 4 is integrated with the stereoscopic display device 1, but a configuration in which the camera 4 is externally attached to the stereoscopic display device 1 may also be adopted.
  • the camera 4 is preferably installed near the display 3 so as to include a user viewing the display 3, for example, as shown in Fig. 4.
  • the camera 4 may be installed at a position where it is possible to capture an image of the user, or at a position where the user is included in the captured image.
  • the camera 4 may be a high-speed camera capable of high-speed imaging.
  • the ultrasonic device 5 is integrated into the stereoscopic display device 1, but a configuration in which it is externally attached to the stereoscopic display device 1 may also be adopted.
  • the ultrasonic device 5 is installed near the camera 4 so as to detect the user.
  • the ultrasonic device 5 may be installed directly above the camera 4, as shown in FIG. 4.
  • the ultrasonic device 5 is installed in a positional relationship such that the distance between the ultrasonic device 5 and the user is the same as the distance between the camera 4 and the user.
  • the ultrasonic device 5 is less responsive than the camera 4. Specifically, the ultrasonic device 5 measures distances using ultrasonic waves, which have a speed slower than that of light, and therefore has a slower response speed than the camera 4, which uses the characteristics of light. On the other hand, the ultrasonic device 5 does not use the distance between feature points detected from the captured image (e.g., interocular distance), but calculates distance values using sound characteristics as described above, so the distance values obtained by the ultrasonic device 5 have less error due to individual differences and less error due to external disturbances, and can be said to be highly accurate distance values. In other words, the distance values derived from sound characteristics obtained by the ultrasonic device 5 can be said to be more accurate distance values than the distance values derived from feature points obtained by the camera 4.
  • Example of the configuration of a stereoscopic display device 2 is a block diagram showing an example of a system configuration of a stereoscopic display device 1 according to an embodiment.
  • the stereoscopic display device 1 generally includes an information processing device 100 and a parallax image processing unit 20.
  • the information processing device 100 outputs information indicating a user's viewpoint position, for example, three-dimensional coordinates of the viewpoint position, to the subsequent parallax image processing unit 20. Details of the configuration, operation example, etc. of the information processing device 100 will be described later.
  • the parallax image processing unit 20 has a spatial viewpoint coordinate generation unit 21, a parallax image generation unit 22, and a parallax image display unit 23.
  • the spatial viewpoint coordinate generation unit 21 converts three-dimensional coordinates indicating the viewpoint position output from the information processing device 100 into viewpoint coordinates in a spatial position by applying a known method, and generates viewpoint coordinates in space. More specifically, the viewpoint coordinates are obtained by converting the camera coordinates corresponding to the camera 4 into rendering coordinates by a known coordinate system conversion method.
  • the known conversion method may include a translation process of the coordinate system, an optical axis correction process, conversion to a world coordinate system, etc.
  • the parallax image generation unit 22 generates a stereoscopic image by generating light rays (images) corresponding to the viewpoint coordinates in space.
  • the parallax image display unit 23 is a device that presents a stereoscopic video by continuously displaying the parallax images generated by the parallax image generation unit 22, and corresponds to the display 3 described above.
  • Configuration example of information processing device Fig. 3 is a block diagram showing an example of the configuration of the information processing device 100 according to the embodiment.
  • the information processing device 100 includes an image sensor 101, an ultrasonic sensor 102, a face detection unit 103, a correction determination unit 104, an error calculation unit 105, and a multiplier 106.
  • an image sensor 101 an ultrasonic sensor 102, a face detection unit 103, a correction determination unit 104, an error calculation unit 105, and a multiplier 106 are included in the information processing device 100, but the present invention is not limited to this example.
  • the image sensor 101 which is an example of an imaging unit, is, for example, a CMOS (Complementary Metal Oxide Semiconductor) sensor. Other sensors such as a CCD (Charge Coupled Device) may also be applied as the image sensor 101.
  • the image sensor 101 captures an image of a user positioned in front of the display 3, more specifically, the area around the user's face, and acquires the captured image.
  • the captured image acquired by the image sensor 101 is output after being A/D (Analog to Digital) converted.
  • the image sensor 101 corresponds to the camera 4.
  • an A/D converter or the like may be implemented on the image sensor 101, or may be provided between the image sensor 101 and the face detection unit 103.
  • the image sensor 101 according to the embodiment is configured to be capable of capturing images at a high frame rate.
  • the image sensor 101 is capable of capturing images at 1000 fps (frames per second) or more.
  • the image sensor 101 is described as being capable of capturing images at 1000 fps.
  • the ultrasonic sensor 102 is a device used to measure the distance to a user located in front of the display 3. For example, the ultrasonic sensor 102 transmits an ultrasonic signal. The ultrasonic sensor 102 may output the timing of transmitting the ultrasonic signal and the timing of receiving the ultrasonic signal to the error calculation unit 105 as information based on the ultrasonic signal. Although the ultrasonic sensor 102 is slower and has a narrower angle than the image sensor 101, it enables direct and highly accurate distance measurement without relying on image recognition. The ultrasonic sensor 102 corresponds to the ultrasonic device 5.
  • the face detection unit 103 performs face detection based on the captured image acquired by the image sensor 101, and generates and acquires face detection information including the position information of the user's face and face frame information based on the face detection results.
  • FIG. 4 shows an overview of distance measurement by face detection.
  • FIG. 4 shows a scene in which the distance to the user U is measured by a camera 4 (image sensor 101) and an ultrasonic device 5 (ultrasonic sensor 102) that are externally attached to the stereoscopic display device 1.
  • the camera 4 is installed on top of the display 3 of the stereoscopic display device 1, and the ultrasonic device 5 is installed directly above the camera 4, so that the distances from the camera 4 and the ultrasonic device 5 to the user U are adjusted to be equal.
  • the face detection unit 103 calculates three-dimensional position information of the face as position information of the user U's face.
  • this may be expressed as three-dimensional coordinate information (Xf, Yf, Zf).
  • the X-coordinate component "Xf" included in the three-dimensional coordinate information (Xf, Yf, Zf) is a horizontal component with respect to the ground surface, as shown in FIG. 4, and is expressed as horizontal information (Xf).
  • the Y-coordinate component "Yf” is a vertical component with respect to the ground surface, and is expressed as vertical information (Yf).
  • the Z-coordinate component "Zf” is a depth component with respect to the two-dimensional space formed by the horizontal information (Xf) and vertical information (Xf), and is distance information indicating the distance between the user U and the camera 4.
  • the distance information (Zf) corresponds to the distance value Zf indicating the distance between the user U and the camera 4.
  • the horizontal information (Xf) and vertical information (Yf) correspond to information indicating the viewpoint position of the user U, for example, the two-dimensional coordinates of the viewpoint position.
  • the horizontal information (Xf) and vertical information (Yf) include the coordinates of the right eye of the user U in the captured image and the coordinates of the left eye of the user U in the captured image, and are used when generating viewpoint coordinates in space.
  • the face detection unit 103 outputs the horizontal information (Xf) and vertical information (Yf) to the parallax image processing unit 20 as tracking data.
  • the distance information (Zf) is a distance value (distance value Zf) from the user U calculated based on the distance (e.g., interocular distance) between multiple feature points (e.g., the right eye and the left eye) detected from the captured image. Therefore, as explained above, there is a possibility that the distance value Zf contains an error due to individual differences in interocular distance when compared with the true distance value.
  • the distance value Zf is also information indicating the viewpoint position of the user U, and is therefore used when generating viewpoint coordinates in space.
  • the face detection unit 103 outputs the distance value Zf to the error calculation unit 105 and the multiplier 106 so that a correction process is performed on the distance value Zf.
  • the face detection unit 103 also outputs face frame information to the correction determination unit 104.
  • the distance value Zf (an example of the first distance value) is corrected using the distance value Zu (an example of the second distance value), which is the distance value obtained by the ultrasonic device 5. Since the distance value Zu is calculated using the characteristics of the sound, there is little error due to individual differences or due to the influence of disturbances, and it can be treated as a true value indicating the correct distance from the user U.
  • the correction determination unit 104 determines whether the user's face has been detected continuously for a predetermined time based on the three-dimensional coordinate information (Xf, Yf, Zf), and if the user's face has been detected continuously for the predetermined time, determines that correction processing may be performed. This processing is performed to eliminate the delay in distance measurement caused by the time difference between the time required for the image sensor 101 to respond and the time required for the ultrasonic sensor 102 to respond. Therefore, for example, the correction determination unit 104 determines whether the user's face has been detected continuously during the delay in distance measurement caused by the time difference between the time required for the image sensor 101 to respond and the time required for the ultrasonic sensor 102 to respond. Note that the delay in distance measurement here refers to the delay time between the image sensor 101 and the ultrasonic sensor 102.
  • the camera 4 image sensor 101
  • the ultrasonic device 5 ultrasonic sensor 102
  • the correction judgment process by the correction judgment unit 104 is performed in order to prevent a decrease in distance measurement accuracy due to these characteristics.
  • This correction judgment process is performed to take advantage of the advantage of using the ultrasonic device 5 in combination (narrower detection range than the camera 4), while also resolving the issue that the ultrasonic device 5 has (poor responsiveness compared to the camera 4).
  • the correction judgment process will be described in detail below.
  • FIG. 5 is a diagram showing the relationship between the directivity of the camera 4 and the directivity of the ultrasonic device 5.
  • FIG. 5 shows a detection range AR4 according to the directivity of the camera 4, and a detection range AR5 according to the directivity of the ultrasonic device 5.
  • detection range AR4 may be set to a wider angle than detection range AR5 in order to expand the viewing range of user U. This allows camera 4 to achieve high face detection performance even if user U moves, for example, up, down, left, or right. However, widening the detection range AR4 may result in a decrease in resolution due to the use of a wide-angle lens, fluctuations in the lighting conditions for user U's face depending on the position of user U, errors in the three-dimensional coordinate information (Xf, Yf, Zf), and in particular an increase in error in distance value Zf.
  • the correction determination unit 104 therefore sets an imaging range corresponding to a detection range AR5 narrower than the detection range AR4 as a face position determination frame FL12, and determines that the face frame FL11 detected by the face detection unit 103 is contained within the face position determination frame FL12 as one of the determination conditions for permitting the execution of the correction process. This point will be explained in more detail using FIG. 6.
  • FIG. 6 is a diagram showing a specific example of the correction determination process based on the face frame.
  • FIG. 6(a) shows a captured image IM1, which is an example of an image captured by the image sensor 101.
  • the captured image IM1 includes a user U.
  • the face detection unit 103 detects the face of the user U using the captured image IM1. As a result of the face detection, a face frame FL11 is set in the area including the face as shown in FIG. 6(b), and face frame information indicating the area of the face frame FL11 is obtained.
  • the method of detecting the face of the user U can be a known method, such as a method that utilizes the characteristics of the captured image IM1.
  • the face detection unit 103 outputs the face frame information to the correction determination unit 104.
  • the correction determination unit 104 When the correction determination unit 104 acquires the face frame information, it determines whether or not the face frame FL11 indicated by the face frame information is within the face position determination frame FL12.
  • the face position determination frame FL12 corresponds to the imaging range generated according to the detection range AR5 of the ultrasonic device 5.
  • the correction determination unit 104 may have information indicating the imaging range generated according to the detection range AR5 in advance. On the other hand, the correction determination unit 104 may identify the information indicating the imaging range when acquiring the face frame information.
  • the process in which the correction determination unit 104 determines whether the face frame FL11 is within the face position determination frame FL12 takes advantage of the advantage of the ultrasonic device 5 (which has a narrower detection range than the camera 4) and eliminates the disadvantage of the wide-angle detection range AR4.
  • the correction determination unit 104 determines that the face frame FL11 is within the face position determination frame FL12, it continues the correction determination process using another determination condition that allows the execution of the correction process.
  • the ultrasonic device 5 is known to have slower response than the camera 4. This is because the speed of sound is slower than the speed of light.
  • the camera 4 uses the characteristics of light to capture images, enabling high-speed distance measurement (measurement of distance value Zf) by the face detection unit 103, whereas the ultrasonic device 5 uses the characteristics of sound to measure distance (measurement of distance value Zu), and is therefore less responsive than the camera 4.
  • the ultrasonic device 5 is less responsive than the camera 4, a delay in distance measurement occurs.
  • the time required for the camera 4 to acquire an image and for the face detection unit 103 to calculate the distance value Zf based on the acquired image is taken as required time t1.
  • the time required for the ultrasonic device 5 to calculate the distance value Zu using ultrasonic waves is taken as required time t2.
  • required time t1 ⁇ required time t2 holds.
  • time TM1 the time corresponding to the required time t1
  • time TM2 the time corresponding to the required time t2
  • TM2 the time difference between time TM1 and time TM2.
  • time TM1 is earlier than time TM2
  • the time difference TM2-TM1 is a period during which the ultrasonic device 5 is longer involved in distance measurement than the camera 4, and can be said to be a delay time in distance measurement.
  • FIG. 7 is an explanatory diagram for explaining the response delay.
  • FIG. 7 shows the transition of the distance value Zf and the transition of the distance value Zu over time.
  • FIG. 7 also shows the delay time DTM based on the time difference TM2-TM1.
  • the delay time DTM caused by the responsiveness of the ultrasonic device 5 is used as another judgment condition for permitting the execution of the correction process.
  • the correction judgment unit 104 judges that the face frame FL11 is within the face position judgment frame FL12, it uses the delay time DTM to judge whether or not to execute the correction process.
  • the correction determination unit 104 determines whether or not the delay time DTM has elapsed with the face frame FL11 within the face position determination frame FL12. Then, when the correction determination unit 104 determines that the delay time DTM has elapsed, it permits the execution of the correction process. For example, when the correction determination unit 104 determines that the delay time DTM has elapsed with the face frame FL11 within the face position determination frame FL12, it outputs a signal permitting the execution of the correction process to the error calculation unit 105.
  • the process in which the correction determination unit 104 determines whether the delay time DTM has elapsed when the face frame FL11 is within the face position determination frame FL12 eliminates the problem (poor responsiveness) of the ultrasonic device 5.
  • the error calculation unit 105 calculates the distance value Zu based on the ultrasonic signal acquired from the ultrasonic sensor 102. For example, the error calculation unit 105 calculates the distance value Zu based on the time from when the ultrasonic signal is transmitted to when it is received as information based on the ultrasonic signal. Then, the error calculation unit 105 executes the correction process using the distance value Zf input from the face detection unit 103 and the distance value Zu. Specifically, the error calculation unit 105 calculates the absolute value (absolute error) of the difference between the distance value Zf and the distance value Zu.
  • the error calculation unit 105 calculates the error rate (relative error) error rate SFZ, which is the ratio of the absolute value to the distance value Zu, taking the distance value Zu as the true value.
  • the error calculation unit 105 also outputs the error rate SFZ to the multiplier 106.
  • the multiplier 106 corrects the distance value Zf, which includes an error, by multiplying the distance value Zf input from the face detection unit 103 by the error rate SFZ.
  • the multiplier 106 also outputs the corrected distance value Zf to the parallax image processing unit 20 as tracking data.
  • Fig. 8 is a flowchart showing the processing procedure of the information processing device 100 according to the embodiment.
  • the correction process by the information processing device 100 starts with signal processing that has gone through initial settings after the power supply of the stereoscopic display device 1 is turned on.
  • the stereoscopic display device 1 will be described as including a processor, a memory, a camera 4, an ultrasonic device 5, etc. as components, and at least some of these components constitute a PC.
  • a captured image is acquired via the image sensor 101, and the acquired captured image is supplied to the face detection unit 103 (step S801).
  • the image sensor 101 constantly inputs the captured image to the face detection unit 103 via the memory. For example, if the camera 4 is a high-speed camera with a frame rate of 1000 fps, images are captured into the memory at intervals of 1 ms.
  • the face detection unit 103 determines whether or not the user's face has been detected by face detection processing based on the captured image (step S802). If the user's face has not been detected by the face detection unit 103 (step S802; No), the process returns to step S801.
  • step S803 information generated based on the face detection result, specifically, face detection information including three-dimensional coordinate information (Xf, Yf, Zf) of the user's face and face frame information, is output (step S803).
  • the face detection unit 103 outputs the distance value Zf, which is distance information included in the three-dimensional coordinate information (Xf, Yf, Zf), to the error calculation unit 105 and the multiplier 106.
  • the face detection unit 103 outputs the face frame information to the correction determination unit 104.
  • the correction determination unit 104 determines whether or not the face frame FL11 indicated by the face frame information is within the face position determination frame FL12 (step S804). If the correction determination unit 104 determines that the face frame FL11 is not within the face position determination frame FL12 (step S804; No), the process returns to step S801.
  • step S804 determines whether or not a period of the delay time DTM has elapsed with the face frame FL11 within the face position determination frame FL12 (step S805). If the correction determination unit 104 determines that the state in which the face frame FL11 is within the face position determination frame FL12 has not continued for the period of the delay time DTM (step S805: No), the process returns to step S804.
  • step S805 if the correction determination unit 104 determines that the delay time DTM has elapsed while the face frame FL11 is within the face position determination frame FL12 (step S805; Yes), a signal permitting the execution of correction processing is output to the error calculation unit 105 (step S806).
  • the error calculation unit 105 acquires an ultrasonic signal from the ultrasonic sensor 102, and calculates the distance value Zu from information based on the acquired ultrasonic signal (step S807). For example, the error calculation unit 105 calculates the distance value Zu based on the time from when the ultrasonic signal is transmitted to when it is received, as information based on the ultrasonic signal. The error calculation unit 105 may obtain the speed of sound according to the ambient temperature of the display 3, and calculate a more accurate distance value Zu based on this speed of sound. A method for calculating the distance value Zu taking the ambient temperature into account will be described later.
  • the multiplier 106 corrects the error of the distance value Zf relative to the distance value Zu based on the error rate SFZ (step S809). Specifically, the multiplier 106 multiplies the distance value Zf input from the face detection unit 103 by the error rate SFZ to correct the distance value Zf that includes an error.
  • the multiplier 106 outputs the three-dimensional position information (Xf, Yf, Zf x SFZ) obtained by the correction process in step S809 to the parallax image processing unit 20 as tracking data (step S810).
  • the speed of sound which is the speed at which sound travels through the air
  • the speed of sound is affected by the temperature (air temperature) in the air.
  • the speed of sound is 331.5 meters per second at 1 atmosphere and 0°C, and has the characteristic of increasing or decreasing by 0.6 meters per second for every 1°C temperature change.
  • the error calculation unit 105 can also use the sound speed formula to calculate the distance value Zu using the following formula (1).
  • L (cm) is a value indicating the distance value Zu.
  • T (°C) is a value indicating the air temperature.
  • t (sec) is the time it takes for the ultrasonic device 5 to transmit an ultrasonic signal and receive it. Therefore, “t/2” indicates the time it takes for the ultrasonic wave to reach the user.
  • a temperature based on the ambient temperature of the display 3 is used as this actual air temperature.
  • the panel of the display 3 exhibits certain temperature characteristics due to heat generation, so the ambient temperature of the display 3 is expected to be substantially a temperature according to these temperature characteristics, and the ultrasonic device 5 is installed near the display 3 as shown in FIG. 4.
  • FIG. 9 shows the relationship between the panel temperature and the ambient temperature.
  • FIG. 9 shows a table TB showing the relationship between the panel temperature, which is the temperature generated on the panel, and the ambient temperature generated around the display 3 as a result of the influence of the panel temperature.
  • the panel temperature may be detected by a temperature sensor provided on the panel.
  • the following formula (2) can be derived to estimate the ambient temperature T2 of the display 3 from the panel temperature T1, which is the actual measured value of the current panel temperature.
  • the ambient temperature T2 is approximated using the panel temperature T1 as shown in formula (2).
  • the error calculation unit 105 applies the current panel temperature T1 detected by the temperature sensor to equation (2) to estimate the ambient temperature T2 of the display 3.
  • the error calculation unit 105 then corrects the outside air temperature T0 using the temperature difference T1-T2 between the panel temperature T1 and the ambient temperature T2.
  • the error calculation unit 105 corrects the outside air temperature T0 by adding the temperature difference T1-T2 to the outside air temperature T0.
  • the error calculation unit 105 uses the corrected outside air temperature T0 as the air temperature T in the above equation (1), and solves equation (1) to calculate the value of "L", i.e., the distance value Zu.
  • the distance value Zu calculated here takes into account the temperature (the ambient temperature of the display 3 due to heat generation by the panel) that is affected by the ultrasonic device 5 being placed near the display 3, so it can be said to be a more accurate value than if a general temperature of 15°C were used.
  • Fig. 10 is a flowchart showing the procedure for calculating the distance value Zu.
  • Fig. 10 shows a procedure for correcting the temperature characteristic that the ultrasonic device 5 exhibits due to the influence of the panel of the display 3.
  • the error calculation unit 105 acquires the outside air temperature T0 (step S1001).
  • the error calculation unit 105 may acquire the outside air temperature T0 when the panel is powered on.
  • the outside air temperature T0 here refers to the air temperature in the space in which the stereoscopic display device 1 is placed.
  • the error calculation unit 105 also acquires the current panel temperature T1 (step S1002), and applies the panel temperature T1 to equation (2) to estimate the ambient temperature T2 of the display 3 (step S1003).
  • the error calculation unit 105 then calculates the temperature difference T1-T2 between the panel temperature T1 and the ambient temperature T2, and performs a correction by adding the temperature difference T1-T2 to the outside air temperature T0 (step S1004).
  • This method is based on Newton's law of cooling.
  • the corrected outside air temperature T0 can be said to be the temperature of the space in which the stereoscopic display device 1 is placed, that is, the space through which ultrasonic waves are transmitted.
  • the error calculation unit 105 calculates the sound speed c by using the corrected outside air temperature T0 as the sound transmission temperature (step S1005). Specifically, the error calculation unit 105 uses the corrected outside air temperature T0 as T indicated in the sound speed formula to calculate the sound speed c.
  • the error calculation unit 105 applies the sound speed c to the above formula (1) to calculate the distance value Zu (step S1006).
  • the distance value Zu obtained in step S1006 is input to the information processing device 100. As described in step S807, the input distance value Zu is obtained by the error calculation unit 105.
  • FIG. 11 is a block diagram showing an example of the hardware configuration of a computer corresponding to the information processing device 100 according to the embodiment. Note that Fig. 11 shows an example of the hardware configuration of a computer corresponding to the information processing device 100 according to the embodiment, and the configuration does not need to be limited to that shown in Fig. 11.
  • computer 1000 has a CPU (Central Processing Unit) 1100, RAM (Random Access Memory) 1200, ROM (Read Only Memory) 1300, HDD (Hard Disk Drive) 1400, a communication interface 1500, and an input/output interface 1600.
  • CPU Central Processing Unit
  • RAM Random Access Memory
  • ROM Read Only Memory
  • HDD Hard Disk Drive
  • the CPU 1100 operates based on the programs stored in the ROM 1300 or the HDD 1400 and controls each component. For example, the CPU 1100 loads the programs stored in the ROM 1300 or the HDD 1400 into the RAM 1200 and executes processes corresponding to the various programs.
  • the ROM 1300 stores boot programs such as the Basic Input Output System (BIOS) that is executed by the CPU 1100 when the computer 1000 starts up, as well as programs that depend on the hardware of the computer 1000.
  • BIOS Basic Input Output System
  • HDD 1400 is a computer-readable recording medium that non-temporarily records programs executed by CPU 1100 and data used by such programs. Specifically, HDD 1400 records program data 1450.
  • Program data 1450 is an example of an information processing program for realizing an information processing method according to an embodiment of the present disclosure, and data used by such information processing program.
  • the communication interface 1500 is an interface for connecting the computer 1000 to an external network 1550 (e.g., the Internet).
  • the CPU 1100 receives data from other devices and transmits data generated by the CPU 1100 to other devices via the communication interface 1500.
  • the input/output interface 1600 is an interface for connecting the input/output device 1650 and the computer 1000.
  • the CPU 1100 receives data from an input device such as a keyboard or a mouse via the input/output interface 1600.
  • the CPU 1100 also transmits data to an output device such as a display device, a speaker, or a printer via the input/output interface 1600.
  • the input/output interface 1600 may also function as a media interface that reads programs and the like recorded on a specific recording medium.
  • Examples of media include optical recording media such as DVDs (Digital Versatile Discs) and PDs (Phase change rewritable Disks), magneto-optical recording media such as MOs (Magneto-Optical Disks), tape media, magnetic recording media, and semiconductor memories.
  • optical recording media such as DVDs (Digital Versatile Discs) and PDs (Phase change rewritable Disks)
  • magneto-optical recording media such as MOs (Magneto-Optical Disks)
  • tape media magnetic recording media
  • magnetic recording media and semiconductor memories.
  • the CPU 1100 of the computer 1000 executes an information processing program loaded onto the RAM 1200, thereby implementing the various processing functions executed by each process shown in FIG. 3. That is, the CPU 1100 and the RAM 1200, etc., work together with the software (the information processing program loaded onto the RAM 1200) to implement the information processing method by the information processing device 100 according to the embodiment.
  • an imaging unit that captures an image of a user and acquires a captured image
  • an ultrasonic device disposed near the imaging unit so as to detect the user
  • Acquire three-dimensional coordinate information of the face of the user shown in the captured image determining whether the user's face has been detected continuously for a predetermined period of time based on the three-dimensional coordinate information
  • an information processing unit that corrects a first distance value indicated by the three-dimensional coordinate information based on a second distance value based on a signal acquired from the ultrasonic device based on a determination that the user's face has been detected continuously for a predetermined period of time.
  • the information processing device corrects an error of the first distance value with respect to the second distance value based on the second distance value.
  • the information processing unit includes: calculating an error rate when the second distance value is set as a true value by using a difference between the first distance value and the second distance value; The information processing device according to (2), further comprising: correcting an error of the first distance value with respect to the second distance value based on the error rate.
  • the imaging unit is installed near the display so that the captured image includes a user viewing the display; The information processing device according to (1), wherein the ultrasonic device is installed near the display.
  • the information processing device calculates the second distance value according to an ambient temperature of the display based on a signal acquired from the ultrasonic device.
  • the information processing unit includes: estimating the ambient temperature based on a panel temperature detected by a temperature sensor provided on a panel of the display; Calculating a sound velocity according to a correction temperature based on a temperature difference between the panel temperature and the ambient temperature; The information processing device according to (5), wherein the second distance value is calculated based on the calculated sound speed.
  • the information processing device further comprising, as the display, a stereoscopic display on which a stereoscopic image generated using a viewpoint position of the user is displayed.
  • the predetermined time represents a delay time that is a difference between a time required for the imaging unit to respond and a time required for the ultrasonic device to respond,
  • the information processing device determines whether the user's face has been detected continuously during the delay time as the determination of whether the user's face has been detected.
  • the computer Acquire three-dimensional coordinate information of a face of the user shown in an image of the user captured by an imaging unit; determining whether the user's face has been detected continuously for a predetermined period of time based on the three-dimensional coordinate information; An information processing method that executes a process of correcting a first distance value indicated by the three-dimensional coordinate information based on a second distance value based on a signal obtained from an ultrasonic device installed near the imaging unit to detect the user, based on a determination that the user's face has been detected continuously for a predetermined period of time.
  • (10) Computer Acquire three-dimensional coordinate information of a face of the user shown in an image of the user captured by an imaging unit; determining whether the user's face has been detected continuously for a predetermined period of time based on the three-dimensional coordinate information;
  • An information processing program for functioning as an information processing unit that, based on a determination that the user's face has been detected continuously for a predetermined period of time, corrects a first distance value indicated by the three-dimensional coordinate information based on a second distance value based on a signal obtained from an ultrasonic device installed near the imaging unit to detect the user.

Landscapes

  • Engineering & Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

Un appareil de traitement d'informations dans un mode de réalisation selon la présente divulgation comprend : une unité d'imagerie qui capture une image d'un utilisateur et acquiert l'image capturée ; un dispositif à ultrasons qui est installé à proximité de l'unité d'imagerie de façon à détecter l'utilisateur ; et une unité de traitement d'informations qui acquiert des informations de coordonnées tridimensionnelles concernant le visage de l'utilisateur apparaissant dans l'image capturée, évalue, sur la base des informations de coordonnées tridimensionnelles, si le visage de l'utilisateur est détecté en continu pendant une durée prescrite, et, sur la base d'une évaluation que le visage de l'utilisateur est détecté en continu pendant la durée prescrite, corrige une première valeur de distance indiquée par les informations de coordonnées tridimensionnelles sur la base d'une seconde valeur de distance qui est basée sur un signal acquis à partir du dispositif à ultrasons.
PCT/JP2023/041042 2022-11-25 2023-11-15 Appareil de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations WO2024111475A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2022-188652 2022-11-25
JP2022188652 2022-11-25

Publications (1)

Publication Number Publication Date
WO2024111475A1 true WO2024111475A1 (fr) 2024-05-30

Family

ID=91195651

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2023/041042 WO2024111475A1 (fr) 2022-11-25 2023-11-15 Appareil de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations

Country Status (1)

Country Link
WO (1) WO2024111475A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1062552A (ja) * 1996-08-14 1998-03-06 Mitsubishi Heavy Ind Ltd 距離計測装置
JP2012209677A (ja) * 2011-03-29 2012-10-25 Kyocera Corp 携帯電子機器
JP2014112757A (ja) * 2012-12-05 2014-06-19 Nlt Technologies Ltd 立体画像表示装置
US20150326968A1 (en) * 2014-05-08 2015-11-12 Panasonic Intellectual Property Management Co., Ltd. Directivity control apparatus, directivity control method, storage medium and directivity control system
WO2020130048A1 (fr) * 2018-12-21 2020-06-25 京セラ株式会社 Dispositif d'affichage tridimensionnel, système d'affichage tête haute et objet mobile

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH1062552A (ja) * 1996-08-14 1998-03-06 Mitsubishi Heavy Ind Ltd 距離計測装置
JP2012209677A (ja) * 2011-03-29 2012-10-25 Kyocera Corp 携帯電子機器
JP2014112757A (ja) * 2012-12-05 2014-06-19 Nlt Technologies Ltd 立体画像表示装置
US20150326968A1 (en) * 2014-05-08 2015-11-12 Panasonic Intellectual Property Management Co., Ltd. Directivity control apparatus, directivity control method, storage medium and directivity control system
WO2020130048A1 (fr) * 2018-12-21 2020-06-25 京セラ株式会社 Dispositif d'affichage tridimensionnel, système d'affichage tête haute et objet mobile

Similar Documents

Publication Publication Date Title
JP6747504B2 (ja) 情報処理装置、情報処理方法、及びプログラム
EP2556410B1 (fr) Caméra 3d à détection de distance d'objet de premier plan
WO2017092334A1 (fr) Procédé et dispositif pour un traitement de restitution d'image
US20110242286A1 (en) Stereoscopic Camera With Automatic Obstruction Removal
JP2003242527A (ja) 情報処理装置および方法
JP2015166890A (ja) 情報処理装置、情報処理システム、情報処理方法及びプログラム
JP2009505247A (ja) 複数の観察者の眼のトラッキングとリアルタイム検出のための方法と回路配置
US12010288B2 (en) Information processing device, information processing method, and program
CN103139463A (zh) 扩增实境的方法、***及移动装置
US10630890B2 (en) Three-dimensional measurement method and three-dimensional measurement device using the same
JP2017531388A (ja) タップによって制御されるヘッドマウント型ディスプレイ装置、その制御方法、及びその制御のためのコンピュータプログラム
US11557020B2 (en) Eye tracking method and apparatus
WO2015194075A1 (fr) Dispositif de traitement d'image, procédé de traitement d'image et programme
WO2019233125A1 (fr) Procédé de traitement d'image, dispositif de capture d'image, dispositif informatique et support d'enregistrement lisible
KR101139287B1 (ko) 사용자의 움직임을 반영한 파노라마 영상 재생 장치 및 파노라마 영상 재생 방법
TW201501508A (zh) 立體顯示方法
JP2020042206A (ja) 画像表示制御装置および画像表示制御用プログラム
US20190369807A1 (en) Information processing device, information processing method, and program
WO2024111475A1 (fr) Appareil de traitement d'informations, procédé de traitement d'informations et programme de traitement d'informations
US20240031551A1 (en) Image capturing apparatus for capturing a plurality of eyeball images, image capturing method for image capturing apparatus, and storage medium
JPWO2020071029A1 (ja) 情報処理装置、情報処理方法、及び記録媒体
JP2012080294A (ja) 電子機器、映像処理方法、及びプログラム
US8064655B2 (en) Face image detection device, face image detection method and imaging apparatus
JP2006287813A (ja) 3次元画像表示装置
US11181977B2 (en) Slippage compensation in eye tracking