CN107341789B - System and method for predicting pathway of visually impaired people based on RGB-D camera and stereo - Google Patents

System and method for predicting pathway of visually impaired people based on RGB-D camera and stereo Download PDF

Info

Publication number
CN107341789B
CN107341789B CN201611048370.8A CN201611048370A CN107341789B CN 107341789 B CN107341789 B CN 107341789B CN 201611048370 A CN201611048370 A CN 201611048370A CN 107341789 B CN107341789 B CN 107341789B
Authority
CN
China
Prior art keywords
color
infrared
image
camera
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611048370.8A
Other languages
Chinese (zh)
Other versions
CN107341789A (en
Inventor
于红雷
杨恺伦
程瑞琦
陈浩
汪凯巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Vision Krypton Technology Co Ltd
Original Assignee
Hangzhou Vision Krypton Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Vision Krypton Technology Co Ltd filed Critical Hangzhou Vision Krypton Technology Co Ltd
Priority to CN201611048370.8A priority Critical patent/CN107341789B/en
Publication of CN107341789A publication Critical patent/CN107341789A/en
Application granted granted Critical
Publication of CN107341789B publication Critical patent/CN107341789B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61HPHYSICAL THERAPY APPARATUS, e.g. DEVICES FOR LOCATING OR STIMULATING REFLEX POINTS IN THE BODY; ARTIFICIAL RESPIRATION; MASSAGE; BATHING DEVICES FOR SPECIAL THERAPEUTIC OR HYGIENIC PURPOSES OR SPECIFIC PARTS OF THE BODY
    • A61H3/00Appliances for aiding patients or disabled persons to walk about
    • A61H3/06Walking aids for blind persons
    • A61H3/061Walking aids for blind persons with electronic detecting or guiding means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10048Infrared image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Landscapes

  • Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Pain & Pain Management (AREA)
  • Physical Education & Sports Medicine (AREA)
  • Rehabilitation Therapy (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Measurement Of Optical Distance (AREA)

Abstract

The invention discloses a vision-impaired person access predicting system and method based on an RGB-D camera and stereo. The method comprises the steps of projecting invisible near-infrared static speckles by using an infrared projector, collecting images by using two infrared cameras and one RGB (red, green and blue) color camera, processing the collected images by using a small-sized processor, and calculating to obtain a depth image. The method acquires attitude angle information of a camera by using an attitude angle sensor. The small processor calculates and acquires a height image by using the depth information and the attitude angle information, divides the height image into blocks, converts the depth information obtained by the blocks into stereo signals, and finally transmits the stereo signals to the visually impaired to assist, so that the requirement of the visually impaired for path prediction can be well met.

Description

System and method for predicting pathway of visually impaired people based on RGB-D camera and stereo
Technical Field
The invention belongs to the technical field of visual impairment person auxiliary technology, binocular vision technology, three-dimensional environment perception technology and stereo interaction. The invention comprises a vision-impaired person access predicting method based on an RGB-D camera and stereo, and relates to a method for projecting invisible near-infrared static speckles by using an infrared projector, acquiring images by using two infrared cameras and one RGB camera, processing the acquired images by using a small processor, and calculating to acquire a depth image. The small-sized processor acquires attitude angle information of the camera by using the attitude angle sensor. The small processor calculates and obtains a height image by using the depth information and the attitude angle information, blocks the height image, converts the depth information obtained by blocking into a stereo signal, and finally transmits the stereo signal to the visually impaired by using the bone conduction headset to carry out an auxiliary path prediction method.
Background
According to the statistics of the world health organization, 2.85 hundred million people with visual impairment exist in the world. The visually impaired people lose normal vision, have difficulty in understanding color, shape, distance and movement, and have great influence on daily life, traveling and the like.
The traditional auxiliary tool for the visually impaired people is a walking stick for the blind, and the visually impaired people can know the condition before the walking stick by repeatedly moving the walking stick, so that time and labor are wasted. The walking stick for the blind has limited detection distance, can only detect barriers beside feet, and cannot reflect the conditions of far distance and air. The blind guide dog can provide help for the visually impaired, but the training and maintenance cost of the blind guide dog is high, and the blind guide dog is hard to bear by a common family. In some occasions, the guide dog cannot accompany the blind person to enter, such as a bus and a railway station, so that the assistance of the guide dog is limited. The bionic eye can help the visually impaired to recover partial vision, but the implantation of the bionic eye requires an operation and is high in cost. The bionic eye is only suitable for blind people with blindness caused by retinitis pigmentosa or senile macular degeneration. Visually impaired people with damaged optic nerves cannot restore part of their vision by implanting a bionic eye.
The electronic vision-impairment auxiliary tool mainly uses an ultrasonic technology, a laser ranging technology, a binocular vision technology, a laser speckle coding technology, a laser radar technology, a millimeter wave radar technology, a thermal imaging technology and a Global Positioning System (GPS). The range finding based on the ultrasonic wave technology and the laser range finding technology is limited, only single-point range finding can be realized, the obtained information amount is too small, the power consumption is high, the equipment is heavy, only an alarm function can be realized, and the environmental interference is easy to cause. The assistance based on the binocular vision technology depends on the abundance of characteristic points and textures in the environment, and the scene with single texture fails, such as an indoor white wall, a smooth ground and the like. The binocular vision technology can be deceived by special situations such as mirror reflection and the like, so that the judgment is missed or misjudged. Assistance based on laser speckle coding techniques fails outdoors because the actively projected structured light is overwhelmed by sunlight and thus the coded speckles cannot be identified. Due to the power limitation, the laser speckle coding technology has the farthest distance, and an object beyond the farthest distance cannot measure the distance. The auxiliary cost based on the laser radar technology is high, the sampling rate is low generally, the system is sensitive to dust, haze and rainwater, and color and texture information cannot be acquired. The auxiliary resolution based on the millimeter wave radar is low, and the signal processing process is difficult. The thermal imaging technology-based auxiliary resolution is low, the calibration process is complex, and only human, animal and other heating objects can be detected. The GPS has low assist accuracy, causes signal loss, cannot be used indoors, and cannot acquire local dynamic obstacle information.
The traditional interaction mode for the visually impaired people is mainly voice prompt and touch vibration. Semantic prompts generally broadcast the distance and direction of obstacles, require a certain time to broadcast, cause delay and accident risk, and have small transmittable information amount. The touch vibration is hardware through vibrating a waistband or a vibrating vest to the vibration prompts the position of the barrier, the vibration device can solve the problem of delay, but the vibration device brings burden to the visually impaired people, and wearing feeling of different people is different.
Disclosure of Invention
The present invention is directed to a system and method for visually impaired people path prediction based on an RGB-D camera and stereo sound.
The purpose of the invention is realized by the following technical scheme: a vision-impaired person pathway prediction system based on an RGB-D camera and stereo comprises an infrared projector, two identical infrared cameras, a color camera, an attitude angle sensor, a USB hub, a small-sized processor, a bone conduction earphone module, two bone conduction vibration modules and a battery module. The infrared projector, the two infrared cameras, the color camera and the attitude angle sensor are connected with the small-sized processor through the USB concentrator, and the battery module is connected with the small-sized processor. The color camera and the infrared projector are located between the two infrared cameras. The optical axes of the two infrared cameras and the color camera are parallel to each other. Attitude angles of the three cameras are consistent and are acquired in real time through attitude angle sensors. The small-sized processor controls the infrared projector to project invisible static near-infrared speckles to the front three-dimensional scene, and the two infrared cameras collect two infrared images of the projected three-dimensional scene in real time. A color camera acquires color images of a three-dimensional scene in real time. The USB concentrator transmits the two infrared images, the color image and the attitude angle information to the small-sized processor. And the small processor processes the two acquired infrared images and the color image to acquire a depth image of the three-dimensional scene. And the small processor processes the depth information and the attitude angle information to acquire a height image of the three-dimensional scene. The small processor blocks the height image, converts the depth information after blocking into a stereo signal and transmits the stereo signal to the bone conduction earphone module. The bone conduction earphone module converts the stereo signal into a bone conduction vibration signal and transmits the bone conduction vibration signal to the two bone conduction vibration modules. And the two bone conduction vibration modules transmit bone conduction vibration signals to the vision-impaired user.
The path prediction method of the system comprises the following steps:
(1) Calibrating two infrared cameras once to obtain focal lengths f of the two infrared camerasIRPrincipal point position of left infrared camera (c)IR-x,cIR-y) Base distance B of two infrared camerasIR-IR
(2) Calibrating the color camera once to obtain the focal length f of the color cameracolorPrincipal point location (c)COLOR-x,cCOLOR-y)。
(3) Calibrating the color camera and the left infrared camera by a binocular camera once to obtain the base line distance B between the left infrared camera and the color cameraIR-COLOR
(4) The infrared projector projects the invisible static near-infrared speckle into the three-dimensional scene in real time.
(5) Two infrared cameras collect two infrared images IR of three-dimensional sceneleftAnd IRright
(6) Color camera Color images Color of a three-dimensional scene.
(7) The attitude Angle sensor collects the rotation angles of the three cameras in the X, Y and Z three-axis directionsX,AngleY,AngleZ
(8) The USB concentrator converts two infrared raysImage IRleftAnd IRrightColor image Color, rotation Angle in three-axis directions of X, Y and ZX,AngleY,AngleZTo the mini-processor.
(9) Small processor IR for two infrared imagesleftAnd IRrightSobel edge is extracted, and two Sobel edge images Sobel are obtainedleftAnd Sobelright
(10) Sobel with left Sobel edge imageleftFor reference, two Sobel edge images SobelleftAnd SobelrightImage matching based on image blocks is carried out, and a series of well-matched effective points E ═ E are obtained1,e2,e3,...,eM}. At the left Sobel edge image SobelleftWherein each effective point is e ═ (u, v, d)TU is the abscissa pixel value, v is the ordinate pixel value, and d is the disparity value.
(11) Taking the matched effective point E as a reference, forming a parallax plane by every three effective points, wherein the equation of the ith parallax plane is d ═ aiu+biv+ciWherein a isi,bi,ciIs the coefficient of the ith parallax plane.
(12) On the basis of the parallax planes, unmatched pixel points (u ', v ', d ')TConversion to matching significant Point (u, v, d)T(ii) a The method specifically comprises the following steps: the pixel point (u ', v ', d ')TA distance to the i-th parallax plane ofSetting the energy function asWhere ε, σ are constants. Traversing all parallax values d ' ═ d ' in the parallax search range for the pixel point 'min,...,d'maxAnd solving the parallax value which enables the Energy function Energy (d') to be minimum, and taking the parallax value as the parallax value d of the pixel point. Further, u ═ u ', v ═ v'.
(13) Traversing all unmatched pixel points, obtaining the parallax value of each unmatched pixel point,Obtaining parallax image Disparity based on the left infrared cameraleft
(14) According to the focal lengths f of the two infrared camerasIRAnd a base distance BIR-IRTraversing each point (u, v, d) in the parallax image with a depth value ofDepth image DepthleftEach point in the Depth map is (u, v, Depth), so that a Depth image Depth with the left infrared camera as a reference is obtainedleft
(15) Depth image Depth is utilizedleftAnd Color image Color, focal lengths f of two infrared camerasIRPrincipal point position of left infrared camera (c)IR-x,cIR-y) Focal length f of color cameracolorPrincipal point location (c)COLOR-x,cCOLOR-y) And a baseline distance B for the left infrared camera and the color cameraIR-COLORThe Depth image Depth of the color camera field of view can be obtained by aligning the Depth image and the color imagecolor
(16) depth from Depth image Depthcolorfocal length f of color cameracolorAnd principal point position (c) of the color cameraCOLOR-x,cCOLOR-y) The three-dimensional coordinates (X, Y, Z) of each point in the color camera coordinate system can be calculated. Depth image DepthcolorThe coordinate of the middle point is (u, v) and the depth value is depth, the three-dimensional coordinate (X, Y, Z) can be calculated by equation (1):
Z=depth
(17) According to the three-dimensional coordinates (X, Y, Z) of each point in the depth image in the camera coordinate system and the rotation angles of the three-axis directions of the attitude Angle sensor, the rotation angles are AngleX=α,AngleY=β,AngleZThen, the coordinate (X) of each point in the world coordinate system can be calculated by the formula (2)w,Yw,Zw):
(18) according to the coordinate Y of each point in the world coordinate systemwI.e., the vertical Height of each point to the color camera wearing position, a Height image Height can be acquired.
(19) Dividing the Height image Height into K from left to right, and calculating each Height image HeightKAverage height ofK. (K is generally between 2 and 10)
(20) K pieces of Height image Height are expressed by ensemble of K musical instruments with different timbresKEach block is represented by the sound production of a musical instrument of a different timbre. Given the height of the visually impaired user as H, the average height of the height images of the different blocksKInversely proportional to the difference between H and the instrument loudness Volume, i.e.: average heightKThe closer to H, the closer to the ground the real object in the image is, the more suitable the road condition is for passing, and the larger the loudness Volume is; average heightKThe farther away from H, the farther away from the ground the real object in the image is, the less suitable the road condition is for traffic, and the smaller the loudness Volume is. The instrument sound in each direction is stereo. The musical instrument can be selected from piano, violin, gong, trumpet, xylophone, etc. with special tone and pleasant.
(21) The mini-processor passes the stereo signal to the bone conduction headset module.
(22) The bone conduction earphone module converts the stereo signal into a bone conduction vibration signal.
(23) The bone conduction vibration module transmits the bone conduction signal to the visually impaired user.
Compared with the prior auxiliary method for the visually impaired, the method has the advantages that:
1. And (4) environmental suitability. Due to the use of an infrared projector and two infrared cameras, the method is compatible for use in both indoor and outdoor environments. When the device is indoors, the static near-infrared light spots projected by the infrared projector add textures to a three-dimensional scene, and the device is favorable for obtaining a dense depth image. When the system is outdoors, the near infrared part of sunlight is combined with a three-dimensional scene, so that a dense depth image can be acquired. The dense depth image can ensure the accuracy of the blocking height and the experience effect of auxiliary interaction.
2. Applicability in daytime and at night. Due to the use of the infrared projector and the two infrared cameras, the method can be used compatibly in the daytime and at night. In the daytime, static near-infrared light spots projected by the infrared projector and near-infrared components in sunlight can add textures to the three-dimensional scene, and dense depth images are facilitated. At night, the static near-infrared light spots projected by the infrared projector add textures to the near three-dimensional scene, and a depth image of the near three-dimensional scene can also be acquired. The method can obtain reliable depth images in the daytime and at night, so that the accuracy of the blocking height and the experience effect of auxiliary interaction are guaranteed.
3. The road conditions such as stairs and slopes can be distinguished, because stereo interaction is adopted, the stereo signals represent the height values of all directions in front, the sounds representing the road conditions such as the stairs and the slopes and the sounds representing the smooth road surfaces which can pass through are different in sound, and the sound signals can predict the passing areas and the road conditions such as the stairs and the slopes.
4. The conditions of the road pits and the like can be distinguished, because stereo interaction is adopted, stereo signals represent height values of all directions in front, because the height of the road pits is different from that of normal road conditions, the sound representing the conditions of the road pits and the sound representing the passable flat road surface sound differently, and the sound signals can predict not only passable areas but also the conditions of the road pits and the like.
5. no ears are occupied. The method adopts the bone conduction earphone to transmit signals to the visually impaired user, and does not hinder the user from hearing outside sounds. Most visually impaired people rely on external sounds to make some interpretation, such as judging the direction of a road according to traffic sounds.
6. Does not occupy two hands. The auxiliary device of the method is wearable, the small processor is portable, and the auxiliary device can be placed in a pocket or a small bag, so that the auxiliary device does not bring great burden to the vision disorder, and people with the vision disorder do not need to hold auxiliary tools with hands.
7. The user is not bothered. The stereo interaction mode of the method uses the musical instrument with pleasure to sound, does not cause annoyance to the visually impaired users, and enables the visually impaired users to avoid passing by listening to the pleasure music when in use.
8. A sufficient amount of information is fed back. Compared with semantic voice broadcasting, stereo interactive feedback utilizes different loudness, musical instruments with different timbres represent traffic ability of road conditions, road conditions in front in different directions can be simultaneously and fully transmitted, and the direction of a passable area can be predicted.
9. Easy to learn and understand. Compared with the sound coding in a complex form, the stereo interaction is based on the height partitioning, the height information after the partitioning processing is not very complicated, and a vision-impaired user can rapidly learn and understand the meaning of the stereo signal and select the walking direction according to the stereo signal.
10. And timely feedback is carried out. Compared with semantic voice broadcasting, the interactive feedback of stereo is timely without delay. Therefore, the visually impaired can select the correct passable path in time, and the safety of the method is ensured.
Drawings
FIG. 1 is a schematic block diagram of a pathway prediction system for visually impaired persons;
FIG. 2 is a schematic view of the structure of the glasses for visually impaired people to predict their pathway;
FIG. 3 shows two infrared images IRleftAnd IRright
FIG. 4 is a Depth image Depth after the graying processleft(in the original depth image, the closer the red is, the farther the blue is, the more, expressed in pseudo color).
Fig. 5 shows a grayed color image (in the original color image, the vertical height from the ground plane to the RGB-D wearing position is close to the height of the visually impaired user, which is marked as a pass-through region by green).
Fig. 6 is a schematic diagram of a representation path of musical instrument stereo.
Detailed Description
As shown in fig. 1, a vision impaired person pathway prediction system based on RGB-D camera and stereo sound comprises an infrared projector, two identical infrared cameras, a color camera, an attitude angle sensor, a USB hub, a small-sized processor, a bone conduction earphone module, two bone conduction vibration modules, and a battery module. The infrared projector, the two infrared cameras, the color camera and the attitude angle sensor are connected with the small-sized processor through the USB concentrator, and the battery module is connected with the small-sized processor. The color camera and the infrared projector are located between the two infrared cameras. The optical axes of the two infrared cameras and the color camera are parallel to each other. Attitude angles of the three cameras are consistent and are acquired in real time through attitude angle sensors. The small-sized processor controls the infrared projector to project invisible static near-infrared speckles to the front three-dimensional scene, and the two infrared cameras collect two infrared images of the projected three-dimensional scene in real time. A color camera acquires color images of a three-dimensional scene in real time. The USB concentrator transmits the two infrared images, the color image and the attitude angle information to the small-sized processor. And the small processor processes the two acquired infrared images and the color image to acquire a depth image of the three-dimensional scene. And the small processor processes the depth information and the attitude angle information to acquire a height image of the three-dimensional scene. The small processor blocks the height image, converts the depth information after blocking into a stereo signal and transmits the stereo signal to the bone conduction earphone module. The bone conduction earphone module converts the stereo signal into a bone conduction vibration signal and transmits the bone conduction vibration signal to the two bone conduction vibration modules. And the two bone conduction vibration modules transmit bone conduction vibration signals to the vision-impaired user. The system may be designed similar to the glasses described in fig. 2 to achieve an aesthetic effect.
The path prediction method of the system comprises the following steps:
(1) calibrating two infrared cameras once to obtain focal lengths f of the two infrared camerasIRPrincipal point position of left infrared camera (c)IR-x,cIR-y) Base distance B of two infrared camerasIR-IR
(2) For color phasecalibrating the camera to obtain the focal length f of the color cameracolorPrincipal point location (c)COLOR-x,cCOLOR-y)。
(3) Calibrating the color camera and the left infrared camera by a binocular camera once to obtain the base line distance B between the left infrared camera and the color cameraIR-COLOR
(4) The infrared projector projects the invisible static near-infrared speckle into the three-dimensional scene in real time.
(5) Two infrared cameras collect two infrared images IR of three-dimensional sceneleftAnd IRright
(6) Color camera Color images Color of a three-dimensional scene.
(7) The attitude Angle sensor collects the rotation angles of the three cameras in the X, Y and Z three-axis directionsX,AngleY,AngleZ
(8) The USB hub transmits two infrared images IRleftAnd IRrightColor image Color, rotation Angle in three-axis directions of X, Y and ZX,AngleY,AngleZTo the mini-processor.
(9) Small processor IR for two infrared imagesleftAnd IRrightSobel edge is extracted, and two Sobel edge images Sobel are obtainedleftAnd Sobelright
(10) Sobel with left Sobel edge imageleftFor reference, two Sobel edge images SobelleftAnd SobelrightImage matching based on image blocks is carried out, and a series of well-matched effective points E ═ E are obtained1,e2,e3,...,eM}. At the left Sobel edge image SobelleftWherein each effective point is e ═ (u, v, d)TU is the abscissa pixel value, v is the ordinate pixel value, and d is the disparity value.
(11) Taking the matched effective point E as a reference, forming a parallax plane by every three effective points, wherein the equation of the ith parallax plane is d ═ aiu+biv+ciWherein a isi,bi,ciis the coefficient of the ith parallax plane.
(12) on the basis of the parallax planes, unmatched pixel points (u ', v ', d ')TConversion to matching significant Point (u, v, d)T(ii) a The method specifically comprises the following steps: the pixel point (u ', v ', d ')TA distance to the i-th parallax plane ofSetting the energy function asWhere ε, σ are constants. Traversing all parallax values d ' ═ d ' in the parallax search range for the pixel point 'min,...,d'maxAnd solving the parallax value which enables the Energy function Energy (d') to be minimum, and taking the parallax value as the parallax value d of the pixel point. Further, u ═ u ', v ═ v'.
(13) Traversing all unmatched pixel points, obtaining the parallax value of each unmatched pixel point, and obtaining the parallax image Disparity with the left infrared camera as the referenceleft
(14) According to the focal lengths f of the two infrared camerasIRAnd a base distance BIR-IRTraversing each point (u, v, d) in the parallax image with a depth value ofdepth image DepthleftEach point in the Depth map is (u, v, Depth), so that a Depth image Depth with the left infrared camera as a reference is obtainedleft
(15) Depth image Depth is utilizedleftAnd Color image Color, focal lengths f of two infrared camerasIRprincipal point position of left infrared camera (c)IR-x,cIR-y) Focal length f of color cameracolorPrincipal point location (c)COLOR-x,cCOLOR-y) And a baseline distance B for the left infrared camera and the color cameraIR-COLORThe Depth image Depth of the color camera field of view can be obtained by aligning the Depth image and the color imagecolor
(16) Depth from Depth image DepthcolorFocal length f of color cameracolorAnd principal point position (c) of the color cameraCOLOR-x,cCOLOR-y) The three-dimensional coordinates (X, Y, Z) of each point in the color camera coordinate system can be calculated. Depth image DepthcolorThe coordinate of the middle point is (u, v) and the depth value is depth, the three-dimensional coordinate (X, Y, Z) can be calculated by equation (1):
Z=depth
(17) According to the three-dimensional coordinates (X, Y, Z) of each point in the depth image in the camera coordinate system and the rotation angles of the three-axis directions of the attitude Angle sensor, the rotation angles are AngleX=α,AngleY=β,AngleZThen, the coordinate (X) of each point in the world coordinate system can be calculated by the formula (2)w,Yw,Zw):
(18) according to the coordinate Y of each point in the world coordinate systemwI.e., the vertical Height of each point to the color camera wearing position, a Height image Height can be acquired.
(19) Dividing the Height image Height into K from left to right, and calculating each Height image HeightKAverage height ofK. (K is generally between 2 and 10)
(20) K pieces of Height image Height are expressed by ensemble of K musical instruments with different timbresKEach block is represented by the sound production of a musical instrument of a different timbre. Given the height of the visually impaired user as H, the average height of the height images of the different blocksKInversely proportional to the difference between H and the instrument loudness Volume, i.e.: average heightKThe closer to H, the closer to the ground the real object in the image is, the more suitable the road condition is for passing, and the larger the loudness Volume is; average heightKThe farther away from H, the farther away from the ground the real object in the image is, the less suitable the road condition is for traffic, and the smaller the loudness Volume is. The instrument sound in each direction is stereo. The musical instrument can be selected from piano, violin, gong, trumpet, xylophone, etc. with special tone and pleasant.
(21) The mini-processor passes the stereo signal to the bone conduction headset module.
(22) The bone conduction earphone module converts the stereo signal into a bone conduction vibration signal.
(23) The bone conduction vibration module transmits the bone conduction signal to the visually impaired user.

Claims (1)

1. A vision-impaired person passage prediction system based on an RGB-D camera and stereo is characterized by comprising an infrared projector, two identical infrared cameras, a color camera, an attitude angle sensor, a USB hub, a small-sized processor, a bone conduction earphone module, two bone conduction vibration modules and a battery module; the infrared projector, the two infrared cameras, the color camera and the attitude angle sensor are connected with the small-sized processor through the USB concentrator, and the battery module is connected with the small-sized processor; the color camera and the infrared projector are positioned between the two infrared cameras; the optical axes of the two infrared cameras and the color camera are parallel to each other; the attitude angles of the three cameras are consistent and are acquired in real time through attitude angle sensors; the small processor controls the infrared projector to project invisible static near-infrared speckles to the front three-dimensional scene, and the two infrared cameras collect two infrared images of the projected three-dimensional scene in real time; a color camera collects color images of a three-dimensional scene in real time; the USB concentrator transmits two infrared images, a color image and attitude angle information to the small-sized processor; the small processor processes the two acquired infrared images and the color image to acquire a depth image of the three-dimensional scene; the small processor processes the depth information and the attitude angle information to obtain a height image of the three-dimensional scene; the small processor blocks the height image, converts the depth information after blocking into a stereo signal and transmits the stereo signal to the bone conduction earphone module; the bone conduction earphone module converts the stereo signal into a bone conduction vibration signal and transmits the bone conduction vibration signal to the two bone conduction vibration modules; the two bone conduction vibration modules transmit bone conduction vibration signals to the visually impaired user; the path prediction method of the system comprises the following steps:
(1) Calibrating two infrared cameras once to obtain focal lengths f of the two infrared camerasIRPrincipal point position of left infrared camera (c)IR-x,cIR-y) Base distance B of two infrared camerasIR-IR
(2) calibrating the color camera once to obtain the focal length f of the color cameracolorPrincipal point location (c)COLOR-x,cCOLOR-y);
(3) Calibrating the color camera and the left infrared camera by a binocular camera once to obtain the base line distance B between the left infrared camera and the color cameraIR-COLOR
(4) The infrared projector projects invisible static near-infrared speckles into the three-dimensional scene in real time;
(5) Two infrared cameras collect two infrared images IR of three-dimensional sceneleftAnd IRright
(6) Acquiring a Color image Color of a three-dimensional scene by a Color camera;
(7) the attitude Angle sensor collects the rotation angles of the three cameras in the X, Y and Z three-axis directionsX,AngleY,AngleZ
(8) The USB hub transmits two infrared images IRleftAnd IRrightColor image Color, rotation Angle in three-axis directions of X, Y and ZX,AngleY,AngleZTo the mini-processor;
(9) Small processor IR for two infrared imagesleftAnd IRrightSobel edge is extracted, and two Sobel edge images Sobel are obtainedleftAnd Sobelright
(10) Sobel with left Sobel edge imageleftFor reference, two Sobel edge images SobelleftAnd SobelrightImage matching based on image blocks is carried out to obtain a series of well matched effective pointsE={e1,e2,e3,...,eM}; at the left Sobel edge image SobelleftWherein each effective point is e ═ (u, v, d)Tu is an abscissa pixel value, v is an ordinate pixel value, and d is a disparity value;
(11) Taking the matched effective point E as a reference, forming a parallax plane by every three effective points, wherein the equation of the ith parallax plane is d ═ aiu+biv+ciWherein a isi,bi,ciCoefficients for the ith parallax plane;
(12) On the basis of the parallax planes, unmatched pixel points (u ', v ', d ')Tconversion to matching significant Point (u, v, d)T(ii) a The method specifically comprises the following steps: the pixel point (u ', v ', d ')TA distance to the i-th parallax plane ofSetting the energy function asWherein epsilon and sigma are constants; traversing all parallax values d ' ═ d ' in the parallax search range for the pixel point 'min,...,d'maxSolving the parallax value which enables the Energy function Energy (d') to be minimum, and taking the parallax value as the parallax value d of the pixel point; further, u ═ u ', v ═ v';
(13) Traversing all unmatched pixel points, obtaining the parallax value of each unmatched pixel point, and obtaining the parallax image Disparity with the left infrared camera as the referenceleft
(14) According to the focal lengths f of the two infrared camerasIRAnd a base distance BIR-IRTraversing each point (u, v, d) in the parallax image with a depth value ofDepth image DepthleftEach point in the Depth map is (u, v, Depth), so that a Depth image Depth with the left infrared camera as a reference is obtainedleft
(15) Depth image Depth is utilizedleftAnd Color image Color, focal lengths f of two infrared camerasIRPrincipal point position of left infrared camera (c)IR-x,cIR-y) Focal length f of color cameracolorprincipal point location (c)COLOR-x,cCOLOR-y) And a baseline distance B for the left infrared camera and the color cameraIR-COLORThe Depth image Depth of the color camera field of view can be obtained by aligning the Depth image and the color imagecolor
(16) depth from Depth image Depthcolorfocal length f of color cameracolorAnd principal point position (c) of the color cameraCOLOR-x,cCOLOR-y) Calculating three-dimensional coordinates (X, Y, Z) of each point in a color camera coordinate system; depth image DepthcolorThe coordinate of the middle point is (u, v) and the depth value is depth, the three-dimensional coordinate (X, Y, Z) can be calculated by equation (1):
(17) According to the three-dimensional coordinates (X, Y, Z) of each point in the depth image in the camera coordinate system and the rotation angles of the three-axis directions of the attitude Angle sensor, the rotation angles are AngleX=α,AngleY=β,AngleZThen, the coordinate (X) of each point in the world coordinate system can be calculated by the formula (2)w,Yw,Zw):
(18) According to the coordinate Y of each point in the world coordinate systemwNamely, the vertical Height from each point to the wearing position of the color camera, the Height image Height can be obtained;
(19) Dividing the Height image Height into K blocks from left to right, and calculating each block of Height image HeightKAverage height ofK(ii) a Wherein, the value of K is generally between 2 and 10;
(20) K pieces of Height image Height are expressed by ensemble of K musical instruments with different timbresK: each block is represented by the sounding of musical instruments with different timbres; given the height of the visually impaired user as H, the average height of the height images of the different blocksKInversely proportional to the difference between H and the instrument loudness Volume, i.e.: average heightKThe closer to H, the closer to the ground the real object in the image is, the more suitable the road condition is for passing, and the larger the loudness Volume is; average heightKThe farther away from H, the farther away from the ground the real object in the image is, the more unsuitable the road condition is for passing, and the smaller the loudness Volume is; the sound of the musical instrument in each direction is stereo;
(21) The small processor transmits the stereo signals to the bone conduction earphone module;
(22) The bone conduction earphone module converts the stereo signal into a bone conduction vibration signal;
(23) The bone conduction vibration module transmits the bone conduction vibration signal to the visually impaired user.
CN201611048370.8A 2016-11-23 2016-11-23 System and method for predicting pathway of visually impaired people based on RGB-D camera and stereo Active CN107341789B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611048370.8A CN107341789B (en) 2016-11-23 2016-11-23 System and method for predicting pathway of visually impaired people based on RGB-D camera and stereo

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611048370.8A CN107341789B (en) 2016-11-23 2016-11-23 System and method for predicting pathway of visually impaired people based on RGB-D camera and stereo

Publications (2)

Publication Number Publication Date
CN107341789A CN107341789A (en) 2017-11-10
CN107341789B true CN107341789B (en) 2019-12-17

Family

ID=60222763

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611048370.8A Active CN107341789B (en) 2016-11-23 2016-11-23 System and method for predicting pathway of visually impaired people based on RGB-D camera and stereo

Country Status (1)

Country Link
CN (1) CN107341789B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6955783B2 (en) 2018-01-10 2021-10-27 達闥機器人有限公司Cloudminds (Shanghai) Robotics Co., Ltd. Information processing methods, equipment, cloud processing devices and computer program products
CN108245385B (en) * 2018-01-16 2019-10-29 曹醒龙 A kind of device helping visually impaired people's trip
CN108876798B (en) * 2018-06-12 2022-03-18 杭州视氪科技有限公司 Stair detection system and method
CN109084700B (en) * 2018-06-29 2020-06-05 上海摩软通讯技术有限公司 Method and system for acquiring three-dimensional position information of article
CN110399807B (en) * 2019-07-04 2021-07-16 达闼机器人有限公司 Method and device for detecting ground obstacle, readable storage medium and electronic equipment
CN111932866A (en) * 2020-08-11 2020-11-13 中国科学技术大学先进技术研究院 Wearable blind person outdoor traffic information sensing equipment
CN112700484A (en) * 2020-12-31 2021-04-23 南京理工大学智能计算成像研究院有限公司 Depth map colorization method based on monocular depth camera
CN114724053B (en) * 2022-04-11 2024-02-20 合肥工业大学 Outdoor visual impairment assisting method based on deep intelligent interaction

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104021388A (en) * 2014-05-14 2014-09-03 西安理工大学 Reversing obstacle automatic detection and early warning method based on binocular vision
CN204766392U (en) * 2015-05-14 2015-11-18 广州龙天软件科技有限公司 Lead blind information processing apparatus
CN105701811A (en) * 2016-01-12 2016-06-22 浙江大学 Sound coding interaction method based on RGB-IR camera

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104021388A (en) * 2014-05-14 2014-09-03 西安理工大学 Reversing obstacle automatic detection and early warning method based on binocular vision
CN204766392U (en) * 2015-05-14 2015-11-18 广州龙天软件科技有限公司 Lead blind information processing apparatus
CN105701811A (en) * 2016-01-12 2016-06-22 浙江大学 Sound coding interaction method based on RGB-IR camera

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
A Ground and Obstacle Detection Algorithm for the Visually Impaired;Ruiqi Cheng 等;《ICBISP 2015》;20151119;摘要,第2-3,6节 *

Also Published As

Publication number Publication date
CN107341789A (en) 2017-11-10

Similar Documents

Publication Publication Date Title
CN107341789B (en) System and method for predicting pathway of visually impaired people based on RGB-D camera and stereo
CN106203390B (en) A kind of intelligent blind auxiliary system
CN106597690B (en) One kind predicting glasses based on RGB-D camera and stereosonic visually impaired people's access
CN106846350B (en) One kind is based on RGB-D camera and stereosonic visually impaired people's barrier early warning system and method
US9370459B2 (en) System and method for alerting visually impaired users of nearby objects
CN106817577B (en) One kind is based on RGB-D cameras and stereosonic visually impaired people's barrier early warning glasses
US7755744B1 (en) Environment sensor that conveys information about objects in the vicinity of the visually impaired user
US20090122161A1 (en) Image to sound conversion device
US9801778B2 (en) System and method for alerting visually impaired users of nearby objects
Dunai et al. Sensory navigation device for blind people
US20060098089A1 (en) Method and apparatus for a multisensor imaging and scene interpretation system to aid the visually impaired
US10579138B2 (en) Head-mounted sensor system
CN106651873B (en) One kind detecting glasses based on RGB-D camera and stereosonic visually impaired people's zebra stripes
CN108245385A (en) A kind of device for helping visually impaired people's trip
CN105686936A (en) Sound coding interaction system based on RGB-IR camera
CN106821692A (en) One kind is based on RGB D cameras and stereosonic visually impaired people's stair detecting system and method
WO2018119403A1 (en) Head mounted sensor system
CN105701811B (en) A kind of acoustic coding exchange method based on RGB-IR cameras
CN106920260B (en) Three-dimensional inertial blind guiding method, device and system
Vítek et al. New possibilities for blind people navigation
Dunai et al. Virtual sound localization by blind people
CN107049717B (en) One kind is based on RGB-D camera and stereosonic visually impaired people's zebra stripes detection system and method
CN107817614B (en) It is a kind of for hiding blind person's auxiliary eyeglasses of the water surface and barrier
CN117323185A (en) Blind person indoor navigation system and method based on computer vision and training method
Hossain et al. State of the art review on walking support system for visually impaired people

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant